Internally the dns code has the ability to try connecting to multiple
addresses in a sequence. Expose this, as we'll want it for reconnection.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes the `short_channel_id` being serialized as 4 bytes block height,
3 bytes transaction index and 1 byte output number, to use 3+3+2 as
the spec says.
The reordering in the unit test structs is mainly to be able to still
use `eq_upto` for tests.
This was responsible for a huge number of loglines simply because we
log every subprocess start and termination. Moving the logging
upstream to where it is really needed gets rid of the polling, that
are successful. A highly unscientific test shows a reduction in
loglines produced by lightningd from 17000 to 10000 lines.
The single string-based hostname and port has been retired in favor of
having multiple `struct ipaddr`s from the `node_announcement`. This
breaks the hostnames and ports from IRC, but I didn't bother to
backport ipaddr for it since it is only used in the legacy daemon.
Rather a big commit, but I couldn't figure out how to split it
nicely. It introduces a new message from the channel to the master
signaling that the channel has been announced, so that the master can
take care of announcing the node itself. A provisorial announcement is
created and passed to the HSM, which signs it and passes it back to
the master. Finally the master injects it into gossipd which will take
care of broadcasting it.
We alternated between using a sha256 and using a privkey, but there are
numerous places where we have a random 32 bytes which are neither.
This fixes many of them (plus, struct privkey is now defined in terms of
struct secret).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The new onion uses the `channel_id` instead of the `node_id` of the
next hop to identify where to forward the payment. So we return the
exact channel chosen by the routing algo, to avoid having to look it
up again later.
We seem to be getting spurious reconnect failures on Travis, this
should fix it (15 seconds may be too long at worst case).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The callback on `key_negotiate` was closing the connection under
certain circumstances and would also `free` the key_negotiate, which
would then be freed again once it returns. We steal it off of the
connection during the callback and doing the free manually afterwards
to make sure this can't happen.
Thanks to @jgriffiths for tracking this one down.
Fixes#142
Reported-By: @bjd and @bgorlick
This fails on the old dev-restart tests, so we need to only enable it
for the new tests:
rusty@rusty-XPS-13-9360:~/devel/cvs/lightning (guilt/ping-pong)$ daemon/test/test-basic --restart --verbose
...
{ }
RESTARTING
dev-restart failed!
valgrind: mmap(0x38000000, 2265088) failed in UME with error 22 (Invalid argument).
valgrind: this can be caused by executables with very large text, data or bss segments.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This just means we put the outgoing_cltv_value where we used to put zeroes.
The old daemon simply ignores this, but the new one should check it as per
BOLT 4.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Appending the channel direction ensures that the directions will be
treated independently. Without this only one direction would be
forwarded to the peers.
The direction bit was computed in several spots and was inconsistent
in some cases. Now we compute it just in routing, and once when
starting up `channeld`, this avoids recomputing it all over the place.
Before exiting, `channeld` constructs and sends a `channel_update`
marking the channel as disabled. This is the pro-active signalling
that the channel may no longer be used.
The JSON-RPC call `getroute` and the functionality to compute the
actual route have been split so that we can reuse it independently of
the JSON-interface. Since this is now a routing-only method I also
moved it into `routing.[ch]` instead of `pay.c`.
The spec 4af8e1841151f0c6e8151979d6c89d11839b2f65 uses a 32-byte 'channel-id'
field, not to be confused with the 8-byte short ID used by gossip. Rename
appropriately, and update to the new handshake protocol.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use a different 'struct peer' in the new daemons, so make sure
the structure isn't assumed in any shared files.
This is a temporary shim.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The peer structure is only for the old daemon; instead move the list
of all outgoing txs for rebroadcasting into struct topology (still
owned by peers, so they are removed when it exits).
One subtlety: on exit, struct topology is free before the peers,
so they end up removing from a freed list. Thus we actually free
every outgoing tx manually on topology free.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's no reason to think that the seed isn't reproducable from the
output: we don't want to give away our siphash seed and allow hashbombing,
so seed isaac with the SHA of the seed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Usually if we get a packet while closing (onchain event), we're going
through pkt_in which discards it. However, if we're reconnecting, we
simply process the init packet and get upset because they've forgotten
us.
Hard to reproduce, but here's the log (in this case, test-routing --reconnect
and we have just done mutual close):
We reconnect in STATE_MUTUAL_CLOSING, send INIT pkt:
+19.397025114 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: Init with ack 1 opens + 9 sigs + 8 revokes + 1 shutdown + 1 closing
While waiting for response, we see the mutual close...
+19.398732602 lightningd(4637):DEBUG: reaped 6370: bitcoin-cli -regtest=1 -datadir=/tmp/bitcoin-lightning2 getblock 2a63b209e17aedc5b1bcc6c2f9e044f97c9c3ca136fc64a719f704d2f632df5f false
+19.401834422 lightningd(4637):DEBUG: Adding block 5fdf32f6d204f719a764fc36a13c9c7cf944e0f9c2c6bcb1c5ed7ae109b2632a
+19.405167334 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: Got UTXO spend for 8bb48a:0: 7f5e422f...
+19.412543610 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: anchor_spent: STATE_MUTUAL_CLOSING => STATE_CLOSE_ONCHAIN_MUTUAL
And we also see it buried "forever" (10 blocks in test mode), so we forget peer:
+19.423045014 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: Anchor at depth 13
+19.426775063 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: check_for_resolution: STATE_CLOSE_ONCHAIN_MUTUAL => STATE_CLOSED
+19.427613109 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: db_forget_peer(023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898)
+19.428130685 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: db_start_transaction(023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898)
+19.501027511 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: db_commit_transaction(023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898)
Now, we get their reply, but they've forgotten us:
+19.520208608 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: Decrypted header len 5
+19.520872035 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: Received packet LEN=5, type=PKT__PKT_INIT
+19.520999082 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: Our order counter is 19, their ack 0
+19.521078913 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: They acked 0, remote=16 local=15
+19.521447174 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: Queued pkt PKT__PKT_OPEN (order=19)
+19.522563794 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: Queued pkt PKT__PKT_OPEN_COMMIT_SIG (order=19)
+19.523517319 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:BROKEN: Can't rexmit 2 when local commit 15 and remote 16
+19.524613177 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:UNUSUAL: Sending PKT_ERROR: invalid ack
+19.526638447 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: Queued pkt PKT__PKT_ERROR (order=19)
+19.527508022 023ec94fb93c669154ba7b08907276e8c8661b2e65d80fc2c089215d5395574898:DEBUG: peer_comms_err: STATE_CLOSED => STATE_ERR_BREAKDOWN
We should never transition from STATE_CLOSED to STATE_ERR_BREAKDOWn,
and that's what this check prevents.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Or for blackbox tests --gdb1=<subdaemon> / --gdb2=<subdaemon>.
This makes the subdaemon wait as soon as it's execed, so we can attach
the debugger.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This seems rather easy to fix, the only case we do not want to set
`STATE_SHUTDOWN` us when we have updates which we have not committed
yet, which is handled separately in the other IF-branch.
The `dstate` reference was only an indirection to the `timers`
sub-structure anyway, so removing this indirection allows us to reuse
the timers in the subdaemon arch.
We used to have a permutation map; this reintroduces a variant which
uses the htlc pointers directly.
We need this because we have to send the htlc-tx signatures in output
order as part of the protocol: without two-stage HTLCs we only needed
to wire them up in the unilateral spend case so we simply brute-forced
the ordering.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>