Valgrind error file: /tmp/lightning-8k06jbb3/test_disconnect/lightning-7/valgrind-errors
==32307== Uninitialised byte(s) found during client check request
==32307== at 0x11EBAD: memcheck_ (mem.h:247)
==32307== by 0x11EC18: towire (towire.c:14)
==32307== by 0x11EF19: towire_short_channel_id (towire.c:92)
==32307== by 0x12203E: towire_channel_update (gen_peer_wire.c:918)
==32307== by 0x1148D4: send_channel_update (channel.c:185)
==32307== by 0x1175C5: peer_conn_broken (channel.c:1010)
==32307== by 0x13186F: destroy_conn (poll.c:173)
==32307== by 0x13188F: destroy_conn_close_fd (poll.c:179)
==32307== by 0x13B279: notify (tal.c:235)
==32307== by 0x13B721: del_tree (tal.c:395)
==32307== by 0x13BB3A: tal_free (tal.c:504)
==32307== by 0x130522: io_close (io.c:415)
==32307== Address 0xffefff87d is on thread 1's stack
==32307== in frame #2, created by towire_short_channel_id (towire.c:88)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is simpler than passing back and forth, for the moment at least. That
means we don't need to ask for a new one on reconnect.
This partially reverts the gossip handling in openingd, since it no longer
passes the gossip fd back. We also close it when peer is freed, so it
needs initializing to -1.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can go to release a gossip peer, and it can fail at the same time.
We work around the problem that the reply must be a gossipctl_release_peer_reply
with two fds, but it's not pretty.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Internally the dns code has the ability to try connecting to multiple
addresses in a sequence. Expose this, as we'll want it for reconnection.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We kill the existing connection if possible; this may mean simply
forgetting the prior peer altogether if it's in an early state.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead, send it the funding_signed message; it can watch, save to
database, and send it.
Now the openingd fundee path is a simple request and response, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Simplifies state machine. Master still has to calculate the tx to get
the signature and broadcast, but now the opening daemon funding path
is a simple request/response.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We want to use it in peer_control to generate the transaction, but we
really only need the funding_pubkey.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Like the fd, it's only useful when the peer is not in a daemon, so we
free & NULL it when that happens.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We steal it when we're closing connection, but we normally want to forget
it if connection just dies.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. We explicitly assert what state we're coming from, to make transitions
clearer.
2. Every transition has a state, even between owners while waiting for HSM.
3. Explictly step though getting the HSM signature on the funding tx
before starting channeld, rather than doing it in parallel: makes
states clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to do this on every connection, whether reconnecting or not,
so it makes sense for the handshake daemon to handle it and return
the feature fields.
Longer term I'm considering having the handshake daemon handle the
listening and connecting, and simply hand the fds back once the peers
are ready.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently create a peer struct, then complete handshake to find out
who it is. This means we have a half-formed peer, and worse: if it's
a reconnect we get two peers the same.
Add an explicit 'struct connection' for the handshake phase, and
construct a 'struct peer' once that's done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Like many I don't read any documentation besides the readme in the
repo, so I thought I could just pull some simple getting started info
into the readme to make it easy for people to get started :-)
Fixes the `short_channel_id` being serialized as 4 bytes block height,
3 bytes transaction index and 1 byte output number, to use 3+3+2 as
the spec says.
The reordering in the unit test structs is mainly to be able to still
use `eq_upto` for tests.
This was responsible for a huge number of loglines simply because we
log every subprocess start and termination. Moving the logging
upstream to where it is really needed gets rid of the polling, that
are successful. A highly unscientific test shows a reduction in
loglines produced by lightningd from 17000 to 10000 lines.
I caught the gossip daemon freeing a message, while it was queued to be
written. Using tal_dup_arr() is the Right Thing, as it handles taken()
properly automatically.
------------------------------- Valgrind errors --------------------------------
Valgrind error file: /tmp/lightning-rvc7d5oi/test_forward/lightning-3/valgrind-errors
==11057== Invalid read of size 8
==11057== at 0x1328F2: to_tal_hdr (tal.c:174)
==11057== by 0x133894: tal_len (tal.c:659)
==11057== by 0x11BBE7: do_write_wire (wire_io.c:103)
==11057== by 0x127B95: do_plan (io.c:369)
==11057== by 0x127C31: io_ready (io.c:390)
==11057== by 0x129461: io_loop (poll.c:295)
==11057== by 0x10CBB4: main (gossip.c:722)
==11057== Address 0x55a99d8 is 24 bytes inside a block of size 200 free'd
==11057== at 0x4C2ED5B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11057== by 0x133000: del_tree (tal.c:416)
==11057== by 0x132F77: del_tree (tal.c:405)
==11057== by 0x13333E: tal_free (tal.c:504)
==11057== by 0x1123F1: queue_broadcast (broadcast.c:38)
==11057== by 0x111EB0: handle_node_announcement (routing.c:918)
==11057== by 0x10B166: handle_gossip_msg (gossip.c:170)
==11057== by 0x10B76B: owner_msg_in (gossip.c:335)
==11057== by 0x12712E: next_plan (io.c:59)
==11057== by 0x127BD0: do_plan (io.c:376)
==11057== by 0x127C09: io_ready (io.c:386)
==11057== by 0x129461: io_loop (poll.c:295)
==11057== Block was alloc'd at
==11057== at 0x4C2DB2F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11057== by 0x132AE7: allocate (tal.c:245)
==11057== by 0x1330A3: tal_alloc_ (tal.c:443)
==11057== by 0x1332A6: tal_alloc_arr_ (tal.c:491)
==11057== by 0x133FEC: tal_dup_ (tal.c:846)
==11057== by 0x112347: new_queued_message (broadcast.c:20)
==11057== by 0x11240B: queue_broadcast (broadcast.c:43)
==11057== by 0x111EB0: handle_node_announcement (routing.c:918)
==11057== by 0x10B166: handle_gossip_msg (gossip.c:170)
==11057== by 0x10B76B: owner_msg_in (gossip.c:335)
==11057== by 0x12712E: next_plan (io.c:59)
==11057== by 0x127BD0: do_plan (io.c:376)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
wire_io: make a copy in io_write_wire (unless taken()).
I hit a corner case where gossipd freed a duplicate while it was being
sent out; this kind of thing doesn't happen if io_write_wire() makes
a copy by default.
We also do a memcheck() here; this gives us a caller in the backtrace
if there are uninitialized bytes, rather than waiting until the write
which happens later.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
eg:
test_routing_gossip (__main__.LightningDTests) ... ERROR
======================================================================
ERROR: test_routing_gossip (__main__.LightningDTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests/test_lightningd.py", line 150, in tearDown
err_count += self.printValgrindErrors(node)
File "tests/test_lightningd.py", line 137, in printValgrindErrors
errors, fname = self.getValgrindErrors(node)
File "tests/test_lightningd.py", line 132, in getValgrindErrors
with open(error_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/lightning-l106st0a/test_routing_gossip/lightning-1/valgrind-errors'
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now in sync with 8ee57b97738b1e9467a1342ca8373d40f0c4aca5.
Our tool doesn't need to convert them any more, but we actually had a
mis-typed field in the HSM which needed fixing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The single string-based hostname and port has been retired in favor of
having multiple `struct ipaddr`s from the `node_announcement`. This
breaks the hostnames and ports from IRC, but I didn't bother to
backport ipaddr for it since it is only used in the legacy daemon.
Rather a big commit, but I couldn't figure out how to split it
nicely. It introduces a new message from the channel to the master
signaling that the channel has been announced, so that the master can
take care of announcing the node itself. A provisorial announcement is
created and passed to the HSM, which signs it and passes it back to
the master. Finally the master injects it into gossipd which will take
care of broadcasting it.
We alternated between using a sha256 and using a privkey, but there are
numerous places where we have a random 32 bytes which are neither.
This fixes many of them (plus, struct privkey is now defined in terms of
struct secret).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>