We don't track them accurately when in onchaind, but we don't want to:
onchaind can be restarted at any time.
Once it's all settled, we're clear to clean them up.
Before this, valgrind could complain about deferncing hout->key.peer:
Valgrind error file: valgrind-errors.10876
==10876== Invalid read of size 4
==10876== at 0x41F8AF: peer_on_chain (peer_control.h:127)
==10876== by 0x42340D: notify_new_block (peer_htlcs.c:1461)
==10876== by 0x40A08D: connect_block (chaintopology.c:96)
==10876== by 0x40A96B: topology_changed (chaintopology.c:313)
==10876== by 0x40AC85: add_block (chaintopology.c:384)
==10876== by 0x40ABF0: gather_previous_blocks (chaintopology.c:363)
==10876== by 0x4051B3: process_rawblock (bitcoind.c:410)
==10876== by 0x4044DD: bcli_finished (bitcoind.c:155)
==10876== by 0x454665: destroy_conn (poll.c:183)
==10876== by 0x454685: destroy_conn_close_fd (poll.c:189)
==10876== by 0x45DF89: notify (tal.c:240)
==10876== by 0x45E43A: del_tree (tal.c:400)
==10876== Address 0x6929208 is 2,120 bytes inside a block of size 2,416 free'd
==10876== at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==10876== by 0x45E513: del_tree (tal.c:421)
==10876== by 0x45E849: tal_free (tal.c:509)
==10876== by 0x41A8E9: handle_irrevocably_resolved (peer_control.c:1172)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These need to be different for testing the example in BOLT 11.
We also use the cltv_final instead of deadline_blocks in the final hop:
various tests assumed 5 was OK, so we tweak utils.py.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Normally, we get an error as soon as we send WIRE_REVOKE_AND_ACK. But if the
commit timer goes off, we get some extra cycles, during which the other side
can reconnect. In this case, we simply kill the channeld before it fails,
and never check for the permfail string.
b'lightning_channeld(18613): TRACE: dev_disconnect: -WIRE_REVOKE_AND_ACK'
b'lightning_channeld(18613): TRACE: Trying commit'
b'lightning_channeld(18613): TRACE: htlc 0: SENT_ADD_REVOCATION->SENT_ADD_ACK_COMMIT'
b'lightning_channeld(18613): TRACE: htlc added REMOTE: local +0 remote -200000000'
b'lightning_channeld(18613): TRACE: sending_commit: HTLC REMOTE 0 = SENT_ADD_ACK_COMMIT/RCVD_ADD_ACK_COMMIT'
b'lightning_gossipd(18590): TRACE: Responder: Act 1'
b'lightning_channeld(18613): TRACE: Derived key 034aab0b5cb755de836cffb34c053ba115fba6fe75414e8f56261e23c80eabb1fe from basepoint 03e0a7bb422b254f54bc954be05bd6823a7b7a4b996ff8d3079ca211590fb5df39, point 02f3bf525b6ca595bf85d63e89c95fc59c0fde3ae434b55c8093bbb5c64849da37'
b'lightningd(18465): Connected json input'
b'lightningd(18465):jcon fd 16: Success'
b'lightningd(18465):jcon fd 16: Closing (Bad file descriptor)'
b'lightning_gossipd(18590): TRACE: Responder: Act 2'
b'lightning_gossipd(18590): TRACE: Responder: Act 3'
b'lightning_gossipd(18590): UPDATE WIRE_GOSSIP_PEER_CONNECTED'
b'lightning_gossipd(18590): UPDATE WIRE_GOSSIP_PEER_CONNECTED'
b'lightningd(18465): peer 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518: Peer has reconnected, state CHANNELD_NORMAL'
b'lightning_channeld(18613): Status closed, but not exited. Killing'
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And we report these through the getpeers JSON RPC again (carefully: in
our reconnect tests we can get duplicates which this patch now filters
out).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In future it will have TOR support, so the name will be awkward.
We collect the to/fromwire functions in common/wireaddr.c, and the
parsing functions in lightningd/netaddress.c.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a bit messier than I'd like, but we want to clearly remove all
dev code (not just have it uncalled), so we remove fields and functions
altogether rather than stub them out. This means we put #ifdefs in callers
in some places, but at least it's explicit.
We still run tests, but only a subset, and we run with NO_VALGRIND under
Travis to avoid increasing test times too much.
See-also: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
test_routing_gossip (__main__.LightningDTests) ... lightningd: Outstanding taken pointers: lightningd/peer_control.c:2352:towire_errorfmt(ld, ((void *)0), "Can't resolve your address")
This caused by the other end closing due to the next bug.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There is a race we see sometimes under valgrind on Travis which shows
gossipd receiving the node_announce from master before it reads the
channel_announce from channeld, and thus fails. The simplest solution
is to send the channel_announce and channel_update to master as well,
so it can ensure it sends them to gossipd in order
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are now only two kinds of subdaemons: global ones (hsmd, gossipd) and
per-peer ones. We can handle many callbacks internally now.
We can have a handler to set a new peer owner, and automatically do
the cleanup of the old one if necessary, since we now know which ones
are per-peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently rely on a zero exit status. That's the only difference between
onchain finished handling and other per-peer daemons, so instead we should
have an explicit "done" message. This is both clearer, and allows us to
unify.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now the flow is much simpler from a lightningd POV:
1. If we want to connect to a peer, just send gossipd `gossipctl_reach_peer`.
2. Every new peer, gossipd hands up to lightningd, with global/local features
and the peer fd and a gossip fd using `gossip_peer_connected`
3. If lightningd doesn't want it, it just hands the peerfd and global/local
features back to gossipd using `gossipctl_handle_peer`
4. If a peer sends a non-gossip msg (eg `open_channel`) the gossipd sends
it up using `gossip_peer_nongossip`.
5. If lightningd wants to fund a channel, it simply calls `release_channel`.
Notes:
* There's no more "unique_id": we use the peer id.
* For the moment, we don't ask gossipd when we're told to list peers, so
connected peers without a channel don't appear in the JSON getpeers API.
* We add a `gossipctl_peer_addrhint` for the moment, so you can connect to
a specific ip/port, but using other sources is a TODO.
* We now (correctly) only give up on reaching a peer after we exchange init
messages, which changes the test_disconnect case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to make the ip/port optional, so they should go at the end.
In addition, using ip:port is nicer, for gethostbyaddr().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have to do a dance when we get a reconnect in openingd, because we
don't normally expect to free both owner and peer. It's a layering
violation: freeing a peer should clean up the owner's pointer to it,
to avoid a double free, and we can eliminate this dance.
The free order is now different, and the test_reconnect_openingd was
overprecise.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This fixes the only case where the master currently has to write directly
to the peer: re-sending an error. We make gossipd do it, by adding
a new gossipctl_fail_peer message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We pull them from the database on-demand, where we're storing them
anyway. No need to keep them in memory as well.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
No idea why we were iterating over the list of stubs and then passing
in the index instead of a pointer to the stub directly.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This wires in the loading of `struct htlc_stub`s on-demand when
starting `onchaind` so that we don't need to keep them in memory.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
peer_fail_permanent() frees peer->owner, but for bad_peer() we're
being called by the sd->badpeercb(), which then goes on to
io_close(conn) which is a child of sd.
We need to detach the two for this case, so neither tries to free the
other.
This leads to a corner case when the subd exits after the peer is gone:
subd->peer is NULL, so we have to handle that too.
Fixes: #282
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have a race where we start onchaind, but state is unchanged, so checks
like peer_control.c's:
peer_ready = (peer->owner && peer->state == CHANNELD_AWAITING_LOCKIN);
if (!peer_ready) {
log_unusual(peer->log,
"Funding tx confirmed, but peer state %s %s",
peer_state_name(peer->state),
peer->owner ? peer->owner->name : "unowned");
} else {
subd_send_msg(peer->owner,
take(towire_channel_funding_locked(peer,
peer->scid)));
}
Can send to the wrong daemon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We re-use the value for reasonable_depth given by the master, and we
tell it when our timeout transactions reach that depth.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we see an offered HTLC onchain, we need to use the preimage if we
know it. So we dump all the known HTLC preimages at startup, and send
new ones as we discover them.
This doesn't cover preimages we know because we're the final
recipient; that can happen if an HTLC hasn't been irrevocably
committed yet. We'll do that in a followup patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>