We fail this at the moment, since we rely on shutdown to do the cleanups
for us.
(Also had to fix the unclean shutdown path: the caller checks the rc unless
mayfail is set, and of course it's not zero since we just SIGTERM'd it).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It would always return bech32; fix that, and don't bother printing
it if they use the (new) 'all' parameter.
This API was introduced in 3e67c09d5e,
which means it wasn't in a release so no CHANGELOG entry necessary.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We didn't count some records before, so we could compare the two counters.
This is much simpler, and avoids reliance on bs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means we intercept the peer's gossip_timestamp_filter request
in the per-peer subdaemon itself. The rest of the semantics are fairly
simple however.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(We don't increment the gossip_store version, since there are only a
few commits since the last time we did this).
This lets the reader simply filter messages; this is especially nice since
the channel_announcement timestamp is *derived*, not in the actual message.
This also creates a 'struct gossip_hdr' which makes the code a bit
clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Keeping the uintmap ordering all the broadcastable messages is expensive:
130MB for the million-channels project. But now we delete obsolete entries
from the store, we can have the per-peer daemons simply read that sequentially
and stream the gossip itself.
This is the most primitive version, where all gossip is streamed;
successive patches will bring back proper handling of timestamp filtering
and initial_routing_sync.
We add a gossip_state field to track what's happening with our gossip
streaming: it's initialized in gossipd, and currently always set, but
once we handle timestamps the per-peer daemon may do it when the first
filter is sent.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the high bit of the length field: this way we can still check
that the checksums are valid on deleted fields.
Once this is done, serially reading the gossip_store file will result
in a complete, ordered, minimal gossip broadcast. Also, the horrible
corner case where we might try to delete things from the store during
load time is completely gone: we only load non-deleted things.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to bump version again, and the code to upgrade it was
quite hairy (and buggy!). It's not worthwhile for such a
poorly-tested path: I will just add code to limit how much incoming
gossip we get to avoid flooding when we upgrade, however.
I also use a modern gossip_store version in our test_gossip_store_load
test, instead of relying on the upgrade path.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we first receive a channel_update, we write both the
channel_announcement and that channel_update to the store: we need
that first update so we can set the channel_announcement timestamp.
However, the channel_update can be replaced later. This means we can
have a channel_announcement, a node_update which relies on it, then
the channel_update later.
So move the "this applies to a pending announcement" check lower, where
gossip_store can use it too. Has a nice side-effect of avoiding
one lookup of the node id.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
First, we should have a channel_update so we actually do some compaction!
(Reported-by @SimonVrouwe). But we should also handle the cases where:
1. A channel_announcement is *not* directly followed by a
channel_update (happens when the channel_update is replaced).
2. A node_announcement predates a channel_update for the peer
(again, can happen once a channel_update is replaced).
3. A local/private channel_creation is not directly followed by an
update.
In addition, we might as well check that we can *load* such a store,
before compaction.
This checks the corner cases which occur in real gossip stores.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's possible that it hasn't got the node_announcement messages;
it will still list the nodes, however (the channel_announcement tells
it the nodes exist). Check for the alias field instead.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Basically, any "Bad" message from gossipd is something we should look
at. This covers failures loading the gossip_store, too!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reorg changes short_channel_id after lockin of private channel, while
one node restarts.
test that:
- peer->depth_togo in billboard decrements
- reorg and scid change is detected by running node and restarting node
- both `old` and `new` scids are in rtable
Also added a comment to test_blockchaintrack to clarify.
Now without bitcoind restart.
bitcoin-cli `prioritisetransaction` came to the rescue!
Its argument `fee_delta` (apparently) lowers the txs _effective_ feerate
soo low that bitcoind wont mine it ... untill we raise it when we want
it to be mined.
Because the call (wallet_extract_owned outputs) that prints that line can happen
_before_ or _after_ confirmation in block, adding `CONFIRMED` in the later.
This brings up an interesting quirk though, in that we report "3
attempts", where we really should have done one.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
My raspberry pi node hung up on my other node:
lightning_openingd-... chan #1: Got bad message from gossipd: 0db1
This is because we didn't handle that message in one path.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We try to look up the funding tx, but it's already spent that to fund
the channel, so we need txindex if this test is to work reliably.
It's not clear to me why this *ever* worked, but if fails on my new
ThreadRipper build machine with valgrind:
> wallettx = l1.bitcoin.rpc.getrawtransaction(wallettxid, True)
...
E bitcoin.rpc.InvalidAddressOrKeyError: {'code': -5, 'message': 'No such mempool transaction. Use -txindex to enable blockchain transaction queries. Use gettransaction for wallet transactions.'}
/usr/lib/python3/dist-packages/bitcoin/rpc.py:231: InvalidAddressOrKeyError
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Save some overhead, plus gets us ready for giving subdaemons direct
store access. This is the first time we *upgrade* the gossip_store,
rather than just discarding.
The downside is that we need to add an extra message after each
channel_announcement, containing the channel capacity.
After:
store_load_msec:28337-30288(28975+/-7.4e+02)
vsz_kb:582304-582316(582306+/-4.8)
store_rewrite_sec:11.240000-11.800000(11.55+/-0.21)
listnodes_sec:1.800000-1.880000(1.84+/-0.028)
listchannels_sec:22.690000-26.260000(23.878+/-1.3)
routing_sec:2.280000-9.570000(6.842+/-2.8)
peer_write_all_sec:48.160000-51.480000(49.608+/-1.1)
Differences:
-vsz_kb:582320
+vsz_kb:582316
-listnodes_sec:2.100000-2.170000(2.118+/-0.026)
+listnodes_sec:1.800000-1.880000(1.84+/-0.028)
-peer_write_all_sec:51.600000-52.550000(52.188+/-0.34)
+peer_write_all_sec:48.160000-51.480000(49.608+/-1.1)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For performance reasons we start the lightningd instances in
parallel. However, if we only assign the numeric ids (used for log-prefixes
and home directories) when we are already running in parallel, we are not
guaranteed to get the numeric ids matching the return value of `get_nodes` or
`line_graph`. With this patch we now select numeric ids before parallelizing
the start.
Signed-off-by: Christian Decker <@cdecker>
Here I add the test for this 5 local_failed case in this commit.
There 5 cases for FORWARD_LOCAL_FAILED status:
1. When Msater resolves the reply about the next peer infor(sent by Gossipd), and need handle unknown next peer failure in channel_resolve_reply();
2. When Master handle the forward process with the htlc_in and the id of next hop, it tries to drive a new htlc_out but fails in forward_htlc();
3. When we send htlc_out, Master asks Channeld to add a new htlc into the outgoing channel but Channeld fails. Master need handle and store this failure in rcvd_htlc_reply();
4. When Channeld receives a new revoke message, if the state of corresponding htlc is RCVD_ADD_ACK_REVOCATION, Master will tries to resolve onionpacket and handle the failure before resolving the next hop in peer_got_revoke();
5. When Onchaind finds the htlc time out or missing htlc, Master need handle these failure as FORWARD_LOCAL_FAILED in if it's forward payment case.
We were triggering a second exception in the directory cleanup step by
attempting to access a field that'd only be set upon entering the test code
itself. That error did not contribute to the problem resolution, so now we
check whether that field is set before accessing it.
Signed-off-by: Christian Decker <decker.christian@gmail.com>