Even without optimization, it's faster to walk all the channels than
ping another daemon and wait for the response.
Changelog-Changed: Forwarding messages is now much faster (less inter-daemon traffic)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The idea is that gossipd can give us the cupdate we need for an error, and
we wire things up so that we ask for it (async) just before we send the
error to the subdaemon.
I tried many other things, but they were all too high-risk.
1. We need to ask gossipd every time, since it produces these lazily
(in particular, it doesn't actually generate an offline update unless
the channel is used).
2. We can't do async calls in random places, since we'll end up with
an HTLC in limbo. What if another path tries to fail it at the same time?
3. This allows us to use a temporary_node_failure error, and upgrade it
when gossipd replies. This doesn't change any existing assumptions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Before this patch we used to send `double`s over the wire by just
copying them. This is not portable because the internal represenation
of a `double` is implementation specific.
Instead of this, multiply any floating-point numbers that come from
the outside (e.g. JSONs) by 1 million and round them to integers when
handling them.
* Introduce a new param_millionths() that expects a floating-point
number and returns it multipled by 1000000 as an integer.
* Replace param_double() and param_percent() with param_millionths()
* Previously the riskfactor would be allowed to be negative, which must
have been unintentional. This patch changes that to require a
non-negative number.
Changelog-None
This is the last venue we need to add custom featurebits to, so we also unmark
the test as xfail.
Changelog-Added: plugin: Plugins can now signal support for experimental protocol extensions by registering featurebits for `node_announcement`s, the connection handshake, and for invoices. For now this is limited to non-dynamic plugins only
Default is legacy. If we have future styles, new strings can be defined,
but for now it's "tlv" or "legacy".
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is mainly an internal-only change, especially since we don't
offer any globalfeatures.
However, LND (as of next release) will offer global features, and also
expect option_static_remotekey to be a *global* feature. So we send
our (merged) feature bitset as both global and local in init, and fold
those bitsets together when we get an init msg.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now have a pointer to chainparams, that fails valgrind if we do anything
chain-specific before setting it.
Suggested-by: Rusty Russell <@rustyrussell>
I was seeing some accidental pruning under load / Travis, and in
particular we stopped accepting channel_updates because they were 103
seconds old. But making it too long makes the prune test untenable,
so restore a separate flag that this test can use.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This will let gossipd be more intelligent about gossiping before we're
synced, and also it might know how far behind we are.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's generally clearer to have simple hardcoded numbers with an
#if DEVELOPER around it, than apparent variables which aren't, really.
Interestingly, our pruning test was always kinda broken: we have to pass
two cycles, since l2 will refresh the channel once to avoid pruning.
Do the more obvious thing, and cut the network in half and check that
l1 and l3 time out.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are some more #if DEVELOPER one-liners coming, this makes them
clear, but still lets them stand out.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
During sync it is highly likely that we can coalesce multiple calls and share
results among them. We also report back failures for non-existing blocks early
on, so we don't run into issues with blocks that our bitcoind doesn't have
yet.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was caused by us not checking against the max_blockheight, but rather the
min_blockheight which can be negative with a newly created node. This is still
safe since we check for duplicates anyway in `wallet_filteredblock_add`.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
If we were to just insert filtered blocks in the range that we will scan later
we'd be hitting the uniqueness constraints later.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
"result" should always be an object (so that we can add new fields),
so make that implicit in json_stream_success.
This makes our primitives well-formed: we previously used NULL as our
fieldname when calling the first json_object_start, which is a hack
since we're actually in an object and the fieldname is 'result' (which
was already written by json_object_start).
There were only two cases which didn't do this:
1. dev-memdump returned an array. No API guarantees on this.
2. shutdown returned a string.
I temporarily made shutdown return an empty object, which shouldn't
break anything, but I want to fix that later anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These are generalized from our internal implementations.
The main difference is that 'struct json_escaped' is now 'struct
json_escape', so we replace that immediately.
The difference between lightningd's json-writing ringbuffer and the
more generic ccan/json_out is that the latter has a better API and
handles escaping transparently if something slips through (though
it does offer direct accessors so you can mess things up yourself!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means there's now a semantic difference between the default `fromid`
and setting `fromid` explicitly to our own node_id. In the default case,
it means we don't charge ourselves fees on the route.
This means we can spend the full channel balance.
We still want to consider the pricing of local channels, however:
there's a *reason* to discount one over another, and that is to bias
things. So we add the first-hop fee to the *risk* value instead.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Keeping the uintmap ordering all the broadcastable messages is expensive:
130MB for the million-channels project. But now we delete obsolete entries
from the store, we can have the per-peer daemons simply read that sequentially
and stream the gossip itself.
This is the most primitive version, where all gossip is streamed;
successive patches will bring back proper handling of timestamp filtering
and initial_routing_sync.
We add a gossip_state field to track what's happening with our gossip
streaming: it's initialized in gossipd, and currently always set, but
once we handle timestamps the per-peer daemon may do it when the first
filter is sent.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A new string field is added to the command structure and is specified at the creation of each native command, and in the JSON created by 'json_add_help_command()'.
This has two effects: most importantly, it avoids the problem where
lightningd creates a 800MB JSON blob in response to listchannels,
which causes OOM on the Raspberry Pi (our previous max allocation was
832MB). This is because lightning-cli can start draining the JSON
while we're filling the buffer, so we end up with a max allocation of
68MB.
But despite being less efficient (multiple queries to gossipd), it
actually speeds things up due to the parallelism:
MCP with -O3 -flto before vs after:
-listchannels_sec:8.980000-9.330000(9.206+/-0.14)
+listchannels_sec:7.500000-7.830000(7.656+/-0.11)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now have a test blockchain for MCP which has the correct channels,
so this is not needed.
Also fix a benchmark script bug where 'mv "$DIR"/log
"$DIR"/log.old.$$' would fail if you log didn't exist from a previous run.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
New fields don't have to be spelled out twice.
The raw version are called _only, so we don't miss a call
accidentally. We can rename them when we finally deprecated old
fields.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can save significant space by combining both sides: so much that we
can reduce the WIRE_LEN_LIMIT to something sane again.
MCP results from 5 runs, min-max(mean +/- stddev):
store_load_msec:34467-36764(35517.8+/-7.7e+02)
vsz_kb:2637488
store_rewrite_sec:35.310000-36.580000(35.816+/-0.44)
listnodes_sec:1.140000-2.780000(1.596+/-0.6)
listchannels_sec:55.390000-58.110000(56.998+/-0.99)
routing_sec:30.330000-30.920000(30.642+/-0.19)
peer_write_all_sec:50.640000-53.360000(51.822+/-0.91)
MCP notable changes from previous patch (>1 stddev):
-store_rewrite_sec:34.720000-35.130000(34.94+/-0.14)
+store_rewrite_sec:35.310000-36.580000(35.816+/-0.44)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I tried to just do gossipd, but it was uncontainable, so this ended up being
a complete sweep.
We didn't get much space saving in gossipd, even though we should save
24 bytes per node.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Pubkeys are not not actually DER encoding, but Pieter Wuille corrected
me: it's SEC 1 documented encoding.
Results from 5 runs, min-max(mean +/- stddev):
store_load_msec,vsz_kb,store_rewrite_sec,listnodes_sec,listchannels_sec,routing_sec,peer_write_all_sec
38922-39297(39180.6+/-1.3e+02),2880728,41.040000-41.160000(41.106+/-0.05),2.270000-2.530000(2.338+/-0.097),44.570000-53.980000(49.696+/-3),32.840000-33.080000(32.95+/-0.095),43.060000-44.950000(43.696+/-0.72)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This lets us benchmark without a valid blockchain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Header from folded patch 'fixup!_gossipd__dev_option_to_allow_unknown_channels.patch':
fixup! gossipd: dev option to allow unknown channels.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As a side-effect of using amount_msat in gossipd/routing.c, we explicitly
handle overflows and don't need to pre-prune ridiculous-fee channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to still accept it when parsing the database, but this flag
should allow upgrade testing for devs building on top
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used a u16, and a 1000 multiplier, which meant we wrapped at
riskfactor 66. We also never undid the multiplier, so we ended up
applying 1000x the riskfactor they specified.
This changes us to pass the riskfactor with a 1M multiplier. The next
patch changes the definition of riskfactor to be more useful.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As a general rule, lightningd shouldn't parse user packets. We move the
parsing into gossipd, and have it respond only to permanent failures.
Note that we should *not* unconditionally remove a channel on
WIRE_INVALID_ONION_HMAC, as this can be triggered (and we do!) by
feeding sendpay a route with an incorrect pubkey.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Christian points out that we can iterate by ->size rather than calling
json_next() to find the end (which traverses the entire object!).
Now ->size is reliable (since previous patch), this is OK.
Reported-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>