The idea is that gossipd can give us the cupdate we need for an error, and
we wire things up so that we ask for it (async) just before we send the
error to the subdaemon.
I tried many other things, but they were all too high-risk.
1. We need to ask gossipd every time, since it produces these lazily
(in particular, it doesn't actually generate an offline update unless
the channel is used).
2. We can't do async calls in random places, since we'll end up with
an HTLC in limbo. What if another path tries to fail it at the same time?
3. This allows us to use a temporary_node_failure error, and upgrade it
when gossipd replies. This doesn't change any existing assumptions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a common thing to do, so create a macro.
Unfortunately, it still needs the type arg, because the paramter may
be const, and the return cannot be, and C doesn't have a general
"(-const)" cast.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
common should not include specific per-daemon files. Turns out this
caused a lot of indirect includes to be exposed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It was a pointer into the list of plugins for the hook, but it was rather
unstable: if a plugin exits after handling the event we could end up skipping
a later plugin. We now rely on the much more stable `call_chain` list, so we
can clean up that useless field.
We are attaching the destructor to notify us when the plugin exits, but we
also need to clear them once the request is handled correctly, so we don't
call the destructor when it exits later.
We make the current state of `lightningd` explicit so we don't have to
identify a shutdown by its side-effects. We then use this in order to prevent
the killing and freeing of plugins to continue down the chain of registered
plugins.
We were waiting for both stdin and stdout to close, however that resulted in
us deferring cleanup indefinitely since we did not poll stdout for being
writable most of the time. On the other hand we are almost always polling
the plugin's stdout, so that notifies us as soon as the plugin stops.
Changelog-Fixed: plugin: Plugins no longer linger indefinitely if their process terminates
Before this patch we used to send `double`s over the wire by just
copying them. This is not portable because the internal represenation
of a `double` is implementation specific.
Instead of this, multiply any floating-point numbers that come from
the outside (e.g. JSONs) by 1 million and round them to integers when
handling them.
* Introduce a new param_millionths() that expects a floating-point
number and returns it multipled by 1000000 as an integer.
* Replace param_double() and param_percent() with param_millionths()
* Previously the riskfactor would be allowed to be negative, which must
have been unintentional. This patch changes that to require a
non-negative number.
Changelog-None
This completes the conversion; any in-flight HTLC failures get turned into temporary_node_failures.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This cleans up the "local failure" callers for incoming HTLCs to hand
an onionreply instead of making us generate it from the code inside
make_failmsg.
(The db path still needs make_failmsg, so that's next).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-deprecated: Plugins: htlc_accepted_hook "failure_code" only handles simple cases now, use "failure_message".
Unfortunately the invoice_payment_hook can give us a failcode, so I simply
restrict it to the two sensible ones.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-deprecated: plugins: invoice_payment_hook "failure_code" only handles simple cases now, use "failure_message".
We tell channeld that an htlc is bad by sending it a 'struct
failed_htlc'. This usually contains an onionreply to forward, but for
the case where the onion itself was bad, it contains a failure code
instead.
This makes the "send a failed_htlc for a bad onion" a completely
separate code path, then we can work on removing failcodes from the
other path.
In several places 'failcode' is now changed to 'badonion' to reflect
that it can only be a BADONION failcode.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
At the moment, we store e.g. WIRE_TEMPORARY_CHANNEL_FAILURE, and then
lightningd has a large demux function which turns that into the correct
error message.
Such an enum demuxer is an anti-pattern.
Instead, store the message directly for output HTLCs; channeld now
sends us an error message rather than an error code.
For input HTLCs we will still need the failure code if the onion was
bad (since we need to prompt channeld to send a completely different
message than normal), though we can (and will!) eliminate its use in
non-BADONION failure cases.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to change our internal structure next, so this is preparation.
We populate existing errors with temporary node failures, for simplicity.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of making it ourselves, lightningd does it. Now we only have
two cases of failed htlcs: completely malformed (BADONION), and with
an already-wrapped onion reply to send.
This makes channeld's job much simpler.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I hadn't realized that lightningd asks gossipd every time we forward
a payment. But I'm going to abuse it here to get the latest channel_update,
otherwise (as lightningd takes over error message generation) lightningd
needs to do an async request at various painful points.
So have gossipd tell us the lastest update (stripped so compatible with
the strange in-onion-error format).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Turn it into temporary node failure: this only happens if we restart
with a failed htlc in, but it's clearer and more robust to handle it
generically.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For incoming htlcs, we need failure details in case we need to
re-xmit them. But for outgoing htlcs, lightningd is telling us it
already knows they've failed, so we just need to flag them failed
and don't need the details.
Internally, we set the ->fail to a dummy non-NULL value; this is
cleaned up next.
This matters for the next patch, which moves onion handling into
lightningd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. forward_htlc sets hout to NULL.
2. forward_htlc passes &hout to send_htlc_out.
3. forward_htlc checks the failcode and frees(NULL) and sets hout to NULL
(again). This in fact covers every failcode which send_htlc_out returns.
We should ensure send_htlc_out sets *houtp to NULL on failure; in fact,
both callers pass houtp, so we can make it unconditional.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't compile with NDEBUG defined, but if we did, this code would
vanish. I did a quick audit, inspired by @ZmnSCPxj.
I actually hacked up something to compile with NDEBUG (many unused vars
resulted, and of course unit tests are allowed to rely on assert()), and
after this the testsuite still passes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I reproduced this by putting a sleep(60) in the pay plugin, then
'lightning-cli pay', 'lightning-cli plugin stop pay' and then ^C
the `lightning-cli pay`:
2020-02-14T00:33:11.217Z INFO plugin-pay: Killing plugin: pay stopped by lightningd via RPC
2020-02-14T00:33:15.250Z DEBUG lightningd: Still waiting for initial block download
==5157== Invalid read of size 8
==5157== at 0x12A29C: destroy_jcon (jsonrpc.c:149)
==5157== by 0x1C6F2A: notify (tal.c:235)
==5157== by 0x1C7441: del_tree (tal.c:397)
==5157== by 0x1C7493: del_tree (tal.c:407)
==5157== by 0x1C77DD: tal_free (tal.c:481)
==5157== by 0x1B7380: io_close (io.c:450)
==5157== by 0x1B71A7: do_plan (io.c:401)
==5157== by 0x1B7214: io_ready (io.c:417)
==5157== by 0x1B94AC: io_loop (poll.c:445)
==5157== by 0x1291C9: io_loop_with_timers (io_loop_with_timers.c:24)
==5157== by 0x12EC7E: main (lightningd.c:928)
==5157== Address 0x4ebab98 is 40 bytes inside a block of size 88 free'd
==5157== at 0x483BA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==5157== by 0x1C750F: del_tree (tal.c:416)
==5157== by 0x1C7493: del_tree (tal.c:407)
==5157== by 0x1C77DD: tal_free (tal.c:481)
==5157== by 0x153856: clear_plugin (plugin_control.c:209)
==5157== by 0x1538FF: plugin_dynamic_stop (plugin_control.c:225)
==5157== by 0x153C51: json_plugin_control (plugin_control.c:295)
==5157== by 0x12B4EC: command_exec (jsonrpc.c:588)
==5157== by 0x12B8AB: rpc_command_hook_callback (jsonrpc.c:679)
==5157== by 0x154575: plugin_hook_call_ (plugin_hook.c:170)
==5157== by 0x12BCD3: plugin_hook_call_rpc_command (jsonrpc.c:756)
==5157== by 0x12BD04: call_rpc_command_hook (jsonrpc.c:764)
==5157== Block was alloc'd at
==5157== at 0x483A7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==5157== by 0x1C6F98: allocate (tal.c:245)
==5157== by 0x1C7559: tal_alloc_ (tal.c:423)
==5157== by 0x15135A: plugin_rpcmethod_add (plugin.c:706)
==5157== by 0x151600: plugin_rpcmethods_add (plugin.c:756)
==5157== by 0x151BDD: plugin_parse_getmanifest_response (plugin.c:893)
==5157== by 0x151C9C: plugin_manifest_cb (plugin.c:915)
==5157== by 0x14FFB9: plugin_response_handle (plugin.c:258)
==5157== by 0x150165: plugin_read_json_one (plugin.c:356)
==5157== by 0x1502BC: plugin_read_json (plugin.c:388)
==5157== by 0x1B65ED: next_plan (io.c:59)
==5157== by 0x1B71D2: do_plan (io.c:407)
Fixes: #3509
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If the peer is not connected, or other error which means we don't
actually create an outgoing HTLC, we don't record the
short_channel_id. This is unhelpful!
Pass the scid down to the wallet code, and explicitly hand the
scid and amount down to the notification code rather than handing it
the htlc_out (which it doesn't need).
Changelog-Changed: JSON API: `listforwards` now shows `out_channel` even if we couldn't forward.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This avoids the getblockhash+getblock, and more importantly that was the
last functionality making use of bitcoind_getrawblock() and bitcoin_getblockhash(),
so we can also get rid of them.
This adds `getchaininfo` and `getrawblockbyheight` handling lightningd-side,
and use them in setup_topology().
We then remove legacy bitcoind_getblockcount() (we already get the count in
`getchaininfo`), bitcoind_getblockchaininfo() (it was only used in setup_topology()),
and wait_for_bitcoind() (this was specific to bitcoin-core and we assume our Bitcoin
backend to be functional if the plugin responds to `init`).
Add new check if we're funder trying to add HTLC, keeping us
with enough extra funds to pay for another HTLC the peer might add.
We also need to adjust the spendable_msat calculation, and update
various tests which try to unbalance channels. We eliminate
the now-redundant test_channel_drainage entirely.
Changelog-Fixed: Corner case where channel could become unusable (https://github.com/lightningnetwork/lightning-rfc/issues/728)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Make the `htlc_accepted` hook the first chained hook in our repertoire. The
plugins are called one after the other in order until we have no more plugins
or the HTLC was handled by one of the plugins. If no plugins handles the HTLC
we continue to handle it internally like always.
Handling in this case means the plugin returns either `{"result": "resolve",
...}` or `{"result": "fail", ...}`.
Changelog-Changed: plugin: Multiple plugins can now register for the htlc_accepted hook.
This used to be necessary because we allocated the `plugin_hook_request` off
of the plugin instance (only tal allocated object we could grab at that
time. Now the plugin was replaced by a list, which itself is tal-allocated,
making that workaround pointless, or even wrong once we have multiple plugins
registering for that hook.
We will be using `plugin_hook_call_next` as part of the loop to traverse all
plugins that registered the hook, so group initialization in the init function
and move per-plugin logic into `plugin_hook_call_next`
We are about to call multiple plugins, and we'll have to pass the payload into
each call. Sadly the serialized stream gets consumed during the call, so keep
the unserialized payload around.
Switch from having a single plugin to a list of plugins. If the hook is of
type single we will enforce that constraint on the number of registered
plugins when attempting to add.
The newly introduced type is used to determine what the call semantics of the
hook are. We have `single` corresponding to the old behavior, as well as
`chain` which allows multiple plugins to register for the hook, and they are
then called sequentially (if all plugins return `{"result": "continue"}`) or
exit the chain if the hook event was handled.