lightningd can crash on shutdown if it's in the middle of getchaintips;
we free the conn, the finished callback is called (process_chaintips),
and it reports that it received an empty result.
The simplest fix is to set a flag in the struct bitcoind destructor,
and avoid the callback.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Either when it exits with a signal, or sends an error status message.
Then we make test_lightningd.py use it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We simply kill lightningd; we should stop it properly and have a timeout
to kill it if that fails. However, that's beyond my python skills :(
So we just look for crash.log. Unfortunately, we usually kill
lightningd before it's finished writing it. So we look for it and
don't kill lightningd, just wait in this case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This change is really to allow us to have a --dev-fail-on-subdaemon-fail option
so we can handle failures from subdaemons generically.
It also neatens handling so we can have an explicit callback for "peer
did something wrong" (which matters if we want to close the channel in
that case).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Remove reference to old $(LIGHTNINGD_OLD_LIB_OBJS) var (in handshaked too).
2. Make check depend directly on unit tests, insteadof weird lightningd/tests
variable.
3. check-source-bolt and check-whitespace are automatic for $(ALL_TEST_PROGRAMS)
so we don't need them here.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is the step where we broadcast the transaction to the network and
a nice place to extract the change from the transaction.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We hit:
assert(!peer->handle_master_reply);
#4 0x000055bba3b030a0 in master_sync_reply (peer=0x55bba41c0030,
msg=0x55bba41c6a80 "", replytype=WIRE_CHANNEL_GOT_COMMITSIG_REPLY,
handle=0x55bba3b041cf <handle_reply_wake_peer>) at channeld/channel.c:518
#5 0x000055bba3b049bc in handle_peer_commit_sig (conn=0x55bba41c10d0,
peer=0x55bba41c0030, msg=0x55bba41c6a80 "") at channeld/channel.c:959
#6 0x000055bba3b05c69 in peer_in (conn=0x55bba41c10d0, peer=0x55bba41c0030,
msg=0x55bba41c67c0 "") at channeld/channel.c:1339
#7 0x000055bba3b123eb in peer_decrypt_body (conn=0x55bba41c10d0,
pcs=0x55bba41c0030) at common/cryptomsg.c:155
#8 0x000055bba3b2c63b in next_plan (conn=0x55bba41c10d0, plan=0x55bba41c1100)
at ccan/ccan/io/io.c:59
We got a commit_sig from the peer while waiting for the master to
reply to acknowledge the commitsig we want to send
(handle_sending_commitsig_reply).
The fix is to go always talk to the master synchronous, and not try to
process anything but messages from the master daemon. This avoids the
whole class of problems.
There's a fairly simple way to do this, as ccan/io lets you override
its poll call: we process any outstanding master requests there, or
add the master fd to the pollfds array.
Fixes: #266
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's no longer used and we definitely do not want to run with an
outdated or future db, so we'll terminate if we can't upgrade or
the version is newer than what we understand.
Signed-off-by: Christian Decker <decker.christian@gmai.com>
For the permfail tests the sendpay call is supposed to fail, so this
was printing stacktraces upon success. Running in futures captures any
thrown exceptions and rethrows them when calling `result()`, in our
case we just ignore them.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
So far we were always using the deadline in the announcements, that's
obviously not good, so this introduces the parameter as per spec.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We weren't killing it. Eventually it would die, and peer_owner_finished()
would access subd->peer->owner, but that peer was freed already.
Closes: #261
Reported-by: Christian Decker <decker.christian@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To reproduce the next bug, I had to ensure that one node keeps thinking it's
disconnected, then the other node reconnects, then the first node realizes
it's disconnected.
This code does that, adding a '0' dev-disconnect modifier. That means
we fork off a process which (due to pipebuf) will accept a little
data, but when the dev_disconnect file is truncated (a hacky, but
effective, signalling mechanism) will exit, as if the socket finally
realized it's not connected any more.
The python tests hang waiting for the daemon to terminate if you leave
the blackhole around; to give a clue as to what's happening in this
case I moved the log dump to before killing the daemon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In this case, we unset the old subd->peer, then freed subd.
peer_owner_finished dereferenced subd->peer->owner, and boom:
test_disconnect_funder (__main__.LightningDTests) ... Fatal signal 11. Log dumped in crash.log
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.2882
==2882== Invalid read of size 8
==2882== at 0x413F74: peer_owner_finished (peer_control.c:679)
==2882== by 0x41EA2C: destroy_subd (subd.c:381)
==2882== by 0x459700: notify (tal.c:240)
==2882== by 0x459BB1: del_tree (tal.c:400)
==2882== by 0x459FC0: tal_free (tal.c:509)
==2882== by 0x413796: peer_reconnected (peer_control.c:493)
==2882== by 0x413A6A: add_peer (peer_control.c:592)
==2882== by 0x40ED1F: handshake_succeeded (new_connection.c:186)
==2882== by 0x41E3DD: sd_msg_reply (subd.c:262)
==2882== by 0x41E6BB: sd_msg_read (subd.c:318)
==2882== by 0x41E4E6: read_fds (subd.c:283)
==2882== by 0x44DEB4: next_plan (io.c:59)
==2882== Address 0x838 is not stack'd, malloc'd or (recently) free'd
==2882==
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Addresses #207 by adding a method to retrieve available funds from the
wallet.
Reported-by: @jl777
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was causing some compilation trouble on 32bit systems, see #256.
Reported-by: @shsmith
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The logic of dispatching the announcement_signatures message was
distributed over several places and daemons. This aims to simplify it
by moving it all into `channeld`, making peer_control only report
announcement depth to `channeld`, which then takes care of the
rest. We also do not reuse the funding_locked tx watcher since it is
easier to just fire off a new watcher with the specific purpose of
waiting for the announcement_depth.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
1. The code to skip over padding didn't take into account max.
2. It also didn't use symbolic names.
3. We are not supposed to fail on unknown addresses, just stop parsing.
4. We don't use the read_ip/write_ip code, so get rid of it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I missed these when I removed the legacy daemon. We also remove the
min_blocks field which was always 0.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use a negative timestamp as the flag for this, making the test simple.
This allows valgrind to detect that we're accessing them prematurely,
including across the wire on gossip_getchannels_entry.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The test is could actually go each way, since for 1000000 the fee is
the same either way.
Increase to 300000, and add an extra test when the alternate path
is disabled.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I had a routing problem, and wrote a simple unit test which passed. So
I wrote one which copied the failure case (and importantly, had a non-1
fee factor), which triggerd it.
In that real example, we underflowed which resulted in us not finding
a route. Simply don't consider routes which are infinite.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
update-mocks was broken, since it assumed the daemon/ directory.
We now use "make" directly to build the test file and harvest errors,
and are more robust if it simply doesn't compile (ie. fails, but no
linker errors).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>