We erase peer data after the last channel close transaction for that
peer is 100 blocks deep. We were failing to finish the migration because
the peer_id lookup on these was failing.
Now we ignore any channel with a null peer_id.
Fixes#3768
We use a database snapshot with 3 channels -- two of which have HTLCs
dangling and one is an initial open channel tx in the 'old' tx hex
format in last_tx and confirm that they are successfully updated to PSBT
format on start.
Valgrind doesn't really like crashes if compiled without DEVELOPER since that
seems to compile out the debug symbols, resulting in the following error:
```
Optimistic lock on the database failed. There may be a concurrent access to the database. Aborting since concurrent access is unsafe.
lightningd: FATAL SIGNAL 6 (version 0.0.99)
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd: FATAL SIGNAL 11 (version 0.0.99)
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
2020-01-07 15:26:03.539 EST [11583] LOG: unexpected EOF on client connection with an open transaction
--------------------------- Captured stdout teardown ---------------------------
DEBUG:root:Calling stop with payload None
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.11409
==11409== Jump to the invalid address stated on the next line
==11409== at 0x0: ???
==11409== by 0x1C00A8: backtrace_full (backtrace.c:127)
==11409== by 0x147B0A: send_backtrace (daemon.c:46)
==11409== by 0x147B55: crashdump (daemon.c:54)
==11409== by 0x6071F1F: ??? (in /lib/x86_64-linux-gnu/libc-2.27.so)
==11409== by 0x6071E96: __libc_signal_restore_set (nptl-signals.h:80)
==11409== by 0x6071E96: raise (raise.c:48)
==11409== by 0x6073800: abort (abort.c:79)
==11409== by 0x12B2FF: fatal (log.c:819)
==11409== by 0x16FA3B: db_data_version_incr (db.c:826)
==11409== by 0x16FA9E: db_commit_transaction (db.c:841)
==11409== by 0x124D20: io_loop_with_timers (io_loop_with_timers.c:34)
==11409== by 0x129260: main (lightningd.c:860)
==11409== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==11409==
==11409==
==11409== Process terminating with default action of signal 11 (SIGSEGV)
==11409== Bad permissions for mapped region at address 0x0
==11409== at 0x0: ???
==11409== by 0x1C00A8: backtrace_full (backtrace.c:127)
==11409== by 0x147B0A: send_backtrace (daemon.c:46)
==11409== by 0x147B55: crashdump (daemon.c:54)
==11409== by 0x6071F1F: ??? (in /lib/x86_64-linux-gnu/libc-2.27.so)
--------------------------------------------------------------------------------
```
The optimistic lock prevents multiple instances of c-lightning making
concurrent modifications to the database. That would be unsafe as it messes up
the state in the DB. The optimistic lock is implemented by checking whether a
gated update on the previous value of the `data_version` actually results in
an update. If that's not the case the DB has been changed under our feet.
The lock provides linearizability of DB modifications: if a database is
changed under the feet of a running process that process will `abort()`, which
from a global point of view is as if it had crashed right after the last
successful commit. Any process that also changed the DB must've started
between the last successful commit and the unsuccessful one since otherwise
its counters would not have matched (which would also have aborted that
transaction). So this reduces all the possible timelines to an equivalent
where the first process died, and the second process recovered from the DB.
This is not that interesting for `sqlite3` where we are also protected via the
PID file, but when running on multiple hosts against the same DB, e.g., with
`postgres`, this protection becomes important.
Changelog-Added: DB: Optimistic logging prevents instances from running concurrently against the same database, providing linear consistency to changes.
This leads to all sorts of problems; in particular it's incredibly
slow (days, weeks!) if bitcoind is a long way back. This also changes
the behaviour of a rescan argument referring to a future block: we will
also refuse to start in that case, which I think is the correct behavior.
We already ignore bitcoind if it goes backwards while we're running.
Also cover a false positive memleak.
Changelog-Fixed: If bitcoind goes backwards (e.g. reindex) refuse to start (unless forced with --rescan).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sometimes the l3 seeker asks for scids, and the reply contains the
channel which is then closed by the time it checks, so it considers
the updates bad gossip.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was the initial issue that was addressed by #2756 and now we just test
that all is working as expected.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is just the test that we use to verify block backfilling below the wallet
birth height is working correctly.
Signed-off-by: Christian Decker <decker.christian@gmail.com>