Skip to content

Instantly share code, notes, and snippets.

@djkazic
Last active April 22, 2025 16:31
Show Gist options
  • Save djkazic/526fa3e032aea9578997f88b45b91fb9 to your computer and use it in GitHub Desktop.
Save djkazic/526fa3e032aea9578997f88b45b91fb9 to your computer and use it in GitHub Desktop.

postgres_kvdb tuning guide for LND

Introduction

Today, LND supports using several database backends with the default being bbolt.

Postgres represents a more battle-tested deployment of a DB and comes with some features of interest that benefit performance and data durability:

  • Async / sync replication
  • Vacuum for dead tuples cleanup
  • (with SQL schema) optimizations around index use
  • Write transaction parallelism (bbolt is single writer)

This doc will go through some common configuration recommendations as well as some queries to monitor the database activity.

Configuration recommendations

It's best to use zstd for WAL compression if you can spare the CPU for it.

wal_compression = zstd

Raise the checkpoint timeout from it's default of 5m. This allows Postgres to coalesce changes (which results in smaller WAL deltas being shipped for replication). The tradeoff is that if you go too high here startup recovery performance takes a hit.

checkpoint_timeout = 10min

Config options continued:

random_page_cost = 1.1

The default is 4 and causes the query planner to avoid index scans.

jit_above_cost = -1
jit = off

Disabling JIT is helpful as it's most useful for long-running queries while postgres_kvdb does not do those.

autovacuum_vacuum_cost_limit = 2000

Autovacuums are important for sustained and consistent performance.

shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = '500ms'
auto_explain.log_analyze = on
auto_explain.log_buffers = on

Auto explain extension prints out an EXPLAIN ANALYZE on queries that exceed a runtime threshold (500ms here).

max_locks_per_transaction = 128
max_pred_locks_per_transaction = 1024

LND needs a lot of locks per transaction to function. You may see a shared memory error if you leave these at the default values. If these are insufficient, double them until the shared memory error goes away.

Configurations that were not included

work_mem = 16MB
shared_buffers = 6000MB
synchronous_standby_names = 'walreceiver'

These options are meant to be customized based on the instance's available resources. Provided values are for my personal machine which has 6C 12T and 32GB of memory. During the case studies, these values were tailored for each node runner's machine.

Queries

List connection count

lnd=# SELECT datname, COUNT(datid) FROM pg_stat_activity GROUP BY datname;
 datname | count
---------+-------
 lnd     |     3

View connections with breakdown by active/idle

lnd=# SELECT
    count(*) AS total_connections,
    count(*) FILTER (WHERE state = 'active') AS active_connections,
    count(*) FILTER (WHERE state = 'idle') AS idle_connections
FROM pg_stat_activity WHERE datname='lnd';
 total_connections | active_connections | idle_connections
-------------------+--------------------+------------------
                 3 |                  1 |                1

View currently executing queries and their state

lnd=# SELECT state, query
FROM pg_stat_activity WHERE datname='lnd';
        state        |                              query
---------------------+------------------------------------------------------------------
 idle                | commit
 idle in transaction | SELECT value FROM channeldb_kv WHERE parent_id=857413 AND key=$1
(2 rows)

Case studies

In testing postgres configurations, I reached out to two node runners with large amounts of traffic on their routing nodes (> 200k forwards lifetime over ~3y).

They tried the configurations above but initially were unsure that their node was any faster. However, this was due to an LNDg query being inefficient. Once my PR was merged, LNDg performance was much quicker.

Context: cryptosharks131/lndg#404

On a before/after basis, we observed an approximate 40% increase in TPS compared to default postgres configurations.

When we tested individual configuration options, random_page_cost had a very large impact. I theorize this is because it makes the postgres query planner favor index scans over sequential scans. The default value of 4 models 90% of reads being serviced from cache (true) and that a random access is 40x more expensive than a sequential one. While this was true for rotational storage media, the assumption didn't really hold for modern SSDs.

The runner-up for most improvement was more subtle: autovacuum thresholds. Without routine autovacuums, postgres performance steadily declined over time. By ensuring that autovacuums ran on a roughly daily basis we were able to ensure consistent performance over time.

RPCs

  • ListPayments
  • ForwardingHistory

Both of these RPCs are slow now. However, with some smart querying (incrementing both start_date and index_offset in the case of ForwardingHistory for example) the performance is tolerable.

Once more SQL schemas are available, performance should increase drastically as postgres will be able to better "understand" how to serialize transactions.

Note: as of lnd v0.18.5 the Postgres write lock is removed. This should translate to better write performance compared to prior versions.

@djkazic
Copy link
Author

djkazic commented Apr 7, 2025

synchronous_standby_names = 'walreceiver' is mentioned twice.

Removed the second mention. Thanks!

@ZZiigguurraatt
Copy link

I think you should remove synchronous_standby_names = 'walreceiver' or provide more information on what it actually is. This seems to be another DB host that it is replicating to that is named walreceiver. All other options in this doc we can copy and paste and they will work, except this one which is a custom name, but we don't have any context here to inform us of this and that it is a custom name and not a special keyword. You've put them in a section stating "These options are meant to be customized based on the instance's available resources.", but this one is not really about resource size.

@ZZiigguurraatt
Copy link

I think you should also indicate what tuning parameters help with the KV schema that is being phased out, and which are general postgres SQL schema tuning parameters.

@ZZiigguurraatt
Copy link

Also, seems like the single quotes (') in some of the values aren't actually needed. They were giving me problems in my docker compose file and leaving them out worked.

@djkazic
Copy link
Author

djkazic commented Apr 21, 2025

The quotes are for postgresql.conf use

@djkazic
Copy link
Author

djkazic commented Apr 21, 2025

I think you should also indicate what tuning parameters help with the KV schema that is being phased out, and which are general postgres SQL schema tuning parameters.

They are all for KV

@djkazic
Copy link
Author

djkazic commented Apr 21, 2025

I think you should remove synchronous_standby_names = 'walreceiver' or provide more information on what it actually is. This seems to be another DB host that it is replicating to that is named walreceiver. All other options in this doc we can copy and paste and they will work, except this one which is a custom name, but we don't have any context here to inform us of this and that it is a custom name and not a special keyword. You've put them in a section stating "These options are meant to be customized based on the instance's available resources.", but this one is not really about resource size.

Sure. I can remove it. I wouldn't recommend running without synchronous replication for production though

@ZZiigguurraatt
Copy link

I think you should also indicate what tuning parameters help with the KV schema that is being phased out, and which are general postgres SQL schema tuning parameters.

They are all for KV

Maybe put that in the title or very first sentence?

@ZZiigguurraatt
Copy link

I think you should remove synchronous_standby_names = 'walreceiver' or provide more information on what it actually is. This seems to be another DB host that it is replicating to that is named walreceiver. All other options in this doc we can copy and paste and they will work, except this one which is a custom name, but we don't have any context here to inform us of this and that it is a custom name and not a special keyword. You've put them in a section stating "These options are meant to be customized based on the instance's available resources.", but this one is not really about resource size.

Sure. I can remove it. I wouldn't recommend running without synchronous replication for production though

That makes sense. It's possible I'm expecting this document to be something it originally was not but now maybe we need an entire postgres setup tutorial that is more comprehensive and more clearly lays out optional stuff like this and how to do the other side of it (setup the mirror) too.

@ZZiigguurraatt
Copy link

When you say "It's best to use zstd for WAL compression if you can spare the CPU for it.", I'm confused by this statement. I have not used this because I'm left to believe that if I don't have a powerful CPU, it will likely make things slower for me instead of faster?

@ZZiigguurraatt
Copy link

I have a database of the following size

# du -hs pgdata/
33G	pgdata/
#

and I needed the following larger values for successful running lndinit migrate-db and lnd

max_locks_per_transaction=512
max_pred_locks_per_transaction=4096
checkpoint_timeout=30min

.

Not mentioned above, I also had to set

max_wal_size=4GB
effective_io_concurrency=32

.

I am not sure if one, some, or all of these changes were actually required for successful running. It took too long to get to an error condition with this large DB for me to have time to try every permutation. Also, I did not get auto_explain working with my docker compose setup until after I found the above successful condition. Maybe auto_explain would have helped me locate the right parameters needed in advance before erroring out?

I set the following lower than your recommended because I did not have a lot of free RAM in my system. I'm not sure if me setting this lower caused some of the above to have to be increased?

shared_buffers=2000MB

@djkazic
Copy link
Author

djkazic commented Apr 22, 2025

I think you should also indicate what tuning parameters help with the KV schema that is being phased out, and which are general postgres SQL schema tuning parameters.

They are all for KV

Maybe put that in the title or very first sentence?

Updated.

@djkazic
Copy link
Author

djkazic commented Apr 22, 2025

When you say "It's best to use zstd for WAL compression if you can spare the CPU for it.", I'm confused by this statement. I have not used this because I'm left to believe that if I don't have a powerful CPU, it will likely make things slower for me instead of faster?

The intent is to signal that low power systems shouldn't enable zstd, as it can create CPU contention and slow things down. For most modern hardware though it won't really get bogged down. Thinking more along the lines of a RPi.

@djkazic
Copy link
Author

djkazic commented Apr 22, 2025

I have a database of the following size

# du -hs pgdata/
33G	pgdata/
#

and I needed the following larger values for successful running lndinit migrate-db and lnd

max_locks_per_transaction=512
max_pred_locks_per_transaction=4096
checkpoint_timeout=30min

.

Not mentioned above, I also had to set

max_wal_size=4GB
effective_io_concurrency=32

.

I am not sure if one, some, or all of these changes were actually required for successful running. It took too long to get to an error condition with this large DB for me to have time to try every permutation. Also, I did not get auto_explain working with my docker compose setup until after I found the above successful condition. Maybe auto_explain would have helped me locate the right parameters needed in advance before erroring out?

I set the following lower than your recommended because I did not have a lot of free RAM in my system. I'm not sure if me setting this lower caused some of the above to have to be increased?

shared_buffers=2000MB

Yep, larger DBs will absolutely require some further tuning. The idea of the doc is to define a starting point. I'll add a note about the locks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment