Reduce pinning and buffer content locking for btree scans

Enterprise / PostgreSQL - Kevin Grittner [postgresql.org] - 25 March 2015 14:24 UTC

Even though the main benefit of the Lehman and Yao algorithm for btrees is that no locks need be held between page reads in an index search, we were holding a buffer pin on each leaf page after it was read until we were ready to read the next one. The reason was so that we could treat this as a weak lock to create an "interlock" with vacuum's deletion of heap line pointers, even though our README file pointed out that this was not necessary for a scan using an MVCC snapshot.

The main goal of this patch is to reduce the blocking of vacuum processes by in-progress btree index scans (including a cursor which is idle), but the code rearrangement also allows for one less buffer content lock to be taken when a forward scan steps from one page to the next, which results in a small but consistent performance improvement in many workloads.

This patch leaves behavior unchanged for some cases, which can be addressed separately so that each case can be evaluated on its own merits. These unchanged cases are when a scan uses a non-MVCC snapshot, an index-only scan, and a scan of a btree index for which modifications are not WAL-logged. If later patches allow all of these cases to drop the buffer pin after reading a leaf page, then the btree vacuum process can be simplified; it will no longer need the "super-exclusive" lock to delete tuples from a page.

Reviewed by Heikki Linnakangas and Kyotaro Horiguchi

2ed5b87f Reduce pinning and buffer content locking for btree scans.
src/backend/access/nbtree/README | 83 +++++++++++-------
src/backend/access/nbtree/nbtinsert.c | 4 +-
src/backend/access/nbtree/nbtree.c | 86 +++++++++++-------
src/backend/access/nbtree/nbtsearch.c | 156 +++++++++++++++++++++++++--------
src/backend/access/nbtree/nbtutils.c | 88 +++++++++++++------
src/backend/access/nbtree/nbtxlog.c | 3 +-
src/backend/storage/buffer/README | 4 +-
src/include/access/nbtree.h | 36 +++++++-
8 files changed, 327 insertions(+), 133 deletions(-)

Upstream: git.postgresql.org


  • Share