Improve VACUUM and ANALYZE by avoiding early lock queue

Enterprise / PostgreSQL - Michael Paquier [paquier.xyz] - 27 August 2018 00:11 EDT

A caller of VACUUM can perform early lookup obtention which can cause other sessions to block on the request done, causing potentially DOS attacks as even a non-privileged user can attempt a vacuum fill of a critical catalog table to block even all incoming connection attempts.

Contrary to TRUNCATE, a client could attempt a system-wide VACUUM after building the list of relations to VACUUM, which can cause vacuum_rel() or analyze_rel() to try to lock the relation but the operation would just block. When the client specifies a list of relations and the relation needs to be skipped, ownership checks are done when building the list of relations to work on, preventing a later lock attempt.

vacuum_rel() already had the sanity checks needed, except that those were applied too late. This commit refactors the code so as relation skips are checked beforehand, making it safer to avoid too early locks, for both manual VACUUM with and without a list of relations specified.

An isolation test is added emulating the fact that early locks do not happen anymore, issuing a WARNING message earlier if the user calling
VACUUM is not a relation owner.

When a partitioned table is listed in a manual VACUUM or ANALYZE command, its full list of partitions is fetched, all partitions get added to the list to work on, and then each one of them is processed one by one, with ownership checks happening at the later phase of
vacuum_rel() or analyze_rel(). Trying to do early ownership checks for each partition is proving to be tedious as this would result in deadlock risks with lock upgrades, and skipping all partitions if the listed partitioned table is not owned would result in a behavior change compared to how Postgres 10 has implemented vacuum for partitioned tables. The original problem reported related to early lock queue for critical relations is fixed anyway, so priority is given to avoiding a backward-incompatible behavior.

a556549d7e Improve VACUUM and ANALYZE by avoiding early lock queue
src/backend/commands/analyze.c | 28 ++---
src/backend/commands/vacuum.c | 156 +++++++++++++++++-------
src/include/commands/vacuum.h | 3 +
src/test/isolation/expected/vacuum-conflict.out | 149 ++++++++++++++++++++++
src/test/isolation/isolation_schedule | 1 +
src/test/isolation/specs/vacuum-conflict.spec | 51 ++++++++
6 files changed, 328 insertions(+), 60 deletions(-)

Upstream: git.postgresql.org


  • Share