Prevent numeric overflows in parallel numeric aggregates

Enterprise / PostgreSQL - Dean Rasheed [gmail.com] - 5 July 2021 09:16 UTC

Formerly various numeric aggregate functions supported parallel aggregation by having each worker convert partial aggregate values to Numeric and use numeric_send() as part of serializing their state. That's problematic, since the range of Numeric is smaller than that of NumericVar, so it's possible for it to overflow (on either side of the decimal point) in cases that would succeed in non-parallel mode.

Fix by serializing NumericVars instead, to avoid the overflow risk and ensure that parallel and non-parallel modes work the same.

A side benefit is that this improves the efficiency of the serialization/deserialization code, which can make a noticeable difference to performance with large numbers of parallel workers.

No back-patch due to risk from changing the binary format of the aggregate serialization states, as well as lack of prior field complaints and low probability of such overflows in practice.

Patch by me. Thanks to David Rowley for review and performance testing, and Ranier Vilela for an additional suggestion.

Discussion: https://postgr.es/m/CAEZATCUmeFWCrq2dNzZpRj5+6LfN85jYiDoqm+ucSXhb9U2TbA@mail.gmail.com

f025f2390e Prevent numeric overflows in parallel numeric aggregates.
src/backend/utils/adt/numeric.c | 255 ++++++++++++++++------------------
src/test/regress/expected/numeric.out | 50 +++++++
src/test/regress/sql/numeric.sql | 36 +++++
3 files changed, 203 insertions(+), 138 deletions(-)

Upstream: git.postgresql.org


  • Share