Recognize some OR clauses as compatible with functional dependencies

18 March 2020 15:41 UTC

Since commit 8f321bd16c functional dependencies can handle IN clauses, which however introduced a possible (and surprising) inconsistency, because IN clauses may be expressed as an OR clause, which are still considered incompatible. For example

a IN (1, 2, 3)

may be rewritten as

(a = 1 OR a = 2 OR a = 3)

The IN clause will work fine with functional dependencies, but the OR clause will force the estimation to fall back to plain per-column estimates, possibly introducing significant estimation errors.

This commit recognizes OR clauses equivalent to an IN clause (when all arugments are compatible and reference the same attribute) as a special case, compatible with functional dependencies. This allows applying functional dependencies, just like for IN clauses.

This does not eliminate the difference in estimating the clause itself, i.e. IN clause and OR clause still use different formulas. It would be possible to change that (for these special OR clauses), but that's not really about extended statistics - it was always like this. Moreover the errors are usually much smaller compared to ignoring dependencies.

Author: Tomas Vondra

ccaa3569f5 Recognize some OR clauses as compatible with functional dependencies
src/backend/statistics/dependencies.c | 65 +++++++++++++++++++++++++--------
src/test/regress/expected/stats_ext.out | 52 ++++++++++++++++++++++++++
src/test/regress/sql/stats_ext.sql | 20 ++++++++++
3 files changed, 121 insertions(+), 16 deletions(-)


