Fix full text search to handle NOT above a phrase search correctly

Enterprise / PostgreSQL - Tom Lane [] - 27 April 2020 16:21 EDT

Queries such as '!(foo<->bar)' failed to find matching rows when implemented as a GiST or GIN index search. That's because of failing to handle phrase searches as tri-valued when considering a query without any position information for the target tsvector. We can only say that the phrase operator might match, not that it does match; and therefore its NOT also might match. The previous coding incorrectly inverted the approximate phrase result to decide that there was certainly no match.

To fix, we need to make TS_phrase_execute return a real ternary result, and then bubble that up accurately in TS_execute. As long as we have to do that anyway, we can simplify the baroque things TS_phrase_execute was doing internally to manage tri-valued searching with only a bool as explicit result.

For now, I left the externally-visible result of TS_execute as a plain bool. There do not appear to be any outside callers that need to distinguish a three-way result, given that they passed in a flag saying what to do in the absence of position data. This might need to change someday, but we wouldn't want to back-patch such a change.

Although tsginidx.c has its own TS_execute_ternary implementation for use at upper index levels, that sadly managed to get this case wrong as well :-(. Fixing it is a lot easier fortunately.

Per bug #16388 from Charles Offenbacher. Back-patch to 9.6 where phrase search was introduced.


src/backend/utils/adt/tsginidx.c | 23 ++-
src/backend/utils/adt/tsvector_op.c | 269 ++++++++++++++++---------
src/test/regress/data/ | 40 ++--
src/test/regress/expected/tsearch.out | 366 +++++++++++++++++++++++++++++++++-
src/test/regress/expected/tstypes.out | 78 ++++++++
src/test/regress/sql/tsearch.sql | 60 ++++++
src/test/regress/sql/tstypes.sql | 13 ++
7 files changed, 734 insertions(+), 115 deletions(-)


