enable intuit under anchored \G, and fix a bug

Programming / Compilers / PERL - David Mitchell [iabyn.com] - 28 July 2013 04:33 UTC

Since 1999, regcomp has had approximately the following comment and code:

/* XXXX Currently intuiting is not compatible with ANCH_GPOS. This should be changed ASAP! */ if ((r->check_substr || r->check_utf8) && !(r->extflags & RXf_ANCH_GPOS)) { r->extflags |= RXf_USE_INTUIT;


However, it appears that since that time, intuit has had (at least some) support for achored \G added. Note also that the RXf_USE_INTUIT flag (up until a few commits go) was only used by *callers* of regexec() to decide whether to call intuit() first; regexec() itself also internally calls intuit() on occasion, and in those cases it directly checks just the check_substr and check_utf8 fields, rather than the RXf_USE_INTUIT flag; so in those cases it's using intuit even in the presence of anchored \G.

So, in the grand perl tradition of "make the change and see if anything in the test suite breaks", that's what I've done for this commit (i.e. removed the RXf_ANCH_GPOS check above).

So intuit is now normally called even in the presence of anchored \G. This means that something like "aaaa" =~ /\G.*xx/ will now quickly fail in intuit rather than more slowly failing in regmatch().

Note that I have no actual knowledge of whether intuit is *really* anchored-\G-safe.

As it happens one thing in the test suite did break, and this was due to the following code, added back in 1997:

if (

&& !((RExC_seen & REG_SEEN_GPOS) || (r->extflags & RXf_ANCH_GPOS))) ) r->extflags |= RXf_CHECK_ALL;

It was clearly meant to say that if either of those \G flags were present, don't set the RXf_CHECK_ALL flag (which enables intuit-only matches). But the '!' was set to cover the first condition only, rather than both. Presumably this had never been spotted before due to skipping intuit under anchored \G.

[Actually this commit broke some other stuff too, not covered by the test suite. See the next commit. Hooray for git rebase -i and history re-writing!]

f1fb9b0 enable intuit under anchored \G, and fix a bug
regcomp.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

Upstream: perl5.git.perl.org


  • Share