PERL PERL5 PORTERS 42 RE PERL 34195 REGEX ALTERNATIONS WITHIN NEGATIVE LOOKAHEAD ASSERTIONS
Subject: Re: [perl #34195] Regex: Alternations within negative lookahead assertions
Date: Mon, 21 Feb 2005 11:47:10 +0000

From: hv@no-spam

Mike Rosulek (via RT) <perlbug-followup@no-spam> wrote:
[...]
:if ( ("a" x 20) =~ /^(a*?)(?!(aa|aaaa)*$)/ ) {
: print "first matched: ($1|$')\n";
: ## doesn't work!
:}

This first fails with ("a" x 8), and it appears that this is because patch #20538 for [perl #23030] is insufficient. (See:
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2003-08/msg00511.html for the details and the followup for the patch.)

The test case there was:
"......abef" =~ /.*a(?!(b|cd)*e).*f/
incorrectly reporting a match.

I'm not sure that I fully understand the failure mode in this case, but I think the problem is that we need to be able to distinguish between cached successes and cached failures, which I think means that the cache cannot work without implementing the rather more expensive option (a) from the above message.

Another option would be to disable the cache whenever we're inside one or more negative assertions, but I suspect even that would involve a fairly large patch to implement.

Hugo