PERL PERL5 PORTERS 40 PERL 34195 REGEX ALTERNATIONS WITHIN NEGATIVE LOOKAHEAD ASSERTIONS
Date: 20 Feb 2005 22:22:46 -0000

Subject: [perl #34195] Regex: Alternations within negative lookahead assertions
From: perlbug-followup@no-spam (Mike Rosulek)

# New Ticket Created by Mike Rosulek # Please include the string: [perl #34195]
# in the subject line of all future correspondence about this issue. # <URL: https://rt.perl.org/rt3/Ticket/Display.html?id=34195 >

This is a bug report for perl from mike@no-spam generated with the help of perlbug 1.35 running under perl v5.8.4.

-----------------------------------------------------------------
[Please enter your report here]

## this seems to be related to Ticket #23030

## /^(aa|aaaa)*$/ is equivalent to /^(aa)*$/
## they both match strings of a's of even length
## this works:

if ( ("a" x 19) !~ /^(aa)*$/ ) {
print "19 a's don't match /^(aa)*\$/\n";
}

## and so does this:

if ( ("a" x 19) !~ /^(aa|aaaa)*$/ ) {
print "19 a's don't match /^(aa|aaaa)*\$/\n";
}

## thus ("a" x 20) should match /^(a*?)(?!(aa|aaaa)*$)/
## with $1 = "a", but it doesn't!

if ( ("a" x 20) =~ /^(a*?)(?!(aa|aaaa)*$)/ ) {
print "first matched: ($1|$')\n";
## doesn't work!
}

## it works without the alternation
if ( ("a" x 20) =~ /^(a*?)(?!(aa)*$)/ ) {
print "second matched: ($1|$')\n";
}

## changing the * to + causes it to match with ## $1 = ("a" x 19), which is closer, but still ## incorrect
if ( ("a" x 20) =~ /^(a*?)(?!(aa|aaaa)+$)/ ) {
print "third matched: ($1|$')\n";
}

## also, changing the order of (aaaa|aa)* also doesn't work.

[Please do not change anything below this line]
-----------------------------------------------------------------

---
Flags:
category=core severity=low ---
Site configuration information for perl v5.8.4:

Configured by Debian Project at Thu Feb 3 01:11:27 EST 2005.

Summary of my perl5 (revision 5 version 8 subversion 4) configuration:
Platform:
osname=linux, osvers=2.4.27-ti1211, archname=i386-linux-thread-multi uname='linux kosh 2.4.27-ti1211 #1 sun sep 19 18:17:45 est 2004 i686
gnulinux '
config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8
-Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5
-Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.4
-Dsitearch=/usr/local/lib/perl/5.8.4 -Dman1dir=/usr/share/man/man1
-Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1
-Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.4 -Dd_dosuid -des'
hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler:
cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -I/usr/local/include'
ccversion='', gccversion='3.3.5 (Debian 1:3.3.5-8)', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define Linker and Libraries:
ld='cc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.3.2.so, so=so, useshrplib=true,
libperl=libperl.so.5.8.4
gnulibc_version='2.3.2'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:

---
@no-spam for perl v5.8.4:
/etc/perl /usr/local/lib/perl/5.8.4
/usr/local/share/perl/5.8.4
/usr/lib/perl5
/usr/share/perl5
/usr/lib/perl/5.8
/usr/share/perl/5.8
/usr/local/lib/site_perl /usr/local/lib/perl/5.8.3
/usr/local/share/perl/5.8.3
/usr/local/lib/perl/5.8.2
/usr/local/share/perl/5.8.2
/usr/local/lib/perl/5.8.1
/usr/local/share/perl/5.8.1
/usr/local/lib/perl/5.8.0
/usr/local/share/perl/5.8.0
.


Date: Mon, 21 Feb 2005 11:29:12 +0100

Subject: Re: [perl #34195] Regex: Alternations within negative lookahead assertions
From: demerphq@no-spam (Demerphq)
On 20 Feb 2005 22:22:46 -0000, via RT Mike Rosulek <perlbug-followup@no-spam> wrote:
> # New Ticket Created by Mike Rosulek > # Please include the string: [perl #34195]
> # in the subject line of all future correspondence about this issue.
> # <URL: https://rt.perl.org/rt3/Ticket/Display.html?id=34195 >
> > This is a bug report for perl from mike@no-spam > generated with the help of perlbug 1.35 running under perl v5.8.4.
> > -----------------------------------------------------------------
> [Please enter your report here]
> > ## this seems to be related to Ticket #23030
> > ## /^(aa|aaaa)*$/ is equivalent to /^(aa)*$/
> ## they both match strings of a's of even length > > ## this works:
> > if ( ("a" x 19) !~ /^(aa)*$/ ) {
> print "19 a's don't match /^(aa)*\$/\n";
> }
> > ## and so does this:
> > if ( ("a" x 19) !~ /^(aa|aaaa)*$/ ) {
> print "19 a's don't match /^(aa|aaaa)*\$/\n";
> }
> > ## thus ("a" x 20) should match /^(a*?)(?!(aa|aaaa)*$)/
> ## with $1 = "a", but it doesn't!

Yeah I agree. Blead perl shows this problem too. It looks like it has to do with caching from the debug output:

"Detected a super-linear match, switching on caching..."

Is reported just a bit before the incorrect fail.
yves