Regex doesn't match - what am I doing wrong?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Sriram

    Regex doesn't match - what am I doing wrong?

    Hi,

    I am having trouble matching a regex that combines a negated character
    class and an anchor ($). Basically, I want to match all strings that
    don't end in a digit. So I tried:

    bash-2.05a@bermuda:1 5$perl -e 'while (<STDIN>) { if (/[^0-9]$/) {
    print;}}'
    skdsklds
    skdsklds
    sklskl2 <== why does this match? it ends in a digit.
    sklskl2

    This matched all strings regardless of whether or not they ended in a
    digit. But the complemented regex seems to work fine:

    bash-2.05a@bermuda:1 3$perl -e 'while (<STDIN>) { if (/[0-9]$/) {
    print;}}'
    sdkldsklds2
    sdkldsklds2
    sdsk2
    sdsk2
    sks <==== doesn't match as expected

    I replaced [0-9] with [\d] but got the same results.

    On the other hand, grep works as expected:

    bash-2.05a@bermuda:9 $grep '[0-9]$'
    sdksdjk2
    sdksdjk2
    22221
    22221
    sdjkdsjk <== doesn't match as expected

    bash-2.05a@bermuda:1 1$grep '[^0-9]$'
    sdklsdklds
    sdklsdklds
    sdkldslk2 <== doesn't match as expected

    What am I doing wrong? Here's the perl version info:

    bash-2.05a@bermuda:1 7$perl -V
    Summary of my perl5 (revision 5.0 version 6 subversion 1)
    configuration:
    Platform:
    osname=linux, osvers=2.4.17-0.13smp, archname=i386-linux
    uname='linux daffy.perf.redh at.com 2.4.17-0.13smp #1 smp fri feb 1
    10:30:48 est 2002 i686 unknown '
    config_args='-des -Doptimize=-O2 -march=i386 -mcpu=i686 -Dcc=gcc
    -Dcf_by=Red Hat, Inc. -Dcccdlflags=-fPIC -Dinstallprefix=/usr
    -Dprefix=/usr -Darchname=i386-linux -
    Dvendorprefix=/usr -Dsiteprefix=/usr -Uusethreads -Uuseithreads
    -Uuselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Di_ndbm -Di_gdbm
    -Di_shadow -Di_syslog -Dman3ext=3pm
    '
    hint=recommende d, useposix=true, d_sigaction=def ine
    usethreads=unde f use5005threads= undef useithreads=und ef
    usemultiplicity =undef
    useperlio=undef d_sfio=undef uselargefiles=u ndef usesocks=undef
    use64bitint=und ef use64bitall=und ef uselongdouble=u ndef
    Compiler:
    cc='gcc', ccflags ='-fno-strict-aliasing -I/usr/local/include',
    optimize='-O2 -march=i386 -mcpu=i686',
    cppflags='-fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='2.9 6 20000731 (Red Hat Linux 7.2
    2.96-109)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=defi ne, longlongsize=8, d_longdbl=defin e,
    longdblsize=12
    ivtype='long', ivsize=4, nvtype='double' , nvsize=8, Off_t='off_t',
    lseeksize=4
    alignbytes=4, usemymalloc=n, prototype=defin e
    Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -ldl -lm -lc -lcrypt -lutil
    perllibs=-lnsl -ldl -lm -lc -lcrypt -lutil
    libc=/lib/libc-2.2.5.so, so=so, useshrplib=fals e,
    libperl=libperl .a
    Dynamic Linking:
    dlsrc=dl_dlopen .xs, dlext=so, d_dlsymun=undef ,
    ccdlflags='-rdynamic'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'


    Characteristics of this binary (from libperl):
    Compile-time options:
    Built under linux
    Compiled at Apr 1 2002 12:23:22
    @INC:
    /usr/lib/perl5/5.6.1/i386-linux
    /usr/lib/perl5/5.6.1
    /usr/lib/perl5/site_perl/5.6.1/i386-linux
    /usr/lib/perl5/site_perl/5.6.1
    /usr/lib/perl5/site_perl/5.6.0
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.6.1/i386-linux
    /usr/lib/perl5/vendor_perl/5.6.1
    /usr/lib/perl5/vendor_perl
  • Arno H.P. Reuser

    #2
    Re: Regex doesn't match - what am I doing wrong?

    On Wed, 10 Mar 2004 14:47:40 -0800, Sriram wrote:

    The string 'sklskl2' does not end in a digit, it ends in a character
    return. That's probably why everything always matches. If you chop the $_
    first, like

    perl -e 'while (<stdin>){chop $_; if (/[^0-9]$/) {print } }'

    results will look more promising.

    Hope this helps.

    [color=blue]
    > Hi,
    >
    > I am having trouble matching a regex that combines a negated character
    > class and an anchor ($). Basically, I want to match all strings that
    > don't end in a digit. So I tried:
    >
    > bash-2.05a@bermuda:1 5$perl -e 'while (<STDIN>) { if (/[^0-9]$/) {
    > print;}}'
    > skdsklds
    > skdsklds
    > sklskl2 <== why does this match? it ends in a digit.
    > sklskl2
    >[/color]

    Comment

    • Sriram

      #3
      Re: Regex doesn't match - what am I doing wrong?

      Hi,

      Thanks! That works much better; so does '/[^0-9]\n/' or !/[0-9]$/.

      My Perl book says "The $ and \Z assertions can match not only at the
      end of the string, but also one character earlier than that, if the
      last character of the string happens to be a newline." This is also
      how I'm used to regexs working in grep/sed etc.

      I'm baffled that $ behaves differently when used in conjunction with a
      negated character class. The complemented regex /[0-9]$/ works without
      chop. Why the subtle difference?

      Sriram

      "Arno H.P. Reuser" <bibliothecaris @xs4all.nl> wrote in message news:<pan.2004. 03.11.08.12.18. 223363@xs4all.n l>...[color=blue]
      > On Wed, 10 Mar 2004 14:47:40 -0800, Sriram wrote:
      >
      > The string 'sklskl2' does not end in a digit, it ends in a character
      > return. That's probably why everything always matches. If you chop the $_
      > first, like
      >
      > perl -e 'while (<stdin>){chop $_; if (/[^0-9]$/) {print } }'
      >
      > results will look more promising.
      >
      > Hope this helps.
      >
      >[color=green]
      > > Hi,
      > >
      > > I am having trouble matching a regex that combines a negated character
      > > class and an anchor ($). Basically, I want to match all strings that
      > > don't end in a digit. So I tried:
      > >
      > > bash-2.05a@bermuda:1 5$perl -e 'while (<STDIN>) { if (/[^0-9]$/) {
      > > print;}}'
      > > skdsklds
      > > skdsklds
      > > sklskl2 <== why does this match? it ends in a digit.
      > > sklskl2
      > >[/color][/color]

      Comment

      Working...