RegExp for a substring

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Csaba Gabor

    RegExp for a substring

    Suppose I want to check that a string, $str, starts with at least the
    first 3 letters of a given word, say "delete". Can I do that compactly
    with a regular expression? The following are not my idea of compact:

    preg_match("/^del(|e|et|ete) \\b/i", $str)
    Quadratic in the length of the given word


    preg_match("/^del(e(t(e)?)?) ?\\b/i", $str)
    Really now. But at least it's linear


    Thanks.
    Csaba Gabor from Vienna

  • Rik

    #2
    Re: RegExp for a substring

    Csaba Gabor wrote:[color=blue]
    > Suppose I want to check that a string, $str, starts with at least the
    > first 3 letters of a given word, say "delete". Can I do that
    > compactly with a regular expression? The following are not my idea
    > of compact:
    >
    > preg_match("/^del(|e|et|ete) \\b/i", $str)
    > Quadratic in the length of the given word
    >
    >
    > preg_match("/^del(e(t(e)?)?) ?\\b/i", $str)
    > Really now. But at least it's linear
    >[/color]

    Euhm, why at LEAST 3?
    You don't seem to be using 'more' matches.
    Or is this just a quick example?

    Normally, I wouldn't use regexes for this. The way you made them is about as
    compact as they get for this particular purpose.

    if(substr($str, 0,3) = substr($needle, 0,3){
    //code...
    }

    If you're trying to do what I think you want, maybe this is the code for
    you:
    $needle = 'delete';
    preg_match("/^[a-z]+\b/i", $str,$match);
    if(strlen($matc h[0])>=3 && $match[0]==substr($needl e,0,strlen($mat ch[0]))){
    //code if it matches
    }

    By no means shorter, but a lot more versatile.

    Grtz,
    --
    Rik Wasmus


    Comment

    • Csaba Gabor

      #3
      Re: RegExp for a substring

      Rik wrote:[color=blue]
      > Csaba Gabor wrote:[color=green]
      > > Suppose I want to check that a string, $str, starts with at least the
      > > first 3 letters of a given word, say "delete". Can I do that
      > > compactly with a regular expression? The following are not my idea
      > > of compact:
      > >
      > > preg_match("/^del(|e|et|ete) \\b/i", $str)
      > > Quadratic in the length of the given word
      > >
      > > preg_match("/^del(e(t(e)?)?) ?\\b/i", $str)
      > > Really now. But at least it's linear
      > >[/color]
      > Euhm, why at LEAST 3?
      > You don't seem to be using 'more' matches.
      > Or is this just a quick example?[/color]

      Yes, just a quick example. It came about as a way to detect certain
      command line options. A person should be able to abbreviate any
      command line option (such that the abbreviation is still uniquely
      applicable (e.g. 'deleg' cannot abbreviate 'delete', and if 'define' is
      another option then 'de' cannot abbreviate 'delete' nor 'define')).
      [color=blue]
      > Normally, I wouldn't use regexes for this. The way you made them is about as
      > compact as they get for this particular purpose.
      >
      > if(substr($str, 0,3) = substr($needle, 0,3){
      > //code...
      > }
      >
      > If you're trying to do what I think you want, maybe this is the code for you:
      > $needle = 'delete';
      > preg_match("/^[a-z]+\b/i", $str,$match);
      > if(strlen($matc h[0])>=3 && $match[0]==substr($needl e,0,strlen($mat ch[0]))){
      > //code if it matches
      > }
      >
      > By no means shorter, but a lot more versatile.[/color]

      Thanks for the response. It's what I also came up with (with some
      lower casing), but it struck me as a lot of code to do something fairly
      trivial.

      function abbreviates($ne edle, $haystack) {
      // returns the first word in $haystack if it is $needle or
      // an abbreviation. Otherwise returns "". Case insensitive
      if (!preg_match("/^\\s*(\\w+)\\b/", $haystack, $match)) return "";
      return strcasecmp($m=$ match[1],substr($needle ,0,strlen($m)))
      ? "" : $m; }

      Csaba

      Comment

      • Paul Lautman

        #4
        Re: RegExp for a substring

        Csaba Gabor wrote:[color=blue]
        > Rik wrote:[color=green]
        >> Csaba Gabor wrote:[/color][/color]
        What it seems that you're trying to achieve is similar to REXX's ABBBREV
        function. This takes 3 arguments thus:

        Abbrev( information, info [, length ] ] )

        information -- reference string
        The string that may start with the abbreviated text value.

        info -- test substring
        The abbreviated text value.

        length -- minimal substring length
        The minimal required length of the test substring -- the default minimum
        length is 0 !

        Which I would solve like this:

        function abbrev($full,$p art,$len=0) {
        return !(strlen($part) <$len || substr($full,0, strlen($part)) !== $part);
        }


        Comment

        Working...