simple regex problem

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Chamomile

    simple regex problem

    I have to split strings of the type:

    $str1 =' Large ladies hats 1.365 0.334';
    $str2 = 'Pins 0.335 0.22';

    into separate variables (or array members) :
    say, $textStr, $number1, $number2 or $pieces[0] $pieces[1] $pieces[2]
    for inclusion in 3 separate fields of a database row
    I am new to the world of regex and split() or preg_split() etc. and have
    been trying all day to do this seemingly
    simple task but keep running into problems till my head spins.
    Can anyone give me some pointers?
    (I have tried the manuals!)




  • John Dunlop

    #2
    Re: simple regex problem

    Chamomile multiposted:
    [color=blue]
    > I have to split strings of the type:
    >
    > $str1 =' Large ladies hats 1.365 0.334';
    > $str2 = 'Pins 0.335 0.22';[/color]

    What type is that exactly?
    [color=blue]
    > into separate variables (or array members)[/color]

    preg_split('`\s {2,}`',$string)

    That returns an array containing substrings of $string split along
    boundaries of two or more whitespace characters.

    --
    Jock

    Comment

    • Chamomile

      #3
      Re: simple regex problem


      "John Dunlop" <john+usenet@jo hndunlop.info> wrote in message
      news:MPG.1a8f43 4f8d7d31cd98968 4@News.Individu al.NET...[color=blue]
      > Chamomile multiposted:
      >[color=green]
      > > I have to split strings of the type:
      > >
      > > $str1 =' Large ladies hats 1.365 0.334';
      > > $str2 = 'Pins 0.335 0.22';[/color]
      >
      > What type is that exactly?
      >[color=green]
      > > into separate variables (or array members)[/color]
      >
      > preg_split('`\s {2,}`',$string)
      >
      > That returns an array containing substrings of $string split along
      > boundaries of two or more whitespace characters.
      >
      > --
      > Jock[/color]

      thank you , jock
      Yes, I suppose I should have said 'kind' not type .
      'multiposting' (by which I assume you mean also posting on alt.php) is a
      bad thing then?
      I am not well versed with the etiquette of newsgroups.
      Also, I should have said that the boundry between the first (text) portion
      of the string is sometimes only 1 space (unforgiveable I know) and I have
      been trying to use an alpha vs. numeric comparison to do the first split.



      Comment

      • John Dunlop

        #4
        Re: simple regex problem

        Chamomile wrote:
        [color=blue]
        > Yes, I suppose I should have said 'kind' not type .[/color]

        I wasn't nitpicking your word choice. Sorry if it came across as if
        I was -- my phraseology was obviously poor in that case. I was
        trying to get a better idea as to how you wanted to split the string.
        [color=blue]
        > 'multiposting' (by which I assume you mean also posting on alt.php) is a
        > bad thing then?[/color]

        Yes. Most definitely.

        ("Multiposti ng" is posting the same article separately to multiple
        newsgroups; "crossposti ng" is simultaneously sending a *single*
        article to multiple newsgroups.)

        If someone were to followup to your article in alt.php, there'd be
        two threads discussing exactly the same subject, which'd pointlessly
        cover the same ground. It so happens that most folks read both
        groups, but the possibility remains.

        If you want to post the same article to different newsgroups,
        crosspost, and seriously consider setting followups to the group
        where the discussion is most topical. It's usually unnecessary to
        even crosspost though. Some groups condemn crossposting; some
        moderated groups even prevent crossposting.


        [color=blue]
        > Also, I should have said that the boundry between the first (text) portion
        > of the string is sometimes only 1 space (unforgiveable I know)[/color]

        That makes it slightly more tricky.
        [color=blue]
        > and I have been trying to use an alpha vs. numeric comparison to do the
        > first split.[/color]

        Right. That's probably the best way, unless there are other
        constraints you're hiding. ;-)

        Consider:

        preg_split('`\s +(?=\d)`',$stri ng)

        That returns an array containing substrings of $string split along
        boundaries of one or more whitespace characters that are followed by
        a decimal digit.

        --
        Jock

        Comment

        • Chamomile

          #5
          Re: simple regex problem

          I take your points about cross and multiposting.
          [color=blue][color=green]
          > > Also, I should have said that the boundry between the first (text)[/color][/color]
          portion[color=blue][color=green]
          > > of the string is sometimes only 1 space (unforgiveable I know)[/color]
          >
          > That makes it slightly more tricky.
          >[color=green]
          > > and I have been trying to use an alpha vs. numeric comparison to do the
          > > first split.[/color]
          >
          > Right. That's probably the best way, unless there are other
          > constraints you're hiding. ;-)[/color]

          no, I'm finding this difficult enough..
          [color=blue]
          > Consider:
          >
          > preg_split('`\s +(?=\d)`',$stri ng)
          >
          > That returns an array containing substrings of $string split along
          > boundaries of one or more whitespace characters that are followed by
          > a decimal digit.[/color]

          yes, that works! I'll use that solution to try and reverse engineer the
          preg_split() thing
          so gain some insight into how it works.. I seem to find regex stuff
          increadibly
          difficult - I hope it's just lack of practice, not an incurable brain
          deficit.
          thanks for your help
          mjg


          Comment

          • John Dunlop

            #6
            Re: simple regex problem

            Chamomile wrote:
            [color=blue]
            > [John Dunlop wrote:]
            >[color=green]
            > > preg_split('`\s +(?=\d)`',$stri ng)
            > >
            > > That returns an array containing substrings of $string split along
            > > boundaries of one or more whitespace characters that are followed by
            > > a decimal digit.[/color]
            >
            > yes, that works! I'll use that solution to try and reverse engineer the
            > preg_split() thing
            > so gain some insight into how it works..[/color]

            The pattern itself isn't too complicated:

            `\s+(?=\d)`

            Firstly, the "\s" is a character type, which stands for any
            whitespace character. The quantifier tells how many times that type
            is allowed: "+" means one or more times.

            The "(?=" starts a zero-width positive look-ahead assertion. A look-
            ahead assertion looks at the characters following the current
            character in the string, but does not "consume" them. So the pattern
            is looking ahead of the last whitespace character. The "\d" is
            another character type, this time meaning any decimal digit (the
            character class [0-9]).

            The details are covered in the Manual.

            PHP is a popular general-purpose scripting language that powers everything from your blog to the most popular websites in the world.


            Preg_split simply uses that pattern to return an array of substrings
            from the original string. The original string is split up along
            matches of the pattern, which are only ever whitespace characters
            since no decimal digits are taken up by the pattern.



            --
            Jock

            Comment

            Working...