Regular expression, (preg_split etc...), some help please.

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Sims

    Regular expression, (preg_split etc...), some help please.

    Hi,

    I need some help to split data using regular expression

    Consider the string

    '1,2,3', I can split it using, preg_split("/,/", '1,2,3') and i correctly
    get [0]=1, [1]=2,[2]=3.

    Now if i have

    '1,"2,3"' i could split it using preg_split("/(?<!\"),/\d", '1,"2,3"') and i
    correctly get [0]=1, [1]="2,3".

    But it clearly does not work in some more advanced cases, for example

    '1," 2 , 3"' or '1,"2 , 3 "' mainly because the /d is no longer useful.

    So how can i search for a regular expression that is *not within*
    apostrophes?

    I think i might have to write my own split function especially if i have an
    extreme case like, '1," 2 , \" 3"', (note the escape apostrophe).

    Many thanks for you input.

    Sims





  • CC Zona

    #2
    Re: Regular expression, (preg_split etc...), some help please.

    In article <c0f7r6$15t48v$ 1@ID-162430.news.uni-berlin.de>,
    "Sims" <siminfrance@ho tmail.com> wrote:
    [color=blue]
    > Consider the string
    >
    > '1,2,3', I can split it using, preg_split("/,/", '1,2,3') and i correctly
    > get [0]=1, [1]=2,[2]=3.
    >
    > Now if i have
    >
    > '1,"2,3"' i could split it using preg_split("/(?<!\"),/\d", '1,"2,3"') and i
    > correctly get [0]=1, [1]="2,3".
    >
    > But it clearly does not work in some more advanced cases, for example
    >
    > '1," 2 , 3"' or '1,"2 , 3 "' mainly because the /d is no longer useful.[/color]

    Is the overall goal to extract numbers having arbitrary separators? If so,
    how about splitting on "/\D+/"?

    --
    CC

    Comment

    • CC Zona

      #3
      Re: Regular expression, (preg_split etc...), some help please.

      In article <c0f7r6$15t48v$ 1@ID-162430.news.uni-berlin.de>,
      "Sims" <siminfrance@ho tmail.com> wrote:
      [color=blue]
      > Consider the string
      >
      > '1,2,3', I can split it using, preg_split("/,/", '1,2,3') and i correctly
      > get [0]=1, [1]=2,[2]=3.
      >
      > Now if i have
      >
      > '1,"2,3"' i could split it using preg_split("/(?<!\"),/\d", '1,"2,3"') and i
      > correctly get [0]=1, [1]="2,3".
      >
      > But it clearly does not work in some more advanced cases, for example
      >
      > '1," 2 , 3"' or '1,"2 , 3 "' mainly because the /d is no longer useful.[/color]

      (Oops, never mind that last post. )

      --
      CC

      Comment

      • Ewoud Dronkert

        #4
        Re: Regular expression, (preg_split etc...), some help please.

        On Thu, 12 Feb 2004 08:49:59 +0200, Sims wrote:[color=blue]
        > I need some help to split data using regular expression[/color]

        No you don't :) http://www.php.net/fgetcsv

        Comment

        • Sims

          #5
          Re: Regular expression, (preg_split etc...), some help please.


          "Ewoud Dronkert" <me@privacy.net > wrote in message
          news:n3im20teqt a7adlm9e3anpclg 332a0jt33@4ax.c om...[color=blue]
          > On Thu, 12 Feb 2004 08:49:59 +0200, Sims wrote:[color=green]
          > > I need some help to split data using regular expression[/color]
          >
          > No you don't :) http://www.php.net/fgetcsv[/color]

          Sorry but that is reading from a file, i am reading from a line.
          I do want the same sort of output but not from a file.

          I will have a look at the C code to see if can create a php function to
          achieve what i need.

          Sims.


          Comment

          • Rahul Anand

            #6
            Re: Regular expression, (preg_split etc...), some help please.

            "Sims" <siminfrance@ho tmail.com> wrote in message news:<c0f7r6$15 t48v$1@ID-162430.news.uni-berlin.de>...[color=blue]
            > Hi,
            >
            > I need some help to split data using regular expression
            >
            > Consider the string
            >
            > '1,2,3', I can split it using, preg_split("/,/", '1,2,3') and i correctly
            > get [0]=1, [1]=2,[2]=3.
            >
            > Now if i have
            >
            > '1,"2,3"' i could split it using preg_split("/(?<!\"),/\d", '1,"2,3"') and i
            > correctly get [0]=1, [1]="2,3".
            >
            > But it clearly does not work in some more advanced cases, for example
            >
            > '1," 2 , 3"' or '1,"2 , 3 "' mainly because the /d is no longer useful.
            >
            > So how can i search for a regular expression that is *not within*
            > apostrophes?
            >
            > I think i might have to write my own split function especially if i have an
            > extreme case like, '1," 2 , \" 3"', (note the escape apostrophe).
            >
            > Many thanks for you input.
            >
            > Sims[/color]

            My guess is following RegExp.

            [SNIP]

            $str = '1," 2 ,\" 3 ", 4, 5';
            $arr = preg_match_all( '#(?<=")\s*[\d]+.*(?=[^\\\]")|[\d]+#U',$str,$matc hes);
            print_r($matche s);

            [/SNIP]

            Try and let me know if it does not work as per your requirements.

            --
            Cheers,
            Rahul Anand

            Comment

            • Sims

              #7
              Re: Regular expression, (preg_split etc...), some help please.

              [color=blue]
              >
              > My guess is following RegExp.
              >
              > [SNIP]
              >
              > $str = '1," 2 ,\" 3 ", 4, 5';
              > $arr =[/color]
              preg_match_all( '#(?<=")\s*[\d]+.*(?=[^\\\]")|[\d]+#U',$str,$matc hes);[color=blue]
              > print_r($matche s);
              >
              > [/SNIP]
              >[/color]

              If it is a guess then it is brilliant, thanks : ).

              It works 99.9%.

              The only time it does not work is when you have something like... $str =
              '1," 2 ,\" 3a", 4, 5';, (Note the 'a' after the '3').
              Your RegExp removes the last item, (letter),
              Your expression is very very complicated for me so i am going to spend some
              time trying to work it all out.

              Thanks a million.


              Comment

              • Ewoud Dronkert

                #8
                Re: Regular expression, (preg_split etc...), some help please.

                On Thu, 12 Feb 2004 11:42:08 +0200, Sims wrote:[color=blue][color=green]
                >> No you don't :) http://www.php.net/fgetcsv[/color]
                >
                > Sorry but that is reading from a file, i am reading from a line.[/color]

                So post the data to a script, open a filepointer to "php://input" and
                read from that. Or write data to diskfile and read from diskfile (duh).

                Comment

                • Sims

                  #9
                  Re: Regular expression, (preg_split etc...), some help please.


                  "Ewoud Dronkert" <me@privacy.net > wrote in message
                  news:akln20p6p3 900d6i70g1qelpg oc8bs362t@4ax.c om...[color=blue]
                  > On Thu, 12 Feb 2004 11:42:08 +0200, Sims wrote:[color=green][color=darkred]
                  > >> No you don't :) http://www.php.net/fgetcsv[/color]
                  > >
                  > > Sorry but that is reading from a file, i am reading from a line.[/color]
                  >
                  > So post the data to a script, open a filepointer to "php://input" and
                  > read from that. Or write data to diskfile and read from diskfile[/color]

                  Sorry that is simply not good practice at all, in fact it is very bad
                  programming.

                  Reading/Writing data to file? Just to use a function, i would bring my
                  server to a stand still.
                  I think i will rather use a more realistic function like the one offered by
                  Rahul.
                  [color=blue]
                  > (duh).[/color]

                  Thanks.

                  Sims


                  Comment

                  • Ewoud Dronkert

                    #10
                    Re: Regular expression, (preg_split etc...), some help please.

                    On Thu, 12 Feb 2004 21:59:30 +0200, Sims wrote:[color=blue]
                    > Sorry that is simply not good practice at all, in fact it is very bad
                    > programming.[/color]

                    Then provide some more details of the environment/circumstances/typical
                    situation. You gave none. For all we know, you were trying to convert a
                    small one-time output by your heartrate monitor or something.

                    Please don't put me down for failing prerequisites you did not mention.

                    Comment

                    • Sims

                      #11
                      Re: Regular expression, (preg_split etc...), some help please.

                      [color=blue]
                      >
                      > Then provide some more details of the environment/circumstances/typical
                      > situation.[/color]

                      You are trying to dig yourself out, i gave a list of strings and what i
                      wanted to achieve with the strings and the problems i was coming across.
                      My environment/circumstances/typical is not necessary in my particular case.

                      I wanted to get an array from a string that is more than enough info.
                      Knowing that i have an Apache server with Win2003 is of no use to the
                      problem, (to a knowledgeable programmer anyway).
                      [color=blue]
                      > You gave none. For all we know, you were trying to convert a
                      > small one-time output by your heartrate monitor or something.[/color]

                      Still, creating a file and reading it is simply wrong, regardless what i was
                      trying to achieve.
                      [color=blue]
                      >
                      > Please don't put me down for failing prerequisites you did not mention.[/color]

                      Then don't quote Homer and read the OP properly.

                      Regards.

                      Sims


                      Comment

                      • John Dunlop

                        #12
                        Re: Regular expression, (preg_split etc...), some help please.

                        Sims wrote:
                        [color=blue]
                        > Now if i have
                        >
                        > '1,"2,3"' i could split it using preg_split("/(?<!\"),/\d", '1,"2,3"')[/color]

                        There's a typo in there somewhere, I believe.
                        [color=blue]
                        > and i correctly get [0]=1, [1]="2,3".
                        >
                        > But it clearly does not work in some more advanced cases, for example
                        >
                        > '1," 2 , 3"' or '1,"2 , 3 "' mainly because the /d is no longer useful.[/color]

                        This is very similar to -- and based on -- Rahul Anand's "guess". :-)

                        A jungle of assertions!:

                        preg_match_all(
                        '`(?<=").*?(?=( ?<!\\\)")|\d+`s ',
                        $string,
                        $array)

                        Describing it at a high-ish level, the pattern looks for either of
                        two alternatives: zero or more quoted substrings, or one or more
                        decimal digits. A quoted substring, intuitively, begins and ends
                        with double-quotes.

                        Describing it at a much lower level, the pattern uses a positive
                        look-behind assertion to check to see if a quoted substring follows,
                        i.e. a double-quote precedes the current matching point. Everything
                        until the closing double-quote is part of a quoted substring. The
                        closing double-quote is found using two more assertions: firstly, a
                        positive look-ahead assertion checks that the next character is a
                        double-quote; secondly, a negative look-behind assertion checks this
                        double-quote is not preceded by a backslash. (The three backslashes
                        are needed to escape the backslash character's special meaning.) If
                        no match can be found for a quoted substring, one or more decimal
                        digits are looked for, using the \d character type.

                        The s pattern modifier means the dot metacharacter matches newlines,
                        which it doesn't by default.

                        All these grim details of the syntax are explained in the Manual's
                        section on PCRE pattern syntax.

                        --
                        Jock

                        Comment

                        • Sims

                          #13
                          Re: Regular expression, (preg_split etc...), some help please.

                          [color=blue][color=green]
                          > > Now if i have
                          > >
                          > > '1,"2,3"' i could split it using preg_split("/(?<!\"),/\d", '1,"2,3"')[/color]
                          >
                          > There's a typo in there somewhere, I believe.[/color]

                          Yes indeed, sorry, maybe i should have copied and paste rather.
                          [color=blue]
                          >[color=green]
                          > > and i correctly get [0]=1, [1]="2,3".
                          > >
                          > > But it clearly does not work in some more advanced cases, for example
                          > >
                          > > '1," 2 , 3"' or '1,"2 , 3 "' mainly because the /d is no longer useful.[/color]
                          >
                          > This is very similar to -- and based on -- Rahul Anand's "guess". :-)[/color]

                          I could see that Rahul was not far from it.
                          [color=blue]
                          > A jungle of assertions!:
                          >
                          > preg_match_all(
                          > '`(?<=").*?(?=( ?<!\\\)")|\d+`s ',
                          > $string,
                          > $array)[/color]

                          so using an example like,

                          $string = 'a1a ,2b , 3c, " 4, \"aaa, 5", 7';
                          preg_match_all( '`(?<=").*?(?=( ?<!\\\)")|\d+`s ', $string, $array);
                          print_r( $array );

                          i get an output like...

                          Array ( [0] => Array ( [0] => 1 [1] => 2 [2] => 3 [3] => 4, \"aaa, 5 [4]
                          => 7 ) )

                          Why are some of the letters ignored?
                          [color=blue]
                          > Describing it at a high-ish level, the pattern looks for either of
                          > two alternatives: zero or more quoted substrings, or one or more
                          > decimal digits. A quoted substring, intuitively, begins and ends
                          > with double-quotes.[/color]

                          What part looks for a decimal number?
                          I am afraid that my description should not have included numbers only.
                          I want the RegEx to work for anything.

                          so that a case like

                          $string = 'xx,xx , xx," x, \"x, x", " x, x", xx, "xx", x';

                          would work regardless what 'x' represents a letter, a number or a symbol,
                          (apart form x= " itself).
                          [color=blue]
                          >
                          > All these grim details of the syntax are explained in the Manual's
                          > section on PCRE pattern syntax.
                          >[/color]

                          Once i have your and Rahul's example in front of me then i can try and work
                          it out but the number of assertions is simply mind boggling.
                          [color=blue]
                          > --
                          > Jock[/color]

                          Many thanks

                          Sims


                          Comment

                          • Tom Thackrey

                            #14
                            Re: Regular expression, (preg_split etc...), some help please.


                            On 12-Feb-2004, "Sims" <siminfrance@ho tmail.com> wrote:
                            [color=blue][color=green]
                            > >
                            > > My guess is following RegExp.
                            > >
                            > > [SNIP]
                            > >
                            > > $str = '1," 2 ,\" 3 ", 4, 5';
                            > > $arr =[/color]
                            > preg_match_all( '#(?<=")\s*[\d]+.*(?=[^\\\]")|[\d]+#U',$str,$matc hes);[color=green]
                            > > print_r($matche s);
                            > >
                            > > [/SNIP]
                            > >[/color]
                            >
                            > If it is a guess then it is brilliant, thanks : ).
                            >
                            > It works 99.9%.
                            >
                            > The only time it does not work is when you have something like... $str
                            > =
                            > '1," 2 ,\" 3a", 4, 5';, (Note the 'a' after the '3').
                            > Your RegExp removes the last item, (letter),
                            > Your expression is very very complicated for me so i am going to spend
                            > some
                            > time trying to work it all out.
                            >
                            > Thanks a million.[/color]

                            There are several classes for handling comma delimited stuff on
                            phpclasses.org, search on 'comma'. You should be able to hack one of them
                            into parsing your string.

                            --
                            Tom Thackrey

                            tom (at) creative (dash) light (dot) com
                            do NOT send email to jamesbutler@wil lglen.net (it's reserved for spammers)

                            Comment

                            • Ewoud Dronkert

                              #15
                              Re: Regular expression, (preg_split etc...), some help please.

                              On Thu, 12 Feb 2004 22:25:46 +0200, Sims wrote:[color=blue]
                              > You are trying to dig yourself out[/color]

                              I give up.

                              Comment

                              Working...