pyparsing

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Bo¹tjan Jerko

    pyparsing

    Hello !

    I am trying to understand pyparsing. Here is a little test program
    to check Optional subclass:

    from pyparsing import Word,nums,Liter al,Optional

    lbrack=Literal( "[").suppress ()
    rbrack=Literal( "]").suppress ()
    ddot=Literal(": ").suppress ()
    start = Word(nums+".")
    step = Word(nums+".")
    end = Word(nums+".")

    sequence=lbrack +start+Optional (ddot+step)+ddo t+end+rbrack

    tokens = sequence.parseS tring("[0:0.1:1]")
    print tokens

    tokens1 = sequence.parseS tring("[1:2]")
    print tokens1

    It works on tokens, but the error message is showed on
    the second string ("[1:2]"). I don't get it. I did use
    Optional for ddot and step so I guess they are optional.

    Any hints what I am doing wrong?

    The versions are pyparsing 1.1.2 and Python 2.3.3.

    Thanks,

    B.
  • Daniel 'Dang' Griffith

    #2
    Re: pyparsing

    On Thu, 13 May 2004 08:05:32 +0200, bostjan.jerko@m f.uni-lj.si
    (Bo¹tjan Jerko) wrote:
    [color=blue]
    >Hello !
    >
    >I am trying to understand pyparsing. Here is a little test program
    >to check Optional subclass:
    >
    >from pyparsing import Word,nums,Liter al,Optional
    >
    >lbrack=Literal ("[").suppress ()
    >rbrack=Literal ("]").suppress ()
    >ddot=Literal(" :").suppress ()
    >start = Word(nums+".")
    >step = Word(nums+".")
    >end = Word(nums+".")
    >
    >sequence=lbrac k+start+Optiona l(ddot+step)+dd ot+end+rbrack
    >
    >tokens = sequence.parseS tring("[0:0.1:1]")
    >print tokens
    >
    >tokens1 = sequence.parseS tring("[1:2]")
    >print tokens1
    >
    >It works on tokens, but the error message is showed on
    >the second string ("[1:2]"). I don't get it. I did use
    >Optional for ddot and step so I guess they are optional.
    >
    >Any hints what I am doing wrong?
    >
    >The versions are pyparsing 1.1.2 and Python 2.3.3.
    >
    >Thanks,
    >
    >B.[/color]
    I don't see anything "obviously" wrong to me, but changing it thusly
    seems to resolve the problem (I added a few intermediate rules to
    make it more obvious):

    pref = lbrack + start
    midf = ddot + step
    suff = ddot + end + rbrack
    sequence = pref + midf + suff | pref + suff

    I've run into "this kind of thing" now and again, and have always
    been able to resolve it by reorganizing my rules.

    --dang

    Comment

    • Paul McGuire

      #3
      Re: pyparsing

      "Bo¹tjan Jerko" <bostjan.jerko@ mf.uni-lj.si> wrote in message
      news:87fza46evn .fsf@bostjan-pc.mf.uni-lj.si...[color=blue]
      > Hello !
      >
      > I am trying to understand pyparsing. Here is a little test program
      > to check Optional subclass:
      >
      > from pyparsing import Word,nums,Liter al,Optional
      >
      > lbrack=Literal( "[").suppress ()
      > rbrack=Literal( "]").suppress ()
      > ddot=Literal(": ").suppress ()
      > start = Word(nums+".")
      > step = Word(nums+".")
      > end = Word(nums+".")
      >
      > sequence=lbrack +start+Optional (ddot+step)+ddo t+end+rbrack
      >
      > tokens = sequence.parseS tring("[0:0.1:1]")
      > print tokens
      >
      > tokens1 = sequence.parseS tring("[1:2]")
      > print tokens1
      >
      > It works on tokens, but the error message is showed on
      > the second string ("[1:2]"). I don't get it. I did use
      > Optional for ddot and step so I guess they are optional.
      >
      > Any hints what I am doing wrong?
      >
      > The versions are pyparsing 1.1.2 and Python 2.3.3.
      >
      > Thanks,
      >
      > B.[/color]
      Bostjan -

      Here's how pyparsing is processing your input strings:

      [0:0.1:1]
      [ = lbrack
      0 = start
      :0.1 = ddot + step (Optional match)
      : = ddot
      1 = end
      ] = rbrack

      [1:2]
      [ = lbrack
      1 = start
      :2 = ddot + step (Optional match)
      ] = oops! expected ddot -> failure


      Dang Griffith proposed one alternative construct, here's another, perhaps
      more explicit:
      lbrack + ( ( ddot + step + ddot + end ) | (ddot + end) ) + rbrack

      Note that the order of the inner construct is important, so as to not match
      ddot+end before trying ddot+step+ddot+ end; '|' is a greedy matching
      operator, creating a MatchFirst object from pyparsing's class library. You
      could avoid this confusion by using '^', which generates an Or object:
      lbrack + ( (ddot + end) ^ ( ddot + step + ddot + end ) ) + rbrack
      This will evaluate both subconstructs, and choose the longer of the two.

      Or you can use another pyparsing helper, the delimited list
      lbrack + delimitedlist( Word(nums+"."), delim=":") + rbrack
      This implicitly suppresses delimiters, so that all you will get back are
      ["1","0.1"," 1"] in the first case and ["1","2"] in the second.

      Happy pyparsing!
      -- Paul


      Comment

      • Paul McGuire

        #4
        Re: pyparsing (errata)

        > Dang Griffith proposed one alternative construct, here's another, perhaps[color=blue]
        > more explicit:
        > lbrack + ( ( ddot + step + ddot + end ) | (ddot + end) ) + rbrack
        >[/color]

        should be:
        lbrack + start + ( ( ddot + step + ddot + end ) | (ddot + end) ) +
        rbrack
        [color=blue]
        > Note that the order of the inner construct is important, so as to not[/color]
        match[color=blue]
        > ddot+end before trying ddot+step+ddot+ end; '|' is a greedy matching
        > operator, creating a MatchFirst object from pyparsing's class library.[/color]
        You[color=blue]
        > could avoid this confusion by using '^', which generates an Or object:
        > lbrack + ( (ddot + end) ^ ( ddot + step + ddot + end ) ) + rbrack[/color]

        should be:
        lbrack + start + ( (ddot + end) ^ ( ddot + step + ddot + end ) ) +
        rbrack
        [color=blue]
        > This will evaluate both subconstructs, and choose the longer of the two.
        >
        > Or you can use another pyparsing helper, the delimited list
        > lbrack + delimitedlist( Word(nums+"."), delim=":") + rbrack[/color]

        at least this one is correct! No, wait, I mis-cased delimitedList!
        should be:
        lbrack + delimitedList( Word(nums+"."), delim=":") + rbrack
        [color=blue]
        > This implicitly suppresses delimiters, so that all you will get back are
        > ["1","0.1"," 1"] in the first case and ["1","2"] in the second.
        >
        > Happy pyparsing!
        > -- Paul
        >
        >[/color]
        Sorry for the sloppiness,
        -- Paul


        Comment

        • Bo¹tjan Jerko

          #5
          Re: pyparsing (errata)

          Paul,

          thanks for the explanation.

          Bo¹tjan

          On Fri, 14 May 2004, ptmcg@austin.rr ._bogus_.com spake:[color=blue][color=green]
          >> Dang Griffith proposed one alternative construct, here's another, perhaps
          >> more explicit:
          >> lbrack + ( ( ddot + step + ddot + end ) | (ddot + end) ) +
          >> rbrack
          >>[/color]
          >
          > should be:
          > lbrack + start + ( ( ddot + step + ddot + end ) | (ddot + end)
          > ) +
          > rbrack
          >[color=green]
          >> Note that the order of the inner construct is important, so as to
          >> not[/color]
          > match[color=green]
          >> ddot+end before trying ddot+step+ddot+ end; '|' is a greedy matching
          >> operator, creating a MatchFirst object from pyparsing's class
          >> library.[/color]
          > You[color=green]
          >> could avoid this confusion by using '^', which generates an Or
          >> object: lbrack + ( (ddot + end) ^ ( ddot + step + ddot + end )
          >> ) + rbrack[/color]
          >
          > should be:
          > lbrack + start + ( (ddot + end) ^ ( ddot + step + ddot + end )
          > ) +
          > rbrack
          >[color=green]
          >> This will evaluate both subconstructs, and choose the longer of the
          >> two.
          >>
          >> Or you can use another pyparsing helper, the delimited list
          >> lbrack + delimitedlist( Word(nums+"."), delim=":") + rbrack[/color]
          >
          > at least this one is correct! No, wait, I mis-cased delimitedList!
          > should be:
          > lbrack + delimitedList( Word(nums+"."), delim=":") + rbrack
          >[color=green]
          >> This implicitly suppresses delimiters, so that all you will get
          >> back are ["1","0.1"," 1"] in the first case and ["1","2"] in the
          >> second.
          >>
          >> Happy pyparsing!
          >> -- Paul
          >>
          >>[/color]
          > Sorry for the sloppiness,
          > -- Paul[/color]

          Comment

          Working...