recursion in grammar?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Peri

    recursion in grammar?

    I'm trying to create Python parser/interpreter using ANTLR.
    Reading grammar from language refference I found:
    or_expr::= xor_expr | or_expr "|" xor_expr

    For me it looks like infinite recursion. And so it says ANTLR. Maybe I
    don't understand EBNF notation. For me it should look like this.
    or_expr::= xor_expr | xor_expr "|" xor_expr

    and in ANTLR grammar file like this:

    or_expr: xor_expr { "\|" xor_expr }, or rather
    or_expr: xor_expr ( "\|" xor_expr )*

    Do I think good?
    ANyone heard of Python grammar for ANTLR?
  • Michael Hudson

    #2
    Re: recursion in grammar?

    peri12345@poczt a.onet.pl (Peri) writes:
    [color=blue]
    > I'm trying to create Python parser/interpreter using ANTLR.
    > Reading grammar from language refference I found:
    > or_expr::= xor_expr | or_expr "|" xor_expr
    >
    > For me it looks like infinite recursion.[/color]

    Isn't this just left recursion (a standard LL(1) trick)?
    [color=blue]
    > And so it says ANTLR. Maybe I don't understand EBNF notation. For
    > me it should look like this. or_expr::= xor_expr | xor_expr "|"
    > xor_expr[/color]

    That wouldn't let you write "1 | 2 | 4", would it?

    It's not really different from recursion in programming languages (or
    mathematical induction) -- the Python grammar is a finite way of
    describing sequences of tokens of arbitrary length. You have to have
    *some* trick like recursion in there for this to work.

    I can't help you with ANTLR.

    Cheers,
    mwh


    --
    I would hereby duly point you at the website for the current pedal
    powered submarine world underwater speed record, except I've lost
    the URL. -- Callas, cam.misc

    Comment

    • Terry Reedy

      #3
      Re: recursion in grammar?


      "Peri" <peri12345@pocz ta.onet.pl> wrote in message
      news:34ea966a.0 312270509.5ecc2 095@posting.goo gle.com...[color=blue]
      > I'm trying to create Python parser/interpreter using ANTLR.
      > Reading grammar from language refference I found:
      > or_expr::= xor_expr | or_expr "|" xor_expr
      >
      > For me it looks like infinite recursion.[/color]

      Better to call it unbounded recursion. This is what makes general CF
      languages/grammars more extensive/powerful that regular languages.

      In particular, the recursion is written as left recursion.
      The equivalent right recursion would be
      or_expr::= xor_expr | xor_expr "|" or_expr
      The recursive 'call' is to the right of the connector literal instead of
      the left.

      There are two types of linear CF parsers: bottom-up LR and top-down LL.
      Each handles recursion written one way and chokes on recursion written the
      other way. The two types have opposite requirements. (I forget which is
      which.) Since Antler choked on left recursion, I presume that it is the
      'other' type than Python's and that it requires right rather than left
      recursion. So try reversing all the recursive definitions. (Or try first
      the one that gives first error message and see if you get farther.)

      Terry J. Reedy


      Comment

      • Stephen Horne

        #4
        Re: recursion in grammar?

        On Sat, 27 Dec 2003 13:04:15 -0500, "Terry Reedy" <tjreedy@udel.e du>
        wrote:
        [color=blue]
        >
        >"Peri" <peri12345@pocz ta.onet.pl> wrote in message
        >news:34ea966a. 0312270509.5ecc 2095@posting.go ogle.com...[color=green]
        >> I'm trying to create Python parser/interpreter using ANTLR.
        >> Reading grammar from language refference I found:
        >> or_expr::= xor_expr | or_expr "|" xor_expr
        >>
        >> For me it looks like infinite recursion.[/color]
        >
        >Better to call it unbounded recursion. This is what makes general CF
        >languages/grammars more extensive/powerful that regular languages.
        >
        >In particular, the recursion is written as left recursion.
        >The equivalent right recursion would be
        >or_expr::= xor_expr | xor_expr "|" or_expr
        >The recursive 'call' is to the right of the connector literal instead of
        >the left.
        >
        >There are two types of linear CF parsers: bottom-up LR and top-down LL.
        >Each handles recursion written one way and chokes on recursion written the
        >other way. The two types have opposite requirements. (I forget which is
        >which.) Since Antler choked on left recursion, I presume that it is the
        >'other' type than Python's and that it requires right rather than left
        >recursion. So try reversing all the recursive definitions. (Or try first
        >the one that gives first error message and see if you get farther.)[/color]

        Unless I am seriously mistaken, bottom-up LR parsers do not choke on
        either left or right recursion. Right recursion, IIRC, requires
        unbounded stack space in principle, but works OK in practice for the
        simple reason that the input always defines a finite bound. It is
        certainly less efficient, and carrys the risk that a complex input
        might overflow the available stack space, but those are rarely serious
        issues these days.

        If LR parsing could not handle both left and right recursion, it
        equally could not handle both left-associative and right-associative
        operators. Of course most real parsers have specialised support to
        optimise these common cases, but pure-form bottom-up LR parsing
        algorithms such as LR(1) and LALR(1) are perfectly capable of handling
        both left and right associative operators, and they do so using left
        and right recursion as appropriate.

        Which raises an important point - if you change the grammar to swap
        the recursions from left to right or whatever, that results in a major
        code-breaking change in how the parser interprets the language.
        Left-associative expressions will be parsed as right-associative and
        visa versa. Maybe worthwhile as a test to see what happens, but no
        good for a final parser that is intended to work on real Python code.
        Well, not without some post-processing of the resulting AST anyway.

        ANTLR definitely uses LL parsing. I don't know about Pythons parsing
        engine, though I suspect it uses LR-style parsing. It is a basic fact
        of life that LL parsers and LR parsers do have quite different
        limitations, and if Python uses LR parsing it will probably be
        impossible to build a parser for the (unmodified) Python grammar using
        an LL tool such as ANTLR. However, I find it hard to believe that
        ANTLR has no means of working around these issues with some relatively
        simple tweaks to the grammar specification.

        I would raise the question in comp.compilers, or check if ANTLR has a
        mailing list.


        --
        Steve Horne

        steve at ninereeds dot fsnet dot co dot uk

        Comment

        • Michael Hudson

          #5
          Re: recursion in grammar?

          Stephen Horne <steve@ninereed s.fsnet.co.uk> writes:
          [color=blue]
          > ANTLR definitely uses LL parsing. I don't know about Pythons parsing
          > engine, though I suspect it uses LR-style parsing.[/color]

          Nope, LL(1). One thing you should note is that the grammar in the
          docs is *not* the grammar used by Python's parser generator -- that's
          Grammar/Grammar in the source distribution. I'm not sure, but I
          suspect that the grammar in the docs is nastily ambiguous. Certainly
          the actual Python parser lets through some stuff that get's thrown out
          in the compiler with SyntaxErrors.

          Cheers,
          mwh

          --
          Good? Bad? Strap him into the IETF-approved witch-dunking
          apparatus immediately! -- NTK now, 21/07/2000

          Comment

          • logistix at cathoderaymission.net

            #6
            Re: recursion in grammar?

            Michael Hudson <mwh@python.net > wrote in message news:<m365fzsxt z.fsf@pc150.mat hs.bris.ac.uk>. ..[color=blue]
            > Stephen Horne <steve@ninereed s.fsnet.co.uk> writes:
            >[color=green]
            > > ANTLR definitely uses LL parsing. I don't know about Pythons parsing
            > > engine, though I suspect it uses LR-style parsing.[/color]
            >
            > Nope, LL(1). One thing you should note is that the grammar in the
            > docs is *not* the grammar used by Python's parser generator -- that's
            > Grammar/Grammar in the source distribution. I'm not sure, but I
            > suspect that the grammar in the docs is nastily ambiguous. Certainly
            > the actual Python parser lets through some stuff that get's thrown out
            > in the compiler with SyntaxErrors.
            >
            > Cheers,
            > mwh[/color]

            Even the file Grammar/Grammar isn't quite LL(1). It's close, but not
            quite. For example:

            comp_op: '<'|'>'|'=='|'> ='|'<='|'<>'|'! ='|'in'|'not' 'in'|'is'|'is'
            'not'

            Would never be able to get to the 'is' 'not' section with LL. I
            believe the parser generator straightens this out when it builds the
            DFA's.

            Comment

            Working...