Parsing - is this a sensible idea?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • gw7rib@aol.com

    Parsing - is this a sensible idea?

    I have a program that needs to do a small amount of relatively simple
    parsing. The routines I've written work fine, but the code using them
    is a bit long-winded.

    I therefore had the idea of creating a class to do parsing. It could
    be used as follows:

    int a, n, x, y;
    Parser par;
    par << string;
    if (par >"From" >' ' >x >' ' >"to" >' ' >y) a = 1;
    else if (par >"Number" >' ' >n) a = 2;
    else a = 3;

    Then if string is "From 3 to 5" this will set a=1, x=3, y=5. If the
    string is "Number 2" this will set a=2 and n=2. If string is
    "Other" then a=3. For convenience, I'll assume that an input of "From
    4 other" is allowed to alter the value of x while returning a=3.

    I think I could write a class that would do this. It would need to
    keep track of whether the current parsing was succeeding and, if so,
    how far through the string it had got. It would need overloaded >>
    operators, obviously, some of them taking references. And it would
    need a conversion operator, which I think would need to be to void *,
    which would not only return whether the current parse had succeeded
    but would also reset the flag and counter ready for another attempt.

    So my questions are, is this a sensible thing to try to do, and are
    there any potential snags that I haven't spotted?

    Thanks.
    Paul.
  • =?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=

    #2
    Re: Parsing - is this a sensible idea?

    On 2008-11-16 22:16, gw7rib@aol.com wrote:
    I have a program that needs to do a small amount of relatively simple
    parsing. The routines I've written work fine, but the code using them
    is a bit long-winded.
    >
    I therefore had the idea of creating a class to do parsing. It could
    be used as follows:
    >
    int a, n, x, y;
    Parser par;
    par << string;
    if (par >"From" >' ' >x >' ' >"to" >' ' >y) a = 1;
    else if (par >"Number" >' ' >n) a = 2;
    else a = 3;
    >
    Then if string is "From 3 to 5" this will set a=1, x=3, y=5. If the
    string is "Number 2" this will set a=2 and n=2. If string is
    "Other" then a=3. For convenience, I'll assume that an input of "From
    4 other" is allowed to alter the value of x while returning a=3.
    >
    I think I could write a class that would do this. It would need to
    keep track of whether the current parsing was succeeding and, if so,
    how far through the string it had got. It would need overloaded >>
    operators, obviously, some of them taking references. And it would
    need a conversion operator, which I think would need to be to void *,
    which would not only return whether the current parse had succeeded
    but would also reset the flag and counter ready for another attempt.
    >
    So my questions are, is this a sensible thing to try to do, and are
    there any potential snags that I haven't spotted?
    If you need to parse a lot you should probably try a tool like yacc or
    some other parser-generator. If you only need to be able to parse a very
    small grammar (and want a good exercise) you can try to write the state-
    machine by hand.

    You example looks like a runtime-construct (though, perhaps you can make
    it compile-time with some fancy template meta-programming) which does
    not sound like a good idea to me.

    --
    Erik Wikström

    Comment

    • gw7rib@aol.com

      #3
      Re: Parsing - is this a sensible idea?

      On 16 Nov, 21:42, Erik Wikström <Erik-wikst...@telia. comwrote:
      On 2008-11-16 22:16, gw7...@aol.com wrote:
      I have a program that needs to do a small amount of relatively simple
      parsing. The routines I've written work fine, but the code using them
      is a bit long-winded.
      >
      I therefore had the idea of creating a class to do parsing. It could
      be used as follows:
      >
      int a, n, x, y;
      Parser par;
      par << string;
      if (par >"From" >' ' >x >' ' >"to" >' ' >y) a = 1;
      else if (par >"Number" >' ' >n) a = 2;
      else a = 3;
      >
      Then if string is "From 3 to 5" this will set a=1, x=3, y=5. If the
      string is "Number     2" this will set a=2 and n=2. If string is
      "Other" then a=3. For convenience, I'll assume that an input of "From
      4 other" is allowed to alter the value of x while returning a=3.
      >
      I think I could write a class that would do this. It would need to
      keep track of whether the current parsing was succeeding and, if so,
      how far through the string it had got. It would need overloaded >>
      operators, obviously, some of them taking references. And it would
      need a conversion operator, which I think would need to be to void *,
      which would not only return whether the current parse had succeeded
      but would also reset the flag and counter ready for another attempt.
      >
      So my questions are, is this a sensible thing to try to do, and are
      there any potential snags that I haven't spotted?
      >
      If you need to parse a lot you should probably try a tool like yacc or
      some other parser-generator. If you only need to be able to parse a very
      small grammar (and want a good exercise) you can try to write the state-
      machine by hand.
      I don't think I'm going to be doing that much parsing, though I'll
      bear that in mind if i do.
      You example looks like a runtime-construct (though, perhaps you can make
      it compile-time with some fancy template meta-programming) which does
      not sound like a good idea to me.
      How my example works - par >"text" will check to see whether the
      next bit of the string to be parsed contains the characters "text".
      par >n will check to see if the next bit of the string is a number,
      and if so, set n to that number. par >' ' will skip whitespace. The
      routine doesn't build up a "template" of what the string is supposed
      to look like, it just checks each bit of it in turn, as I would have
      thought any parser needs to.

      Thanks for any further thoughts.
      Paul.

      Comment

      • Joe Smith

        #4
        Re: Parsing - is this a sensible idea?


        Paul wrote:
        >How my example works - par >"text" will check to see whether the
        >next bit of the string to be parsed contains the characters "text".
        >par >n will check to see if the next bit of the string is a number,
        >and if so, set n to that number. par >' ' will skip whitespace. The
        >routine doesn't build up a "template" of what the string is supposed
        >to look like, it just checks each bit of it in turn, as I would have
        >thought any parser needs to.
        It is definately possible.

        The only part that sticks out of your design as really weird is the
        side effects of the conversion operator. I would prefer to have the
        operator>overlo ads return copies of the original with the changed
        member variables. If you use a reference counting smart pointer for
        the string your class would no larger than 4 integers on most
        platforms (one for pointer, one for its reference count, one for the
        position and less than 1 for the flag). The cost of copying four
        integers is not terrible. If all the lines you want to parse are
        fairly short like in your examples, you won't be making too many
        copies. This is likely a reasonable tradeoff for avoiding the magic
        in the operator void*().

        In general though the returning copies is not scalable. On the other
        hand your design has limited scalablility too, as advanced parsing
        requires more sophisiticated techniques. But considerering your
        examples, it sounds like you don't need a powerful parser, but
        want something to parse simple strings, so all this might be just fine
        for you.

        Comment

        • James Kanze

          #5
          Re: Parsing - is this a sensible idea?

          On Nov 16, 11:09 pm, gw7...@aol.com wrote:
          On 16 Nov, 21:42, Erik Wikström <Erik-wikst...@telia. comwrote:
          On 2008-11-16 22:16, gw7...@aol.com wrote:
          I have a program that needs to do a small amount of
          relatively simple parsing. The routines I've written work
          fine, but the code using them is a bit long-winded.
          I therefore had the idea of creating a class to do
          parsing. It could be used as follows:
          int a, n, x, y;
          Parser par;
          par << string;
          if (par >"From" >' ' >x >' ' >"to" >' ' >y) a = 1;
          else if (par >"Number" >' ' >n) a = 2;
          else a = 3;
          Then if string is "From 3 to 5" this will set a=1, x=3,
          y=5. If the string is "Number 2" this will set a=2 and
          n=2. If string is "Other" then a=3. For convenience, I'll
          assume that an input of "From 4 other" is allowed to alter
          the value of x while returning a=3.
          I think I could write a class that would do this. It would
          need to keep track of whether the current parsing was
          succeeding and, if so, how far through the string it had
          got. It would need overloaded >operators, obviously,
          some of them taking references. And it would need a
          conversion operator, which I think would need to be to
          void *, which would not only return whether the current
          parse had succeeded but would also reset the flag and
          counter ready for another attempt.
          So my questions are, is this a sensible thing to try to
          do, and are there any potential snags that I haven't
          spotted?
          If you need to parse a lot you should probably try a tool
          like yacc or some other parser-generator. If you only need
          to be able to parse a very small grammar (and want a good
          exercise) you can try to write the state- machine by hand.
          I don't think I'm going to be doing that much parsing, though
          I'll bear that in mind if i do.
          You example looks like a runtime-construct (though, perhaps
          you can make it compile-time with some fancy template
          meta-programming) which does not sound like a good idea to
          me.
          How my example works - par >"text" will check to see whether
          the next bit of the string to be parsed contains the
          characters "text".
          I think that that's what I really don't care for in it. One
          expects >to read, not to check.

          What's wrong with just using boost::regex?

          --
          James Kanze (GABI Software) email:james.kan ze@gmail.com
          Conseils en informatique orientée objet/
          Beratung in objektorientier ter Datenverarbeitu ng
          9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

          Comment

          • Jerry Coffin

            #6
            Re: Parsing - is this a sensible idea?

            In article <52baba84-3b2a-40fc-b95b-
            e8d9e3660c5a@a1 7g2000prm.googl egroups.com>, gw7rib@aol.com says...
            I have a program that needs to do a small amount of relatively simple
            parsing. The routines I've written work fine, but the code using them
            is a bit long-winded.
            >
            I therefore had the idea of creating a class to do parsing. It could
            be used as follows:
            Depending on what you're doing, I'd consider using a regular expression
            library such as boost::regex, or a template-based parser generator such
            as boost::Spirit 2.

            --
            Later,
            Jerry.

            The universe is a figment of its own imagination.

            Comment

            • rwp

              #7
              Re: Parsing - is this a sensible idea?

              I wrote a class like that a few years ago and it turned out to be quite
              useful

              Example code:

              string Part1, Part2, Key;
              parse_str(Line) >Part1 >"%" >Key >"%" >Part2 >"";
              if (Key.size() != 0) ...


              The class was modelled after the Rexx parse command so it uses some
              special strings like
              "." for word
              "10" to go to position 10 in the string
              "+10" to go 10 positions forward in the string
              "," to go to the next line in the string etc.

              The construction of the class is as follows

              parse_str(const string& in_s) ...
              Constructor that just saves the string variable internally

              //method that picks up integer variable to assign value to and returns the
              object to enable
              //continuing using >operators
              parse_str& operator>>(int& ival)
              {
              wordstep();
              (this->*m_try_assign) ();
              m_pvar = (void*)&ival;
              m_try_assign = &parse_str::try _assign_int;
              m_wordmatch = 1;
              return *this;
              }

              // method that recognizes special strings and search items
              parse_str& operator>>(cons t char* in_psz)...

              // method that converts a part of the parse string to an integer
              int try_assign_int( ) ...


              // variables

              void* m_pvar; // pointer to variable to set value to
              const string m_str; // string passed in as argument to constructor
              int (t_parse_string ::* m_try_assign)(v oid); function pointer to method
              that assigns variable





              --
              Message posted using http://www.talkaboutprogramming.com/...comp.lang.c++/
              More information at http://www.talkaboutprogramming.com/faq.html

              Comment

              Working...