Help with regexp please

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Felix Collins

    Help with regexp please

    Hi,
    I'm not a regexp expert and had a bit of trouble with the following
    search.

    I have an "outline number" system like

    1
    1.2
    1.2.3
    1.3
    2
    3
    3.1

    etc.

    I want to parse an outline number and return the parent.

    So for example...

    parent("1.2.3.4 ") returns "1.2.3"

    The only way I can figure is to do two searches feeding the output of
    the first into the input of the second.

    Here is the code fragment...

    m = re.compile(r'(\ d+\.)+').match( "1.2.3.4")
    n = re.compile(r'\d +(\.\d+)+').mat ch(m.string[m.start():m.end ()])
    parentoutlinenu mber = n.string[n.start():n.end ()]

    parentoutlinenu mber
    1.2.3

    How do I get that into one regexp?

    Thanks for any help...

    Felix
  • Scott David Daniels

    #2
    Re: Help with regexp please

    Felix Collins wrote:[color=blue]
    > Hi,
    > I'm not a regexp expert and had a bit of trouble with the following search.
    > I have an "outline number" system like
    > 1
    > 1.2
    > 1.2.3
    > I want to parse an outline number and return the parent.[/color]

    Seems to me regex is not the way to go:
    def parent(string):
    return string[: string.rindex(' .')]

    Comment

    • Christopher Subich

      #3
      Re: Help with regexp please

      Scott David Daniels wrote:[color=blue]
      > Felix Collins wrote:[color=green]
      >> I have an "outline number" system like
      >> 1
      >> 1.2
      >> 1.2.3
      >> I want to parse an outline number and return the parent.[/color]
      >
      > Seems to me regex is not the way to go:
      > def parent(string):
      > return string[: string.rindex(' .')][/color]

      Absolutely, regex is the wrong solution for this problem. I'd suggest
      using rsplit, though, since that will Do The Right Thing when a
      top-level outline number is passed:
      def parent(string):
      return string.rsplit(' .',1)[0]

      Your solution will throw an exception, which may or may not be the right
      behaviour.

      Comment

      • Felix Collins

        #4
        Re: Help with regexp please

        Christopher Subich wrote:[color=blue]
        > Scott David Daniels wrote:[/color]
        Thanks to you both. Wow! what a quick response!
        [color=blue]
        >string.rsplit( '.',1)[0][/color]

        Clever Python! ;-)


        Sorry, I mainly code in C so I'm not very Pythonic in my thinking.
        Thanks again...

        Felix

        Comment

        • Terry Hancock

          #5
          Re: Help with regexp please

          On Thursday 21 July 2005 11:39 pm, Felix Collins wrote:[color=blue]
          > Christopher Subich wrote:[color=green]
          > > Scott David Daniels wrote:[/color]
          > Thanks to you both. Wow! what a quick response![color=green]
          > >string.rsplit( '.',1)[0][/color]
          > Clever Python! ;-)
          > Sorry, I mainly code in C so I'm not very Pythonic in my thinking.
          > Thanks again...[/color]

          I think this is the "regexes can't count" problem. When the repetition
          count matters, you usually need something else. Usually some
          combination of string and list methods will do the trick, as here.

          --
          Terry Hancock ( hancock at anansispacework s.com )
          Anansi Spaceworks http://www.anansispaceworks.com

          Comment

          • Christopher Subich

            #6
            Re: Help with regexp please

            Terry Hancock wrote:[color=blue]
            > I think this is the "regexes can't count" problem. When the repetition
            > count matters, you usually need something else. Usually some
            > combination of string and list methods will do the trick, as here.[/color]

            Not exactly, regexes are just fine at doing things like "first" and
            "last." The "regexes can't count" saying applies mostly to activities
            that reduce to parentheses matching at arbitrary nesting.

            The OP's problem could easily be written as a regex substitution, it's
            just that there's no need to; I believe that the sub would be
            (completely untested, and I'm probably going to use the wrong call to
            re.sub anyway since I don't have the docs open):

            re.sub(outline_ value,'([0-9.]+)\.[0-9]+','\1')

            It's just that the string.rsplit call is much more legible, much more
            intutitive, doesn't do strange things if it's accidentally called on a
            top-level outline value, and also extends immediately to handle
            outlines of the form I.1.a.i.

            Comment

            Working...