returning regex matches as lists

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jonathan Lukens

    returning regex matches as lists

    I am in the last phase of building a Django app based on something I
    wrote in Java a while back. Right now I am stuck on how to return the
    matches of a regular expression as a list *at all*, and in particular
    given that the regex has a number of groupings. The only method I've
    seen that returns a list is .findall(string ), but then I get back the
    groups as tuples, which is sort of a problem.

    Thank you,
    Jonathan
  • John Machin

    #2
    Re: returning regex matches as lists

    On Feb 16, 6:07 am, Jonathan Lukens <jonathan.luk.. .@gmail.comwrot e:
    I am in the last phase of building a Django app based on something I
    wrote in Java a while back. Right now I am stuck on how to return the
    matches of a regular expression as a list *at all*, and in particular
    given that the regex has a number of groupings. The only method I've
    seen that returns a list is .findall(string ), but then I get back the
    groups as tuples, which is sort of a problem.
    >
    It would help if you explained what you want the contents of the list
    to be, why you want a list as opposed to a tuple or a generator or
    whatever ... we can't be expected to imagine why getting groups as
    tuples is "sort of a problem".

    Use a concrete example, e.g.
    >>import re
    >>regex = re.compile(r'(\ w+)\s+(\d+)')
    >>text = 'python 1 junk xyzzy 42 java 666'
    >>r = regex.findall(t ext)
    >>r
    [('python', '1'), ('xyzzy', '42'), ('java', '666')]
    >>>
    What would you like to see instead?

    Comment

    • Gabriel Genellina

      #3
      Re: returning regex matches as lists

      En Fri, 15 Feb 2008 17:07:21 -0200, Jonathan Lukens
      <jonathan.luken s@gmail.comescr ibió:
      I am in the last phase of building a Django app based on something I
      wrote in Java a while back. Right now I am stuck on how to return the
      matches of a regular expression as a list *at all*, and in particular
      given that the regex has a number of groupings. The only method I've
      seen that returns a list is .findall(string ), but then I get back the
      groups as tuples, which is sort of a problem.
      Do you want something like this?

      pyre.findall(r" ([a-z]+)([0-9]+)", "foo bar3 w000 no abc123")
      [('bar', '3'), ('w', '000'), ('abc', '123')]
      pyre.findall(r" (([a-z]+)([0-9]+))", "foo bar3 w000 no abc123")
      [('bar3', 'bar', '3'), ('w000', 'w', '000'), ('abc123', 'abc', '123')]
      pygroups = re.findall(r"(([a-z]+)([0-9]+))", "foo bar3 w000 no abc123")
      pygroups
      [('bar3', 'bar', '3'), ('w000', 'w', '000'), ('abc123', 'abc', '123')]
      py[group[0] for group in groups]
      ['bar3', 'w000', 'abc123']

      --
      Gabriel Genellina

      Comment

      • Jonathan Lukens

        #4
        Re: returning regex matches as lists

        John,
        (1) raw string for improved legibility
        ru'(?u)\b([á-ñ]{2,}\s+)([<<"][Á-Ñá-ñ]+)(\s*-?[Á-Ñá-ñ]+)*([>>"])'
        This actually escaped my notice after I had posted -- the letters with
        diacritics are incorrectly decoded Cyrillic letters -- I suppose I
        code use the Unicode escape sequences (the sets [á-ñ] and [Á-Ñá-ñ] are
        the Cyrillic equivalents of [a-z] and [A-Za-z]) but then suddenly the
        legibility goes out the window again.
        (3) what appears between [] is a set of characters, so [<<"] is the
        same as [<"] and probably isn't doing what you expect; have you tested
        this regex for correctness?
        These were angled quotation marks in the original Unicode. Sorry
        again. The regex matches everything it is supposed to. The extra
        parentheses were because I had somehow missed the .group method and it
        had only been returning what was only in the one needed set of
        parentheses.
        I can't imagine how "not a programmer" implies "interested to know if
        there is a more elegant way".
        More carefully stated: "I am self-taught have no real training or
        experience as a programmer and would be interested in seeing how a
        programmer with training
        and experience would go about this."

        Thank you,
        Jonathan

        Comment

        Working...