regex + string formatting?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Patrick C
    New Member
    • Apr 2007
    • 54

    regex + string formatting?

    Can I regex (dont know if that's truely a verb, but i digress...) in string functions such as split, replace or strip?

    What i want to do is with a file that has lines like

    asd5Jkl;lk
    132bn8K;lk

    I want to split them whenever i have a number followed by a capital letter.

    I immediatly thought usinng regex and somehing like pattern = "[0-9][A-Z]"

    Ideally i'd split it between the number and letter, say with a '\n' so
    [0-9][A-Z] would become [0-9] \n [A-Z]

    any thoughts on this?
  • bvdet
    Recognized Expert Specialist
    • Oct 2006
    • 2851

    #2
    Originally posted by Patrick C
    Can I regex (dont know if that's truely a verb, but i digress...) in string functions such as split, replace or strip?

    What i want to do is with a file that has lines like

    asd5Jkl;lk
    132bn8K;lk

    I want to split them whenever i have a number followed by a capital letter.

    I immediatly thought usinng regex and somehing like pattern = "[0-9][A-Z]"

    Ideally i'd split it between the number and letter, say with a '\n' so
    [0-9][A-Z] would become [0-9] \n [A-Z]

    any thoughts on this?
    The following will break a string by slicing at the match and join the strings with '\n'.[code=Python]import re

    def split_on_re(s):
    # split a string between the number and capital letter
    patt = re.compile(r'([0-9])(?=[A-Z])')
    while True:
    m = patt.search(s)
    if m:
    s = '\n'.join([s[:m.end()], s[m.end():]])
    else:
    break
    return s

    print repr(split_on_r e('asd5Jkl;lk13 2bn8K;lk8J'))

    >>> 'asd5\nJkl;lk13 2bn8\nK;lk8\nJ'[/code]

    Comment

    • ghostdog74
      Recognized Expert Contributor
      • Apr 2006
      • 511

      #3
      Originally posted by Patrick C
      Can I regex (dont know if that's truely a verb, but i digress...) in string functions such as split, replace or strip?

      What i want to do is with a file that has lines like

      asd5Jkl;lk
      132bn8K;lk

      I want to split them whenever i have a number followed by a capital letter.

      I immediatly thought usinng regex and somehing like pattern = "[0-9][A-Z]"

      Ideally i'd split it between the number and letter, say with a '\n' so
      [0-9][A-Z] would become [0-9] \n [A-Z]

      any thoughts on this?
      Code:
      import sys
      for line in open("file"):
          for n,l in enumerate(line):
              if l.isdigit() and line[n+1].isupper():
                  sys.stdout.write(l + "\n")
              else: sys.stdout.write(l)

      Comment

      Working...