how to split text into lines?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • kj

    how to split text into lines?



    In Perl, one can break a chunk of text into an array of lines while
    preserving the trailing line-termination sequence in each line, if
    any, by splitting the text on the regular expression /^/:

    DB<1x split(/^/, "foo\nbar\nbaz" )
    0 'foo
    '
    1 'bar
    '
    2 'baz'

    But nothing like this seems to work in Python:
    >>re.split('^ ', 'foo\nbar\nbaz' )
    ['foo\nbar\nbaz']

    (One gets the same result if one adds the re.MULTILINE flag to the
    re.split call.)

    What's the Python idiom for splitting text into lines, preserving
    the end-of-line sequence in each line?
    --
    NOTE: In my address everything before the first period is backwards;
    and the last period, and everything after it, should be discarded.
  • kj

    #2
    Re: how to split text into lines?

    In <g6qjmc$f2o$1@r eader1.panix.co mkj <socyl@987jk.co m.invalidwrites :


    >In Perl, one can break a chunk of text into an array of lines while
    >preserving the trailing line-termination sequence in each line, if
    >any, by splitting the text on the regular expression /^/:
    DB<1x split(/^/, "foo\nbar\nbaz" )
    >0 'foo
    >'
    >1 'bar
    >'
    >2 'baz'
    >But nothing like this seems to work in Python:
    >>>re.split('^' , 'foo\nbar\nbaz' )
    >['foo\nbar\nbaz']
    >(One gets the same result if one adds the re.MULTILINE flag to the
    >re.split call.)
    >What's the Python idiom for splitting text into lines, preserving
    >the end-of-line sequence in each line?

    Sorry, I should have googled this first. I just found splitlines()...

    Still, for my own edification, is there a way to achieve the same
    effect using re.split?

    TIA!

    kynn

    --
    NOTE: In my address everything before the first period is backwards;
    and the last period, and everything after it, should be discarded.

    Comment

    • Miles

      #3
      Re: how to split text into lines?

      On Wed, Jul 30, 2008 at 4:45 PM, kj wrote:
      >>What's the Python idiom for splitting text into lines, preserving
      >>the end-of-line sequence in each line?
      >
      >
      Sorry, I should have googled this first. I just found splitlines()...
      >
      Still, for my own edification, is there a way to achieve the same
      effect using re.split?
      Not directly: re.split doesn't split on zero-length matches.






      -Miles

      Comment

      • alex23

        #4
        Re: how to split text into lines?

        kj wrote:
        Sorry, I should have googled this first.  I just found splitlines()...
        >
        Still, for my own edification, is there a way to achieve the same
        effect using re.split?
        re.split(os.lin esep, <string>) works the same as <string>.splitl ines()

        Neither retain the EOL for each line, though. The only way I'm aware
        of is to re-add it:

        [s+os.linesep for s in re.split(os.lin esep, <string>)]

        Was that what you were after?

        Comment

        • Chris

          #5
          Re: how to split text into lines?

          On Jul 31, 7:26 am, alex23 <wuwe...@gmail. comwrote:
          kj wrote:
          Sorry, I should have googled this first.  I just found splitlines()... .
          >
          Still, for my own edification, is there a way to achieve the same
          effect using re.split?
          >
          re.split(os.lin esep, <string>) works the same as <string>.splitl ines()
          >
          Neither retain the EOL for each line, though. The only way I'm aware
          of is to re-add it:
          >
          [s+os.linesep for s in re.split(os.lin esep, <string>)]
          >
          Was that what you were after?
          or what about 'string'.splitl ines(True) as that retains newline
          characters. ;)

          Comment

          • alex23

            #6
            Re: how to split text into lines?

            Chris wrote:
            or what about 'string'.splitl ines(True) as that retains newline
            characters. ;)
            Okay, you win :)

            Man, you'd think with the ease of object introspection I'd have at
            least looked at its docstring :)

            Cheers, Chris!

            Comment

            Working...