Removing dots - please help me out

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Aristotle

    Removing dots - please help me out

    Could you please help me out with regular expressions. I'm trying to
    write a perl script that proccesses some text, and i'm stuck at the
    following:

    need to remove from the text
    1. dots followed by space & words starting with lower case letters
    2. dots followed by only by words starting with lower case letters

    ie

    "pure text here bla bla bla. more text follows" --> changes to
    "pure text here bla bla bla more text follows"

    and

    "pure text here bla bla bla.more text follows" --> changes to
    "pure text here bla bla bla more text follows"

    Need to remove just the dots, not letters.

    No matter how hard i tried, i could not make it work. Tried various
    things eg $line =~ s/\. (?=[a-z])/ /g; I'd really appreciate your
    help, it's a must do and dont have anyone else to help out.

    Thanks in advance.
  • Gunnar Hjalmarsson

    #2
    Re: Removing dots - please help me out

    Aristotle wrote:[color=blue]
    > need to remove from the text
    > 1. dots followed by space & words starting with lower case letters
    > 2. dots followed by only by words starting with lower case letters
    >
    > ie
    >
    > "pure text here bla bla bla. more text follows" --> changes to
    > "pure text here bla bla bla more text follows"
    >
    > and
    >
    > "pure text here bla bla bla.more text follows" --> changes to
    > "pure text here bla bla bla more text follows"
    >
    > Need to remove just the dots, not letters.
    >
    > No matter how hard i tried, i could not make it work. Tried various
    > things eg $line =~ s/\. (?=[a-z])/ /g;[/color]

    You seem to be close, but since the dot may or may not be followed by
    a space, you'd better say so:

    s/\. ?(?=[a-z])/ /g;
    ---------^

    HTH

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl

    Comment

    • Aristotle

      #3
      Re: Removing dots - please help me out

      Thanks, it does seem to work correctly.
      However, there are still some dots that are not being removed,
      for example when there is a 'return' after the dot:

      "Asthma in children of sycotic parents ( Nat .
      s ) Spasms of the glottis with clucking in the larynx ; air "

      Ofcourse i'm using first
      $line =~ s/\n//g;
      in order to remove the return characters and then
      $line =~ s/\. ?(?=[a-z])/ /g;
      but these dots still escape.

      Any ideas why it doesnt work ?


      [color=blue]
      >Gunnar Hjalmarsson <noreply@gunnar .cc> wrote in message news:<HVH5c.525 20$mU6.
      > You seem to be close, but since the dot may or may not be followed by
      > a space, you'd better say so:
      >
      > s/\. ?(?=[a-z])/ /g;
      > ---------^[/color]

      Comment

      • Aristotle

        #4
        Re: Removing dots - please help me out

        Please ignore my last message about some dots after returns, escaping the regex.
        It was my mistake, needed to re-apply the regex a second time at a different point.
        Thnx

        Comment

        Working...