regexp to list all sentences and sub sentences, with overlapping?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Tony

    regexp to list all sentences and sub sentences, with overlapping?

    Hello,

    Can someone please point me toward a regular expression that goes
    through a string and contructs a list of sentences and part sentences,
    where words are gradually dropped from the front of the current
    sentence? Sound confusing?

    Well perhaps an example would help? Given...

    "Different countries have different ideas. Merry Christmas to all."

    I'd like to output:

    Different countries have different ideas.
    countries have different ideas.
    have different ideas.
    different ideas.
    Merry Christmas to all.
    Christmas to all.
    to all.

    Is that possible?

    Thanks in advance,

    Tony
  • Jürgen Exner

    #2
    Re: regexp to list all sentences and sub sentences, with overlapping?

    Tony wrote:[color=blue]
    > Can someone please point me toward a regular expression that goes
    > through a string and contructs a list of sentences and part sentences,
    > where words are gradually dropped from the front of the current
    > sentence? Sound confusing?
    >
    > Well perhaps an example would help? Given...
    >
    > "Different countries have different ideas. Merry Christmas to all."
    >
    > I'd like to output:
    >
    > Different countries have different ideas.
    > countries have different ideas.
    > have different ideas.
    > different ideas.
    > Merry Christmas to all.
    > Christmas to all.
    > to all.
    >
    > Is that possible?[/color]

    Maybe, I don't know.
    But I question if REs are the best tool for the job.

    Two splits with two nested loops will do quite nicely:

    use warnings; use strict;
    my $s = "Different countries have different ideas. Merry Christmas to all.";
    my @sentences = split /\./, $s;
    for (@sentences) {
    my @words = split (/ /, $_);
    while (@words) {
    print (join ' ',@words);
    print "\n";
    shift @words;
    }
    }

    Just replace the print with a push to your result list if you want to have a
    list instead.

    jue


    Comment

    • Andy De Petter

      #3
      Re: regexp to list all sentences and sub sentences, with overlapping?

      hawkmoon1972@ho tmail.com (Tony) wrote in news:c90e5468.0 311260046.693d3 5c1
      @posting.google .com:
      [color=blue]
      > Is that possible?[/color]

      Everything is possible with perl. ;)

      my $s = "Different countries have different ideas. Merry Christmas to
      all.";

      while ($s =~ m/\s/) {
      print $s."\n";
      $s =~ s/[^\s]+\s(.*)/$1/;
      }

      Hth,

      -Andy

      --
      Andy De Petter - http://www.techos.be/andy - naql@gunvobk.or (ROT13)
      Expert IT Analyst - Belgacom ANS/NTA/NST - http://www.belgacom.be
      "Cogito Ergo Sum - I think, therefore I am."
      -- R. Descartes

      Comment

      • Tony

        #4
        Re: regexp to list all sentences and sub sentences, with overlapping?

        Very impressive. Thank you very much.

        But what is the second "\s" for in: $s =~ s/[^\s]+\s(.*)/$1/;

        I've also decided to implement a second loop, and this time drop off
        the LAST word each time. Is there a better regexp than the below
        (which seems to be working):

        $s =~ s/(.*)[\$\s]+.+$/$1/;



        Andy De Petter <nqrcrggr@fxlar g.or> wrote in message news:<Xns943F6B DE93314adepette skynetbe@195.23 8.3.180>...[color=blue]
        > hawkmoon1972@ho tmail.com (Tony) wrote in news:c90e5468.0 311260046.693d3 5c1
        > @posting.google .com:
        >[color=green]
        > > Is that possible?[/color]
        >
        > Everything is possible with perl. ;)
        >
        > my $s = "Different countries have different ideas. Merry Christmas to
        > all.";
        >
        > while ($s =~ m/\s/) {
        > print $s."\n";
        > $s =~ s/[^\s]+\s(.*)/$1/;
        > }[/color]

        Comment

        • Andy De Petter

          #5
          Re: regexp to list all sentences and sub sentences, with overlapping?

          hawkmoon1972@ho tmail.com (Tony) wrote in news:c90e5468.0 311270049.41f4b 553
          @posting.google .com:
          [color=blue]
          >
          > But what is the second "\s" for in: $s =~ s/[^\s]+\s(.*)/$1/;
          >[/color]

          To check, wheter there's still a space after a detected word.
          [color=blue]
          > I've also decided to implement a second loop, and this time drop off
          > the LAST word each time. Is there a better regexp than the below
          > (which seems to be working):
          >
          > $s =~ s/(.*)[\$\s]+.+$/$1/;[/color]

          $s =~ s/(.*)\s+[^\s]+\.?/$1/;

          (or something ilke that)

          -Andy

          Comment

          Working...