Can't get RegEx to work, pls help

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • H

    Can't get RegEx to work, pls help

    This is kind of an followup on oneof my previous questions, and it has with
    RegEx to do.
    I have a string containing of several words. What would a good regex
    expression looklike to get one match on every word ?

    For example :
    String myString =" This is the string that stupid H can't split up";

    // A good RegEx needed here .. So the result would look something like this
    ;

    Match1 = This
    Match2 = is
    Match3 = the
    Match4 = string
    Match5 = that
    etc etc.
    All I've managed is is to get really weird matches, so please help this
    annoying and frustrated newbie

    TIA




  • William Ryan

    #2
    Re: Can't get RegEx to work, pls help

    You can use Regex.Split and split on Spaces. If you want every word, you
    can just use the reglular old string split. For the regex (which is much
    more powerful if you need it)

    String[] s = Regex.Split(myS tring, " ");

    When you iterate through the array, you'll have each element contain one of
    your words. If you want things like Commas, or if you are splitting phrases
    that have spaces, definitely go with Regex.Split.

    HTH,

    Bill

    "H" <nospamplease__ henke0001@hotma il.com> wrote in message
    news:06xrb.3304 3$mU6.100317@ne wsb.telia.net.. .[color=blue]
    > This is kind of an followup on oneof my previous questions, and it has[/color]
    with[color=blue]
    > RegEx to do.
    > I have a string containing of several words. What would a good regex
    > expression looklike to get one match on every word ?
    >
    > For example :
    > String myString =" This is the string that stupid H can't split up";
    >
    > // A good RegEx needed here .. So the result would look something like[/color]
    this[color=blue]
    > ;
    >
    > Match1 = This
    > Match2 = is
    > Match3 = the
    > Match4 = string
    > Match5 = that
    > etc etc.
    > All I've managed is is to get really weird matches, so please help this
    > annoying and frustrated newbie
    >
    > TIA
    >
    >
    >
    >[/color]


    Comment

    • H

      #3
      Re: Can't get RegEx to work, pls help


      "William Ryan" <dotnetguru@nos pam.comcast.net > skrev i meddelandet
      news:#ETWc4vpDH A.1096@TK2MSFTN GP11.phx.gbl...[color=blue]
      > You can use Regex.Split and split on Spaces. If you want every word, you
      > can just use the reglular old string split. For the regex (which is much
      > more powerful if you need it)
      >
      > String[] s = Regex.Split(myS tring, " ");
      >
      > When you iterate through the array, you'll have each element contain one[/color]
      of[color=blue]
      > your words. If you want things like Commas, or if you are splitting[/color]
      phrases[color=blue]
      > that have spaces, definitely go with Regex.Split.[/color]

      Thanks ! I tried that before, but it didn't work.. Now it works though. Must
      be magic.

      But it still won't work, as I want to be rid of all empty spaces. By using
      your advice if there are 3 spaces in a row, one of them would be counted as
      a word. Also I want to filter out characters like ? ! , . and so on.
      So no hint of how the expression would look like ?




      Comment

      • Steve - DND

        #4
        Re: Can't get RegEx to work, pls help

        > But it still won't work, as I want to be rid of all empty spaces. By using[color=blue]
        > your advice if there are 3 spaces in a row, one of them would be counted[/color]
        as[color=blue]
        > a word. Also I want to filter out characters like ? ! , . and so on.
        > So no hint of how the expression would look like ?[/color]

        string test = " This is the string . that? stupid H can't split up";
        string cleanExp = @"[\?,\.!]+";
        string cleaned = Regex.Replace(t est, cleanExp, " ");
        string splitExp = @"\s+";
        string[] split = Regex.Split(cle aned.Trim(), splitExp);
        foreach (string s in split) {
        Console.WriteLi ne(s);
        }

        The operation is best done in multiple steps. This way if there are any
        preceeding or trailing characters other than spaces, then they can be
        replaced and trimmed off, before doing a split on any white space that is
        one of more characters in length. If you additionally wanted to strip out
        any punctuation contained within words(such as "can't" above). Then your
        first step should be to go through and replace any of those punctuation
        characters with an empty string.

        Steve


        Comment

        • H

          #5
          Re: Can't get RegEx to work, pls help


          "Steve - DND" <steve!@!digita lnothing.com> skrev i meddelandet
          news:uUnGTp0pDH A.2804@TK2MSFTN GP09.phx.gbl...[color=blue][color=green]
          > > But it still won't work, as I want to be rid of all empty spaces. By[/color][/color]
          using[color=blue][color=green]
          > > your advice if there are 3 spaces in a row, one of them would be counted[/color]
          > as[color=green]
          > > a word. Also I want to filter out characters like ? ! , . and so on.
          > > So no hint of how the expression would look like ?[/color]
          >
          > string test = " This is the string . that? stupid H can't split up";
          > string cleanExp = @"[\?,\.!]+";
          > string cleaned = Regex.Replace(t est, cleanExp, " ");
          > string splitExp = @"\s+";
          > string[] split = Regex.Split(cle aned.Trim(), splitExp);
          > foreach (string s in split) {
          > Console.WriteLi ne(s);
          > }
          >
          > The operation is best done in multiple steps. This way if there are any
          > preceeding or trailing characters other than spaces, then they can be
          > replaced and trimmed off, before doing a split on any white space that is
          > one of more characters in length. If you additionally wanted to strip out
          > any punctuation contained within words(such as "can't" above). Then your
          > first step should be to go through and replace any of those punctuation
          > characters with an empty string.
          >
          > Steve
          >
          >[/color]

          Excellent ! Thanks, this I can understand and work from. Thanks again !


          Comment

          Working...