VB.net string help

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • CroCrew
    Recognized Expert Contributor
    • Jan 2008
    • 564

    VB.net string help

    Hello Everyone,

    First off I would like to thank anyone that helps me out with this problem that I have.

    Ok, the environment I am working in is VB.net

    I have a string that I am reading in from a file. There are “chunks” of data within the string that I want to keep and the rest of the data I could care less about. Here is a small example of the string:

    xString = “434623[73899]256[346]37856[3634][367][8922] 45745[12954]35478”

    The “chunks” that I am interested in are within the brackets “[###]”. What I would like to do is build an array something like this:

    yString(0) = 73899
    yString(1) = 346
    yString(2) = 3634
    yString(3) = 367
    yString(4) = 8922
    yString(4) = 12954

    Anyone have a fast way of doing this? Examples?

    Again; Thanks for all your help,
    CroCrew~
  • tlhintoq
    Recognized Expert Specialist
    • Mar 2008
    • 3532

    #2
    I'm sure someone will have a way of doing this with RegEx, but I'm not good with RegEx.

    Me... I'm more brute force. Use a 'for' loop to look at each character. If it is a '[' note the position... keep looping until you find a ']' and note the position. Your desired substring is between the two positions you just found. Keep going to the end of the string.

    Comment

    • lonekorean
      New Member
      • Feb 2010
      • 4

      #3
      Well, here's the regex way. :)

      Try this regex pattern: (?<=\[)\d+(?=\])

      The numbers you want will be in the collection of matches you get from the result.

      You can see it in action here: http://regexstorm.net/Tester.aspx?p=...b12954%5d35478

      Comment

      • tlhintoq
        Recognized Expert Specialist
        • Mar 2008
        • 3532

        #4
        I'm learning too here...

        The most complex RegEx I've ever done is things like is it an alpha, is it positive and so on. The values are evaluated within the function and return a result

        It seems every example I run across is about matching or validating. They all seem to get boolean results.

        Code:
        // Function to test for Positive Integers with zero inclusive
        public static bool IsWholeNumber(String strNumber)
        {
             Regex objNotWholePattern = new Regex("[^0-9]");
             return !objNotWholePattern.IsMatch(strNumber);
        }
        How would one use the RegEx you supplied to get a string[] returned, as in the OP's need/example?

        Code:
        string xString = “434623[73899]256[346]37856[3634][367][8922]45745[12954]35478”;
        Regex BetweentTheBrackets =  new Regex(?<=\[)\d+(?=\]);
        string[] Results = ???? What goes here? How to format the use?

        Comment

        • tlhintoq
          Recognized Expert Specialist
          • Mar 2008
          • 3532

          #5
          I'm not trying to hijack this thread, just work out a complete function that the OP could use.

          So I tried to do some research and figure out the answer to my own question.
          That's what we do here, right? As I always advise others I started with MSDN and found this helpful example:


          That lead me to build this function
          Code:
                  // Function to return a string[] from a source string and Regex pather
                  public static List<string> FindWithin(string SourceString, string Pattern)
                  {
                      Match m;
                      CaptureCollection cc;
                      GroupCollection gc;
                      List<string> Results = new List<string>();
          
                      Regex r = new Regex(Pattern);
                      // Define the string to search.
                      m = r.Match(SourceString);
                      gc = m.Groups;
          
                      // Loop through each group.
                      for (int i = 0; i < gc.Count; i++)
                      {
                          cc = gc[i].Captures;// So this is still an array of sorts
                          
                          // Loop through each capture in group.
                          for (int ii = 0; ii < cc.Count; ii++)
                          {
                              Results.Add(cc[ii].ToString());
                          }
                      }
                      return Results;
                  }
          and then call it like so
          Code:
                  private void regexToolStripMenuItem_Click(object sender, EventArgs e)
                  {
                      List<string> SomeStrings =
                      FindWithin("434623[73899]256[346]37856[3634][367][8922]45745[12954]35478",
                                                   @"(?<=\[)\d+(?=\])+");
                      int count = SomeStrings.Count;
          
                  }
          But it seems no matter how I call it (with or without the trailing plus which I thought might be for all matches not just the first)... The gc collection still only matches to the first bracketed set of numbers [78399], not all of them.

          So what am I as a Regex newbie missing?

          Comment

          • lonekorean
            New Member
            • Feb 2010
            • 4

            #6
            No worries, here's a code snippet for you:
            Code:
                    Regex r = new Regex(@"(?<=\[)\d+(?=\])");
                    MatchCollection mc = r.Matches("434623[73899]256[346]37856[3634][367][8922] 45745[12954]35478");
                    foreach (Match m in mc)
                    {
                        string s = m.Value;
                    }
            Basically, we instantiate a new Regex object with the pattern we want to run. The Regex object has a method Matches() on it that takes in the body of text to run the pattern on, and returns the collection of matches (these are the results we want). For each such match in this collection, we can look at the Value property to get the actual string value we want.

            In this particular case, you don't need to worry about groups or captures. We can get what we need just from looking at matches.

            Hope that helps. :)

            Comment

            • tlhintoq
              Recognized Expert Specialist
              • Mar 2008
              • 3532

              #7
              Thank you for that fine explanation. That allowed me to correct the function to:
              Code:
                      // Function to return a string[] from a source string and Regex pather
                      public static List<string> FindWithin(string SourceString, string Pattern)
                      {
                          List<string> Results = new List<string>();
                          Regex r = new Regex(Pattern);
                          MatchCollection mc = r.Matches(SourceString);
                          foreach (Match m in mc)
                          {
                              Results.Add(m.Value);
                          }
                          return Results;
                      }
              being called from a testing menu option like
              Code:
                      private void regexToolStripMenuItem_Click(object sender, EventArgs e)
                      {
                          List<string> SomeStrings =
                              Saint.StRegEx.FindWithin("434623[73899]256[346]37856[3634][367][8922]45745[12954]35478",
                                                       @"(?<=\[)\d+(?=\])");
                          int count = SomeStrings.Count;
              
                      }
              Very efficient and reusable. Thanks for holding the hands of a couple newbies. I see myself taking a lot of time to learn Regex for future projects.

              Comment

              • lonekorean
                New Member
                • Feb 2010
                • 4

                #8
                No problem, happy to help. :)

                Pardon the self-promotion, but I have a site you might find useful for your future regex endeavors: Regex Storm.

                It runs on the .NET regex engine, and I tried to cater it to programmers like us that need to come up with regex to use in our code.

                Comment

                • CroCrew
                  Recognized Expert Contributor
                  • Jan 2008
                  • 564

                  #9
                  Thanks for all the help guys!!

                  Comment

                  • CroCrew
                    Recognized Expert Contributor
                    • Jan 2008
                    • 564

                    #10
                    I am not good working with RegEx. I just found out that the data could contain “any” characters within the brackets. Well, one good thing is that there can be a bracket within a bracket so I have that going for me </grin>.

                    Here is another small example of some data:

                    xString = “dd#[AN$$970]##[DaD]134[(fff4)][ttt][111] *j^333[323]ff-32”

                    What would I use as my expression to capture “any” characters?

                    Again; thanks for all the help,
                    CroCrew~

                    Comment

                    • lonekorean
                      New Member
                      • Feb 2010
                      • 4

                      #11
                      Not a problem. Replace the "\d+" with ".+?"

                      Here's the modified regex in action: find anything enclosed in brackets

                      To explain, \d matches any digit, and the + means "one or more". So \d+ matches any series of digits, AKA a number.

                      In the new pattern, the . matches anything (except newlines), + means the same thing (one or more) and the "?" in this context means "but match as few as possible". The ? is important because it keeps the pattern from being "greedy" and going beyond the closing bracket. If you want to see what I mean by being greedy, take it out on that testing page, and you'll see what I mean.

                      Comment

                      • CroCrew
                        Recognized Expert Contributor
                        • Jan 2008
                        • 564

                        #12
                        Thanks tlhintoq & lonekorean! for all your help!

                        Comment

                        Working...