Regular expression : Grouping decimal values and double quote

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Ahmad A. Rahman

    Regular expression : Grouping decimal values and double quote

    Hi all,

    I have a problem constructing a regular expression using .net.

    I have a string, separated with comma, and I want to group the string
    together but, I failed to group a numeric character with decimal values.

    Example string : 1, 2.3, "two"," three"

    So, I want to group this string into 4 groups (1), (2.3), (two) and (three)

    The best regular expression that I have so far is:
    (?:^|\s*\,\s*)( (?:"(?<SubStrin g>(?:""|[^"])*)")+)|((?<Sub String>(\d))+)

    But this regex will return (1), (2), (3), (two) and (three).

    So, what is the right regular expression to do this? Please help.

    Thanks.




  • Kalpesh Shah

    #2
    Re: Regular expression : Grouping decimal values and double quote

    Will String.Split(", ") do ?

    <Kalpesh/>

    Comment

    • Ahmad A. Rahman

      #3
      Re: Regular expression : Grouping decimal values and double quote

      No. Consider this string:
      string s = "1, 2.3, \"ab,\"\"c\" , \"e.ff,;$\"" ;

      I want to split it into (1), (2.3), (ab,""c) and (e.ff,;$).

      Got my point? Please help.


      "Kalpesh Shah" <kalpesh@disc.m icrosoft.com> wrote in message
      news:OV044lCvDH A.2380@TK2MSFTN GP10.phx.gbl...[color=blue]
      > Will String.Split(", ") do ?
      >
      > <Kalpesh/>
      >[/color]


      Comment

      • Sherif ElMetainy

        #4
        Re: Regular expression : Grouping decimal values and double quote

        Hello

        Try this expression
        (?:^|\s*\,\s*)( ?:(?:"(?<SubStr ing>(?:""|[^"])*)"\s*)|(?:\s* (?<SubString>(? :\
        s*[^\s,]+)*)\s*))

        Best regards
        Sherif

        "Ahmad A. Rahman" <jeremiahrahman @yahoo.com> wrote in message
        news:uGthTWBvDH A.3220@tk2msftn gp13.phx.gbl...[color=blue]
        > Hi all,
        >
        > I have a problem constructing a regular expression using .net.
        >
        > I have a string, separated with comma, and I want to group the string
        > together but, I failed to group a numeric character with decimal values.
        >
        > Example string : 1, 2.3, "two"," three"
        >
        > So, I want to group this string into 4 groups (1), (2.3), (two) and[/color]
        (three)[color=blue]
        >
        > The best regular expression that I have so far is:
        > (?:^|\s*\,\s*)( (?:"(?<SubStrin g>(?:""|[^"])*)")+)|((?<Sub String>(\d))+)
        >
        > But this regex will return (1), (2), (3), (two) and (three).
        >
        > So, what is the right regular expression to do this? Please help.
        >
        > Thanks.
        >
        >
        >[/color]


        Comment

        • Ahmad A. Rahman

          #5
          Re: Regular expression : Grouping decimal values and double quote

          Thanx a lot Metainy!

          It works. But that regex also matches invalid decimal values. Like, it still
          match 1.23aa value. And same case also happened to the double-quote
          character, of which I wanted it to start and end with double-quote, with no
          trailing character except a comma or no char at all. Got my point?

          Just to get it clear:
          - 1.23, "abc"qwe" = valid
          - 1.23x, "abc"qwe" = invalid
          - 1.23, "abc"qwe"xx = invalid

          And one more thing is, any good (but free) resource of regex tutorial? ebook
          or website.

          Still hoping for assistance here.

          Thank you.




          Comment

          • Sherif ElMetainy

            #6
            Re: Regular expression : Grouping decimal values and double quote

            Hello Ahmad

            Try this one
            ^(?:(?:(?:^|,)\ s*)(?:(?:"(?<Su bString>(""|[^"])*)")|(?<SubStr ing>\d+(?:\.\d+
            )?))(?=\s*(?:$| ,))\s*)+$

            Best regards,
            Sherif

            "Ahmad A. Rahman" <jeremiahrahman @yahoo.com> wrote in message
            news:OujxqnIvDH A.2208@TK2MSFTN GP10.phx.gbl...[color=blue]
            > Thanx a lot Metainy!
            >
            > It works. But that regex also matches invalid decimal values. Like, it[/color]
            still[color=blue]
            > match 1.23aa value. And same case also happened to the double-quote
            > character, of which I wanted it to start and end with double-quote, with[/color]
            no[color=blue]
            > trailing character except a comma or no char at all. Got my point?
            >
            > Just to get it clear:
            > - 1.23, "abc"qwe" = valid
            > - 1.23x, "abc"qwe" = invalid
            > - 1.23, "abc"qwe"xx = invalid
            >
            > And one more thing is, any good (but free) resource of regex tutorial?[/color]
            ebook[color=blue]
            > or website.
            >
            > Still hoping for assistance here.
            >
            > Thank you.
            >
            >
            >
            >[/color]


            Comment

            • Ahmad A. Rahman

              #7
              Re: Regular expression : Grouping decimal values and double quote

              Hi ElMetainy,

              That one does work, but I also need the double-quote character to be in
              between the double quote.
              Llike my previous post:

              1.23, "abc"qwe" = valid (and by using MatchCollection on <SubString>, this
              will return [1.23] and [abc"qwe])
              1.23, "abc"qwe"xx = invalid
              1.23xx, "abc"qwe" = invalid

              Can you help me...just a little bit more? :) You almost got it right.

              p/s: Sorry, I still got no time to learn regex. But I really need a quick
              solution right now.

              "Sherif ElMetainy" <elmeteny.NOSPA M@wayout.net.NO SPAM> wrote in message
              news:%23s%2342e WvDHA.1872@TK2M SFTNGP09.phx.gb l...[color=blue]
              > Hello Ahmad
              >
              > Try this one
              >[/color]
              ^(?:(?:(?:^|,)\ s*)(?:(?:"(?<Su bString>(""|[^"])*)")|(?<SubStr ing>\d+(?:\.\d+[color=blue]
              > )?))(?=\s*(?:$| ,))\s*)+$
              >
              > Best regards,
              > Sherif
              >
              > "Ahmad A. Rahman" <jeremiahrahman @yahoo.com> wrote in message
              > news:OujxqnIvDH A.2208@TK2MSFTN GP10.phx.gbl...[color=green]
              > > Thanx a lot Metainy!
              > >
              > > It works. But that regex also matches invalid decimal values. Like, it[/color]
              > still[color=green]
              > > match 1.23aa value. And same case also happened to the double-quote
              > > character, of which I wanted it to start and end with double-quote, with[/color]
              > no[color=green]
              > > trailing character except a comma or no char at all. Got my point?
              > >
              > > Just to get it clear:
              > > - 1.23, "abc"qwe" = valid
              > > - 1.23x, "abc"qwe" = invalid
              > > - 1.23, "abc"qwe"xx = invalid
              > >
              > > And one more thing is, any good (but free) resource of regex tutorial?[/color]
              > ebook[color=green]
              > > or website.
              > >
              > > Still hoping for assistance here.
              > >
              > > Thank you.
              > >
              > >
              > >
              > >[/color]
              >
              >[/color]


              Comment

              • Sherif ElMetainy

                #8
                Re: Regular expression : Grouping decimal values and double quote

                Hello

                This can be too complicated

                How do I treat the double quote and comma. A ',' between double quotes is
                considered a part of the string and a double quote between double quotes is
                also considered a part of the string
                Imagine this

                1.23,"aa,ddd",1 23 this should match [1.23], [aa,ddd] and [123]

                1.23,"abc"qwe," ee",124 should match [1.23], [abc"qwe,"ee] and [124] or be
                considered invalid??
                To take this decision you have to understand the nature of the data (for
                example being able to distinguish a contact's first name from his
                nickname) which is not possible with regular expressions.

                This is why it is difficult to match one double quote between 2 double
                quotes.

                Here is where the "" resolves the ambiguity
                1.23,"abc""qwe, ""ee",124 should match [1.23], [abc"qwe,"ee] and [124]
                meaning that 2 consecutive double quotes within double quotes should be
                treated as a one double quote which is a part of the string. The "" is
                standard in formats like csv.


                Best regards,
                Sherif


                "Ahmad A. Rahman" <jeremiahrahman @yahoo.com> wrote in message
                news:#nHPlaYvDH A.2076@TK2MSFTN GP09.phx.gbl...[color=blue]
                > Hi ElMetainy,
                >
                > That one does work, but I also need the double-quote character to be in
                > between the double quote.
                > Llike my previous post:
                >
                > 1.23, "abc"qwe" = valid (and by using MatchCollection on <SubString>, this
                > will return [1.23] and [abc"qwe])
                > 1.23, "abc"qwe"xx = invalid
                > 1.23xx, "abc"qwe" = invalid
                >
                > Can you help me...just a little bit more? :) You almost got it right.
                >
                > p/s: Sorry, I still got no time to learn regex. But I really need a quick
                > solution right now.
                >
                > "Sherif ElMetainy" <elmeteny.NOSPA M@wayout.net.NO SPAM> wrote in message
                > news:%23s%2342e WvDHA.1872@TK2M SFTNGP09.phx.gb l...[color=green]
                > > Hello Ahmad
                > >
                > > Try this one
                > >[/color]
                >[/color]
                ^(?:(?:(?:^|,)\ s*)(?:(?:"(?<Su bString>(""|[^"])*)")|(?<SubStr ing>\d+(?:\.\d+[color=blue][color=green]
                > > )?))(?=\s*(?:$| ,))\s*)+$
                > >
                > > Best regards,
                > > Sherif
                > >
                > > "Ahmad A. Rahman" <jeremiahrahman @yahoo.com> wrote in message
                > > news:OujxqnIvDH A.2208@TK2MSFTN GP10.phx.gbl...[color=darkred]
                > > > Thanx a lot Metainy!
                > > >
                > > > It works. But that regex also matches invalid decimal values. Like, it[/color]
                > > still[color=darkred]
                > > > match 1.23aa value. And same case also happened to the double-quote
                > > > character, of which I wanted it to start and end with double-quote,[/color][/color][/color]
                with[color=blue][color=green]
                > > no[color=darkred]
                > > > trailing character except a comma or no char at all. Got my point?
                > > >
                > > > Just to get it clear:
                > > > - 1.23, "abc"qwe" = valid
                > > > - 1.23x, "abc"qwe" = invalid
                > > > - 1.23, "abc"qwe"xx = invalid
                > > >
                > > > And one more thing is, any good (but free) resource of regex tutorial?[/color]
                > > ebook[color=darkred]
                > > > or website.
                > > >
                > > > Still hoping for assistance here.
                > > >
                > > > Thank you.
                > > >
                > > >
                > > >
                > > >[/color]
                > >
                > >[/color]
                >
                >[/color]


                Comment

                • Ahmad A. Rahman

                  #9
                  Re: Regular expression : Grouping decimal values and double quote

                  Hi,

                  I know that it was too complicated, that's why I'm here.

                  But, I think I have my way out now. I can use MatchColelction and break the
                  string apart between the comma, and use another regex to validate every
                  broken string. :)

                  Anyway, you've been a great help ElMetainy. Thanks a lot.

                  Bye.

                  "Sherif ElMetainy" <elmeteny.NOSPA M@wayout.net.NO SPAM> wrote in message
                  news:uypGwNdvDH A.560@TK2MSFTNG P11.phx.gbl...[color=blue]
                  > Hello
                  >
                  > This can be too complicated
                  >
                  > How do I treat the double quote and comma. A ',' between double quotes is
                  > considered a part of the string and a double quote between double quotes[/color]
                  is[color=blue]
                  > also considered a part of the string
                  > Imagine this
                  >
                  > 1.23,"aa,ddd",1 23 this should match [1.23], [aa,ddd] and [123]
                  >
                  > 1.23,"abc"qwe," ee",124 should match [1.23], [abc"qwe,"ee] and [124] or be
                  > considered invalid??
                  > To take this decision you have to understand the nature of the data (for
                  > example being able to distinguish a contact's first name from his
                  > nickname) which is not possible with regular expressions.
                  >
                  > This is why it is difficult to match one double quote between 2 double
                  > quotes.
                  >
                  > Here is where the "" resolves the ambiguity
                  > 1.23,"abc""qwe, ""ee",124 should match [1.23], [abc"qwe,"ee] and [124]
                  > meaning that 2 consecutive double quotes within double quotes should be
                  > treated as a one double quote which is a part of the string. The "" is
                  > standard in formats like csv.
                  >
                  >
                  > Best regards,
                  > Sherif
                  >
                  >
                  > "Ahmad A. Rahman" <jeremiahrahman @yahoo.com> wrote in message
                  > news:#nHPlaYvDH A.2076@TK2MSFTN GP09.phx.gbl...[color=green]
                  > > Hi ElMetainy,
                  > >
                  > > That one does work, but I also need the double-quote character to be in
                  > > between the double quote.
                  > > Llike my previous post:
                  > >
                  > > 1.23, "abc"qwe" = valid (and by using MatchCollection on <SubString>,[/color][/color]
                  this[color=blue][color=green]
                  > > will return [1.23] and [abc"qwe])
                  > > 1.23, "abc"qwe"xx = invalid
                  > > 1.23xx, "abc"qwe" = invalid
                  > >
                  > > Can you help me...just a little bit more? :) You almost got it right.
                  > >
                  > > p/s: Sorry, I still got no time to learn regex. But I really need a[/color][/color]
                  quick[color=blue][color=green]
                  > > solution right now.
                  > >
                  > > "Sherif ElMetainy" <elmeteny.NOSPA M@wayout.net.NO SPAM> wrote in message
                  > > news:%23s%2342e WvDHA.1872@TK2M SFTNGP09.phx.gb l...[color=darkred]
                  > > > Hello Ahmad
                  > > >
                  > > > Try this one
                  > > >[/color]
                  > >[/color]
                  >[/color]
                  ^(?:(?:(?:^|,)\ s*)(?:(?:"(?<Su bString>(""|[^"])*)")|(?<SubStr ing>\d+(?:\.\d+[color=blue][color=green][color=darkred]
                  > > > )?))(?=\s*(?:$| ,))\s*)+$
                  > > >
                  > > > Best regards,
                  > > > Sherif
                  > > >
                  > > > "Ahmad A. Rahman" <jeremiahrahman @yahoo.com> wrote in message
                  > > > news:OujxqnIvDH A.2208@TK2MSFTN GP10.phx.gbl...
                  > > > > Thanx a lot Metainy!
                  > > > >
                  > > > > It works. But that regex also matches invalid decimal values. Like,[/color][/color][/color]
                  it[color=blue][color=green][color=darkred]
                  > > > still
                  > > > > match 1.23aa value. And same case also happened to the double-quote
                  > > > > character, of which I wanted it to start and end with double-quote,[/color][/color]
                  > with[color=green][color=darkred]
                  > > > no
                  > > > > trailing character except a comma or no char at all. Got my point?
                  > > > >
                  > > > > Just to get it clear:
                  > > > > - 1.23, "abc"qwe" = valid
                  > > > > - 1.23x, "abc"qwe" = invalid
                  > > > > - 1.23, "abc"qwe"xx = invalid
                  > > > >
                  > > > > And one more thing is, any good (but free) resource of regex[/color][/color][/color]
                  tutorial?[color=blue][color=green][color=darkred]
                  > > > ebook
                  > > > > or website.
                  > > > >
                  > > > > Still hoping for assistance here.
                  > > > >
                  > > > > Thank you.
                  > > > >
                  > > > >
                  > > > >
                  > > > >
                  > > >
                  > > >[/color]
                  > >
                  > >[/color]
                  >
                  >[/color]


                  Comment

                  Working...