Java Regex: Parsing Problem

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • zeny
    New Member
    • Jul 2006
    • 44

    Java Regex: Parsing Problem

    Hi everyone!

    How to split a string with parenthesis, since the parenthesis are characters used in regular expressions for grouping?

    For example, how to split the string "by Gregor Hohpe and Bobby Woolf (2003)" in a way that the result gives 2 strings: "by Gregor Hohpe and Bobby Woolf " and "(2003)"?

    I´ve tried unsuccessfully with several regular expressions like "(([0-9]))", and i´m really out of ideas. I would be much appreciated for any ideas for a regular expression.

    Best Regards
  • JosAH
    Recognized Expert MVP
    • Mar 2007
    • 11453

    #2
    Originally posted by zeny
    Hi everyone!

    How to split a string with parenthesis, since the parenthesis are characters used in regular expressions for grouping?

    For example, how to split the string "by Gregor Hohpe and Bobby Woolf (2003)" in a way that the result gives 2 strings: "by Gregor Hohpe and Bobby Woolf " and "(2003)"?

    I´ve tried unsuccessfully with several regular expressions like "(([0-9]))", and i´m really out of ideas. I would be much appreciated for any ideas for a regular expression.

    Best Regards
    You can escape these special characters: \( and \). Note that the backslash is
    a special character for the javac compiler as well, so it you want to pass these
    characters to the regexp compiler in a String literal you have to use \\( and \\)

    Note that you can't 'count' matching parentheses using regular expressions, i.e.
    if you want to check whether or not parentheses are balanced you can't use
    regular expressions for that.

    kind regards,

    Jos

    Comment

    • prometheuzz
      Recognized Expert New Member
      • Apr 2007
      • 197

      #3
      Originally posted by JosAH
      You can escape these special characters: \( and \). Note that the backslash is
      a special character for the javac compiler as well, so it you want to pass these
      characters to the regexp compiler in a String literal you have to use \\( and \\)
      @OP:
      And if the you want to keep the parenthesis in tact, you will have to use some sort of look around-voodoo:
      [CODE=java]String[] array = s.split("(?=\\( )|(?<=\\))");[/CODE]


      Originally posted by JosAH
      Note that you can't 'count' matching parentheses using regular expressions, i.e.
      if you want to check whether or not parentheses are balanced you can't use
      regular expressions for that.

      kind regards,

      Jos
      I believe some regex engines can do such recursive things (Perl's if I'm not mistaken?). But you're right, Java's regex certainly can't.

      Comment

      • JosAH
        Recognized Expert MVP
        • Mar 2007
        • 11453

        #4
        Originally posted by prometheuzz
        I believe some regex engines can do such recursive things (Perl's if I'm not mistaken?). But you're right, Java's regex certainly can't.
        Perl is a write only atrocity ;-)

        kind regards,

        Jos (I didn't write any of the above, honest! ;-)

        Comment

        • zeny
          New Member
          • Jul 2006
          • 44

          #5
          Thanks to you all, now i´ve made it! Very much apreciated for the help!

          Best regards

          Comment

          Working...