RegExp split for Spell Check

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • pr

    #16
    Re: RegExp split for Spell Check

    Thomas 'PointedEars' Lahn wrote:
    pr wrote:
    > alert('<font face="arial" size=2>test</font><p>yo this is a
    >test'.replac e(/\s(?=[^<]*>)/g, "~").split(/<p>|\s/).join("\n"));
    ^^^^^^^
    | I need a pattern that will split without replacing.
    >
    Hanged if I can think of a good reason why, but well spotted, Thomas,
    this is more efficient in any case:

    alert('<font face="arial" size=2>test</font><p>yo this is a
    test'.split(/\s(?![^<]*>)|<p>/).join("\n"));

    Comment

    • Randy Webb

      #17
      Re: RegExp split for Spell Check

      Thomas 'PointedEars' Lahn said the following on 12/4/2007 2:38 PM:
      SmokeWilliams wrote:
      >I need a pattern that will split without replacing. So I need to
      >split on spaces or carriage returns, but not spaces that are withing
      >html tags. I know there are better ways, but I am using an IFrame in
      >IE and I work for a government agency which doesn't allow me to use
      >open source. I am depending on a RegEx wizard out there to supply me
      >with the pattern.
      >>
      >So I need a pattern that matches any space or carriage return that is
      >not within an html tag.
      >>
      ><font face="arial" size=2>test</font><p>yo this is a test
      >>
      >Splitting this text should return an array containing:
      >1: <font face="arial" size=2>test</font>
      >2: yo
      >3: this
      >4: is
      >5: a
      >6: test
      >
      Suppose you have
      >
      var s = '<font face="arial" size=2>test</font><p>yo this is a test';
      >
      Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
      or (which is more likely) instead you want the resulting array to be
      >
      ['', 'test', '', 'yo', 'this', 'is', 'a', 'test']
      No, that is not what he said. Perhaps you should try reading what he
      wrote and the intended results. Your "solution" leaves out the 1. listed
      above.

      --
      Randy
      Chance Favors The Prepared Mind
      comp.lang.javas cript FAQ - http://jibbering.com/faq/index.html
      Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/

      Comment

      • Thomas 'PointedEars' Lahn

        #18
        Re: RegExp split for Spell Check

        Randy Webb wrote:
        Thomas 'PointedEars' Lahn said the following on 12/4/2007 2:38 PM:
        >SmokeWilliam s wrote:
        >>I need a pattern that will split without replacing. So I need to
        >>split on spaces or carriage returns, but not spaces that are withing
        >>html tags. I know there are better ways, but I am using an IFrame in
        >>IE and I work for a government agency which doesn't allow me to use
        >>open source. I am depending on a RegEx wizard out there to supply me
        >>with the pattern.
        >>>
        >>So I need a pattern that matches any space or carriage return that is
        >>not within an html tag.
        >>>
        >><font face="arial" size=2>test</font><p>yo this is a test
        >>>
        >>Splitting this text should return an array containing:
        >>1: <font face="arial" size=2>test</font>
        >>2: yo
        >>3: this
        >>4: is
        >>5: a
        >>6: test
        >Suppose you have
        >>
        > var s = '<font face="arial" size=2>test</font><p>yo this is a test';
        >>
        >Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
        >or (which is more likely) instead you want the resulting array to be
        >>
        > ['', 'test', '', 'yo', 'this', 'is', 'a', 'test']
        >
        No, that is not what he said.
        Are you stupid or what? I *know* that this is not what he said. However, I
        don't think he really knows what he wants. Because it does not make sense
        for a spell checker in a structural editor to ignore HTML element content.
        And therefore, I posted my solution as it is.
        Perhaps you should try reading what he wrote
        You should read what I wrote, not what you wanted me to have written.

        So much for reading.
        and the intended results. Your "solution" leaves out the 1. listed
        above.
        I know.


        PointedEars
        --
        var bugRiddenCrashP ronePieceOfJunk = (
        navigator.userA gent.indexOf('M SIE 5') != -1
        && navigator.userA gent.indexOf('M ac') != -1
        ) // Plone, register_functi on.js:16

        Comment

        • Randy Webb

          #19
          Re: RegExp split for Spell Check

          Thomas 'PointedEars' Lahn said the following on 12/4/2007 4:25 PM:
          Randy Webb wrote:
          >Thomas 'PointedEars' Lahn said the following on 12/4/2007 2:38 PM:
          >>SmokeWillia ms wrote:
          >>>I need a pattern that will split without replacing. So I need to
          >>>split on spaces or carriage returns, but not spaces that are withing
          >>>html tags. I know there are better ways, but I am using an IFrame in
          >>>IE and I work for a government agency which doesn't allow me to use
          >>>open source. I am depending on a RegEx wizard out there to supply me
          >>>with the pattern.
          >>>>
          >>>So I need a pattern that matches any space or carriage return that is
          >>>not within an html tag.
          >>>>
          >>><font face="arial" size=2>test</font><p>yo this is a test
          >>>>
          >>>Splitting this text should return an array containing:
          >>>1: <font face="arial" size=2>test</font>
          >>>2: yo
          >>>3: this
          >>>4: is
          >>>5: a
          >>>6: test
          >>Suppose you have
          >>>
          >> var s = '<font face="arial" size=2>test</font><p>yo this is a test';
          >>>
          >>Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
          >>or (which is more likely) instead you want the resulting array to be
          >>>
          >> ['', 'test', '', 'yo', 'this', 'is', 'a', 'test']
          >No, that is not what he said.
          >
          Are you stupid or what?
          If imitation is the sincerest form of flattery, you flatter the shit out
          of me sometimes.
          I *know* that this is not what he said. However, I don't think he really
          knows what he wants.
          He knows *exactly* what he wants, he just isn't sure how to implement
          it. Subtle difference my friend.

          EOD.

          --
          Randy
          Chance Favors The Prepared Mind
          comp.lang.javas cript FAQ - http://jibbering.com/faq/index.html
          Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/

          Comment

          • Thomas 'PointedEars' Lahn

            #20
            Re: RegExp split for Spell Check

            Randy Webb wrote:
            Thomas 'PointedEars' Lahn said the following on 12/4/2007 4:25 PM:
            >Randy Webb wrote:
            >>Thomas 'PointedEars' Lahn said the following on 12/4/2007 2:38 PM:
            >>>SmokeWilliam s wrote:
            >>>>I need a pattern that will split without replacing. So I need to
            >>>>split on spaces or carriage returns, but not spaces that are withing
            >>>>html tags. [...]
            >>>>>
            >>>>So I need a pattern that matches any space or carriage return that is
            >>>>not within an html tag.
            >>>>>
            >>>><font face="arial" size=2>test</font><p>yo this is a test
            >>>>>
            >>>>Splitting this text should return an array containing:
            >>>>1: <font face="arial" size=2>test</font>
            >>>>2: yo
            >>>>3: this
            >>>>4: is
            >>>>5: a
            >>>>6: test
            >>>Suppose you have
            >>>>
            >>> var s = '<font face="arial" size=2>test</font><p>yo this is a test';
            >>>>
            >>>Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
            >>>or (which is more likely) instead you want the resulting array to be
            >>>>
            >>> ['', 'test', '', 'yo', 'this', 'is', 'a', 'test']
            >>No, that is not what he said.
            >Are you stupid or what?
            >
            If imitation is the sincerest form of flattery, you flatter the shit out
            of me sometimes.
            I have tried your lower level of language this time so that you may better
            understand me. Obviously, I am not very good at it. Sorry.
            >I *know* that this is not what he said. However, I don't think he really
            >knows what he wants.
            >
            He knows *exactly* what he wants,
            No, he does not. He has the idea of a spell checker and the problem that he
            can not simply split on whitespace because of whitespace in HTML tags:
            >>>>So I need a pattern that matches any space or carriage return that is
            >>>>not within an html tag.
            But the his example says otherwise. So I assumed that what he really wants
            is to exclude the tags from consideration which leaves only the plain text
            for the spell check. And that my solution allows. It is also a solution
            that works with any script engine that supports regular expressions, while
            solutions including negative lookahead or non-greedy matching do not.
            However, these solutions posted so far have assumed that he wants exactly
            the result he has posted; they did not take the practical application, or
            rather the lack thereof, of that result into account, and they did not take
            into account that he may have posted merely a bad example.

            Of course, much of that remains speculation until he clears that up. But I
            have explicitly stated in my posting that my solution was _not_ to provide
            the result that he posted last. And so your followup to that was
            unnecessary and the style in which it was written was completely uncalled
            for. If you only had read not only *his* postings, but also *my* posting
            *properly*.
            he just isn't sure how to implement it.
            He is pretty much unsure about anything so far.
            Subtle difference my friend.
            Don't be familiar with me until you have earned it.


            PointedEars

            Comment

            • Randy Webb

              #21
              Re: RegExp split for Spell Check

              Thomas 'PointedEars' Lahn said the following on 12/4/2007 7:48 PM:
              Randy Webb wrote:
              >Thomas 'PointedEars' Lahn said the following on 12/4/2007 4:25 PM:
              >>Randy Webb wrote:
              >>>Thomas 'PointedEars' Lahn said the following on 12/4/2007 2:38 PM:
              >>>>SmokeWillia ms wrote:
              >>>>>I need a pattern that will split without replacing. So I need to
              >>>>>split on spaces or carriage returns, but not spaces that are withing
              >>>>>html tags. [...]
              >>>>>>
              >>>>>So I need a pattern that matches any space or carriage return that is
              >>>>>not within an html tag.
              >>>>>>
              >>>>><font face="arial" size=2>test</font><p>yo this is a test
              >>>>>>
              >>>>>Splittin g this text should return an array containing:
              >>>>>1: <font face="arial" size=2>test</font>
              >>>>>2: yo
              >>>>>3: this
              >>>>>4: is
              >>>>>5: a
              >>>>>6: test
              >>>>Suppose you have
              >>>>>
              >>>> var s = '<font face="arial" size=2>test</font><p>yo this is a test';
              >>>>>
              >>>>Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
              >>>>or (which is more likely) instead you want the resulting array to be
              >>>>>
              >>>> ['', 'test', '', 'yo', 'this', 'is', 'a', 'test']
              >>>No, that is not what he said.
              >>Are you stupid or what?
              >If imitation is the sincerest form of flattery, you flatter the shit out
              >of me sometimes.
              >
              I have tried your lower level of language this time so that you may better
              understand me. Obviously, I am not very good at it. Sorry.
              You still flatter the shit out of me. You are failing in your endeavor,
              but I am still flattered.
              >>I *know* that this is not what he said. However, I don't think he really
              >>knows what he wants.
              >He knows *exactly* what he wants,
              >
              No, he does not. He has the idea of a spell checker and the problem that he
              can not simply split on whitespace because of whitespace in HTML tags:
              He wants a spell checker. He knows what he wants, he just doesn't know
              the best way to implement it. And, the "best solution" doesn't involve a
              regular expression, just a simple split on the text.
              >Subtle difference my friend.
              >
              Don't be familiar with me until you have earned it.
              If I thought, for one minute, that you even came close to understanding
              what I wrote really means, then your statement wouldn't be so ludicrous.

              I am way more familiar with you than I ever cared to be.

              --
              Randy
              Chance Favors The Prepared Mind
              comp.lang.javas cript FAQ - http://jibbering.com/faq/index.html
              Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/

              Comment

              • Thomas 'PointedEars' Lahn

                #22
                Re: RegExp split for Spell Check

                Randy Webb wrote:
                Thomas 'PointedEars' Lahn said the following on 12/4/2007 7:48 PM:
                >Randy Webb wrote:
                >>Thomas 'PointedEars' Lahn said the following on 12/4/2007 4:25 PM:
                >>>I *know* that this is not what he said. However, I don't think he really
                >>>knows what he wants.
                >>He knows *exactly* what he wants,
                >No, he does not. He has the idea of a spell checker and the problem that he
                >can not simply split on whitespace because of whitespace in HTML tags:
                ^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^ ^^^^^^^^^
                He wants a spell checker. He knows what he wants, he just doesn't know
                the best way to implement it. And, the "best solution" doesn't involve a
                regular expression, just a simple split on the text.
                Learn to read.


                PointedEars

                Comment

                • Randy Webb

                  #23
                  Re: RegExp split for Spell Check

                  Thomas 'PointedEars' Lahn said the following on 12/5/2007 6:49 PM:
                  Randy Webb wrote:
                  >Thomas 'PointedEars' Lahn said the following on 12/4/2007 7:48 PM:
                  >>Randy Webb wrote:
                  >>>Thomas 'PointedEars' Lahn said the following on 12/4/2007 4:25 PM:
                  >>>>I *know* that this is not what he said. However, I don't think he really
                  >>>>knows what he wants.
                  >>>He knows *exactly* what he wants,
                  >>No, he does not. He has the idea of a spell checker and the problem that he
                  >>can not simply split on whitespace because of whitespace in HTML tags:
                  ^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^ ^^^^^^^^^
                  >He wants a spell checker. He knows what he wants, he just doesn't know
                  >the best way to implement it. And, the "best solution" doesn't involve a
                  >regular expression, just a simple split on the text.
                  >
                  Learn to read.
                  "He knows what he wants, he just doesn't know how to implement it".
                  Let's see if I can read what I write and if you can understand what I write.

                  He wants a spell checker.
                  He thinks that to implement it he has to split the HTML code.
                  He doesn't.
                  To implement a spell checker, you simply read the *text* of the page.
                  Then you split the text on spaces.
                  Then you spell check the words from the *text* of the page.
                  You find mis-spelled words and notify the user.

                  Now, since he thinks he has to read the HTML code to implement it, then
                  he doesn't know how to implement it.

                  Your problem isn't that you can't read, you refuse to understand what
                  you read sometimes.

                  But, you did manage to flatter me again.

                  --
                  Randy
                  Chance Favors The Prepared Mind
                  comp.lang.javas cript FAQ - http://jibbering.com/faq/index.html
                  Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/

                  Comment

                  Working...