Standard newline character

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • bgold12

    Standard newline character

    Will newlines ever be standardized? I recently discovered that in a
    textarea, internet explorer adds \r\n for every newline you enter,
    while firefox adds \n. I know \r is also used in some places... will
    this ever be fixed?
  • Grant

    #2
    Re: Standard newline character

    On Mon, 13 Oct 2008 00:27:49 -0700 (PDT), bgold12 <bgold12@gmail. comwrote:
    >Will newlines ever be standardized? I recently discovered that in a
    >textarea, internet explorer adds \r\n for every newline you enter,
    >while firefox adds \n. I know \r is also used in some places... will
    >this ever be fixed?
    No; windows uses \r\n, Mac uses \r and unix + linux do it right with \n
    for a newline :) I find when processing text areas it's best to filter
    all control chars to spaces then remove dup'd spaces, gets rid of nasties
    and reformats the thing nicely.

    Grant.
    --

    Comment

    • viza

      #3
      Re: Standard newline character

      On Mon, 13 Oct 2008 00:27:49 -0700, bgold12 wrote:
      Will newlines ever be standardized? I recently discovered that in a
      textarea, internet explorer adds \r\n for every newline you enter, while
      firefox adds \n. I know \r is also used in some places... will this ever
      be fixed?
      They are already standardized.

      All text that you send or receive over the network must use \r\n.

      All text that you read or write in C uses \n.

      If you are using some interpreted language (eg javascript), read the
      manual for your interpreter to see which standard it follows.

      Comment

      • Jukka K. Korpela

        #4
        Re: Standard newline character

        viza wrote:
        On Mon, 13 Oct 2008 00:27:49 -0700, bgold12 wrote:
        >
        >Will newlines ever be standardized? I recently discovered that in a
        >textarea, internet explorer adds \r\n for every newline you enter,
        >while firefox adds \n. I know \r is also used in some places... will
        >this ever be fixed?
        >
        They are already standardized.
        Indeed they are. There are many standards to choose from, so nobody needs to
        be nonstandard!
        All text that you send or receive over the network must use \r\n.
        Not correct. There is no such standard. Internet message headers have an
        Internet-standard, but that's just headers, not e.g. HTML or form data.
        All text that you read or write in C uses \n.
        Incorrect and irrelevant to the topic.

        Regarding HTML, consult the HTML specifications. They specify, partly
        somewhat sloppily, that browsers should or shall accept any of CR, LF, and
        CR LF as end of line.

        The question was about form data from textarea elements. There the
        "standard" says that browsers shall canonicalize line ends to CR LF (which
        is what you seem to mean by \r\n, which is _not_ an HTML notation or
        metanotation).

        I just tested how Firefox behaves, and it correctly sends CR LF (encoded as
        %0D%0A) when a newline is entered in a textarea. So the original question
        probably reflects a misunderstand or misinterpretati on.

        --
        Yucca, http://www.cs.tut.fi/~jkorpela/

        Comment

        • viza

          #5
          Re: Standard newline character

          Hi

          On Mon, 13 Oct 2008 20:03:40 +0300, Jukka K. Korpela wrote:
          viza wrote:
          >On Mon, 13 Oct 2008 00:27:49 -0700, bgold12 wrote:
          >>
          >>Will newlines ever be standardized? I recently discovered that in a
          >>textarea, internet explorer adds \r\n for every newline you enter,
          >>while firefox adds \n. I know \r is also used in some places... will
          >>this ever be fixed?
          >All text that you send or receive over the network must use \r\n.
          >
          Not correct. There is no such standard. Internet message headers have an
          Internet-standard, but that's just headers, not e.g. HTML or form data.
          Perhaps a little over-generalized. At least HTML message bodies in email
          must use CR LF (or be base64 encoded etc).
          >All text that you read or write in C uses \n.
          >
          Incorrect and irrelevant to the topic.
          This is both correct and relevant.

          See C99 7.19.2.1 and 7.19.2.2.

          For example, if you fopen() a text file (in text mode) on windows then
          the CR LF on disk is required to be converted to LF before you fgetc()
          them, and are converted back when you write them.
          The question was about form data from textarea elements. There the
          "standard" says that browsers shall canonicalize line ends to CR LF
          I just tested how Firefox behaves, and it correctly sends CR LF (encoded
          as %0D%0A) when a newline is entered in a textarea. So the original
          question probably reflects a misunderstand or misinterpretati on.
          So the firefox js interpreter behaves the same as the c library does?

          Comment

          • Jukka K. Korpela

            #6
            Re: Standard newline character

            viza wrote:
            Perhaps a little over-generalized.
            _What_ is over-generalized in your opinion? Surely your claim that "All text
            that you send or receive over the network must use \r\n" was worse than
            over-generalization: patently false.
            At least HTML message bodies in
            email must use CR LF (or be base64 encoded etc).
            HTML in email is off-topic in this group, and it's typically nonstandard and
            program-dependent, and it can surely be encoded in many ways.
            >>All text that you read or write in C uses \n.
            >>
            >Incorrect and irrelevant to the topic.
            >
            This is both correct and relevant.
            C is surely not HTML.
            See C99 7.19.2.1 and 7.19.2.2.
            Why would I do that? You pick up one version of the C language and ask my to
            look at some vaguely identified document on it, in a context where C is
            definitely off-topic. And I have used C decades ago and I know that it has
            been used even in systems that have _no_ line break characters (but
            designate line structure otherwise).
            >I just tested how Firefox behaves, and it correctly sends CR LF
            >(encoded as %0D%0A) when a newline is entered in a textarea. So the
            >original question probably reflects a misunderstand or
            >misinterpretat ion.
            >
            So the firefox js interpreter behaves the same as the c library does?
            Where did you pick up "js" now? Why would I use "js" when I want to test
            basic form data handling in a browser?

            You seem to contribute nothing but confusion in this discussion. Please do
            not hesitate to come back when you have something to say about HTML
            authoring for the WWW and you have some idea of what you are talking about.

            --
            Yucca, http://www.cs.tut.fi/~jkorpela/

            Comment

            • Ben C

              #7
              Re: Standard newline character

              On 2008-10-13, Jukka K. Korpela <jkorpela@cs.tu t.fiwrote:
              viza wrote:
              [...]
              >See C99 7.19.2.1 and 7.19.2.2.
              >
              Why would I do that? You pick up one version of the C language and ask my to
              look at some vaguely identified document on it, in a context where C is
              definitely off-topic. And I have used C decades ago and I know that it has
              been used even in systems that have _no_ line break characters (but
              designate line structure otherwise).
              You're still supposed to write, for example, fputc('\n', stdout). fputc
              will take care of writing whatever bytes are supposed to represent the
              end of a line on the system you're on. I think that's viza's point.

              Comment

              • Jukka K. Korpela

                #8
                Re: Standard newline character

                Ben C wrote:
                You're still supposed to write, for example, fputc('\n', stdout).
                fputc will take care of writing whatever bytes are supposed to
                represent the end of a line on the system you're on. I think that's
                viza's point.
                I don't think so, and I don't think viza has any point. The fact that the
                notation '\n' will be implemented in a system-dependent manner speaks
                against viza's off-topic rants.

                --
                Yucca, http://www.cs.tut.fi/~jkorpela/

                Comment

                • Andy Dingley

                  #9
                  Re: Standard newline character

                  On 14 Oct, 02:22, "David E. Ross" <nob...@nowhere .notwrote:
                  Once upon a time, long, long ago -- before the Internet, even before
                  computers -- printed messages could be sent electrically via telex.
                  Someone would sit at a keyboard and type; the message would print
                  remotely.  Transmissions of 9,600 bits per second (9.6 kbps) were
                  considered fast.
                  Telex didn't run anything close to 9600 bps, although Baudot did (like
                  ASCII) support CR & LF as separate codes. Teleprinters might have run
                  at 9600, but not Telex.
                  Telex printers had "flying print heads".  
                  Telex printers had all sorts of things. My teleprinter 7 had type bars
                  like an old manual typewriter and, like a typewriter, moved the
                  _paper_ carriage from side to side.

                  Comment

                  • Jim Moe

                    #10
                    Re: Standard newline character

                    On 10/13/08 06:22 pm, David E. Ross wrote:
                    >
                    History lesson follows:
                    >
                    Once upon a time, long, long ago -- before the Internet, even before
                    computers -- printed messages could be sent electrically via telex.
                    Someone would sit at a keyboard and type; the message would print
                    remotely. Transmissions of 9,600 bits per second (9.6 kbps) were
                    considered fast.
                    >
                    Another option was the use of paper tape which allowed an operator to
                    prepare a transmission offline at a paper punch keyboard. Then the paper
                    tape was loaded into the telex for transmission. Hanging chads plagued
                    more than just elections.
                    >
                    While Windows uses CR/LF (only one CR) and UNIX uses merely CR, there
                    might still be some systems that use CR/CR/LF.
                    >
                    UNIX uses LF as a newline character, not CR.

                    --
                    jmm (hyphen) list (at) sohnen-moe (dot) com
                    (Remove .AXSPAMGN for email)

                    Comment

                    • Jim Moe

                      #11
                      Re: Standard newline character

                      On 10/13/08 11:47 am, viza wrote:
                      >
                      >>All text that you send or receive over the network must use \r\n.
                      >>
                      >Not correct. There is no such standard. Internet message headers have an
                      >Internet-standard, but that's just headers, not e.g. HTML or form data.
                      >
                      Perhaps a little over-generalized. At least HTML message bodies in email
                      must use CR LF (or be base64 encoded etc).
                      >
                      You are confusing the RFC822 standard with HTML. Not the same at all.
                      RFC822 defines a newline as cr-lf; the pair is a requirement, the
                      characters separately are not allowed.
                      HTML has no such requirement. In fact a 100,000 character page can
                      contain no newline characters whatsoever, of any variety. Browsers are
                      designed to recognize the various newline combinations and treats them all
                      as whitespace. Web servers simply do not care.

                      --
                      jmm (hyphen) list (at) sohnen-moe (dot) com
                      (Remove .AXSPAMGN for email)

                      Comment

                      • viza

                        #12
                        Re: Standard newline character

                        On Tue, 14 Oct 2008 15:13:26 -0700, Jim Moe wrote:
                        On 10/13/08 11:47 am, viza wrote:
                        >>
                        >>>All text that you send or receive over the network must use \r\n.
                        >>>
                        >>Not correct. There is no such standard. Internet message headers have
                        >>an Internet-standard, but that's just headers, not e.g. HTML or form
                        >>data.
                        >>
                        >Perhaps a little over-generalized. At least HTML message bodies in
                        >email must use CR LF (or be base64 encoded etc).
                        HTML has no such requirement. In fact a 100,000 character page can
                        contain no newline characters whatsoever, of any variety. Browsers are
                        designed to recognize the various newline combinations and treats them
                        all as whitespace. Web servers simply do not care.
                        html sent over http _from_ a server can use any or no newlines, but the
                        O.P. is programing for a textarea on the client side, so all text that
                        *he/she* sends over the network should use CR LF.

                        You are confusing the RFC822 standard with HTML. Not the same at all.
                        RFC822 defines a newline as cr-lf; the pair is a requirement, the
                        characters separately are not allowed.
                        (PS: You mean rfc2822 - the (obsolete) rfc822 did allow bare CR and LF)




                        Comment

                        • David Stone

                          #13
                          Re: Standard newline character

                          In article <avSdnZHRX601i2 jVnZ2dnUVZ_gGdn Z2d@giganews.co m>,
                          Jim Moe <jmm-list.AXSPAMGN@s ohnen-moe.comwrote:
                          On 10/13/08 06:22 pm, David E. Ross wrote:

                          History lesson follows:

                          Once upon a time, long, long ago -- before the Internet, even before
                          computers -- printed messages could be sent electrically via telex.
                          Someone would sit at a keyboard and type; the message would print
                          remotely. Transmissions of 9,600 bits per second (9.6 kbps) were
                          considered fast.
                          Another option was the use of paper tape which allowed an operator to
                          prepare a transmission offline at a paper punch keyboard. Then the paper
                          tape was loaded into the telex for transmission. Hanging chads plagued
                          more than just elections.

                          While Windows uses CR/LF (only one CR) and UNIX uses merely CR, there
                          might still be some systems that use CR/CR/LF.
                          UNIX uses LF as a newline character, not CR.
                          Probably thinking of classic Mac OS, which use(d|s) CR

                          Comment

                          • Dr J R Stockton

                            #14
                            Re: Standard newline character

                            In comp.infosystem s.www.authoring.html message <2pKdncS9lcAahG jVnZ2dnUVZ
                            _r_inZ2d@gigane ws.com>, Tue, 14 Oct 2008 15:13:26, Jim Moe <jmm-
                            list.AXSPAMGN@s ohnen-moe.composted:
                            RFC822 defines a newline as cr-lf; the pair is a requirement, the
                            >characters separately are not allowed.
                            HTML has no such requirement. In fact a 100,000 character page can
                            >contain no newline characters whatsoever, of any variety. Browsers are
                            >designed to recognize the various newline combinations and treats them all
                            >as whitespace. Web servers simply do not care.
                            HTML must recognise [CR|LF]+ newlines within <pre>. I don't know
                            whether all combinations and permutations of [CR|LF]+ give the same
                            number of new lines in all systems.

                            --
                            (c) John Stockton, Surrey, UK. ?@merlyn.demon. co.uk Turnpike v6.05 MIME.
                            Web <URL:http://www.merlyn.demo n.co.uk/- FAQish topics, acronyms, & links.
                            Proper <= 4-line sig. separator as above, a line exactly "-- " (SonOfRFC1036)
                            Do not Mail News to me. Before a reply, quote with ">" or "" (SonOfRFC1036)

                            Comment

                            • David E. Ross

                              #15
                              Re: Standard newline character

                              On 10/14/2008 3:01 PM, Jim Moe wrote [in part]:
                              On 10/13/08 06:22 pm, I preveviously wrote [also in part]:
                              Another option was the use of paper tape which allowed an operator to
                              prepare a transmission offline at a paper punch keyboard. Then the paper
                              tape was loaded into the telex for transmission. Hanging chads plagued
                              more than just elections.
                              >While Windows uses CR/LF (only one CR) and UNIX uses merely CR, there
                              >might still be some systems that use CR/CR/LF.
                              >>
                              UNIX uses LF as a newline character, not CR.
                              >
                              Yes. I misread my own notes from a study I did 5 years ago.

                              --

                              David E. Ross
                              <http://www.rossde.com/>

                              Q: What's a President Bush cocktail?
                              A: Business on the rocks.

                              Comment

                              Working...