C Text/Binary Files

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Bartc

    C Text/Binary Files

    The stdin/stdout files of C seem to be always in Text mode.

    Is there any way of running a C program so that these (but especially
    stdout) are in Binary mode instead?

    (I'm in the process of wrapping a different language around C which doesn't
    want the concept of text and binary files. But if I output a string such as
    "ONE\nTWO\n ", this will behave differently between stdout and a regular
    (binary) file. Examples on my OS:

    "\n" Output 13,10 in text mode; 10 in binary mode
    "\w" Output 13,13,10 in text mode; 13,10 in binary mode

    (\w is a new escape code equivalent to \r\n). Workarounds will be awkward
    (and I could never stop \n expanding to 13,10 for stdout) so would be nice
    to avoid them)

    --
    Thanks,

    Bartc


  • pete

    #2
    Re: C Text/Binary Files

    Bartc wrote:
    The stdin/stdout files of C seem to be always in Text mode.
    That's what the standards say.

    --
    pete

    Comment

    • santosh

      #3
      Re: C Text/Binary Files

      Bartc wrote:
      The stdin/stdout files of C seem to be always in Text mode.
      >
      Is there any way of running a C program so that these (but especially
      stdout) are in Binary mode instead?
      Yes, use freopen like this:

      FILE *fin, *fout, *ferr;

      fin = freopen(NULL, "rb", stdin);
      fout = freopen(NULL, "ab", stdout);
      ferr = freopen(NULL, "ab", stderr);

      You could assign the return value to stdin, stdout and stderr itself,
      but the standards says that they are not necessarily modifiable
      lvalues. However it will probably work on most systems you would care
      about.

      See section 7.19.5.4 of the standard for details.

      <snip>

      Comment

      • Ben Bacarisse

        #4
        Re: C Text/Binary Files

        santosh <santosh.k83@gm ail.comwrites:
        Bartc wrote:
        >
        >The stdin/stdout files of C seem to be always in Text mode.
        >>
        >Is there any way of running a C program so that these (but especially
        >stdout) are in Binary mode instead?
        >
        Yes, use freopen like this:
        >
        FILE *fin, *fout, *ferr;
        >
        fin = freopen(NULL, "rb", stdin);
        fout = freopen(NULL, "ab", stdout);
        ferr = freopen(NULL, "ab", stderr);
        >
        You could assign the return value to stdin, stdout and stderr itself,
        but the standards says that they are not necessarily modifiable
        lvalues. However it will probably work on most systems you would care
        about.
        More importantly, freopen is not guaranteed to do what Bartc wants.
        Thus the key information is not what the standard says but what
        typical implementations do on systems where there is difference
        between text and binary mode. I can give only one data point:
        lcc-win32 returns NULL from the freopen call (for stdout).

        --
        Ben.

        Comment

        • santosh

          #5
          Re: C Text/Binary Files

          Ben Bacarisse wrote:
          santosh <santosh.k83@gm ail.comwrites:
          >
          >Bartc wrote:
          >>
          >>The stdin/stdout files of C seem to be always in Text mode.
          >>>
          >>Is there any way of running a C program so that these (but
          >>especially stdout) are in Binary mode instead?
          >>
          >Yes, use freopen like this:
          >>
          > FILE *fin, *fout, *ferr;
          >>
          > fin = freopen(NULL, "rb", stdin);
          > fout = freopen(NULL, "ab", stdout);
          > ferr = freopen(NULL, "ab", stderr);
          >>
          >You could assign the return value to stdin, stdout and stderr itself,
          >but the standards says that they are not necessarily modifiable
          >lvalues. However it will probably work on most systems you would care
          >about.
          >
          More importantly, freopen is not guaranteed to do what Bartc wants.
          Thus the key information is not what the standard says but what
          typical implementations do on systems where there is difference
          between text and binary mode. I can give only one data point:
          lcc-win32 returns NULL from the freopen call (for stdout).
          And it similarly fails for stdin too. It's perhaps surprising that it
          should fail. What difficulty would an implementation like win-lcc have
          with this?

          Comment

          • Ali Karaali

            #6
            Re: C Text/Binary Files

            >
            See section 7.19.5.4 of the standard for details.
            >
            <snip>
            Anyway, How can I find out standard's documents?

            Comment

            • Bartc

              #7
              Re: C Text/Binary Files

              "Bartc" <bc@freeuk.comw rote in message
              news:LCA7k.1408 8$E41.12364@tex t.news.virginme dia.com...
              The stdin/stdout files of C seem to be always in Text mode.
              Thanks for the replies.

              I think if I use exclusively "\w" for newlines (ie. "\r\n") in strings and
              internal functions that generate newlines, then this will work for binary
              files.

              For stdout, this will generate (on my OS) 13,13,10, but for console output
              that is not critical. The only problem will be when stdout is piped or
              redirected to a file at the OS command line, then I will need to process the
              output to take out the extra 13.

              I can live with that.

              I have tried freopen() as suggested, and that sort of works, but output is
              then sent to a file. So this is an alternative perhaps to redirection by the
              OS and the mode /will/ be binary.

              --
              Bartc


              Comment

              • Ben Bacarisse

                #8
                Re: C Text/Binary Files

                Ali Karaali <alicpp@gmail.c omwrites:
                >>
                >See section 7.19.5.4 of the standard for details.
                >>
                ><snip>
                >
                Anyway, How can I find out standard's documents?
                http://www.open-std.org/jtc1/sc22/wg...docs/n1256.pdf is a recent
                draft of C99. The same site has lots of other useful documents.

                --
                Ben.

                Comment

                • rahul

                  #9
                  Re: C Text/Binary Files

                  On Jun 23, 8:21 am, santosh <santosh....@gm ail.comwrote:
                  And it similarly fails for stdin too. It's perhaps surprising that it
                  should fail. What difficulty would an implementation like win-lcc have
                  with this?

                  The following works for me:
                  #include <stdio.h>
                  #include <stdlib.h>

                  int
                  main(void) {
                  stdout = freopen(NULL, "ab", stdout);
                  return 0;
                  }

                  I compiled that with gcc on Linux. It works probably because Linux/
                  Unix does not distinguish between text and binary mode.

                  Comment

                  • Richard Bos

                    #10
                    Re: C Text/Binary Files

                    santosh <santosh.k83@gm ail.comwrote:
                    Bartc wrote:
                    >
                    The stdin/stdout files of C seem to be always in Text mode.

                    Is there any way of running a C program so that these (but especially
                    stdout) are in Binary mode instead?
                    >
                    Yes, use freopen like this:
                    >
                    FILE *fin, *fout, *ferr;
                    >
                    fin = freopen(NULL, "rb", stdin);
                    fout = freopen(NULL, "ab", stdout);
                    ferr = freopen(NULL, "ab", stderr);
                    Note that freopen() with a null first argument is new in C99. In C89,
                    you had to give a new file name.

                    Richard

                    Comment

                    • Keith Thompson

                      #11
                      Re: C Text/Binary Files

                      "Bartc" <bc@freeuk.comw rites:
                      "Bartc" <bc@freeuk.comw rote in message
                      news:LCA7k.1408 8$E41.12364@tex t.news.virginme dia.com...
                      The stdin/stdout files of C seem to be always in Text mode.
                      >
                      Thanks for the replies.
                      >
                      I think if I use exclusively "\w" for newlines (ie. "\r\n") in strings and
                      internal functions that generate newlines, then this will work for binary
                      files.
                      [...]

                      What is "\w"? It's not a standard escape sequence; its value is
                      implementation-defined.

                      --
                      Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                      Nokia
                      "We must do something. This is something. Therefore, we must do this."
                      -- Antony Jay and Jonathan Lynn, "Yes Minister"

                      Comment

                      • Harald van =?UTF-8?b?RMSzaw==?=

                        #12
                        Re: C Text/Binary Files

                        On Mon, 23 Jun 2008 12:59:01 -0700, Keith Thompson wrote:
                        What is "\w"? It's not a standard escape sequence; its value is
                        implementation-defined.
                        "\w" does not match the syntax of a string literal, so by the rule of the
                        longest match this is tokenised as {"}{\}{w}{"} . The behaviour is
                        undefined if a double quote character occurs as a single token. There need
                        not be any value given to "\w", and if there is, it need not be documented.

                        Comment

                        • Bartc

                          #13
                          Re: C Text/Binary Files


                          "Keith Thompson" <kst@cts.comwro te in message
                          news:lzk5gfkjne .fsf@stalkings. ghoti.net...
                          "Bartc" <bc@freeuk.comw rites:
                          >"Bartc" <bc@freeuk.comw rote in message
                          >news:LCA7k.140 88$E41.12364@te xt.news.virginm edia.com...
                          The stdin/stdout files of C seem to be always in Text mode.
                          >>
                          >Thanks for the replies.
                          >>
                          >I think if I use exclusively "\w" for newlines (ie. "\r\n") in strings
                          >and
                          >internal functions that generate newlines, then this will work for binary
                          >files.
                          [...]
                          >
                          What is "\w"? It's not a standard escape sequence; its value is
                          implementation-defined.
                          Sorry. In my original post I'd indicated (not very clearly) that \w was a
                          new escape in a language I was creating to wrap around C.

                          So it's not a C escape but is translated to "\r\n". It represents 'windows
                          newline'; (or more generally, the full newline sequence used in the target
                          OS).

                          --
                          Bartc





                          Comment

                          • Keith Thompson

                            #14
                            Re: C Text/Binary Files

                            Harald van Dþÿ3k <truedfx@gmail. comwrites:
                            On Mon, 23 Jun 2008 12:59:01 -0700, Keith Thompson wrote:
                            What is "\w"? It's not a standard escape sequence; its value is
                            implementation-defined.
                            >
                            "\w" does not match the syntax of a string literal, so by the rule
                            of the longest match this is tokenised as {"}{\}{w}{"} . The
                            behaviour is undefined if a double quote character occurs as a
                            single token. There need not be any value given to "\w", and if
                            there is, it need not be documented.
                            I believe you're mostly or entirely right, and I was wrong.

                            I misinterpreted the second clause of C99 6.4.4.4p10:

                            The value of an integer character constant containing more than
                            one character (e.g., 'ab'), or containing a character or escape
                            sequence that does not map to a single-byte execution character,
                            is implementation-defined.

                            as applying to things like '\w'; instead, it applies to things like
                            '\xffffffff'.

                            "\w" is split into 4 preprocessor tokens:
                            " \ w "
                            The " is not a punctuator; it's in the category "each non-white-space
                            character that cannot be one of the above" (C99 6.4), which means the
                            behavior is undefined.

                            In addition, though, this preprocessor token cannot be converted to a
                            token. The constraint in 6.4p2 is:

                            Each preprocessing token that is converted to a token shall have
                            the lexical form of a keyword, an identifier, a constant, a string
                            literal, or a punctuator.

                            So, assuming that "\w" isn't surrounded by something like "#if 0"
                            .... "endif", it would seem to be a constraint violation. By C99
                            5.1.1.3, this requires a diagnostic even if the behavior is also
                            undefined.

                            Note that, by the same reasoning, "abcd\w" should be split into 5
                            preprocessing tokens:

                            " abcd \ w "

                            which just seems confusing. But since such cases require a diagnostic
                            anyway, a compiler doesn't actually have to pp-tokenize it that way;
                            as long as it prints a warning or error message, its job is done.

                            Still, I think the description would have been simpler if a \ followed
                            by any character in a character or string literal were allowed
                            syntactically, with a constraint limiting the following character to
                            the ones that are specified. Then "\w" would be a single pp-token and
                            a single token (a string literal), with a diagnostic required because
                            of the constraint violation.

                            --
                            Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                            Nokia
                            "We must do something. This is something. Therefore, we must do this."
                            -- Antony Jay and Jonathan Lynn, "Yes Minister"

                            Comment

                            • Antoninus Twink

                              #15
                              Re: C Text/Binary Files

                              On 23 Jun 2008 at 21:43, Keith Thompson wrote:
                              䡡牡汤⁶慮 ⁄ij欠㱴牵敤晸䁧 浡楬⹣潭㸠睲楴敳?
                              㸠佮⁍潮Ⱐ㈳⁊畮′〰 㠠ㄲ㨵㤺〱‭ 〷〰Ⱐ䭥楴栠周潭灳潮⁷ 牯瑥?
                              You may want to check whether you really mean to include this header:
                              Content-Type: text/plain; charset=utf-16be

                              Comment

                              Working...