Error in scanf implementation or error in example in standard?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Simon Biber

    Error in scanf implementation or error in example in standard?

    The following Example 3 is given in the 1999 C standard for the function
    fscanf:
    EXAMPLE 3 To accept repeatedly from stdin a quantity, a unit of
    measure, and an item name:
    >
    #include <stdio.h>
    /* ... */
    int count; float quant; char units[21], item[21];
    do {
    count = fscanf(stdin, "%f%20s of %20s", &quant, units, item);
    fscanf(stdin,"% *[^\n]");
    } while (!feof(stdin) && !ferror(stdin)) ;
    >
    If the stdin stream contains the following lines:
    >
    2 quarts of oil
    -12.8degrees Celsius
    lots of luck
    10.0LBS of
    dirt
    100ergs of energy
    >
    the execution of the above example will be analogous to the following
    assignments:
    >
    quant = 2; strcpy(units, "quarts"); strcpy(item, "oil");
    count = 3;
    quant = -12.8; strcpy(units, "degrees");
    count = 2; // "C" fails to match "o"
    count = 0; // "l" fails to match "%f"
    quant = 10.0; strcpy(units, "LBS"); strcpy(item, "dirt");
    count = 3;
    count = 0; // "100e" fails to match "%f"
    count = EOF;
    I have tested several implementations and none of them get the last case
    right. In no case does fscanf return 0 indicating failure to match
    "100ergs of energy" with "%f".

    The actual behaviour varies. Some will match '100', leaving the 'e' unread:

    quant = 100; strcpy(units, "ergs"); strcpy(item, "energy");
    count = 3;

    While others will match '100e', leaving the 'r' unread:

    quant = 100; strcpy(units, "rgs"); strcpy(item, "energy");
    count = 3;

    But I am yet to come across an implementation that does what the example
    in the Standard specifies. Is this a failure in the implementations or
    in the Standard itself?

    --
    Simon.
  • Robert Gamble

    #2
    Re: Error in scanf implementation or error in example in standard?

    Simon Biber wrote:
    The following Example 3 is given in the 1999 C standard for the function
    fscanf:
    >
    EXAMPLE 3 To accept repeatedly from stdin a quantity, a unit of
    measure, and an item name:

    #include <stdio.h>
    /* ... */
    int count; float quant; char units[21], item[21];
    do {
    count = fscanf(stdin, "%f%20s of %20s", &quant, units, item);
    fscanf(stdin,"% *[^\n]");
    } while (!feof(stdin) && !ferror(stdin)) ;

    If the stdin stream contains the following lines:

    2 quarts of oil
    -12.8degrees Celsius
    lots of luck
    10.0LBS of
    dirt
    100ergs of energy

    the execution of the above example will be analogous to the following
    assignments:

    quant = 2; strcpy(units, "quarts"); strcpy(item, "oil");
    count = 3;
    quant = -12.8; strcpy(units, "degrees");
    count = 2; // "C" fails to match "o"
    count = 0; // "l" fails to match "%f"
    quant = 10.0; strcpy(units, "LBS"); strcpy(item, "dirt");
    count = 3;
    count = 0; // "100e" fails to match "%f"
    count = EOF;
    >
    I have tested several implementations and none of them get the last case
    right. In no case does fscanf return 0 indicating failure to match
    "100ergs of energy" with "%f".
    >
    The actual behaviour varies. Some will match '100', leaving the 'e' unread:
    >
    quant = 100; strcpy(units, "ergs"); strcpy(item, "energy");
    count = 3;
    >
    While others will match '100e', leaving the 'r' unread:
    >
    quant = 100; strcpy(units, "rgs"); strcpy(item, "energy");
    count = 3;
    >
    But I am yet to come across an implementation that does what the example
    in the Standard specifies. Is this a failure in the implementations or
    in the Standard itself?
    Footnote 245 in n1124 states:
    "fscanf pushes back at most one input character onto the input stream.
    Therefore, some sequences that are acceptable to strtod, strtol, etc.,
    are unacceptable to fscanf."

    This was added in response to Defect Report #22:
    http://www.open-std.org/jtc1/sc22/wg...cs/dr_022.html.

    In the case of 100ergs, fscanf reads up to the r before realizing that
    the "e" is not part of the number but at that point, given the one
    character pushback limit, it can no longer push back both the r and the
    e so it has to return with a failure since 100e is not a valid number.
    Many implementations allow more than one character pushback and take
    advantage of this fact in the fscanf function, hence the behavior you
    have seen. Technically such implementations are in violation of the
    Standard but the sentiment among many implementors is that the
    requirement is unjustified and they just live with non-conformance.

    Robert Gamble

    Comment

    • Richard Heathfield

      #3
      Re: Error in scanf implementation or error in example in standard?

      Robert Gamble said:

      <snip>
      Many implementations allow more than one character pushback and take
      advantage of this fact in the fscanf function, hence the behavior you
      have seen. Technically such implementations are in violation of the
      Standard
      Why?

      --
      Richard Heathfield
      "Usenet is a strange place" - dmr 29/7/1999

      email: rjh at the above domain, - www.

      Comment

      • Richard Bos

        #4
        Re: Error in scanf implementation or error in example in standard?

        "Robert Gamble" <rgamble99@gmai l.comwrote:
        Simon Biber wrote:
        I have tested several implementations and none of them get the last case
        right. In no case does fscanf return 0 indicating failure to match
        "100ergs of energy" with "%f".

        The actual behaviour varies. Some will match '100', leaving the 'e' unread:

        quant = 100; strcpy(units, "ergs"); strcpy(item, "energy");
        count = 3;

        While others will match '100e', leaving the 'r' unread:

        quant = 100; strcpy(units, "rgs"); strcpy(item, "energy");
        count = 3;

        But I am yet to come across an implementation that does what the example
        in the Standard specifies. Is this a failure in the implementations or
        in the Standard itself?
        >
        Footnote 245 in n1124 states:
        "fscanf pushes back at most one input character onto the input stream.
        Therefore, some sequences that are acceptable to strtod, strtol, etc.,
        are unacceptable to fscanf."
        True, but feetneet are not normative. Strictly speaking, there's a
        conflict between two parts of the Standard; the footnote makes it clear
        that in this case, the intent was that the part about a single character
        pushback buffer for input streams overrides the part about parsing
        numbers, but it would be better if that were made explicit in the
        _normative_ text in the next TC.

        Richard

        Comment

        • Robert Gamble

          #5
          Re: Error in scanf implementation or error in example in standard?

          Richard Heathfield wrote:
          Robert Gamble said:
          >
          <snip>
          >
          Many implementations allow more than one character pushback and take
          advantage of this fact in the fscanf function, hence the behavior you
          have seen. Technically such implementations are in violation of the
          Standard
          >
          Why?
          Why what? Why such implementations aren't technically conforming?
          Because implementations that push back more than one character in the
          fscanf family of functions do not behave as mandated by the Standard.
          I am not sure I understand your point, perhaps you could clarify with a
          multi-word response.

          Robert Gamble

          Comment

          • Ben Pfaff

            #6
            Re: Error in scanf implementation or error in example in standard?

            "Robert Gamble" <rgamble99@gmai l.comwrites:
            Many implementations allow more than one character pushback and take
            advantage of this fact in the fscanf function, hence the behavior you
            have seen. Technically such implementations are in violation of the
            Standard but the sentiment among many implementors is that the
            requirement is unjustified and they just live with non-conformance.
            C99 says this in the description of the ungetc function:

            One character of pushback is guaranteed. If the ungetc
            function is called too many times on the same stream without
            an intervening read or file positioning operation on that
            stream, the operation may fail.

            I don't see a requirement that *only* one character of pushback
            be supported, only that *at least* one character of pushback be
            supported.

            On the other hand, perhaps you are talking about the following
            text and footnote for the fscanf function; your article seems
            ambiguous to me:

            An input item is read from the stream, unless the specification
            includes an n specifier. An input item is defined as the
            longest sequence of input characters which does not exceed
            any specified field width and which is, or is a prefix of, a
            matching input sequence.242)

            242) fscanf pushes back at most one input character onto the
            input stream. Therefore, some sequences that are
            acceptable to strtod, strtol, etc., are unacceptable
            to fscanf.
            --
            int main(void){char p[]="ABCDEFGHIJKLM NOPQRSTUVWXYZab cdefghijklmnopq rstuvwxyz.\
            \n",*q="kl BIcNBFr.NKEzjwC IxNJC";int i=sizeof p/2;char *strchr();int putchar(\
            );while(*q){i+= strchr(p,*q++)-p;if(i>=(int)si zeof p)i-=sizeof p-1;putchar(p[i]\
            );}return 0;}

            Comment

            • Robert Gamble

              #7
              Re: Error in scanf implementation or error in example in standard?

              Richard Bos wrote:
              "Robert Gamble" <rgamble99@gmai l.comwrote:
              >
              Simon Biber wrote:
              I have tested several implementations and none of them get the last case
              right. In no case does fscanf return 0 indicating failure to match
              "100ergs of energy" with "%f".
              >
              The actual behaviour varies. Some will match '100', leaving the 'e' unread:
              >
              quant = 100; strcpy(units, "ergs"); strcpy(item, "energy");
              count = 3;
              >
              While others will match '100e', leaving the 'r' unread:
              >
              quant = 100; strcpy(units, "rgs"); strcpy(item, "energy");
              count = 3;
              >
              But I am yet to come across an implementation that does what the example
              in the Standard specifies. Is this a failure in the implementations or
              in the Standard itself?
              Footnote 245 in n1124 states:
              "fscanf pushes back at most one input character onto the input stream.
              Therefore, some sequences that are acceptable to strtod, strtol, etc.,
              are unacceptable to fscanf."
              >
              True, but feetneet are not normative.
              And neither are the examples for that matter.
              Strictly speaking, there's a
              conflict between two parts of the Standard; the footnote makes it clear
              that in this case, the intent was that the part about a single character
              pushback buffer for input streams overrides the part about parsing
              numbers, but it would be better if that were made explicit in the
              _normative_ text in the next TC.
              I certainly agree that it would have been nice if this footnote was
              part of the normative text, I don't know why it isn't. The only
              conflict I see is the one in the C90 Standard which was addressed in DR
              022. Although the footnote is non-normative, it along with the example
              and the fact that it was the result of a DR make it abundantly clear
              what the intent was. If intent isn't enough though, a careful reading
              of the normative changes made in the DR (which were carried through to
              C99) yield the same result even if not as clearly spelled out.

              Robert Gamble

              Comment

              • Richard Heathfield

                #8
                Re: Error in scanf implementation or error in example in standard?

                Robert Gamble said:
                Richard Heathfield wrote:
                >Robert Gamble said:
                >>
                ><snip>
                >>
                Many implementations allow more than one character pushback and take
                advantage of this fact in the fscanf function, hence the behavior you
                have seen. Technically such implementations are in violation of the
                Standard
                >>
                >Why?
                >
                Why what? Why such implementations aren't technically conforming?
                Yes.
                Because implementations that push back more than one character in the
                fscanf family of functions do not behave as mandated by the Standard.
                Why not?
                I am not sure I understand your point, perhaps you could clarify with a
                multi-word response.
                <grinOkay, let me see if I can make it clearer. Maybe you're right that
                providing more than the minimum level of pushback is against the rules, and
                maybe you're not. I can see why an implementation *must* provide at least
                one character of pushback, but where is it *forbidden* from providing more?

                --
                Richard Heathfield
                "Usenet is a strange place" - dmr 29/7/1999

                email: rjh at the above domain, - www.

                Comment

                • Robert Gamble

                  #9
                  Re: Error in scanf implementation or error in example in standard?

                  Ben Pfaff wrote:
                  "Robert Gamble" <rgamble99@gmai l.comwrites:
                  >
                  Many implementations allow more than one character pushback and take
                  advantage of this fact in the fscanf function, hence the behavior you
                  have seen. Technically such implementations are in violation of the
                  Standard but the sentiment among many implementors is that the
                  requirement is unjustified and they just live with non-conformance.
                  >
                  C99 says this in the description of the ungetc function:
                  >
                  One character of pushback is guaranteed. If the ungetc
                  function is called too many times on the same stream without
                  an intervening read or file positioning operation on that
                  stream, the operation may fail.
                  >
                  I don't see a requirement that *only* one character of pushback
                  be supported, only that *at least* one character of pushback be
                  supported.
                  I was speaking specifically of the pushback used by the fscanf function
                  which I thought was clear based on the footnote that I cited. I
                  certainly did not mean to imply that multi-character pushback was
                  itself incorrect, just its use in the fscanf function.
                  On the other hand, perhaps you are talking about the following
                  text and footnote for the fscanf function; your article seems
                  ambiguous to me:
                  >
                  An input item is read from the stream, unless the specification
                  includes an n specifier. An input item is defined as the
                  longest sequence of input characters which does not exceed
                  any specified field width and which is, or is a prefix of, a
                  matching input sequence.242)
                  >
                  242) fscanf pushes back at most one input character onto the
                  input stream. Therefore, some sequences that are
                  acceptable to strtod, strtol, etc., are unacceptable
                  to fscanf.
                  Right, I cited this exact footnote at the beginning of my original
                  article, perhaps your missed it.

                  Robert Gamble

                  Comment

                  • Robert Gamble

                    #10
                    Re: Error in scanf implementation or error in example in standard?

                    Richard Heathfield wrote:
                    Robert Gamble said:
                    >
                    Richard Heathfield wrote:
                    Robert Gamble said:
                    >
                    <snip>
                    >
                    Many implementations allow more than one character pushback and take
                    advantage of this fact in the fscanf function, hence the behavior you
                    have seen. Technically such implementations are in violation of the
                    Standard
                    >
                    Why?
                    Why what? Why such implementations aren't technically conforming?
                    >
                    Yes.
                    >
                    Because implementations that push back more than one character in the
                    fscanf family of functions do not behave as mandated by the Standard.
                    >
                    Why not?
                    >
                    I am not sure I understand your point, perhaps you could clarify with a
                    multi-word response.
                    >
                    <grinOkay, let me see if I can make it clearer. Maybe you're right that
                    providing more than the minimum level of pushback is against the rules, and
                    maybe you're not. I can see why an implementation *must* provide at least
                    one character of pushback, but where is it *forbidden* from providing more?
                    First let me make clear that I am speaking only of the pushback
                    functionality used within the fscanf function itself, not the pushback
                    capability of a stream in general (which can provide pushback for as
                    many characters as it desires), at least one person seems to have been
                    confused by my original statement. The Standard makes it clear through
                    the discussed footnote and example that the behavior shall be as if a
                    maximum of one character of pushback was used within the fscanf
                    function ("fscanf pushes back at most one input character onto the
                    input stream"). Although footnotes and examples are non-normative, the
                    same meaning is supported by the normative changes that were provoked
                    by DR 022:

                    In subclause 7.9.6.2, page 135, lines 31-33, change:

                    "An input item is defined as the longest matching sequence of input
                    characters, unless that exceeds a specified field width, in which case
                    it is the initial subsequence of that length in the sequence."

                    to:

                    "An input item is defined as the longest sequence of input characters
                    which does not exceed any specified field width and which is, or is a
                    prefix of, a matching input sequence."

                    Robert Gamble

                    Comment

                    • Richard Heathfield

                      #11
                      Re: Error in scanf implementation or error in example in standard?

                      Robert Gamble said:
                      The Standard makes it clear through
                      the discussed footnote and example that the behavior shall be as if a
                      maximum of one character of pushback was used within the fscanf
                      function ("fscanf pushes back at most one input character onto the
                      input stream").
                      Thank you for clarifying. I know you know that footn...
                      Although footnotes and examples are non-normative,
                      ....er, quite so.
                      the
                      same meaning is supported by the normative changes that were provoked
                      by DR 022:
                      I've found DRs 200 through 294. I can't find DR 022.

                      --
                      Richard Heathfield
                      "Usenet is a strange place" - dmr 29/7/1999

                      email: rjh at the above domain, - www.

                      Comment

                      • Robert Gamble

                        #12
                        Re: Error in scanf implementation or error in example in standard?

                        Richard Heathfield wrote:
                        Robert Gamble said:
                        >
                        The Standard makes it clear through
                        the discussed footnote and example that the behavior shall be as if a
                        maximum of one character of pushback was used within the fscanf
                        function ("fscanf pushes back at most one input character onto the
                        input stream").
                        >
                        Thank you for clarifying. I know you know that footn...
                        >
                        Although footnotes and examples are non-normative,
                        >
                        ...er, quite so.
                        >
                        the
                        same meaning is supported by the normative changes that were provoked
                        by DR 022:
                        >
                        I've found DRs 200 through 294. I can't find DR 022.
                        The link was in my original response:
                        http://www.open-std.org/jtc1/sc22/wg...cs/dr_022.html.

                        Robert Gamble

                        Comment

                        • Ben Pfaff

                          #13
                          Re: Error in scanf implementation or error in example in standard?

                          "Robert Gamble" <rgamble99@gmai l.comwrites:
                          >On the other hand, perhaps you are talking about the following
                          >text and footnote for the fscanf function; your article seems
                          >ambiguous to me:
                          [...]
                          Right, I cited this exact footnote at the beginning of my original
                          article, perhaps your missed it.
                          I did miss it, sorry.
                          --
                          Ben Pfaff
                          email: blp@cs.stanford .edu
                          web: http://benpfaff.org

                          Comment

                          • Richard Heathfield

                            #14
                            Re: Error in scanf implementation or error in example in standard?

                            Robert Gamble said:
                            Richard Heathfield wrote:
                            <snip>
                            >>
                            >I've found DRs 200 through 294. I can't find DR 022.
                            >
                            The link was in my original response:
                            http://www.open-std.org/jtc1/sc22/wg...cs/dr_022.html.
                            My apologies for missing that. It does appear that the text under
                            consideration is still non-normative. (It's footnote 245 in n1124, for
                            those who don't know).

                            Having said that, I accept that the intent of footnotes, despite their
                            non-normative status, is to clarify the meaning of the Standard, so I'll
                            shut up now.

                            (Like I care ***so much*** about fscanf! :-) )

                            --
                            Richard Heathfield
                            "Usenet is a strange place" - dmr 29/7/1999

                            email: rjh at the above domain, - www.

                            Comment

                            • Simon Biber

                              #15
                              Re: Error in scanf implementation or error in example in standard?

                              Robert Gamble wrote:
                              Simon Biber wrote:
                              >I have tested several implementations and none of them get the last case
                              >right. In no case does fscanf return 0 indicating failure to match
                              >"100ergs of energy" with "%f".
                              >>
                              >The actual behaviour varies. Some will match '100', leaving the 'e' unread:
                              >>
                              >quant = 100; strcpy(units, "ergs"); strcpy(item, "energy");
                              >count = 3;
                              >>
                              >While others will match '100e', leaving the 'r' unread:
                              >>
                              >quant = 100; strcpy(units, "rgs"); strcpy(item, "energy");
                              >count = 3;
                              >>
                              >But I am yet to come across an implementation that does what the example
                              >in the Standard specifies. Is this a failure in the implementations or
                              >in the Standard itself?
                              >
                              Footnote 245 in n1124 states:
                              "fscanf pushes back at most one input character onto the input stream.
                              Therefore, some sequences that are acceptable to strtod, strtol, etc.,
                              are unacceptable to fscanf."
                              >
                              This was added in response to Defect Report #22:
                              http://www.open-std.org/jtc1/sc22/wg...cs/dr_022.html.
                              >
                              In the case of 100ergs, fscanf reads up to the r before realizing that
                              the "e" is not part of the number but at that point, given the one
                              character pushback limit, it can no longer push back both the r and the
                              e so it has to return with a failure since 100e is not a valid number.
                              But none of the implementations I tested actually return with a failure!

                              Try it -- whether on Solaris, Linux, Cygwin, DJGPP, Microsoft VC++,
                              LCC-Win32 or Turbo C, none of them return with a failure. They interpret
                              100e as a valid number, with the value 100.

                              That's the real bug, not the quibble on how many characters are pushed back.

                              --
                              Simon.

                              Comment

                              Working...