Correct behaviour of scanf and sscanf

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Rob Thorpe

    Correct behaviour of scanf and sscanf

    Given the code:-

    r = sscanf (s, "%lf", x);

    What is the correct output if the string s is simply "-" ?

    If "-" is considered the beginning of a number, that has been
    cut-short then the correct output is that r = EOF. If it is taken to
    be a letter in the stream, then the output should be r = 0, as far as
    I can see. My compiler gives EOF.

    Does the standard specify which is correct?
  • Mac

    #2
    Re: Correct behaviour of scanf and sscanf

    On Mon, 14 Mar 2005 10:44:13 -0800, Rob Thorpe wrote:
    [color=blue]
    > Given the code:-
    >
    > r = sscanf (s, "%lf", x);
    >
    > What is the correct output if the string s is simply "-" ?
    >
    > If "-" is considered the beginning of a number, that has been
    > cut-short then the correct output is that r = EOF. If it is taken to
    > be a letter in the stream, then the output should be r = 0, as far as
    > I can see. My compiler gives EOF.
    >
    > Does the standard specify which is correct?[/color]

    From C-89 (or a reasonable facsimile thereof)

    *************be ginning of excerpt ***************

    4.9.6.6 The sscanf function

    [snip]

    Returns

    The sscanf function returns the value of the macro EOF if an input
    failure occurs before any conversion. Otherwise, the sscanf function
    returns the number of input items assigned, which can be fewer than
    provided for, or even zero, in the event of an early matching failure.

    *********** end of excerpt **************


    The behavior you describe sounds correct to me. sscanf() is supposed to
    return EOF if an input failure occurs. Elsewhere, it says that
    encountering the end of the string is identical to encountering the end
    of file in a fscanf() call. So this is just like calling fscanf on a text
    file which has the single character '.' in it.

    HTH

    --Mac

    Comment

    • Eric Sosman

      #3
      Re: Correct behaviour of scanf and sscanf

      Rob Thorpe wrote:
      [color=blue]
      > Given the code:-
      >
      > r = sscanf (s, "%lf", x);
      >
      > What is the correct output if the string s is simply "-" ?
      >
      > If "-" is considered the beginning of a number, that has been
      > cut-short then the correct output is that r = EOF. If it is taken to
      > be a letter in the stream, then the output should be r = 0, as far as
      > I can see. My compiler gives EOF.
      >
      > Does the standard specify which is correct?[/color]

      Haven't seen a reply in the several hours since I first
      saw the message, so (fools rush in ...) I'll hazard a guess.

      The Standard speaks of two sorts of failure for *scanf()
      directives: "matching failure," which amounts to an input
      sequence that doesn't satisfy the syntax required by the
      directive, and "input failure," meaning that the source of
      input characters dried up -- for the stream-input versions
      this means EOF was sensed, and for sscanf() it means the
      scan reached the end of the string. On a matching failure,
      *scanf() stops operating and returns the number of items
      already matched and converted (0, in your example), while
      for an input failure *scanf() returns EOF.

      So the question boils down to this: When "%lf" processes
      "-", is the failure a matching failure or an input failure?

      One point of view considers it a matching failure. The
      characters for "%f" are supposed to be something strtod() would
      swallow: an optional all-whitespace prefix, an optional sign,
      and then a character string resembling a floating-point constant
      as written in C source code. The string "-" doesn't match this
      description (the floating-point constant is missing), so it could
      be called a matching failure.

      The other viewpoint holds that no "mismatch" was detected
      before end-of-string, so it's an input failure. The sequence
      of characters is perfectly good as the prefix of a valid match,
      and the only thing preventing a complete match is the fact that
      no more input was available. Hence (says this argument), the
      operation ends with an input failure rather than a matching
      failure, and EOF is the correct return value.

      IMHO the Standard is not entirely clear about which argument
      is correct: is an incomplete prefix a failure to match, or a
      failure of the input source? To me, the language of the Standard
      doesn't shine enough light into this dark corner -- but if anyone
      happens to have a torch to hand, I'd welcome illumination ...

      Trying to put myself in the place of an implementor, I'd
      imagine the input failure (EOF) outcome would be "more natural,"
      but I don't think the Standard's language actually says so in
      so many words.

      The fool has rushed in; tread, o ye angels!

      --
      Eric Sosman
      esosman@acm-dot-org.invalid

      Comment

      • CBFalconer

        #4
        Re: Correct behaviour of scanf and sscanf

        Rob Thorpe wrote:[color=blue]
        >
        > Given the code:-
        >
        > r = sscanf (s, "%lf", x);
        >
        > What is the correct output if the string s is simply "-" ?
        >
        > If "-" is considered the beginning of a number, that has been
        > cut-short then the correct output is that r = EOF. If it is
        > taken to be a letter in the stream, then the output should be
        > r = 0, as far as I can see. My compiler gives EOF.
        >
        > Does the standard specify which is correct?[/color]

        The problem is that detection of such a lone '-' requires reading
        two characters, the second of which is not a digit. C only
        guarantees one level of pushback via ungetc, so whatever routine is
        doing the parsing (such as scanf) cannot leave the input stream
        unaltered and report 'No number available'. With string sources
        this obviously does not apply. So the question is "should the
        string and stream operations function in the same manner". A
        similar (but worse) problem arises after reading the e in floating
        point formats. "3.0e-x" should return 3.0 and have to push back
        three chars.

        I think the proper thing would be to guarantee three level
        pushback, maybe in C05. This requires defining what is to be done
        when an application attempts excess pushback :-)

        --
        "If you want to post a followup via groups.google.c om, don't use
        the broken "Reply" link at the bottom of the article. Click on
        "show options" at the top of the article, then click on the
        "Reply" at the bottom of the article headers." - Keith Thompson


        Comment

        • tigervamp

          #5
          Re: Correct behaviour of scanf and sscanf


          CBFalconer wrote:[color=blue]
          > Rob Thorpe wrote:[color=green]
          > >
          > > Given the code:-
          > >
          > > r = sscanf (s, "%lf", x);
          > >
          > > What is the correct output if the string s is simply "-" ?
          > >
          > > If "-" is considered the beginning of a number, that has been
          > > cut-short then the correct output is that r = EOF. If it is
          > > taken to be a letter in the stream, then the output should be
          > > r = 0, as far as I can see. My compiler gives EOF.
          > >
          > > Does the standard specify which is correct?[/color]
          >
          > The problem is that detection of such a lone '-' requires reading
          > two characters, the second of which is not a digit. C only
          > guarantees one level of pushback via ungetc,[/color]

          Footnote 242 in section 7.19.6.2 (fscanf) indicates that a _maximum_ of
          one character can be pushed back, the standard does not say that sscanf
          behaves differently.
          [color=blue]
          > so whatever routine is
          > doing the parsing (such as scanf) cannot leave the input stream
          > unaltered and report 'No number available'. With string sources
          > this obviously does not apply. So the question is "should the
          > string and stream operations function in the same manner".[/color]

          According to the standard they should.
          [color=blue]
          > A similar (but worse) problem arises after reading the e in floating
          > point formats. "3.0e-x" should return 3.0 and have to push back
          > three chars.[/color]

          fscanf should consume the "3.0e-x", recognize a matching failure, push
          the "x" back onto the stream, and return 0. This is the behavior
          defined in example 3 of section 7.19.6.2p20 (fscanf), and again the
          standard specifies that sscanf should behave the same.

          I think that in the OP's case the behavior should be similiar and the
          return value should be 0, glibc does this and I think they are right
          here. From what I can tell, EOF is never returned if a character was
          read (regardless of whether is matched or was pushed back), but I may
          well be wrong.
          [color=blue]
          > I think the proper thing would be to guarantee three level
          > pushback, maybe in C05. This requires defining what is to be done
          > when an application attempts excess pushback :-)[/color]

          I think the current behavior is pretty clear and well-defined but
          notable implementations do not follow this behavior (Solaris and glibc
          both push back multiple characters to achieve the output you described
          above, details about the Solaris behavior can be found at
          http://iforce.sun.com/protected/sola...eral/scanf.txt,
          apparently there are instances that require at least 5 characters to be
          pushed back to follow the behavior you outlined).
          [color=blue]
          > --
          > "If you want to post a followup via groups.google.c om, don't use
          > the broken "Reply" link at the bottom of the article. Click on
          > "show options" at the top of the article, then click on the
          > "Reply" at the bottom of the article headers." - Keith Thompson[/color]

          Rob Gamble

          Comment

          • Rob Thorpe

            #6
            Re: Correct behaviour of scanf and sscanf

            CBFalconer <cbfalconer@yah oo.com> wrote in message news:<423671FC. CD3BFE56@yahoo. com>...[color=blue]
            > Rob Thorpe wrote:[color=green]
            > >
            > > Given the code:-
            > >
            > > r = sscanf (s, "%lf", x);
            > >
            > > What is the correct output if the string s is simply "-" ?
            > >
            > > If "-" is considered the beginning of a number, that has been
            > > cut-short then the correct output is that r = EOF. If it is
            > > taken to be a letter in the stream, then the output should be
            > > r = 0, as far as I can see. My compiler gives EOF.
            > >
            > > Does the standard specify which is correct?[/color]
            >
            > The problem is that detection of such a lone '-' requires reading
            > two characters, the second of which is not a digit. C only
            > guarantees one level of pushback via ungetc, so whatever routine is
            > doing the parsing (such as scanf) cannot leave the input stream
            > unaltered and report 'No number available'. With string sources
            > this obviously does not apply. So the question is "should the
            > string and stream operations function in the same manner". A
            > similar (but worse) problem arises after reading the e in floating
            > point formats. "3.0e-x" should return 3.0 and have to push back
            > three chars.
            >
            > I think the proper thing would be to guarantee three level
            > pushback, maybe in C05. This requires defining what is to be done
            > when an application attempts excess pushback :-)[/color]

            Thanks, that explains it.

            I wondered if testing for both 0 and EOF is OTT, but since it works as
            you describe it's necessary in very many situtions.

            Comment

            • Dan Pop

              #7
              Re: Correct behaviour of scanf and sscanf

              In <BtidnVDXwvdZ0q vfRVn-1w@comcast.com> Eric Sosman <esosman@acm-dot-org.invalid> writes:
              [color=blue]
              >Rob Thorpe wrote:
              >[color=green]
              >> Given the code:-
              >>
              >> r = sscanf (s, "%lf", x);
              >>
              >> What is the correct output if the string s is simply "-" ?
              >>
              >> If "-" is considered the beginning of a number, that has been
              >> cut-short then the correct output is that r = EOF. If it is taken to
              >> be a letter in the stream, then the output should be r = 0, as far as
              >> I can see. My compiler gives EOF.
              >>
              >> Does the standard specify which is correct?[/color]
              >
              > Haven't seen a reply in the several hours since I first
              >saw the message, so (fools rush in ...) I'll hazard a guess.
              >
              > The Standard speaks of two sorts of failure for *scanf()
              >directives: "matching failure," which amounts to an input
              >sequence that doesn't satisfy the syntax required by the
              >directive, and "input failure," meaning that the source of
              >input characters dried up -- for the stream-input versions
              >this means EOF was sensed, and for sscanf() it means the
              >scan reached the end of the string. On a matching failure,
              >*scanf() stops operating and returns the number of items
              >already matched and converted (0, in your example), while
              >for an input failure *scanf() returns EOF.
              >
              > So the question boils down to this: When "%lf" processes
              >"-", is the failure a matching failure or an input failure?
              >
              > One point of view considers it a matching failure. The
              >characters for "%f" are supposed to be something strtod() would
              >swallow: an optional all-whitespace prefix, an optional sign,
              >and then a character string resembling a floating-point constant
              >as written in C source code. The string "-" doesn't match this
              >description (the floating-point constant is missing), so it could
              >be called a matching failure.
              >
              > The other viewpoint holds that no "mismatch" was detected
              >before end-of-string, so it's an input failure. The sequence
              >of characters is perfectly good as the prefix of a valid match,
              >and the only thing preventing a complete match is the fact that
              >no more input was available. Hence (says this argument), the
              >operation ends with an input failure rather than a matching
              >failure, and EOF is the correct return value.
              >
              > IMHO the Standard is not entirely clear about which argument
              >is correct: is an incomplete prefix a failure to match, or a
              >failure of the input source? To me, the language of the Standard
              >doesn't shine enough light into this dark corner -- but if anyone
              >happens to have a torch to hand, I'd welcome illumination ...[/color]

              An incomplete prefix followed by an end of file condition cannot be a
              matching failure, like "- " or "-foo", we're clearly in the case where
              an input failure occured before any conversion, just as if the input
              were an empty string.

              I agree that the text of the standard is less than crystal clear and I
              wouldn't be surprised to see different behaviours on different
              implementations . OTOH, as an implementation user, especially in the
              case of sscanf, I see no problem: if the function doesn't return 1, it is
              obvious that the input string doesn't contain a valid number.

              Dan
              --
              Dan Pop <Dan.Pop@ifh.de >

              Comment

              Working...