Checking for excessive input when using fgets()

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • CaptainnFungi
    New Member
    • Oct 2007
    • 2

    Checking for excessive input when using fgets()

    Hi All,

    I am very new to C and have been working my way through a few C books with the aim of getting more knowledge in programming. However I have hit a wall and I am not sure how to get over it.

    I have the following code, it is quite simple.

    Code:
    #include <stdio.h>
    
    struct cat
    {
         int age;
         int height;
         char name[20];
         char father[20];
         char mother[20];
    };
    
    
    int main()
    {
    
       struct cat myCat[2];
       int i = 0;
    
       for(i = 0; i < 2 ; i++ )
       {
    	printf("What is your cats name?\n");
    	fgets(myCat[i].name, 20, stdin);
       }
       
      for(i = 0; i < 2 ; i++ )
       {
    	printf("%s", myCat[i].name);
       }
       
    return 0;
    }
    I understand how this code works without any problems. I do have issues regarding input though.

    If I enter more that the allowed characters using the fgets function then the first myCat.name[i] is terminated with \0, this is what I would expect. The problem is that the STDIO buffer retains my extra input and enters it into the next fgets() function. How can I stop this?

    Also how can I do some error checking on the fgets() functions to see if the user entered more than needed input and consequentaly reset the character array before asking them again?

    I was thinking of finding the character array length and if it is equal to 19 checking the last element for the \0 terminator? I think this would do it but I am unsure how to clear the array so that it is empty and does not contain "junk" data.

    Any help is greatly appreciated.
  • gpraghuram
    Recognized Expert Top Contributor
    • Mar 2007
    • 1275

    #2
    Originally posted by CaptainnFungi
    Hi All,

    I am very new to C and have been working my way through a few C books with the aim of getting more knowledge in programming. However I have hit a wall and I am not sure how to get over it.

    I have the following code, it is quite simple.

    Code:
    #include <stdio.h>
    
    struct cat
    {
         int age;
         int height;
         char name[20];
         char father[20];
         char mother[20];
    };
    
    
    int main()
    {
    
       struct cat myCat[2];
       int i = 0;
    
       for(i = 0; i < 2 ; i++ )
       {
    	printf("What is your cats name?\n");
    	fgets(myCat[i].name, 20, stdin);
       }
       
      for(i = 0; i < 2 ; i++ )
       {
    	printf("%s", myCat[i].name);
       }
       
    return 0;
    }
    I understand how this code works without any problems. I do have issues regarding input though.

    If I enter more that the allowed characters using the fgets function then the first myCat.name[i] is terminated with \0, this is what I would expect. The problem is that the STDIO buffer retains my extra input and enters it into the next fgets() function. How can I stop this?

    Also how can I do some error checking on the fgets() functions to see if the user entered more than needed input and consequentaly reset the character array before asking them again?

    I was thinking of finding the character array length and if it is equal to 19 checking the last element for the \0 terminator? I think this would do it but I am unsure how to clear the array so that it is empty and does not contain "junk" data.

    Any help is greatly appreciated.

    Hi,
    After doing read you can call gets to clear stdin from having previously entered characters. see this code
    [code=c]
    #include <stdio.h>

    struct cat
    {
    int age;
    int height;
    char name[20];
    char father[20];
    char mother[20];
    };

    void clear_stdin()
    {
    char read_rem[200];
    gets(read_rem);
    }
    int main()
    {

    struct cat myCat[2];
    int i = 0;

    for(i = 0; i < 2 ; i++ )
    {
    printf("What is your cats name?\n");
    fgets(myCat[i].name, 20, stdin);
    clear_stdin();/i have added this.
    }

    for(i = 0; i < 2 ; i++ )
    {
    printf("%s", myCat[i].name);
    }

    return 0;
    }
    [/code]

    I think this will solve the purpose
    Thanks
    Raghuram

    Comment

    • Banfa
      Recognized Expert Expert
      • Feb 2006
      • 9067

      #3
      Originally posted by CaptainnFungi
      I was thinking of finding the character array length and if it is equal to 19 checking the last element for the \0 terminator? I think this would do it but I am unsure how to clear the array so that it is empty and does not contain "junk" data.
      The is no need to clear the array. What you should do is check to see if the last character returned is a newline ('\n' ASCII 0x0A).

      The user will have had to press return and fgets returns this character, if you have read the '\n' then you have read the entire line. You have not read a '\n' then there is more data to read from the line. Additionally you should check the return value from fgets to catch any errors.

      Additionally you have a bug waiting to happen in you fgets line and loop conditions.

      You have [code=c] fgets(myCat[i].name, 20, stdin);[/code] the problem is the 20. You have hard coded the buffer length, if you change the size of cat::name this code will need changing as well, if you don't and the buffer size is reduced you run a real risk of a segmentation fault.

      This is better [code=c] fgets(myCat[i].name, sizeof myCat[i].name, stdin);[/code]the sizeof operator instructs the compiler to automatically insert the size of the cat::name array. If the buffer size is changed then the size passed to fgets is automatically changed.

      There are other ways of doing this (you could us a #define) but this is my prefered method.

      You loop end condition has the same problem, hardcoded to 2. Again would be better to have it change automatically with a change in array size and again this can be achieved either using the sizeof operator or using a #define.

      What you want to achieve is only defining any given number in 1 place in your code, then if you need to change it you only have to change a single place and the rest of the code is automatically recompiled to the new size. This makes you code much more robust.

      Comment

      • Banfa
        Recognized Expert Expert
        • Feb 2006
        • 9067

        #4
        Originally posted by gpraghuram
        [code=c]
        void clear_stdin()
        {
        char read_rem[200];
        gets(read_rem);
        }
        [/code]

        I think this will solve the purpose
        I do not think this is a very good solution, it uses gets instead of fgets risking a buffer overrun error in read_rem and if it used fgets then it would run into the same problem the original code did. All be it that the user has to enter > 220 characters in order to run into problems.

        Since this problem can be fixed absolutely so that it always works no matter how many charcters are input this "sort of works but not in every situation" approach is very poor design.

        It is this sort of sloppy programming that leaves applications open to expoits using buffer overrun errors.

        Comment

        • CaptainnFungi
          New Member
          • Oct 2007
          • 2

          #5
          I have used this function to clear stdio, anybody have any thoughts on if this is a good or bad solution?

          Code:
          /*	This function flushes the stdio input buffer */
          void flush_input ( void ) 
          { 
          	/* getchar returns an int */ 
          	int ch; 
          	/* Read characters until there are none left */ 
          	do 
          	ch = getchar(); 
          	while ( ch != EOF && ch != '\n' ); 
          	/* Clear EOF state */ 
          	clearerr ( stdin ); 
          }

          Comment

          • gpraghuram
            Recognized Expert Top Contributor
            • Mar 2007
            • 1275

            #6
            Originally posted by Banfa
            I do not think this is a very good solution, it uses gets instead of fgets risking a buffer overrun error in read_rem and if it used fgets then it would run into the same problem the original code did. All be it that the user has to enter > 220 characters in order to run into problems.

            Since this problem can be fixed absolutely so that it always works no matter how many charcters are input this "sort of works but not in every situation" approach is very poor design.

            It is this sort of sloppy programming that leaves applications open to expoits using buffer overrun errors.
            Hi Banfa,
            Thanks for pointing it out.
            Do u have a better solution for this as i think this is one of the common need in many programs.

            Raghuram

            Comment

            • Banfa
              Recognized Expert Expert
              • Feb 2006
              • 9067

              #7
              I would do it like this.

              The first thing to note is that it is nearly always necessary to encapsulate the input functions with your own functions. This is because verifying user input is actually a somewhat involved task and keeping the logic of doing this out of the main body of your code will make the rest of the code easier to maintain and therefore more robust.

              Also if you do this you can unit test your input functions to provide a high level on confidence in them, where as if you write it into your code every time you end up with a mess that is hard to test properly.

              So here is the function I would use to get a single string input (and this would then be the basis of functions to get integers and floats if the program required them)

              [code=c]
              int GetString(char *string, size_t size)
              {
              char *fgr;
              int result = 1; /* Assume OK */
              char tmp[10];

              fgr = fgets(string, size, stdin);

              if (fgr != NULL)
              {
              if (string[strlen(string)-1] != '\n')
              {
              do
              {
              fgr = fgets(tmp, sizeof tmp, stdin);
              }
              while (fgr != NULL && tmp[strlen(tmp)-1] != '\n');

              if (fgr == NULL)
              {
              result = 0;
              }
              }
              }
              else
              {
              result = 0; /* fgets returned error of some sort (could be eof) */
              }

              return result;
              }
              [/code]

              Comment

              • Banfa
                Recognized Expert Expert
                • Feb 2006
                • 9067

                #8
                Originally posted by CaptainnFungi
                I have used this function to clear stdio, anybody have any thoughts on if this is a good or bad solution?

                Code:
                /*	This function flushes the stdio input buffer */
                void flush_input ( void ) 
                { 
                	/* getchar returns an int */ 
                	int ch; 
                	/* Read characters until there are none left */ 
                	do 
                	ch = getchar(); 
                	while ( ch != EOF && ch != '\n' ); 
                	/* Clear EOF state */ 
                	clearerr ( stdin ); 
                }
                This solution seems fine however you have to be sure that stdin needs flushing before calling it, if you call it when there is no data on stding getchar waits for input.

                There is still a case for encapsulating the input command as I did but using your flush_input combined with my original function gives

                [code=c]
                int GetString(char *string, size_t size)
                {
                char *fgr;
                int result = 1; /* Assume OK */

                fgr = fgets(string, size, stdin);

                if (fgr != NULL)
                {
                if (string[strlen(string)-1] != '\n')
                {
                flush_input();
                }
                }
                else
                {
                result = 0; /* fgets returned error of some sort (could be eof) */
                }

                return result;
                }
                [/code]
                Which is better because it requires less stack data (although an extra stack frame).

                Comment

                Working...