When to clean input text

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Craig Thomson

    When to clean input text

    I was wondering what people do with text provided by the user in a
    form. Some cleaning needs to be done at some stage if you are going to
    be putting it in a database or displaying it etc. But when is the time
    to do that?

    Do you clean it as soon as you get it?
    Do you pass around the original text and clean it when you use it?

    What about magic slashes? You need to addslashes before using in a db
    statement, but you need to strip them when displaying. When do you do
    that?

    TIA.

    Craig
  • Jimmy Jacobson

    #2
    Re: When to clean input text

    Great question. The answer is that you always clean it. The OWASP (
    http://www.owasp.org ) compiles a list of the top 10 most critical web
    application flaws every year, and every year, unvalidated input is at
    the top of the list. Here is what I do. I wrote a "datascrubb er"
    class that I use on every page that accepts any kind of input either
    from POST or GET variables. The datascrubber is very simple, it runs
    a series of tests on each variable passed to the page. The tests
    include:

    type (using is_int, is_float, etc)
    minimum and maximum lenghts
    minimum and maximum values
    regex - compare it to a regex to see if it matches the expected
    pattern (email address, URL, etc).

    If the variable data passes all the tests, then I push it into an
    array called $clean[] and if not it goes into $unclean[]. At this
    point I do the addslahses as well. Once this is done, I can call any
    variable from the $clean[] array and be sure that it passed the tests
    I set for it. I've encapsulated all this into an object for easy
    reuse and I can provide that to you if you would like.

    Jimmy
    Codingscape.com

    Craig Thomson <craig@spam.fre e> wrote in message news:<bpuu70dj6 kpdt33reub1saqc 7ucq325dnt@4ax. com>...[color=blue]
    > I was wondering what people do with text provided by the user in a
    > form. Some cleaning needs to be done at some stage if you are going to
    > be putting it in a database or displaying it etc. But when is the time
    > to do that?
    >
    > Do you clean it as soon as you get it?
    > Do you pass around the original text and clean it when you use it?
    >
    > What about magic slashes? You need to addslashes before using in a db
    > statement, but you need to strip them when displaying. When do you do
    > that?
    >
    > TIA.
    >
    > Craig[/color]

    Comment

    • Jimmy Jacobson

      #3
      Re: When to clean input text

      Great question. The answer is that you always clean it. The OWASP (
      http://www.owasp.org ) compiles a list of the top 10 most critical web
      application flaws every year, and every year, unvalidated input is at
      the top of the list. Here is what I do. I wrote a "datascrubb er"
      class that I use on every page that accepts any kind of input either
      from POST or GET variables. The datascrubber is very simple, it runs
      a series of tests on each variable passed to the page. The tests
      include:

      type (using is_int, is_float, etc)
      minimum and maximum lenghts
      minimum and maximum values
      regex - compare it to a regex to see if it matches the expected
      pattern (email address, URL, etc).

      If the variable data passes all the tests, then I push it into an
      array called $clean[] and if not it goes into $unclean[]. At this
      point I do the addslahses as well. Once this is done, I can call any
      variable from the $clean[] array and be sure that it passed the tests
      I set for it. I've encapsulated all this into an object for easy
      reuse and I can provide that to you if you would like.

      Jimmy
      Codingscape.com

      Craig Thomson <craig@spam.fre e> wrote in message news:<bpuu70dj6 kpdt33reub1saqc 7ucq325dnt@4ax. com>...[color=blue]
      > I was wondering what people do with text provided by the user in a
      > form. Some cleaning needs to be done at some stage if you are going to
      > be putting it in a database or displaying it etc. But when is the time
      > to do that?
      >
      > Do you clean it as soon as you get it?
      > Do you pass around the original text and clean it when you use it?
      >
      > What about magic slashes? You need to addslashes before using in a db
      > statement, but you need to strip them when displaying. When do you do
      > that?
      >
      > TIA.
      >
      > Craig[/color]

      Comment

      • Chung Leong

        #4
        Re: When to clean input text

        "Craig Thomson" <craig@spam.fre e> wrote in message
        news:bpuu70dj6k pdt33reub1saqc7 ucq325dnt@4ax.c om...[color=blue]
        > I was wondering what people do with text provided by the user in a
        > form. Some cleaning needs to be done at some stage if you are going to
        > be putting it in a database or displaying it etc. But when is the time
        > to do that?
        >
        > Do you clean it as soon as you get it?
        > Do you pass around the original text and clean it when you use it?[/color]

        I use the latter approach, since you can only tell whether something is
        "clean" or not when it's used in a particular context. An example would be
        text with unescaped single quotes.

        A good rule to go by, I think, is "functions should always validate
        parameters passed to them." For sample, say I have the function
        GetUser($user_i d). Since an integer is expected, the function should either
        fail immediately when an non-integer is passed or cast the parameter into an
        int.
        [color=blue]
        > What about magic slashes? You need to addslashes before using in a db
        > statement, but you need to strip them when displaying. When do you do
        > that?[/color]

        Magic quotes, IMHO, is the dumbest feature of PHP. Turn it off if you can.
        If not, use a statement in a header file to strip off slashes from all
        incoming data ($_GET, $_PUT), and them escape quotes manually.


        Comment

        • Craig Thomson

          #5
          Re: When to clean input text

          On 16 Apr 2004 10:05:15 -0700, jamesj@jamesj.z yx.net (Jimmy Jacobson)
          wrote:
          [color=blue]
          >I wrote a "datascrubb er"
          >class that I use on every page that accepts any kind of input either
          >from POST or GET variables.[/color]
          [...][color=blue]
          >I've encapsulated all this into an object for easy
          >reuse and I can provide that to you if you would like.[/color]

          Thanks, I would love to see it!

          Craig

          Comment

          • Craig Thomson

            #6
            Re: When to clean input text

            On Fri, 16 Apr 2004 18:39:05 -0400, "Chung Leong"
            <chernyshevsky@ hotmail.com> wrote:
            [color=blue][color=green]
            >> What about magic slashes? You need to addslashes before using in a db
            >> statement, but you need to strip them when displaying. When do you do
            >> that?[/color]
            >
            >Magic quotes, IMHO, is the dumbest feature of PHP. Turn it off if you can.
            >If not, use a statement in a header file to strip off slashes from all
            >incoming data ($_GET, $_PUT), and them escape quotes manually.[/color]

            What do you mean by putting a statement in a header file? Do you mean
            turning it off using an option places in a header file? Or do you mean
            checking if it is on and stripping the slashes as you read the $_GET
            and $_POST data?

            Craig

            Comment

            • Chung Leong

              #7
              Re: When to clean input text

              "Craig Thomson" <craig@spam.fre e> wrote in message
              news:t076805scb ilg4b2c2004vjtv 46sa5v67b@4ax.c om...[color=blue]
              > On Fri, 16 Apr 2004 18:39:05 -0400, "Chung Leong"
              > <chernyshevsky@ hotmail.com> wrote:
              >[color=green][color=darkred]
              > >> What about magic slashes? You need to addslashes before using in a db
              > >> statement, but you need to strip them when displaying. When do you do
              > >> that?[/color]
              > >
              > >Magic quotes, IMHO, is the dumbest feature of PHP. Turn it off if you[/color][/color]
              can.[color=blue][color=green]
              > >If not, use a statement in a header file to strip off slashes from all
              > >incoming data ($_GET, $_PUT), and them escape quotes manually.[/color]
              >
              > What do you mean by putting a statement in a header file? Do you mean
              > turning it off using an option places in a header file? Or do you mean
              > checking if it is on and stripping the slashes as you read the $_GET
              > and $_POST data?[/color]

              The assumption is that you have a file which is included at the beginning of
              every script. Global.php or something like that. In this file, you place the
              slash stripping code, so that all your scripts will get data without
              slashes.

              Example:

              if(get_magic_qu otes_gpc()) {
              function __stripslashes (&$s) { $s = stripslashes($s ); }

              array_walk($_PO ST, '__stripslashes ');
              array_walk($_GE T, '__stripslashes ');
              }

              This is necessary if you can't change the setting in php.ini.


              Comment

              • Craig Thomson

                #8
                Re: When to clean input text

                On Mon, 19 Apr 2004 18:31:13 -0400, "Chung Leong"
                <chernyshevsky@ hotmail.com> wrote:
                [color=blue]
                >The assumption is that you have a file which is included at the beginning of
                >every script. Global.php or something like that. In this file, you place the
                >slash stripping code, so that all your scripts will get data without
                >slashes.
                >
                >Example:
                >
                >if(get_magic_q uotes_gpc()) {
                > function __stripslashes (&$s) { $s = stripslashes($s ); }
                >
                > array_walk($_PO ST, '__stripslashes ');
                > array_walk($_GE T, '__stripslashes ');
                >}
                >
                >This is necessary if you can't change the setting in php.ini.
                >[/color]

                Thanks Chuck.

                I was thinking about this and wondered if it would be possible for the
                GET or POST element to be an array itself? If so, wouldn't your user
                function be better as:

                function __stripslashes (&$s) {
                if ( is_array($s) ) { __stripslashes( $s); }
                $s = stripslashes($s );
                }

                But I'm not sure under what circumstances it would be an array. A
                multi select box may do it. What do you think?

                Craig

                Comment

                • Craig Thomson

                  #9
                  Re: When to clean input text

                  Actually it should be:
                  function __stripslashes (&$s) {
                  if ( is_array($s) ) { array_walk($s, '__stripslashes '); }
                  else { $s = stripslashes($s ); }
                  }

                  I have a test case below.

                  Thanks, Chung, for the code.

                  Craig

                  PS: And Chung, sorry for calling you Chuck in a previous post. My
                  mistake.

                  -----------------------------------------------

                  <html><head><ti tle> Test Forms </title></head><body>
                  <pre><?PHP print_r($_POST) ?></pre>
                  <HR>
                  <?PHP
                  if( get_magic_quote s_gpc() ) {
                  function __stripslashes (&$s) {
                  if ( is_array($s) ) {
                  array_walk($s, '__stripslashes ');
                  } else {
                  $s = stripslashes($s );
                  }
                  }

                  array_walk($_PO ST, '__stripslashes ');
                  }
                  ?>

                  <pre><?PHP print_r($_POST) ?></pre>
                  <hr>

                  <form action="<?PHP echo $_SERVER['PHP_SELF']; ?>" method="post">
                  <select multiple name="snacks[]">
                  <option value='option "l"'>Option one</option>
                  <option value="option '2'">Option two</option>
                  <option value='option "3"'>Option three</option>
                  <option value="option '4'">Option four</option>
                  <option value='option "5"'>Option five</option>
                  <option value="option '5'">Option five</option>
                  </select>
                  <input type="submit" value="Submit" name="add">
                  </form>
                  </body>
                  </html>

                  Comment

                  Working...