Tricky Error

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • dracolytch@gmail.com

    Tricky Error

    Good day all,
    Ok, I have a pretty tricky problem that I need some help with.

    I pass around search query information a fair amount (specifically
    WHERE statements). Normally, I just rawurlencode() the buggers, and
    pass them via the URL. I like having the where clauses in the URL,
    because then someone can just bookmark the URL, or send it to a friend,
    and I don't have to worry about a thing.

    If someone does a search that requires a LIKE statement, the system
    parses out the terms, and generates the LIKE statement, and passes the
    LIKE statement to the search results page. This works great.

    Almost all of the time.

    Here's where the problem occurs:
    If you search for a NUMBER, and the system uses a LIKE statement, the
    string looks like this:

    data LIKE('%415%')

    Which is great... When I rawurlencode it, I get:

    LIKE%28%27%2541 5%25%27%29

    Looking at it more closely:
    %25 = %
    %27 = '
    %28 = (
    %29 = )

    Which is GREAT... all that makes sense.

    Unfortunately, when I rawurldecode that URL, I get:
    LIKE('A5%')

    It appears to be interpreting the %25 as a %, and then using that % on
    the next two numbers... Ideas?

    ~D

  • J Wynia

    #2
    Re: Tricky Error

    dracolytch@gmai l.com wrote:[color=blue]
    > Good day all,
    > Ok, I have a pretty tricky problem that I need some help with.
    >
    > I pass around search query information a fair amount (specifically
    > WHERE statements). Normally, I just rawurlencode() the buggers, and
    > pass them via the URL. I like having the where clauses in the URL,
    > because then someone can just bookmark the URL, or send it to a friend,
    > and I don't have to worry about a thing.
    >[/color]
    1. You should probably do some reading on "SQL injection" and tread
    very carefully. Having SQL in your request (even if you're parsing it)
    is playing with fire.
    2. You can accomplish the whole "bookmarkab le URL" thing without
    putting your SQL right on the URL.

    Here's a way to do it. Cache the SQL statements that your script creates
    from the search query to the filesystem. An easy way to do that is to
    md5() the constructed SQL statement, create a file with that md5() value
    and put the SQL statement into the file. Then, put query=md5string on
    your URL's. When that parameter is present, check the filesystem for the
    appropriate cached query and pull it in, using it instead of your
    defaults. Quick sample snippets are below.

    --------------
    $sql = "SELECT * FROM table";
    $cached_filenam e = md5($sql);
    $filename = "cachedquer ies/$cached_filenam e";
    if (!$handle = fopen($filename , 'w')) {
    die("Cannot open file $filename");
    exit;
    }

    if (fwrite($handle , $sql) === FALSE) {
    die("Cannot write file ($filename");
    exit;
    }

    fclose($handle) ;
    }
    $url = "existingurl?qu ery=$cached_fil ename;
    -------------
    $query = $_GET['query'];
    $cached_query = "cachedquer ies/$query";
    if(file_exists( $cached_query)) {
    $sql = file_get_conten ts($cached_quer y);
    } else {
    $sql = "SELECT * FROM table";
    }
    --------------------------------
    J Wynia
    Myriad Intellect, Inc.
    "Web technology that earns its keep."

    Comment

    • J Wynia

      #3
      Re: Tricky Error

      dracolytch@gmai l.com wrote:[color=blue]
      > Good day all,
      > Ok, I have a pretty tricky problem that I need some help with.
      >
      > I pass around search query information a fair amount (specifically
      > WHERE statements). Normally, I just rawurlencode() the buggers, and
      > pass them via the URL. I like having the where clauses in the URL,
      > because then someone can just bookmark the URL, or send it to a friend,
      > and I don't have to worry about a thing.
      >[/color]
      1. You should probably do some reading on "SQL injection" and tread
      very carefully. Having SQL in your request (even if you're parsing it)
      is playing with fire.
      2. You can accomplish the whole "bookmarkab le URL" thing without
      putting your SQL right on the URL.

      Here's a way to do it. Cache the SQL statements that your script creates
      from the search query to the filesystem. An easy way to do that is to
      md5() the constructed SQL statement, create a file with that md5() value
      and put the SQL statement into the file. Then, put query=md5string on
      your URL's. When that parameter is present, check the filesystem for the
      appropriate cached query and pull it in, using it instead of your
      defaults. Quick sample snippets are below.

      --------------
      $sql = "SELECT * FROM table";
      $cached_filenam e = md5($sql);
      $filename = "cachedquer ies/$cached_filenam e";
      if (!$handle = fopen($filename , 'w')) {
      die("Cannot open file $filename");
      exit;
      }

      if (fwrite($handle , $sql) === FALSE) {
      die("Cannot write file ($filename");
      exit;
      }

      fclose($handle) ;
      }
      $url = "existingurl?qu ery=$cached_fil ename;
      -------------
      $query = $_GET['query'];
      $cached_query = "cachedquer ies/$query";
      if(file_exists( $cached_query)) {
      $sql = file_get_conten ts($cached_quer y);
      } else {
      $sql = "SELECT * FROM table";
      }
      --------------------------------
      J Wynia
      Myriad Intellect, Inc.
      "Web technology that earns its keep."

      Comment

      • Colin McKinnon

        #4
        Re: Tricky Error

        dracolytch@gmai l.com wrote:
        [color=blue]
        > Good day all,
        > Ok, I have a pretty tricky problem that I need some help with.
        >
        > I pass around search query information a fair amount (specifically
        > WHERE statements). Normally, I just rawurlencode() the buggers, and
        > pass them via the URL. I like having the where clauses in the URL,
        > because then someone can just bookmark the URL, or send it to a friend,
        > and I don't have to worry about a thing.
        >[/color]

        SQL injection?

        I use mcrypt to encrypt it using AES, then base64 encode it. The encryption
        prevents people modifying the code (I actually encode($check .
        $where_clause) so I can validate on $check). The base64 encode bit means I
        don't need to worry about whether it is urlencoded or not (although the
        '==' at the end gets stripped off if I include it in a URL without encoding
        it, base64decode iterprets it just the same.

        HTH

        C.

        Comment

        • Colin McKinnon

          #5
          Re: Tricky Error

          dracolytch@gmai l.com wrote:
          [color=blue]
          > Good day all,
          > Ok, I have a pretty tricky problem that I need some help with.
          >
          > I pass around search query information a fair amount (specifically
          > WHERE statements). Normally, I just rawurlencode() the buggers, and
          > pass them via the URL. I like having the where clauses in the URL,
          > because then someone can just bookmark the URL, or send it to a friend,
          > and I don't have to worry about a thing.
          >[/color]

          SQL injection?

          I use mcrypt to encrypt it using AES, then base64 encode it. The encryption
          prevents people modifying the code (I actually encode($check .
          $where_clause) so I can validate on $check). The base64 encode bit means I
          don't need to worry about whether it is urlencoded or not (although the
          '==' at the end gets stripped off if I include it in a URL without encoding
          it, base64decode iterprets it just the same.

          HTH

          C.

          Comment

          • dracolytch

            #6
            Re: Tricky Error

            To be honest, I've already done this analysis. The SQL in the URLs are
            only the WHERE clauses. So, someone could go ahead and inject their own
            SQL to create their own custom WHERE clause. All the more power to 'em.

            This is not the kind of system where I'm doing my "Trust no input from
            the user" diligence, I'm just organizing available data in an
            accessable format.

            True, your form of caching would solve the given problem, but I think
            it may be over-solving the problem. It could also incur additional
            maintenance issues once the cache reaches a couple thousand entries.

            ~D

            Comment

            • dracolytch

              #7
              Re: Tricky Error

              To be honest, I've already done this analysis. The SQL in the URLs are
              only the WHERE clauses. So, someone could go ahead and inject their own
              SQL to create their own custom WHERE clause. All the more power to 'em.

              This is not the kind of system where I'm doing my "Trust no input from
              the user" diligence, I'm just organizing available data in an
              accessable format.

              True, your form of caching would solve the given problem, but I think
              it may be over-solving the problem. It could also incur additional
              maintenance issues once the cache reaches a couple thousand entries.

              ~D

              Comment

              • dracolytch

                #8
                Re: Tricky Error

                The base 64 encoding did the trick. I don't need to encrypt the
                information at this point, but it does give me the additional
                flexibility for easily hardening this code if reused in another
                project.

                ~D

                Comment

                • dracolytch

                  #9
                  Re: Tricky Error

                  The base 64 encoding did the trick. I don't need to encrypt the
                  information at this point, but it does give me the additional
                  flexibility for easily hardening this code if reused in another
                  project.

                  ~D

                  Comment

                  • J Wynia

                    #10
                    Re: Tricky Error

                    dracolytch wrote:[color=blue]
                    > To be honest, I've already done this analysis. The SQL in the URLs are
                    > only the WHERE clauses. So, someone could go ahead and inject their own
                    > SQL to create their own custom WHERE clause. All the more power to 'em.[/color]

                    Most of the "classic" examples of SQL injection *are* in the WHERE
                    clause. Many are within a specific column name portion of the WHERE
                    clause. One of those classic examples is:

                    WHERE email = 'someone@exampl e.com';

                    which, when injected can become

                    WHERE email = 'someone@exampl e.com'; DROP TABLE members; --';
                    [color=blue]
                    >
                    > This is not the kind of system where I'm doing my "Trust no input from
                    > the user" diligence, I'm just organizing available data in an
                    > accessable format.[/color]

                    Given that the solution I mentioned takes less than 15 minutes, it's
                    hard to justify *not* taking that level of diligence. To me, not passing
                    anything important (like SQL, usernames, passwords, account information,
                    etc.) on the URL is my default level of diligence for any project I'm
                    putting on the web at large.
                    [color=blue]
                    >
                    > True, your form of caching would solve the given problem, but I think
                    > it may be over-solving the problem. It could also incur additional
                    > maintenance issues once the cache reaches a couple thousand entries.
                    >
                    > ~D
                    >[/color]
                    I actually use variations of the code I posted to do caching of a LOT of
                    things. In most instances, I also wrap a date check in there or a file
                    count that cleans up "all files created before $timestamp" or
                    "everything but the latest 100". If the cache is going to be big, I make
                    this a seperate script and schedule it to be called daily, etc. It's
                    such a simple solution, I don't see how it's "over-solving" the problem.

                    I actually also frequently do this sort of caching on result sets, RSS
                    and other remote XML, etc. as well. Many of my caches run well into the
                    10,000-50,000 file range and still perform quite well. In most cases, it
                    also sped up the application in the process as getting the contents of a
                    specific file is quicker than connecting to a database or making a
                    remote connection to fetch content. In most of my implementations , using
                    something similar to what I posted, I actually wrap the database fetch
                    itself in a cache check. It's still based on an md5() of the query, but
                    instead of storing the query, I store a serialized version of the
                    returned array from the database. I add a quick additional check next to
                    the file_exists to also check that it's new enough and delete as
                    appropriate.

                    As long as you either add cleanup code or know the churn rate of the
                    data is so low as to not be an issue, there is no maintenance. I see
                    literally hundreds of apps on Sourceforge, Hotscripts, etc. that are
                    making dozens of calls to databases on every page load for data that
                    changes twice a year. That's a complete waste of database time.

                    Comment

                    • J Wynia

                      #11
                      Re: Tricky Error

                      dracolytch wrote:[color=blue]
                      > To be honest, I've already done this analysis. The SQL in the URLs are
                      > only the WHERE clauses. So, someone could go ahead and inject their own
                      > SQL to create their own custom WHERE clause. All the more power to 'em.[/color]

                      Most of the "classic" examples of SQL injection *are* in the WHERE
                      clause. Many are within a specific column name portion of the WHERE
                      clause. One of those classic examples is:

                      WHERE email = 'someone@exampl e.com';

                      which, when injected can become

                      WHERE email = 'someone@exampl e.com'; DROP TABLE members; --';
                      [color=blue]
                      >
                      > This is not the kind of system where I'm doing my "Trust no input from
                      > the user" diligence, I'm just organizing available data in an
                      > accessable format.[/color]

                      Given that the solution I mentioned takes less than 15 minutes, it's
                      hard to justify *not* taking that level of diligence. To me, not passing
                      anything important (like SQL, usernames, passwords, account information,
                      etc.) on the URL is my default level of diligence for any project I'm
                      putting on the web at large.
                      [color=blue]
                      >
                      > True, your form of caching would solve the given problem, but I think
                      > it may be over-solving the problem. It could also incur additional
                      > maintenance issues once the cache reaches a couple thousand entries.
                      >
                      > ~D
                      >[/color]
                      I actually use variations of the code I posted to do caching of a LOT of
                      things. In most instances, I also wrap a date check in there or a file
                      count that cleans up "all files created before $timestamp" or
                      "everything but the latest 100". If the cache is going to be big, I make
                      this a seperate script and schedule it to be called daily, etc. It's
                      such a simple solution, I don't see how it's "over-solving" the problem.

                      I actually also frequently do this sort of caching on result sets, RSS
                      and other remote XML, etc. as well. Many of my caches run well into the
                      10,000-50,000 file range and still perform quite well. In most cases, it
                      also sped up the application in the process as getting the contents of a
                      specific file is quicker than connecting to a database or making a
                      remote connection to fetch content. In most of my implementations , using
                      something similar to what I posted, I actually wrap the database fetch
                      itself in a cache check. It's still based on an md5() of the query, but
                      instead of storing the query, I store a serialized version of the
                      returned array from the database. I add a quick additional check next to
                      the file_exists to also check that it's new enough and delete as
                      appropriate.

                      As long as you either add cleanup code or know the churn rate of the
                      data is so low as to not be an issue, there is no maintenance. I see
                      literally hundreds of apps on Sourceforge, Hotscripts, etc. that are
                      making dozens of calls to databases on every page load for data that
                      changes twice a year. That's a complete waste of database time.

                      Comment

                      • Mladen Gogala

                        #12
                        Re: Tricky Error

                        On Tue, 10 May 2005 07:10:07 -0700, dracolytch wrote:
                        [color=blue]
                        > If you search for a NUMBER, and the system uses a LIKE statement, the
                        > string looks like this:
                        >
                        > data LIKE('%415%')
                        >
                        > Which is great... When I rawurlencode it, I get:
                        >
                        > LIKE%28%27%2541 5%25%27%29
                        >
                        > Looking at it more closely:
                        > %25 = %
                        > %27 = '
                        > %28 = (
                        > %29 = )
                        >
                        > Which is GREAT... all that makes sense.
                        >
                        > Unfortunately, when I rawurldecode that URL, I get: LIKE('A5%')
                        >
                        > It appears to be interpreting the %25 as a %, and then using that % on
                        > the next two numbers... Ideas?[/color]

                        I read the whole debate and I agree with other people which fear the SQL
                        Injection. Also, when you assemble your SQL like that, you're not helping
                        your database, because it will have to perpetually parse the new
                        statement. With modern statistics based optimizers, it can become rather
                        expensive. My solution would be to use WHERE COLUMN LIKE '%:BVAR%' and
                        then bind the passed string to the variable :BVAR. The advantage of that
                        solution is that you will parse only once. Bind variables will also take
                        care of the SQL injection.

                        --
                        Demagogue: One who preaches a doctrine he knows to be untrue to
                        men he knows to be idiots.
                        H.L. Mencken

                        Comment

                        • Mladen Gogala

                          #13
                          Re: Tricky Error

                          On Tue, 10 May 2005 07:10:07 -0700, dracolytch wrote:
                          [color=blue]
                          > If you search for a NUMBER, and the system uses a LIKE statement, the
                          > string looks like this:
                          >
                          > data LIKE('%415%')
                          >
                          > Which is great... When I rawurlencode it, I get:
                          >
                          > LIKE%28%27%2541 5%25%27%29
                          >
                          > Looking at it more closely:
                          > %25 = %
                          > %27 = '
                          > %28 = (
                          > %29 = )
                          >
                          > Which is GREAT... all that makes sense.
                          >
                          > Unfortunately, when I rawurldecode that URL, I get: LIKE('A5%')
                          >
                          > It appears to be interpreting the %25 as a %, and then using that % on
                          > the next two numbers... Ideas?[/color]

                          I read the whole debate and I agree with other people which fear the SQL
                          Injection. Also, when you assemble your SQL like that, you're not helping
                          your database, because it will have to perpetually parse the new
                          statement. With modern statistics based optimizers, it can become rather
                          expensive. My solution would be to use WHERE COLUMN LIKE '%:BVAR%' and
                          then bind the passed string to the variable :BVAR. The advantage of that
                          solution is that you will parse only once. Bind variables will also take
                          care of the SQL injection.

                          --
                          Demagogue: One who preaches a doctrine he knows to be untrue to
                          men he knows to be idiots.
                          H.L. Mencken

                          Comment

                          • dracolytch

                            #14
                            Re: Tricky Error

                            Cleaning the cache then breaks the saved URLs, so getting the behavior
                            I want with a file-based cache is not as straightforward as you're
                            making it out to be. So, while your ideas are good, their place is not
                            in this application. Encrypting the clauses, as another submitter
                            suggested, appears to be a more appropriate solution.

                            As it is, since our projects are inernal to our organization via a
                            secured and encrypted network, the audience is implicitly trusted,
                            injection is a non-issue. Frankly, if someone outside of our audience
                            got in, they're not here to corrupt our phone list.

                            ~D

                            Comment

                            • BKDotCom

                              #15
                              Re: Tricky Error


                              Mladen Gogala wrote:[color=blue]
                              > I read the whole debate and I agree with other people which fear the[/color]
                              SQL[color=blue]
                              > Injection. Also, when you assemble your SQL like that, you're not[/color]
                              helping[color=blue]
                              > your database, because it will have to perpetually parse the new
                              > statement. With modern statistics based optimizers, it can become[/color]
                              rather[color=blue]
                              > expensive. My solution would be to use WHERE COLUMN LIKE '%:BVAR%'[/color]
                              and[color=blue]
                              > then bind the passed string to the variable :BVAR. The advantage of[/color]
                              that[color=blue]
                              > solution is that you will parse only once. Bind variables will also[/color]
                              take[color=blue]
                              > care of the SQL injection.[/color]

                              Could you elaborate on "bind variables"?
                              what is this technique, or construct, or whatever you speak of?

                              Comment

                              Working...