Regular Expression

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Nicodemas
    Recognized Expert New Member
    • Nov 2007
    • 164

    Regular Expression

    Hello all, could not find a regular expression forum, so I thought I would post it to my favorite of the forums.

    I have a series of applications I've developed which all use a centralized function that creates a connection object and executes an SQL query, i.e. function Query().

    I always sanitize my SQL statements to buffer all apostrophes with two apostrophes ala function Buffer().

    However, I have long wondered if I could do away with Buffer(), and use a high-end regular expression to detect a string value in an SQL expression, and automatically replace any single apostrophes in the statement with doubles.

    Psuedocode example:

    Code:
    
    Query sSQL, rs
    
    ....
    
    (now inside [B]Query()[/B])
    
    ..
    ' create connection, or use already open connection
    ' create new recordset 
    ' regular expression analyzes SQL statement passed in thru arguments, detects the apostrophe in "o'toole" (but not the apostrophes surrounding "george o'toole") and VBScript replaces it with two apostrophes so that sSQL now looks like this:
    
    sSQL = "select id from users where name = 'george o''toole'"
    
    ' execute sql statement, return recordset to calling program
    I already have most of the logic done, but I am CRAP at regular expressions. Anyone think they've got the gusto to work with me on this one? I've tried a couple of times to write it myself, but keep giving up in frustration.
  • NeoPa
    Recognized Expert Moderator MVP
    • Oct 2006
    • 32645

    #2
    I've done some playing with regular expressions (REs), but I can't see why any special handling would be required here. The apostrophe (') is not treated specially so surely a simple change of {'} to {''} (ignore the {} - I just used them to delineate the two strings) would work for you.
    I should say I'm also no ASP user, but simply on the RE front I can't see why they would be required. If there's something I've overlooked I'm more than happy to lend what experience I have to try to help.
    BTW If you're using VBScript then I would expect the following to work :
    Code:
    sSQL = "SELECT [ID] " & _
           "FROM [Users] " & _
           "WHERE [Name] = '" & Replace(sName, "'", "''") & "'"

    Comment

    • Motoma
      Recognized Expert Specialist
      • Jan 2007
      • 3236

      #3
      As my compatriot NeoPa has already stated, there is no reason to use regular expressions to deal with apostrophes. Regex's are typically orders of magnitude more expensive than a simple linear search, so the Replace() function is your best and quickest solution.

      Comment

      • Nicodemas
        Recognized Expert New Member
        • Nov 2007
        • 164

        #4
        I'm afraid there is a little confusion as to my specific goal, as well as my specific situation.

        I have a global function (i.e. accessible by any script in my application) named Query(). It accepts two arguments, an SQL statement, and any variant which comes back as a recordset, i.e.

        Code:
        dim rs      '// passed back to script by Query as a recordset.
        dim sSQL    '// a string
        dim sName   '// a user submitted value
        
        sName = request.form("txtName")
        
        '//----------------------------------------------------
        '// assume the user entered: O'Toole
        '//----------------------------------------------------
        
        sSQL = "SELECT * FROM Administrators WHERE lastName = "& sName
        
        if Query(sSQL, rs) then
           'life goes on...
        else
           'hang yourself
        end if
        As you can see, Query can accept any type of SQL statement as its first argument, e.g. SELECTs, UPDATEs, DELETEs, etc...

        If I were to run the above SQL against a database, it would error because of the apostrophe in "O'Toole."

        Until present, I have have used a custom function to perform inline replace statements on content concatenated onto SQL statements, e.g.

        Code:
        '//----------------------------------------------------
        '// function Buffer just does a replace
        '// replace(randomString, "'", "''")
        '//----------------------------------------------------
        sSQL = "SELECT * FROM Administrators WHERE lastName = "& Buffer(sName)
        It is my desire to no longer do this! I want to pass an unsanitized string into Query(), and have it sanitized by a Regular Expression, then executed against the database.

        Such a Regular Expression would detect any apostrophes that were not a part of the SQL syntax, then it would replace just those apostrophes found with two apostrophes.

        A few examples:

        Code:
        SELECT * FROM Administrators WHERE lastName = 'O'Toole';
        '// would be changed to:
        SELECT * FROM Administrators WHERE lastName = 'O[B]''[/B]Toole';
        
        
        INSERT INTO Administrators (lastName, firstName, pword) VALUES ('O'Toole', 'Jacob', 'linux');
        '// would be changed to:
        INSERT INTO Administrators (lastName, firstName, pword) VALUES ('O[B]''[/B]Toole', 'Jacob', 'linux');
        
        
        UPDATE Administrators SET lastName = 'O'Toole' WHERE adminID = 4;
        '// would be changed to:
        UPDATE Administrators SET lastName = 'O[B]''[/B]Toole' WHERE adminID = 4;
        
        
        DELETE FROM Administrators WHERE lastName = 'O'Toole';
        '// would be changed to:
        DELETE FROM Administrators WHERE lastName = 'O[B]''[/B]Toole';
        All of this would happen without me ever writing another inline replace statement, or using Buffer() ever again.

        Comment

        • NeoPa
          Recognized Expert Moderator MVP
          • Oct 2006
          • 32645

          #5
          That's fair enough Nicodemas, but you appear to be expecting us to design a full SQL parser here for you, rather than some code to handle replacement of YOUR defined understanding of what's required.
          If you want to define what you understand to be the patterns to be recognised I can help. If you want me to look up all the syntax for you, and then build a routine, incorporating all possible situations, for you, then I'm afraid I won't do that.

          ** EDIT **
          This sounds a little harsh maybe, and rereading the OP I see you are asking for someone to work with you on this. I'm happy to do it that way, but I think the defining the parsing part is the difficult bit.

          Comment

          • NeoPa
            Recognized Expert Moderator MVP
            • Oct 2006
            • 32645

            #6
            I should add that it's good security practice among designer / developers of database front-ends for each string taken from the operator to pass through a "security" function, which disallows single quotes (apostrophes) from being passed back as they stand. This is to protect against SQL Insertion attacks.

            It may be appropriate for you to consider taking this (much more straightforward ) approach instead.
            This will not handle passing names taken from your database tables into any SQL though, so may not suit your needs fully.

            I should also add that your first post was actually expressed admirably clearly. Sometimes it's just hard to take it all in without missing some important details though I'm afraid.

            Comment

            • NeoPa
              Recognized Expert Moderator MVP
              • Oct 2006
              • 32645

              #7
              As a final comment, I would just say that whatever algorithm is used here, I don't thing REs (per-se) can be. If a concept can be conceived (a fairly big ask), then this would have to be implemented in standard code. The concept may end up as something similar to what is found when working with REs, but I can see no way to utilise an RE system within this sort of situation.
              The reason I mention this (it may seem obvious) is that it means RE experience is unlikely to be of help in this situation, although pattern matching skills may help to determine what the algorithm should be that we're looking for maybe.

              Comment

              • Motoma
                Recognized Expert Specialist
                • Jan 2007
                • 3236

                #8
                Originally posted by Nicodemas
                I'm afraid there is a little confusion as to my specific goal, as well as my specific situation.

                I have a global function (i.e. accessible by any script in my application) named Query(). It accepts two arguments, an SQL statement, and any variant which comes back as a recordset, i.e.

                Code:
                dim rs      '// passed back to script by Query as a recordset.
                dim sSQL    '// a string
                dim sName   '// a user submitted value
                
                sName = request.form("txtName")
                
                '//----------------------------------------------------
                '// assume the user entered: O'Toole
                '//----------------------------------------------------
                
                sSQL = "SELECT * FROM Administrators WHERE lastName = "& sName
                
                if Query(sSQL, rs) then
                   'life goes on...
                else
                   'hang yourself
                end if
                As you can see, Query can accept any type of SQL statement as its first argument, e.g. SELECTs, UPDATEs, DELETEs, etc...

                If I were to run the above SQL against a database, it would error because of the apostrophe in "O'Toole."

                Until present, I have have used a custom function to perform inline replace statements on content concatenated onto SQL statements, e.g.

                Code:
                '//----------------------------------------------------
                '// function Buffer just does a replace
                '// replace(randomString, "'", "''")
                '//----------------------------------------------------
                sSQL = "SELECT * FROM Administrators WHERE lastName = "& Buffer(sName)
                It is my desire to no longer do this! I want to pass an unsanitized string into Query(), and have it sanitized by a Regular Expression, then executed against the database.

                Such a Regular Expression would detect any apostrophes that were not a part of the SQL syntax, then it would replace just those apostrophes found with two apostrophes.

                A few examples:

                Code:
                SELECT * FROM Administrators WHERE lastName = 'O'Toole';
                '// would be changed to:
                SELECT * FROM Administrators WHERE lastName = 'O[B]''[/B]Toole';
                
                
                INSERT INTO Administrators (lastName, firstName, pword) VALUES ('O'Toole', 'Jacob', 'linux');
                '// would be changed to:
                INSERT INTO Administrators (lastName, firstName, pword) VALUES ('O[B]''[/B]Toole', 'Jacob', 'linux');
                
                
                UPDATE Administrators SET lastName = 'O'Toole' WHERE adminID = 4;
                '// would be changed to:
                UPDATE Administrators SET lastName = 'O[B]''[/B]Toole' WHERE adminID = 4;
                
                
                DELETE FROM Administrators WHERE lastName = 'O'Toole';
                '// would be changed to:
                DELETE FROM Administrators WHERE lastName = 'O[B]''[/B]Toole';
                All of this would happen without me ever writing another inline replace statement, or using Buffer() ever again.
                I am sorry Nicodemas, but this simply isn't how input sanitization works.
                Take the following instances, how would these be properly quoted?
                SELECT * FROM users WHERE username = 'Motoma'
                SELECT * FROM users WHERE username = 'Motoma' AND password = 'nothing'

                If you look at that and say "they are both correctly escaped" you could be wrong. While the first one is obviously correct, the second one could have have been correct, or it could have been the result of SQL injection with the input being "Motoma' AND password = 'nothing'"
                The problem is, there is _no_reasonable_ way_ to programmaticall y tell while keeping your Query function sufficiently powerful.

                Comment

                • Nicodemas
                  Recognized Expert New Member
                  • Nov 2007
                  • 164

                  #9
                  I concede. Exhaustive queries, searching, testing have rendered nothing. Thanks for your inputs.

                  Comment

                  Working...