strip html but keep ' & ""

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • dgriff80
    New Member
    • Jul 2009
    • 3

    strip html but keep ' & ""

    hello all!

    I am using a form that a user can fill out but for security reasons I want html stripped out. If the user inputs html, I want it to kick back saying something to the fact that it had html removed. What I have works just fine with one exception, I want people to be able to use

    Code:
    <?php
    $RemarksPure = Trim(stripslashes($_POST['remarks']));
    $Remarks = addslashes(preg_replace('#</?\w[^>]*>#', '', $RemarksPure));
    $RemarksValidationOK = true;
    $ValidationOK = true;
    if ($RemarksPure !== $Remarks) {
        // breaks validation for the form thus returning user to page to re-edit content
        $RemarksValidationOK = false;
        $ValidationOK = false;  // Whole Form Validation
    }
    ?>
    Code:
    <?php if (!$RemarksValidationOK) { echo "No HTML please!"; } ?>
    (This is shortened code)
    While this works fine for stripping html, it also kicks back apostrophes and quotes. I would like the user to be able to use them, but I'm not completely sure how to do that. I want to maintain security so that people can't put errant code in the input box, but atleast it doesn't kick back on an apostrophe. Quotes would be nice, but not necessary if I'm better off leaving it as is.

    I think I have a basic concepts of these commands in php to have gotten it working thus far, but I'm betting I can modify these commands to function better for me:
    • preg_replace - I'm a little flaky on the syntax and how it's used, but understand how it works
    • stripslashes & addslashes - not sure I fully understand this function to properly use it for what I need.



    Thanks in advance!

    Dan
  • dlite922
    Recognized Expert Top Contributor
    • Dec 2007
    • 1586

    #2
    There are many other functions in the manual for removing HTML code only from a string. (I can also give you a regexp)

    addslashes() is good to sanitize all types of quotations for a database insertion (to prevent SQL injection), but you don't need that otherwise.

    If you are putting it in the database, then when you store it in the DB the slash before the double quote are not stored.

    If you want to accept just string text from the user, you should.
    1. use preg_replace() to remove all HTML tags
    2. use mysql_read_esca pe_string() , if you're using MySQL DBMS, or addslashes otherwise.
    3. If you want to display the text again to the user, without having to recall it from the Database, then just put the version after step one into a variable so that you have a non-db-safe copy without any slashes.

    Remove HTML: preg_replace("/</?[a-z][a-z0-9]*[^<>]*>/i","",$input );

    // above removes all opening and closing (case insensitive) html tags.


    Hope that helps,






    Dan

    Comment

    • dgriff80
      New Member
      • Jul 2009
      • 3

      #3
      I am just emailing the content of the input box, nothing needs to really be stored from these. I tried intgrating your pregreplace and i got an error saying that ? was an unknown variable... I don't really know what to do for troubleshooting though.

      You however did finally make the stripslashes and addslashes make sense... so it adds slashes for certain types of data whereas in others you just want raw data. SO thanks!

      This is the line that was giving me an error...
      Code:
      $RemarksSlashed = addslashes(preg_replace("/</?[a-z][a-z0-9]*[^<>]*>/i",""[U][/U],$RemarksPure));

      Comment

      • dlite922
        Recognized Expert Top Contributor
        • Dec 2007
        • 1586

        #4
        you probably dont need the addslashes() then.

        Check your variable name for misspellings.

        DAN

        Comment

        • dgriff80
          New Member
          • Jul 2009
          • 3

          #5
          Thanks, I'll give it a shot!

          Comment

          • Dormilich
            Recognized Expert Expert
            • Aug 2008
            • 8694

            #6
            Code:
            preg_replace("/</?[a-z][a-z0-9]*[^<>]*>/i","",$input);
            should probably be
            Code:
            preg_replace("§</?[a-z][a-z0-9]*[^<>]*>§i","",$input);
            or even
            Code:
            preg_replace("@</?\w*[^<>]*>@i","",$input);
            maybe htmlspecialchar s() is also worth looking at

            Comment

            Working...