Remove all javascript code/content/references from the HTML code using Perl code lang

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • shivsa
    New Member
    • Apr 2010
    • 1

    Remove all javascript code/content/references from the HTML code using Perl code lang

    Hi Experts,

    I have a simple Perl code, in this, I have a Perl variable , say

    $original = '<html>........ ............... ............... ............... ............... ......<html>'

    in between this <HTML> tag, let's say, I have a lots of JavaScript code along with the HTML code, say <script> ..... </script> and also have inline javascript references , say , <input type = "text" name = "ok" value = "2" onclick=onClick ="javascript : function('value ')" >.

    I want to remove all the JavaScript related things, and just want HTML code/content/tag.

    Can you please help me out?

    Thanks
    Shiv
  • chaarmann
    Recognized Expert Contributor
    • Nov 2007
    • 785

    #2
    Just use a non-greedy Regular Expression to delete the javascript:
    Replace "<script>.* ?</script>" with empty string everywhere.
    Keep care of the upper/lowercase characters by either transforming the string before replacement or by using "(?i)" in front of your Regular Expression.
    Extend your Regular Expression to skip white spaces inside the tags and also characters that could follow inside the opening tag. For example "<script language=javasc ript color=red>" or "</script >"

    Comment

    Working...