compress html output with php?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Ciaran

    compress html output with php?

    HI I know about
    ob_start( 'ob_gzhandler' );
    But I'm looking for something that removes all line breaks and extra
    whitespace in the html before sending it to the visitor's browser. Is
    this possible?

    Cheers,
    Ciarán

  • petersprc

    #2
    Re: compress html output with php?

    Hi,

    This comment has an example of stripping whitespace:



    On Mar 22, 9:48 pm, "Ciaran" <cronok...@hotm ail.comwrote:
    HI I know about
    ob_start( 'ob_gzhandler' );
    But I'm looking for something that removes all line breaks and extra
    whitespace in the html before sending it to the visitor's browser. Is
    this possible?
    >
    Cheers,
    Ciarán

    Comment

    • Ciaran

      #3
      Re: compress html output with php?

      Hmm, yeah looks good. Thanks a lot Peter.

      Does anyone know if it's worth it? I mean is the time spent running
      the function less then the time it would take to download the longer
      html file? Any thoughts on this?

      Cheers,
      Ciarán

      Comment

      • Ciaran

        #4
        Re: compress html output with php?

        OK I ran a few tests on my slowest page... Here's my results:

        WITHOUT COMPRESSION FUNCTION::::::: ::::
        Page Size: 523.97 kb
        Load Time: 0.9995 seconds
        Load Time: 0.8 seconds
        Load Time: 0.8095 seconds
        Load Time: 0.7091 seconds
        Load Time: 0.7223 seconds

        WITH COMPRESSION FUNCTION::::::: ::::
        Page Size: 494.77 kb
        Load Time: 0.8448 seconds
        Load Time: 0.8307 seconds
        Load Time: 0.8307 seconds
        Load Time: 0.8444 seconds
        Load Time: 0.9014 seconds

        AVERAGE SPEED WITHOUT COMPRESSION: 0.80808
        AVERAGE SPEED WITH COMPRESSION: 0.8504

        Hope that helps someone!
        Cheers,
        Ciarán

        Comment

        • Ciaran

          #5
          Re: compress html output with php?

          Dammit I'm confused! I'm not sure how acurate this info is! when I
          downloaded the uncompressed version it was 613 KB while the compressed
          version of the same page was a tiny 48KB! Surely the download speed of
          the page has to be considered?
          Anyone?

          Comment

          • Richard Formby

            #6
            Re: compress html output with php?

            Ciaran wrote:

            Does anyone know if it's worth it? [compressing HTML] I mean is the time
            spent running
            the function less then the time it would take to download the longer
            html file? Any thoughts on this?

            This comes up over at alt.html all the time. I have seen arguments put
            forward where compressing a 50K file to 30K is a good thing *but* that 50K
            file links to 500K of images. It's the images that cause the trouble.

            --
            Richard.


            Comment

            • Toby A Inkster

              #7
              Re: compress html output with php?

              Richard Formby wrote:
              Does anyone know if it's worth it? [compressing HTML] I mean is the time
              spent running the function less then the time it would take to download
              the longer html file? Any thoughts on this?
              Not the way Ciaran's attempting to do it. Gzipping HTML as you send it
              will lead to a dramatic reduction in file size. (The resultant file will
              probably be half the size of the original, or even smaller.) Whatsmore,
              the compression is done in well-optimised C code, so it uses very little
              time to do. You can perform it using particular settings in php.ini or
              Apache, so it doesn't require any modification to your PHP code base,
              making it very easy to toggle on or off as required.

              Stripping out redundant whitespace in a file leads to a small reduction in
              file size. Depending on how much whitespace there is in the first place,
              you might shave off 10% or so from the file size. The compression is
              typically done using PHP and regular expressions, which is slower than the
              method above. It generally requires you to make some modifications to your
              PHP code. Whatsmore, it's error-prone. Whitespace is significant in some
              places (e.g. within PRE, TEXTAREA and SCRIPT elements). Most whitespace
              stripping scripts get this wrong in certain places -- getting it right
              requires even more careful effort parsing the HTML, and slows the script
              down even more.

              Zipping HTML content in transit can save significant bandwidth on
              mainly textual websites without using much extra CPU time.

              --
              Toby A Inkster BSc (Hons) ARCS
              Contact Me ~ http://tobyinkster.co.uk/contact
              Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux

              * = I'm getting there!

              Comment

              • Ciaran

                #8
                Re: compress html output with php?

                Hi again fellas, thank for the replies on this. I'm using a
                combination of ob_gzhandler and this php function to compress my
                pages. Depending on the page, I'm getting a small increase in server
                side time and a huge reduction in the size of the outputted html page
                (and bandwidth!). The problem is, as Toby mentioned, the compression
                function messes up some things. One thing I've noticed is some of my
                javascript functions are breaking because of it so I'm only adding it
                on select pages. I love the result so is there any way to stop it
                breaking things or is there a better way to get the same effect?

                Here's the function:

                function compress($buffe r){
                $search = array('/\>[^\S ]+/s','/[^\S ]+\</s','/(\s)+/s');
                $replace = array('>','<',' \\1');
                $buffer = preg_replace($s earch, $replace, $buffer);
                return $buffer;
                }

                Cheers,
                Ciarán

                Comment

                • Colin McKinnon

                  #9
                  Re: compress html output with php?

                  Ciaran wrote:
                  Dammit I'm confused! I'm not sure how acurate this info is! when I
                  downloaded the uncompressed version it was 613 KB while the compressed
                  version of the same page was a tiny 48KB! Surely the download speed of
                  the page has to be considered?
                  Anyone?
                  Sorry Ciaran, but you're metrics are meaningless unless you can tell us what
                  you were measuring (hardware at each end, intervening network hardware,
                  bandwidths, RTT, network latency, request latency...).

                  I find it very hard to believe that the gz handler would only reduce a 524Kb
                  HTML or Text file to 494Kb. I think your methodology is flawed.

                  C.


                  Comment

                  • Toby A Inkster

                    #10
                    Re: compress html output with php?

                    Ciaran wrote:
                    The problem is, as Toby mentioned, the compression function messes up
                    some things.
                    I can share a little code with you I suppose... I happen to do exactly the
                    opposite of what you're describing -- add *more* whitespace to some HTML,
                    in order to pretty-print it. Obviously, this screws up when you get inside
                    a PRE or TEXTAREA element, so I made my function smart enough to know when
                    it's inside one of those.

                    Download demiblog for free. DemiBlog is part blog engine, part content management system, part photo gallery and part message board. It supports MySQL and PostgreSQL backends, and outputs valid HTML 4 or XHTML 1.x.


                    It's the indent_html() function you're looking for. Obviously, you'll need
                    to work at it a bit to get it to do what you want, but you should see that
                    it fairly reliably knows at each point whether or not it's within a "safe
                    tag" or not.

                    That said, I'd still advise against your plan. Gzipping your files will be
                    far more effective, more reliable and easier.

                    --
                    Toby A Inkster BSc (Hons) ARCS
                    Contact Me ~ http://tobyinkster.co.uk/contact
                    Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux

                    * = I'm getting there!

                    Comment

                    • Ciaran

                      #11
                      Re: compress html output with php?

                      Sorry Ciaran, but you're metrics are meaningless unless you can tell us what
                      you were measuring (hardware at each end, intervening network hardware,
                      bandwidths, RTT, network latency, request latency...).
                      I don't see why that matters. All I'm measuring is the increase in
                      speed. The speed itself is not an issue.

                      I find it very hard to believe that the gz handler would only reduce a 524Kb
                      HTML or Text file to 494Kb. I think your methodology is flawed.
                      Sorry I dont think I made this clear - I'm already using the gz
                      handler. The stats I posted are using the 'homemade' compression
                      function I posted that removes whitespace.


                      Comment

                      • Ciaran

                        #12
                        Re: compress html output with php?

                        Thanks for the info Toby - I'll check it out when I get a chance.
                        That said, I'd still advise against your plan. Gzipping your files will be
                        far more effective, more reliable and easier.
                        I was actually planning on doing both. I've always been using gzip
                        compression - I started this thread in the hope I could squeeze a bit
                        more compression in there.



                        THE BOTTOM LINE::::::::::: ::::::::::::::: :::
                        Using the (temperamental) compression function posted earlier I'm
                        getting a reduction of 4.7% in filesize but my server is 5% slower at
                        throwing the pages together. I guess that means using the function
                        will save a small amount of bandwidth at the expense of a tiny
                        increase in page access time. You can make up your own minds weather
                        that's worth it! ;)

                        Ciarán

                        Comment

                        Working...