process a BIG string

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • quamis@gmail.com

    process a BIG string

    Hy,
    i need to process every character in a file, so i open the file read
    in buffers of about 8192bytes and process each buffer, then i write
    the output to another file.

    the problem is that with large files(>8Mb) i get a script error(Fatal
    error: Maximum execution time of 30 seconds exceeded ).
    i acess every character in the buffer with
    $chr=ord($buffe rIn{$i}); (where $i=0...8192)
    seems like all he time the script consumes is in the for loop and the
    chr/ord functions.

    can i do something to speed things up?
    is there any other way of acessing a single characher except
    $bufferIn{$i} ?

  • Erwin Moller

    #2
    Re: process a BIG string

    quamis@gmail.co m wrote:
    Hy,
    i need to process every character in a file, so i open the file read
    in buffers of about 8192bytes and process each buffer, then i write
    the output to another file.
    >
    the problem is that with large files(>8Mb) i get a script error(Fatal
    error: Maximum execution time of 30 seconds exceeded ).
    i acess every character in the buffer with
    $chr=ord($buffe rIn{$i}); (where $i=0...8192)
    seems like all he time the script consumes is in the for loop and the
    chr/ord functions.
    >
    can i do something to speed things up?
    is there any other way of acessing a single characher except
    $bufferIn{$i} ?
    Hi,

    I am not sure if you can speed things up in your script, but why not simply
    increase the max_execution time?

    This can be done with
    ini_set("max_ex ecution_time",6 0);
    for 60 seconds.

    Have a look here for more options:


    You might also hit the roof of your memory usage if you handle and process
    very large file. In that case look at: memory_limit (defaults to 8MB)

    Hope that helps.

    Regards,
    Erwin Moller

    Comment

    • _q_u_a_m_i_s's

      #3
      Re: process a BIG string

      I won't hit the memory limit because i only read about 10 to 64Kb at a
      tim from the file, process it and output it to the other one.
      I'm not sure that increasing the execution time will be the solution,
      this thing should be able to process files of about 100-200Mb, and if i
      need 30seconds for a 2-3Mb file, i can't imagine how long it will take
      to process a 100Mb file. during this time the user won't get any
      feedback...
      I'm goint to take your solution into consideration, longer exection
      time and a FAST server might do the trick...

      Erwin Moller a scris:
      quamis@gmail.co m wrote:
      >
      Hy,
      i need to process every character in a file, so i open the file read
      in buffers of about 8192bytes and process each buffer, then i write
      the output to another file.

      the problem is that with large files(>8Mb) i get a script error(Fatal
      error: Maximum execution time of 30 seconds exceeded ).
      i acess every character in the buffer with
      $chr=ord($buffe rIn{$i}); (where $i=0...8192)
      seems like all he time the script consumes is in the for loop and the
      chr/ord functions.

      can i do something to speed things up?
      is there any other way of acessing a single characher except
      $bufferIn{$i} ?
      >
      Hi,
      >
      I am not sure if you can speed things up in your script, but why not simply
      increase the max_execution time?
      >
      This can be done with
      ini_set("max_ex ecution_time",6 0);
      for 60 seconds.
      >
      Have a look here for more options:

      >
      You might also hit the roof of your memory usage if you handle and process
      very large file. In that case look at: memory_limit (defaults to 8MB)
      >
      Hope that helps.
      >
      Regards,
      Erwin Moller

      Comment

      • CptDondo

        #4
        Re: process a BIG string

        quamis@gmail.co m wrote:
        Hy,
        i need to process every character in a file, so i open the file read
        in buffers of about 8192bytes and process each buffer, then i write
        the output to another file.
        >
        the problem is that with large files(>8Mb) i get a script error(Fatal
        error: Maximum execution time of 30 seconds exceeded ).
        i acess every character in the buffer with
        $chr=ord($buffe rIn{$i}); (where $i=0...8192)
        seems like all he time the script consumes is in the for loop and the
        chr/ord functions.
        >
        can i do something to speed things up?
        is there any other way of acessing a single characher except
        $bufferIn{$i} ?
        >
        I know this may be sacrilege on this list :-), but perhaps you might
        consider something like C? If all you're doing is processing single
        chars, and you don't need a lot of the data structure and management
        stuff PHP provides, C might be a much faster alternative....

        --Yan

        Comment

        • Jerry Stuckle

          #5
          Re: process a BIG string

          _q_u_a_m_i_s's wrote:
          I won't hit the memory limit because i only read about 10 to 64Kb at a
          tim from the file, process it and output it to the other one.
          I'm not sure that increasing the execution time will be the solution,
          this thing should be able to process files of about 100-200Mb, and if i
          need 30seconds for a 2-3Mb file, i can't imagine how long it will take
          to process a 100Mb file. during this time the user won't get any
          feedback...
          I'm goint to take your solution into consideration, longer exection
          time and a FAST server might do the trick...
          >
          Erwin Moller a scris:
          >
          >>quamis@gmail. com wrote:
          >>
          >>
          >>>Hy,
          >>>i need to process every character in a file, so i open the file read
          >>>in buffers of about 8192bytes and process each buffer, then i write
          >>>the output to another file.
          >>>
          >>>the problem is that with large files(>8Mb) i get a script error(Fatal
          >>>error: Maximum execution time of 30 seconds exceeded ).
          >>>i acess every character in the buffer with
          >>>$chr=ord($bu fferIn{$i}); (where $i=0...8192)
          >>>seems like all he time the script consumes is in the for loop and the
          >>>chr/ord functions.
          >>>
          >>>can i do something to speed things up?
          >>>is there any other way of acessing a single characher except
          >>>$bufferIn{$i } ?
          >>
          >>Hi,
          >>
          >>I am not sure if you can speed things up in your script, but why not simply
          >>increase the max_execution time?
          >>
          >>This can be done with
          >>ini_set("max_ execution_time" ,60);
          >>for 60 seconds.
          >>
          >>Have a look here for more options:
          >>http://nl2.php.net/manual/en/ini.php#ini.list
          >>
          >>You might also hit the roof of your memory usage if you handle and process
          >>very large file. In that case look at: memory_limit (defaults to 8MB)
          >>
          >>Hope that helps.
          >>
          >>Regards,
          >>Erwin Moller
          >
          >
          If you've got that big of a file to process, you might need to think
          about a different approach. Even a faster server probably won't get a
          200MB file done in a reasonable time. And if you extend your time
          limit, your browser might time out (although you could send data every
          once in a while to keep the connection active, this isn't foolproof,
          either).

          In your situation I think I'd do it in a compiled language such as C and
          set it up as a batch job or even a PHP extension. The C routine will
          run much faster than PHP, and should solve a lot of your problems.


          --
          =============== ===
          Remove the "x" from my email address
          Jerry Stuckle
          JDS Computer Training Corp.
          jstucklex@attgl obal.net
          =============== ===

          Comment

          • Colin McKinnon

            #6
            Re: process a BIG string

            quamis@gmail.co m wrote:
            >
            the problem is that with large files(>8Mb) i get a script error(Fatal
            error: Maximum execution time of 30 seconds exceeded ).
            i acess every character in the buffer with
            <snip>
            can i do something to speed things up?
            Not without resolving the problem of why you need to examine each char.

            I would suggest that you before commencing the processing you put:

            set_time_limit( 0);
            ignore_user_abo rt(); // in case your browser timeout.

            ....then in the read loop but ouitside the char loop provide a machanism for
            stopping runaways:

            if ((rand(0,20)==1 0) && (file_exists("b reak")) {
            break;
            }

            C.

            Comment

            • _q_u_a_m_i_s's

              #7
              Re: process a BIG string

              i will use that..


              a good advice would be also

              so i can "resume" file processing from time to time...but i guess it
              wold be better to just take the time a do the whole file-processing at
              once.
              also, this thing seems interesting
              http://ro2.php.net/manual/en/functio...imit.php#54023 i may do
              an output to the broser, telling the user the progress with that file.
              thanks a lot.

              PS: what do you mean by runaways? you mean that the script could enter
              a infinite loop? or is it something i'm missing here?


              Colin McKinnon a scris:
              quamis@gmail.co m wrote:
              >

              the problem is that with large files(>8Mb) i get a script error(Fatal
              error: Maximum execution time of 30 seconds exceeded ).
              i acess every character in the buffer with
              <snip>
              >
              can i do something to speed things up?
              >
              Not without resolving the problem of why you need to examine each char.
              >
              I would suggest that you before commencing the processing you put:
              >
              set_time_limit( 0);
              ignore_user_abo rt(); // in case your browser timeout.
              >
              ...then in the read loop but ouitside the char loop provide a machanism for
              stopping runaways:
              >
              if ((rand(0,20)==1 0) && (file_exists("b reak")) {
              break;
              }
              >
              C.

              Comment

              Working...