preg_match_all Maximum execution time of 60 seconds error

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Shuan

    preg_match_all Maximum execution time of 60 seconds error

    I am getting this error. Can it be fixed by setting more than 60 for the
    max_execution_t ime
    in php.in file?

    Fatal error: Maximum execution time of 60 seconds exceeded in
    categorycrawler .php on line 19

    on this line i have regular expression

    preg_match_all( ,,)


  • Rik

    #2
    Re: preg_match_all Maximum execution time of 60 seconds error

    Shuan wrote:
    I am getting this error. Can it be fixed by setting more than 60 for
    the max_execution_t ime
    in php.in file?
    >
    Fatal error: Maximum execution time of 60 seconds exceeded in
    categorycrawler .php on line 19
    >
    on this line i have regular expression
    >
    preg_match_all( ,,)
    Possibly. It's quite impossible to know without knowing your actual code. 60
    second is really quite long. What are you trying to do exactly?

    Grtz,
    --
    Rik Wasmus


    Comment

    • Shuan

      #3
      Re: preg_match_all Maximum execution time of 60 seconds error

      I am trying to grab sites like craigslist, parse with regular expression
      and put some content into database.

      $request -fetch( $region_link );

      if( !$request -error ){
      $pageContent = $request -results;

      $regionpattern =
      "/<a[^>]*href=\"(\/s\/SL\/sg_maY.*)\".*>. *<img.*alt=\"(. *)\".*id=\"btn. *\">/
      siU";

      if(preg_match_a ll( $regionpattern, $pageContent, $categorylinks ))
      {
      for( $y = 0; $y < count( $categorylinks[ 1 ] ); $y++ ){

      $category_link= "http://www.mysite.com" .$categorylinks[ 1 ][ $y ];

      include( "pagecrawler.ph p" );
      }


      }


      Comment

      • Alvaro G. Vicario

        #4
        Re: preg_match_all Maximum execution time of 60 seconds error

        *** Shuan escribió/wrote (Thu, 17 Aug 2006 21:22:04 GMT):
        I am getting this error. Can it be fixed by setting more than 60 for the
        max_execution_t ime
        in php.in file?
        It's okay if you're doing a job that really needs a long time to execute,
        such us retriving a 50 MB file from the Internet, spidering a web site or
        synchronising two databases.

        Is that the case?



        --
        -+ http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
        ++ Mi sitio sobre programación web: http://bits.demogracia.com
        +- Mi web de humor con rayos UVA: http://www.demogracia.com
        --

        Comment

        • Alvaro G. Vicario

          #5
          Re: preg_match_all Maximum execution time of 60 seconds error

          *** Shuan escribió/wrote (Thu, 17 Aug 2006 21:35:19 GMT):
          I am trying to grab sites like craigslist, parse with regular expression
          and put some content into database.
          Try something like this:

          ini_set('max_ex ecution_time', 3600); // 1 hour


          --
          -+ http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
          ++ Mi sitio sobre programación web: http://bits.demogracia.com
          +- Mi web de humor con rayos UVA: http://www.demogracia.com
          --

          Comment

          • Rik

            #6
            Re: preg_match_all Maximum execution time of 60 seconds error

            Shuan wrote:
            I am trying to grab sites like craigslist, parse with regular
            expression and put some content into database.
            >
            $request -fetch( $region_link );
            >
            if( !$request -error ){
            $pageContent = $request -results;
            >
            $regionpattern =
            "/<a[^>]*href=\"(\/s\/SL\/sg_maY.*)\".*>. *<img.*alt=\"(. *)\".*id=\"btn. *\">/
            siU";
            >
            if(preg_match_a ll( $regionpattern, $pageContent, $categorylinks ))
            I was almost tempted to say it was a greedyness issue, before I spotted the /U.
            Dodged a bullet there :-).

            If I interprete you regex correctly, try this rewrite (I tend to use dots very
            sparingly, I'm more a fan of negative character classes, in which proper
            greediness is more usefull). I'm not really sure it will gain much on the
            resources consumption, but we can try:

            '|<a[^>]*?href="(/s/SL/sg_maY[^"]*)"[^>]*>.*?<img[^>]*?alt="([^"]*)"[^>]*?id="bt
            n[^"]*"[^>]*>|si

            I'd suggest a foreach loop also, instead your for loop:

            foreach($catego rylinks[1] as $link){
            $category_link= "http://www.mysite.com" .$link;
            include( "pagecrawler.ph p" );//I'm still curious what this does....
            }

            Or if you do use capture 2:
            if(preg_match_a ll( $regionpattern, $pageContent, $categorylinks,
            PREG_SET_ORDER) ){
            foreach($catego rylinks as $link){
            $category_link= "http://www.mysite.com" .$link[1];
            include( "pagecrawler.ph p" );//I'm still curious what this does....
            }
            }

            If you still have issues I'd like to see/know the actual site you're leeching
            right now :-).(If you're trying to get a page all at once, be sure to unset()
            unused/past variables.) I don't know what your actual pagecrawler.php does, but
            if it doesn't use capture 2 you might as well not capture it.

            Grtz,
            --
            Rik Wasmus


            Comment

            • Shuan

              #7
              Re: preg_match_all Maximum execution time of 60 seconds error

              It's okay if you're doing a job that really needs a long time to execute,
              such us retriving a 50 MB file from the Internet, spidering a web site or
              synchronising two databases.
              >
              Is that the case?
              i don't have to get a big file but i need to crawl the whole websites( about
              1000 pages )
              and that takes time;




              Comment

              • Rik

                #8
                Re: preg_match_all Maximum execution time of 60 seconds error

                Shuan wrote:
                >It's okay if you're doing a job that really needs a long time to
                >execute, such us retriving a 50 MB file from the Internet, spidering
                >a web site or synchronising two databases.
                >>
                >Is that the case?
                >
                i don't have to get a big file but i need to crawl the whole
                websites( about 1000 pages )
                and that takes time;
                Well yeah, about a 1000 pages will almost certainly need more execution time,
                forget what I said about the regex. As the preg_match_all is likely the most
                time consuming part of your script (be it relativelyy fast), chances are quite
                high that in it's nth execution the limit is passed.

                DO you know how many pages you do parse in those 60 second BTW?

                Grtz,
                --
                Rik Wasmus


                Comment

                • Chung Leong

                  #9
                  Re: preg_match_all Maximum execution time of 60 seconds error

                  Shuan wrote:
                  I am trying to grab sites like craigslist, parse with regular expression
                  and put some content into database.
                  >
                  $request -fetch( $region_link );
                  >
                  if( !$request -error ){
                  $pageContent = $request -results;
                  >
                  $regionpattern =
                  "/<a[^>]*href=\"(\/s\/SL\/sg_maY.*)\".*>. *<img.*alt=\"(. *)\".*id=\"btn. *\">/
                  siU";
                  There is a lot of back-tracking in your pattern, even though you've
                  specified ungreedy behavior. If there are many instances matching the
                  <a[^>]*href=\"(\/s\/SL\/sg_maY part of the pattern but not the rest,
                  then the .* that follows would make the regexp engine continually scan
                  to the end of the file.

                  My suggestion is to do /<a\s+href=\"( \/s\/SL\/sg_maY.*)\">(.* )<\/a>/siU
                  first, then loop through the results and regexp for the img tag.

                  Comment

                  • Shuan

                    #10
                    Re: preg_match_all Maximum execution time of 60 seconds error

                    Hi,
                    >ini_set('max_e xecution_time', 3600); // 1 hour
                    This solution worked, but I need to check out the reg. expression
                    to see if i am doing it effiiciently.

                    Thanks all for your supports.

                    --
                    -+ http://alvaro.es - Alvaro G. Vicario - Burgos, Spain
                    ++ Mi sitio sobre programacion web: http://bits.demogracia.com
                    +- Mi web de humor con rayos UVA: http://www.demogracia.com
                    "Rik" <luiheidsgoeroe @hotmail.comwro te in message
                    news:eb35c$44e4 eacf$8259c69c$2 1874@news2.tude lft.nl...
                    Shuan wrote:
                    It's okay if you're doing a job that really needs a long time to
                    execute, such us retriving a 50 MB file from the Internet, spidering
                    a web site or synchronising two databases.
                    >
                    Is that the case?
                    i don't have to get a big file but i need to crawl the whole
                    websites( about 1000 pages )
                    and that takes time;
                    >
                    Well yeah, about a 1000 pages will almost certainly need more execution
                    time,
                    forget what I said about the regex. As the preg_match_all is likely the
                    most
                    time consuming part of your script (be it relativelyy fast), chances are
                    quite
                    high that in it's nth execution the limit is passed.
                    >
                    DO you know how many pages you do parse in those 60 second BTW?
                    >
                    Grtz,
                    --
                    Rik Wasmus
                    >
                    >

                    Comment

                    Working...