Curl gives 403 forbidden

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Basta

    Curl gives 403 forbidden

    I'm trying to retrieve information of a website using PHP and Curl.
    This is the code I use:

    <?
    $tturl = "http://teletekst.nos.n l/";
    echo "opening $tturl ...\n";
    $ch = curl_init();
    if (! $ch) die( "Cannot allocate a new PHP-CURL handle\n" );
    $fp = fopen("ttread.h tm", "w");
    curl_setopt($ch , CURLOPT_FILE, $fp);
    curl_setopt($ch , CURLOPT_URL, $tturl);
    curl_exec($ch);
    curl_close($ch) ;
    fclose($fp);
    echo "finished\n ";
    ?>

    This results in a 403 forbidden page. However if I type the url
    http://teletekst.nos.nl/ in my browser then it works fine (also with
    cookies disabled). If I change $tturl in the script to
    http://www.nos.nl/ itw works. What is teh difference between typing
    itin my browser or accessing it with curl? Is tehere a workaround for
    this?

    Greetingz Bas

  • Philip Ronan

    #2
    Re: Curl gives 403 forbidden

    "Basta" wrote:
    [color=blue]
    > I'm trying to retrieve information of a website using PHP and Curl.
    > This is the code I use:
    >[/color]
    (snip)[color=blue]
    >
    > This results in a 403 forbidden page. However if I type the url
    > http://teletekst.nos.nl/ in my browser then it works fine (also with
    > cookies disabled).[/color]

    That's probably because the owners of teletekst.nos.n l are fed up with
    having idiot robots crawling all over their site and stealing its content.

    If you had bothered to visit <http://teletekst.nos.n l/robots.txt> you might
    have noticed that robots are not permitted to access this website. You're
    getting a 403 response because their website has identified that you're
    accessing it improperly.

    There are probably some things you could do to bypass the blocks on this
    website, but I'm not going to tell you what they are. Create your own
    content. Don't steal it from other websites.

    --
    phil [dot] ronan @ virgin [dot] net



    Comment

    • Basta

      #3
      Re: Curl gives 403 forbidden

      > There are probably some things you could do to bypass the blocks on this[color=blue]
      > website, but I'm not going to tell you what they are. Create your own
      > content. Don't steal it from other websites.[/color]

      Thanx for your help. So I'm stealing content from a website? I can read
      it but then I have to forget it as soon as possible otherwise I'm a
      thief. Interesting thought. I'm surpsised you didn't even bother to
      inform for what purpose I needed it.

      Comment

      • Jacob Atzen

        #4
        Re: Curl gives 403 forbidden

        On 2005-08-26, Basta <baspellis@gmai l.com> wrote:[color=blue]
        > I'm trying to retrieve information of a website using PHP and Curl.
        > This is the code I use:
        >
        ><?
        > $tturl = "http://teletekst.nos.n l/";
        > echo "opening $tturl ...\n";
        > $ch = curl_init();
        > if (! $ch) die( "Cannot allocate a new PHP-CURL handle\n" );
        > $fp = fopen("ttread.h tm", "w");
        > curl_setopt($ch , CURLOPT_FILE, $fp);
        > curl_setopt($ch , CURLOPT_URL, $tturl);
        > curl_exec($ch);
        > curl_close($ch) ;
        > fclose($fp);
        > echo "finished\n ";
        > ?>
        >
        > This results in a 403 forbidden page. However if I type the url
        > http://teletekst.nos.nl/ in my browser then it works fine (also with
        > cookies disabled). If I change $tturl in the script to
        > http://www.nos.nl/ itw works. What is teh difference between typing
        > itin my browser or accessing it with curl? Is tehere a workaround for
        > this?[/color]

        Perhaps it checks on user-agent?

        --
        Cheers,
        - Jacob Atzen

        Comment

        • Basta

          #5
          Re: Curl gives 403 forbidden

          > Perhaps it checks on user-agent?

          Setting the CURLOPT_USERAGE NT to "Mozilla/5.0 (Windows; U; Windows NT
          5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0" doesn't help.

          Comment

          • R. Rajesh Jeba Anbiah

            #6
            Re: Curl gives 403 forbidden

            Basta wrote:[color=blue][color=green]
            > > Perhaps it checks on user-agent?[/color]
            >
            > Setting the CURLOPT_USERAGE NT to "Mozilla/5.0 (Windows; U; Windows NT
            > 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0" doesn't help.[/color]

            Or, may be referrer or cookie issue. Better use verbose mode and post
            the log file here.

            Sample to verbose mode and log:
            $fp_err = fopen('verbose_ file.txt', 'ab+');
            fwrite($fp_err, date('Y-m-d H:i:s')."\n\n") ; //add timestamp to the
            verbose log
            curl_setopt($ch , CURLOPT_VERBOSE , 1);
            curl_setopt($ch , CURLOPT_FAILONE RROR, true);
            curl_setopt($ch , CURLOPT_STDERR, $fp_err);

            Also, check
            <http://curl.haxx.se/libcurl/php/examples/?ex=cookiejar.p hp> for cookie
            handling.

            --
            <?php echo 'Just another PHP saint'; ?>
            Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/

            Comment

            Working...