Trying to extract a string from HTTP::Request object

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nkg1234567
    New Member
    • Mar 2007
    • 1

    Trying to extract a string from HTTP::Request object

    I'm trying to extract HTML from a website in the form of a string, and then I want to extract particular elements from the string using the substr function:
    here is some sample code that I have thus far:

    Code:
    use HTTP::Request::Common;
    use LWP::UserAgent;
    use LWP::Simple;
    
    $ua = LWP::UserAgent->new;
    
    $request = HTTP::Request->new(GET => 'http://www.cnn.com');
    $response = $ua->request($request);
    $content = $response->content();
    
    my $result2 = substr $content, index($content, 'Headlines');
    So, the variable $content seems to be an HTML object or something that is NOT a string. How can I convert $content to a string, so that I can use the substr function?

    I have tried other methods including simpler code:

    Code:
    my $content = get('http://securities.stanford.edu//1014/TCHC00');
    however, I am not able to process $content as a string.

    I have even tried putting the contents into a text file, but I am not able to extract a string from a text file either?

    any help is appreciated!!!
    Last edited by miller; Mar 27 '07, 09:27 PM. Reason: Code tag
  • rickumali
    New Member
    • Dec 2006
    • 19

    #2
    I ran your program through the Perl debugger, and confirmed that $response->content() definitely contains HTML. When I put the output into an editor, I found that the content does NOT contain "Headlines. " Try another keyword, like "Weather."

    If you want to examine variables without the debugger, use this code (provided your Perl has the Dumpvalue module):
    Code:
    use HTTP::Request::Common;
    use LWP::UserAgent;
    use LWP::Simple;
    use Dumpvalue;
    
    $dumper=new Dumpvalue;
    
    $ua = LWP::UserAgent->new;
    
    $request = HTTP::Request->new(GET => 'http://www.cnn.com');
    $response = $ua->request($request);
    $content = $response->content();
    
    $dumper->dumpValue(\$response);
    
    my $result2 = substr $content, index($content, 'Headlines');
    Then when you run it, save the output to a text file. On my Windows box, with ActiveState Perl, I used this:
    Code:
    C:\cygwin\home\Rick\perl>perl getreq.pl > output.txt
    In the Perl debugger, this is what I see when I used 'Weather':
    Code:
    main::(getreq.pl:15):   my $result2 = substr $content, index($content, 'Weather');
      DB<1>
    main::(getreq.pl:17):   print $result2;
      DB<1> print length($result2)
    105335
      DB<2> print substr $result2, 0, 20
    Weather, Entertainme
    You're on the right track. Prove what each line does, and learn the Perl debugger to get interactive.

    Comment

    Working...