Hi,
I have few weblogs, example below
65.96.112.177 - - [27/Jan/2007:00:02:10 -0500] "GET /~jking/images/ralph.jpg HTTP/1.1" 200 66196
65.214.44.44 - - [27/Jan/2007:00:02:27 -0500] "GET /~jkabara/Research.htm HTTP/1.0" 200 4696
207.46.98.52 - - [27/Jan/2007:00:02:29 -0500] "GET /~mweiss/new/background.htm HTTP/1.0" 200 3905
207.46.98.52 - - [27/Jan/2007:00:02:35 -0500] "GET /%7Epaws/project_pacer.h tm HTTP/1.0" 200 15645
207.46.98.52 - - [27/Jan/2007:00:02:36 -0500] "GET /%7Espring/patterns/node15.html#SEC TION00042300000 000000000 HTTP/1.1" 200 2641
65.214.44.44 - - [27/Jan/2007:00:02:42 -0500] "GET /~jkling/2110/week04.ppt HTTP/1.0" 200 1914280
Basically... I want to find out, which page did the user start browsing and which page he ended.
The output should be something like
Visitor 65.96.112.177 started browsing at page /~jking/images/ralph.jpg
Visitor 65.96.112.177 ended browsing at page /~jking/images/ralph.jpg
Visitor 65.214.44.44 started browsing at page /~jkabara/Research.htm
Visitor 65.214.44.44 ended browsing at page /~jkling/2110/week04.ppt
.
.
.
.
I wrote the code below assuming that, if the ip doesnt match at the 20th time...it means...the user ended his session. But it aint happening..coul d someone please help me do this...any other solutions/ideas are welcome
I have few weblogs, example below
65.96.112.177 - - [27/Jan/2007:00:02:10 -0500] "GET /~jking/images/ralph.jpg HTTP/1.1" 200 66196
65.214.44.44 - - [27/Jan/2007:00:02:27 -0500] "GET /~jkabara/Research.htm HTTP/1.0" 200 4696
207.46.98.52 - - [27/Jan/2007:00:02:29 -0500] "GET /~mweiss/new/background.htm HTTP/1.0" 200 3905
207.46.98.52 - - [27/Jan/2007:00:02:35 -0500] "GET /%7Epaws/project_pacer.h tm HTTP/1.0" 200 15645
207.46.98.52 - - [27/Jan/2007:00:02:36 -0500] "GET /%7Espring/patterns/node15.html#SEC TION00042300000 000000000 HTTP/1.1" 200 2641
65.214.44.44 - - [27/Jan/2007:00:02:42 -0500] "GET /~jkling/2110/week04.ppt HTTP/1.0" 200 1914280
Basically... I want to find out, which page did the user start browsing and which page he ended.
The output should be something like
Visitor 65.96.112.177 started browsing at page /~jking/images/ralph.jpg
Visitor 65.96.112.177 ended browsing at page /~jking/images/ralph.jpg
Visitor 65.214.44.44 started browsing at page /~jkabara/Research.htm
Visitor 65.214.44.44 ended browsing at page /~jkling/2110/week04.ppt
.
.
.
.
I wrote the code below assuming that, if the ip doesnt match at the 20th time...it means...the user ended his session. But it aint happening..coul d someone please help me do this...any other solutions/ideas are welcome
Code:
open (INF, $logfile);
read INF, $file, -s INF;
close INF;
@lines = split /\n/, $file;
foreach (@lines) {
@values = split / /, $_;
$visitors{$values[0]}++;
}
$maxcount=20;
$timecounter=250;
foreach $visitor (keys %visitors)
{
$flag = 0 ;
foreach $logentry (@lines)
{
@values = split / /, $logentry;
if($visitor eq $values[0])
{
$currentpage = $values[6];
if($flag==0){
print "\n Visitor " . $visitor . " started checking out the page";
print ": " .$values[6] . " ";
$startpage = $values[6];
$flag=1;
$i=0;
$endpage = $values[6];
}
else{
$endpage = $values[6];
}
}#if
else{
$i++;
}
if($i > $maxcount) {print "\nLast Page Visited by " . $visitor . " was " .$endpage . " ";last;}
print "\ni=" .$i;
}#for each line
}#for each visitor
Comment