I have a .txt log file and must of it is crap. But there are parts that display when a user logs in, and at what time the logged in. Below is a portion of the log file. For example, "mwoelk" is a user logging in and "dcurtin" is another user logging in. So far I have created a python app that counts how many times a user logged in, but I'm a little clueless on how to pull when the user logged in. Any help on what I could do would help a lot.
172.16.9.206 - mwoelk [01/Feb/2008:04:32:12 -0500] "GET /controller?meth od=getUser HTTP/1.0" 200 305
172.16.9.166 - - [01/Feb/2008:04:57:38 -0500] "HEAD /images/DCI.gif HTTP/1.1" 200 -
172.16.9.166 - - [01/Feb/2008:04:57:38 -0500] "HEAD /eagent.jnlp HTTP/1.1" 200 -
172.16.9.166 - - [01/Feb/2008:04:57:38 -0500] "HEAD /jh.jnlp HTTP/1.1" 200 -
172.16.9.166 - - [01/Feb/2008:04:57:38 -0500] "HEAD /smack.jar HTTP/1.1" 200 -
172.16.9.166 - - [01/Feb/2008:04:57:38 -0500] "HEAD /jh.jar HTTP/1.1" 200 -
172.16.9.166 - - [01/Feb/2008:04:57:39 -0500] "HEAD /images/DCI.gif HTTP/1.1" 200 -
172.16.9.166 - noone [01/Feb/2008:04:57:40 -0500] "GET /controller?meth od=getNode&name =S14000068 HTTP/1.0" 200 499
172.16.9.166 - - [01/Feb/2008:04:57:40 -0500] "GET /help/helpset.hs HTTP/1.1" 200 547
172.16.9.166 - - [01/Feb/2008:04:57:43 -0500] "GET /help/map.jhm HTTP/1.1" 200 59650
172.16.9.162 - dcurtin [01/Feb/2008:00:19:16 -0500] "GET /controller?meth od=getUser HTTP/1.0" 200 307
Here is what I have done so far to count the frequency of a user logging in.
172.16.9.206 - mwoelk [01/Feb/2008:04:32:12 -0500] "GET /controller?meth od=getUser HTTP/1.0" 200 305
172.16.9.166 - - [01/Feb/2008:04:57:38 -0500] "HEAD /images/DCI.gif HTTP/1.1" 200 -
172.16.9.166 - - [01/Feb/2008:04:57:38 -0500] "HEAD /eagent.jnlp HTTP/1.1" 200 -
172.16.9.166 - - [01/Feb/2008:04:57:38 -0500] "HEAD /jh.jnlp HTTP/1.1" 200 -
172.16.9.166 - - [01/Feb/2008:04:57:38 -0500] "HEAD /smack.jar HTTP/1.1" 200 -
172.16.9.166 - - [01/Feb/2008:04:57:38 -0500] "HEAD /jh.jar HTTP/1.1" 200 -
172.16.9.166 - - [01/Feb/2008:04:57:39 -0500] "HEAD /images/DCI.gif HTTP/1.1" 200 -
172.16.9.166 - noone [01/Feb/2008:04:57:40 -0500] "GET /controller?meth od=getNode&name =S14000068 HTTP/1.0" 200 499
172.16.9.166 - - [01/Feb/2008:04:57:40 -0500] "GET /help/helpset.hs HTTP/1.1" 200 547
172.16.9.166 - - [01/Feb/2008:04:57:43 -0500] "GET /help/map.jhm HTTP/1.1" 200 59650
172.16.9.162 - dcurtin [01/Feb/2008:00:19:16 -0500] "GET /controller?meth od=getUser HTTP/1.0" 200 307
Here is what I have done so far to count the frequency of a user logging in.
Code:
file = open("localhost_access_log.2008-02-01.txt", "r") text = file.read() file.close() word_list = text.lower().split(None) word_freq = {} for word in word_list: word_freq[word] = word_freq.get(word, 0) + 1 keys = sorted(word_freq.keys()) for word in keys: print "%-10s %d" % (word, word_freq[word])
Comment