Hello!
I have the following assgnment:
Open the file mbox-short.txt and read it line by line. When you find a line that starts with 'From ' like the following line:
From stephen.marquar d@uct.ac.za Sat Jan 5 09:14:16 2008
You will parse the From line using split() and print out the second word in the line (i.e. the entire address of the person who sent the message). Then print out a count at the end.
Hint: make sure not to include the lines that start with 'From:'. Also look at the last line of the sample output to see how to print the count.
You can download the sample data at http://www.py4e.com/code3/mbox-short.txt
The output should be:
stephen.marquar d@uct.ac.za
louis@media.ber keley.edu
zqian@umich.edu
rjlowe@iupui.ed u
zqian@umich.edu
rjlowe@iupui.ed u
cwen@iupui.edu
cwen@iupui.edu
gsilver@umich.e du
gsilver@umich.e du
zqian@umich.edu
gsilver@umich.e du
wagnermr@iupui. edu
zqian@umich.edu
antranig@caret. cam.ac.uk
gopal.ramasammy cook@gmail.com
david.horwitz@u ct.ac.za
david.horwitz@u ct.ac.za
david.horwitz@u ct.ac.za
david.horwitz@u ct.ac.za
stephen.marquar d@uct.ac.za
louis@media.ber keley.edu
louis@media.ber keley.edu
ray@media.berke ley.edu
cwen@iupui.edu
cwen@iupui.edu
cwen@iupui.edu
There were 27 lines in the file with From as the first word
I tried it with the code below:
``````````
fname = input("Enter file name: ")
if len(fname) < 1:
fname = "mbox-short.txt"
fh = open(fname)
count = 0
for line in fh:
if line.startswith ('From:'):
pass
elif line.startswith ('From'):
x = line.split('Fro m') and line.rstrip(' SatFriThuJn0123 456789:')
print(x)
count = count + 1
print("There were", count, "lines in the file with From as the first word")
```````````````
The output I'm getting is the following:
From stephen.marquar d@uct.ac.za Sat Jan 5 09:14:16 2008 ← Mismatch
From louis@media.ber keley.edu Fri Jan 4 18:10:48 2008
From zqian@umich.edu Fri Jan 4 16:10:39 2008
From rjlowe@iupui.ed u Fri Jan 4 15:46:24 2008
From zqian@umich.edu Fri Jan 4 15:03:18 2008
From rjlowe@iupui.ed u Fri Jan 4 14:50:18 2008
From cwen@iupui.edu Fri Jan 4 11:37:30 2008
From cwen@iupui.edu Fri Jan 4 11:35:08 2008
From gsilver@umich.e du Fri Jan 4 11:12:37 2008
From gsilver@umich.e du Fri Jan 4 11:11:52 2008
From zqian@umich.edu Fri Jan 4 11:11:03 2008
From gsilver@umich.e du Fri Jan 4 11:10:22 2008
From wagnermr@iupui. edu Fri Jan 4 10:38:42 2008
From zqian@umich.edu Fri Jan 4 10:17:43 2008
From antranig@caret. cam.ac.uk Fri Jan 4 10:04:14 2008
From gopal.ramasammy cook@gmail.com Fri Jan 4 09:05:31 2008
From david.horwitz@u ct.ac.za Fri Jan 4 07:02:32 2008
From david.horwitz@u ct.ac.za Fri Jan 4 06:08:27 2008
From david.horwitz@u ct.ac.za Fri Jan 4 04:49:08 2008
From david.horwitz@u ct.ac.za Fri Jan 4 04:33:44 2008
From stephen.marquar d@uct.ac.za Fri Jan 4 04:07:34 2008
From louis@media.ber keley.edu Thu Jan 3 19:51:21 2008
From louis@media.ber keley.edu Thu Jan 3 17:18:23 2008
From ray@media.berke ley.edu Thu Jan 3 17:07:00 2008
From cwen@iupui.edu Thu Jan 3 16:34:40 2008
From cwen@iupui.edu Thu Jan 3 16:29:07 2008
From cwen@iupui.edu Thu Jan 3 16:23:48 2008
There were 27 lines in the file with From as the first word[
As you can see, the last line is correct (count)and the email addresses are at the right order....but I'm not being able to remove "From" at the beginning of the lines neither the dates at the end of the lines. Another thing is that the lines are skipped and they shoudn't.
Could someone help me with that task, please?
I have the following assgnment:
Open the file mbox-short.txt and read it line by line. When you find a line that starts with 'From ' like the following line:
From stephen.marquar d@uct.ac.za Sat Jan 5 09:14:16 2008
You will parse the From line using split() and print out the second word in the line (i.e. the entire address of the person who sent the message). Then print out a count at the end.
Hint: make sure not to include the lines that start with 'From:'. Also look at the last line of the sample output to see how to print the count.
You can download the sample data at http://www.py4e.com/code3/mbox-short.txt
The output should be:
stephen.marquar d@uct.ac.za
louis@media.ber keley.edu
zqian@umich.edu
rjlowe@iupui.ed u
zqian@umich.edu
rjlowe@iupui.ed u
cwen@iupui.edu
cwen@iupui.edu
gsilver@umich.e du
gsilver@umich.e du
zqian@umich.edu
gsilver@umich.e du
wagnermr@iupui. edu
zqian@umich.edu
antranig@caret. cam.ac.uk
gopal.ramasammy cook@gmail.com
david.horwitz@u ct.ac.za
david.horwitz@u ct.ac.za
david.horwitz@u ct.ac.za
david.horwitz@u ct.ac.za
stephen.marquar d@uct.ac.za
louis@media.ber keley.edu
louis@media.ber keley.edu
ray@media.berke ley.edu
cwen@iupui.edu
cwen@iupui.edu
cwen@iupui.edu
There were 27 lines in the file with From as the first word
I tried it with the code below:
``````````
fname = input("Enter file name: ")
if len(fname) < 1:
fname = "mbox-short.txt"
fh = open(fname)
count = 0
for line in fh:
if line.startswith ('From:'):
pass
elif line.startswith ('From'):
x = line.split('Fro m') and line.rstrip(' SatFriThuJn0123 456789:')
print(x)
count = count + 1
print("There were", count, "lines in the file with From as the first word")
```````````````
The output I'm getting is the following:
From stephen.marquar d@uct.ac.za Sat Jan 5 09:14:16 2008 ← Mismatch
From louis@media.ber keley.edu Fri Jan 4 18:10:48 2008
From zqian@umich.edu Fri Jan 4 16:10:39 2008
From rjlowe@iupui.ed u Fri Jan 4 15:46:24 2008
From zqian@umich.edu Fri Jan 4 15:03:18 2008
From rjlowe@iupui.ed u Fri Jan 4 14:50:18 2008
From cwen@iupui.edu Fri Jan 4 11:37:30 2008
From cwen@iupui.edu Fri Jan 4 11:35:08 2008
From gsilver@umich.e du Fri Jan 4 11:12:37 2008
From gsilver@umich.e du Fri Jan 4 11:11:52 2008
From zqian@umich.edu Fri Jan 4 11:11:03 2008
From gsilver@umich.e du Fri Jan 4 11:10:22 2008
From wagnermr@iupui. edu Fri Jan 4 10:38:42 2008
From zqian@umich.edu Fri Jan 4 10:17:43 2008
From antranig@caret. cam.ac.uk Fri Jan 4 10:04:14 2008
From gopal.ramasammy cook@gmail.com Fri Jan 4 09:05:31 2008
From david.horwitz@u ct.ac.za Fri Jan 4 07:02:32 2008
From david.horwitz@u ct.ac.za Fri Jan 4 06:08:27 2008
From david.horwitz@u ct.ac.za Fri Jan 4 04:49:08 2008
From david.horwitz@u ct.ac.za Fri Jan 4 04:33:44 2008
From stephen.marquar d@uct.ac.za Fri Jan 4 04:07:34 2008
From louis@media.ber keley.edu Thu Jan 3 19:51:21 2008
From louis@media.ber keley.edu Thu Jan 3 17:18:23 2008
From ray@media.berke ley.edu Thu Jan 3 17:07:00 2008
From cwen@iupui.edu Thu Jan 3 16:34:40 2008
From cwen@iupui.edu Thu Jan 3 16:29:07 2008
From cwen@iupui.edu Thu Jan 3 16:23:48 2008
There were 27 lines in the file with From as the first word[
As you can see, the last line is correct (count)and the email addresses are at the right order....but I'm not being able to remove "From" at the beginning of the lines neither the dates at the end of the lines. Another thing is that the lines are skipped and they shoudn't.
Could someone help me with that task, please?
Comment