Problem with split() and rstrip()

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • guiromero
    New Member
    • Nov 2021
    • 4

    Problem with split() and rstrip()

    Hello!

    I have the following assgnment:

    Open the file mbox-short.txt and read it line by line. When you find a line that starts with 'From ' like the following line:

    From stephen.marquar d@uct.ac.za Sat Jan 5 09:14:16 2008

    You will parse the From line using split() and print out the second word in the line (i.e. the entire address of the person who sent the message). Then print out a count at the end.
    Hint: make sure not to include the lines that start with 'From:'. Also look at the last line of the sample output to see how to print the count.

    You can download the sample data at http://www.py4e.com/code3/mbox-short.txt

    The output should be:

    stephen.marquar d@uct.ac.za
    louis@media.ber keley.edu
    zqian@umich.edu
    rjlowe@iupui.ed u
    zqian@umich.edu
    rjlowe@iupui.ed u
    cwen@iupui.edu
    cwen@iupui.edu
    gsilver@umich.e du
    gsilver@umich.e du
    zqian@umich.edu
    gsilver@umich.e du
    wagnermr@iupui. edu
    zqian@umich.edu
    antranig@caret. cam.ac.uk
    gopal.ramasammy cook@gmail.com
    david.horwitz@u ct.ac.za
    david.horwitz@u ct.ac.za
    david.horwitz@u ct.ac.za
    david.horwitz@u ct.ac.za
    stephen.marquar d@uct.ac.za
    louis@media.ber keley.edu
    louis@media.ber keley.edu
    ray@media.berke ley.edu
    cwen@iupui.edu
    cwen@iupui.edu
    cwen@iupui.edu
    There were 27 lines in the file with From as the first word



    I tried it with the code below:

    ``````````
    fname = input("Enter file name: ")
    if len(fname) < 1:
    fname = "mbox-short.txt"

    fh = open(fname)
    count = 0
    for line in fh:
    if line.startswith ('From:'):
    pass
    elif line.startswith ('From'):
    x = line.split('Fro m') and line.rstrip(' SatFriThuJn0123 456789:')
    print(x)

    count = count + 1

    print("There were", count, "lines in the file with From as the first word")

    ```````````````

    The output I'm getting is the following:


    From stephen.marquar d@uct.ac.za Sat Jan 5 09:14:16 2008 ← Mismatch

    From louis@media.ber keley.edu Fri Jan 4 18:10:48 2008

    From zqian@umich.edu Fri Jan 4 16:10:39 2008

    From rjlowe@iupui.ed u Fri Jan 4 15:46:24 2008

    From zqian@umich.edu Fri Jan 4 15:03:18 2008

    From rjlowe@iupui.ed u Fri Jan 4 14:50:18 2008

    From cwen@iupui.edu Fri Jan 4 11:37:30 2008

    From cwen@iupui.edu Fri Jan 4 11:35:08 2008

    From gsilver@umich.e du Fri Jan 4 11:12:37 2008

    From gsilver@umich.e du Fri Jan 4 11:11:52 2008

    From zqian@umich.edu Fri Jan 4 11:11:03 2008

    From gsilver@umich.e du Fri Jan 4 11:10:22 2008

    From wagnermr@iupui. edu Fri Jan 4 10:38:42 2008

    From zqian@umich.edu Fri Jan 4 10:17:43 2008

    From antranig@caret. cam.ac.uk Fri Jan 4 10:04:14 2008

    From gopal.ramasammy cook@gmail.com Fri Jan 4 09:05:31 2008

    From david.horwitz@u ct.ac.za Fri Jan 4 07:02:32 2008

    From david.horwitz@u ct.ac.za Fri Jan 4 06:08:27 2008

    From david.horwitz@u ct.ac.za Fri Jan 4 04:49:08 2008

    From david.horwitz@u ct.ac.za Fri Jan 4 04:33:44 2008

    From stephen.marquar d@uct.ac.za Fri Jan 4 04:07:34 2008

    From louis@media.ber keley.edu Thu Jan 3 19:51:21 2008

    From louis@media.ber keley.edu Thu Jan 3 17:18:23 2008

    From ray@media.berke ley.edu Thu Jan 3 17:07:00 2008

    From cwen@iupui.edu Thu Jan 3 16:34:40 2008

    From cwen@iupui.edu Thu Jan 3 16:29:07 2008

    From cwen@iupui.edu Thu Jan 3 16:23:48 2008

    There were 27 lines in the file with From as the first word[


    As you can see, the last line is correct (count)and the email addresses are at the right order....but I'm not being able to remove "From" at the beginning of the lines neither the dates at the end of the lines. Another thing is that the lines are skipped and they shoudn't.


    Could someone help me with that task, please?
  • dev7060
    Recognized Expert Contributor
    • Mar 2017
    • 655

    #2
    From stephen.marquar d@uct.ac.za Sat Jan 5 09:14:16 2008

    You will parse the From line using split() and print out the second word in the line (i.e. the entire address of the person who sent the message).
    Code:
    x = line.split(' ')[1]
    ‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎‎ ‎

    Comment

    • pritikumari
      Banned
      New Member
      • Jan 2023
      • 23

      #3
      The rstrip() method removes any trailing characters at the end a string, space is the default trailing character*to*re move.

      I have looked at a lot of other people's questions/answers about split and strip, and I think that maybe some of my problem is that my initial code splits the text by line, thus giving me lists to work with, when in actuality, each of my lines is still a string that I need to break up and NOT a list, but because of python syntax I have to work with the string as if it were a list. Please, any advice you can give me that would help me understand what my problem is and how to fix it would be*so*appreciat ed.

      Comment

      • Arushi
        New Member
        • Oct 2022
        • 7

        #4
        The correct code to achieve the desired output should be:

        Code:
        fname = input("Enter file name: ")
        if len(fname) < 1:
        fname = "mbox-short.txt"
        
        fh = open(fname)
        count = 0
        for line in fh:
        if line.startswith('From:'):
        pass
        elif line.startswith('From'):
        words = line.split()
        email = words[1]
        print(email)
        
        count = count + 1
        
        print("There were", count, "lines in the file with From as the first word")

        Comment

        Working...