file to dictionary

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • v13tn1g
    New Member
    • Feb 2009
    • 31

    file to dictionary

    i am given an open file in the form of, key serperator value, its not just 1 line but many lines in that form. what the objective is is that i have to convert the open file into a dictionary where the output should be {key : value}, and the seperator is where it splits it up into both values.

    for example:

    File:
    abc def ghj
    jkl def jkk
    uio def asd

    the output should be :{"abc": "ghj" , "jkl" : "jkk", "uio" : "asd"}

    and the parameters should be:

    def file_to_dict(fi le, seperator)

    How would i write a file like this any hints?
  • bvdet
    Recognized Expert Specialist
    • Oct 2006
    • 2851

    #2
    Have you tried to code this yourself? Maybe this will give you some hints:
    Code:
    s = '''
    abc def ghj
    jkl def jkk
    uio def asd
    '''
    
    sList = [item for item in s.split('\n') if item]
    
    sep = "def"
    dd = {}
    for item in sList:
        a,b = item.split(sep)
        dd[a.strip()] = b.strip()

    Comment

    • v13tn1g
      New Member
      • Feb 2009
      • 31

      #3
      yep i've tried to code it myself but what i have so far is i've basically made the file into a string and then in split it using the seperator. it's just that now im stuck with the word before the seperator as the key and after the seperator as the value.

      Comment

      • v13tn1g
        New Member
        • Feb 2009
        • 31

        #4
        Code:
        import urllib
        
        def file_to_dict(a, s):
            '''Return a dictionary that contains the contents of each line in a given 
            open file as a key-value pair. The file contains lines of the form key 
            separator value, where separator is the second parameter of the function. 
            There may be more than one occurrence of the separator in the line; 
            the key is before the first one - all others are part of the value string.
            Both key and value shoud be without leading or trailing whitespace. '''
            
            f = urllib.urlopen(a)
            for c in f:
                g = c.rstrip()
                    
            sList = [item for item in g.split('\n') if item]
        
            sep = s
            dd = {}
            for item in sList:
                a,b = item.split(s)
                dd[a.strip()] = b.strip()
            print dd

        I test it with:

        Code:
        file_to_dict("http://www.utsc.utoronto.ca/~szamosi/a20/lectures/w2/price_1.py", "price")
        but i get the error

        Code:
        File "y:\<string>", line 1, in <module>
          File "y:\<string>", line 34, in file_to_dict
        ValueError: too many values to unpack

        Comment

        • v13tn1g
          New Member
          • Feb 2009
          • 31

          #5
          Originally posted by bvdet
          Have you tried to code this yourself? Maybe this will give you some hints:
          Code:
          s = '''
          abc def ghj
          jkl def jkk
          uio def asd
          '''
          
          sList = [item for item in s.split('\n') if item]
          
          sep = "def"
          dd = {}
          for item in sList:
              a,b = item.split(sep)
              dd[a.strip()] = b.strip()
          I think i know why i got that error, its because i tested a file such as

          Code:
          s = '''
          dfs ghj hge
          abc def ghj
          jkl def jkk
          uio def asd
          '''
          the first line did not have the seperator hence the error.

          also another thing if the test file were to be

          Code:
          s = '''
          dfs ghj hge
          abc def ghj
          jkl def jkk
          uio def asd
          dfs def qwe
          '''
          since there is 2 occurances of dfs it won't print both of them in the dictionary it will only print out one, how should i write this program to print out both "dfs" key and value?

          Comment

          • bvdet
            Recognized Expert Specialist
            • Oct 2006
            • 2851

            #6
            The first line has no separator, therefore the ValueError. We can trap that with a try/except block. Since we can have more than one data element with the same key, create a list for the values.
            Code:
            s = '''
            
            dfs ghj hge
            abc def ghj
            jkl def jkk
            uio def asd
            dfs def qwe
            dfs def ewq
            
            
            '''
            
            
            sList = [item for item in s.split('\n') if item.strip()]
            
            sep = "def"
            dd = {}
            for i, item in enumerate(sList):
                try:
                    a,b = [s.strip() for s in item.split(sep)]
                    dd.setdefault(a, []).append(b)
                except ValueError, e:
                    print "Malformed data line number %d" % (i+1)
            
            for key in dd:
                for item in dd[key]:
                    print "%s: %s" % (key, item)
            Output:
            Code:
            >>> Malformed data line number 1
            jkl: jkk
            abc: ghj
            dfs: qwe
            dfs: ewq
            uio: asd
            >>>

            Comment

            Working...