Parse file into array

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • amfr

    Parse file into array

    I was wondering how i could parse the contents of a file into an array.
    the file would look something like this:

    gif:image/gif
    html:text/html
    jpg:image/jpeg
    ....

    As you can see, it contains the mime type and the file extension
    seperated by commas, 1 per line. I was wondering if it was possible to
    create and array like this:

    (Pseudocode)
    mimetypearray[gif] = "image/gif"
    mimetypearray[html] = "text/html"
    mimetypearray[jpg] = "image/jpeg"
    ....

    I come from a PHP backround where I know this is possible, but I am new
    at Python. Please disregard this if it is a stupid question.

  • Craig Marshall

    #2
    Re: Parse file into array

    > I was wondering how i could parse the contents of a file into an array.[color=blue]
    > the file would look something like this:
    >
    > gif:image/gif
    > html:text/html
    > jpg:image/jpeg[/color]

    Try something like this:

    d = {}
    for line in open("input.txt ").readline s():
    ext, mime = line.strip().sp lit(":")
    d[ext] = mime
    print d

    Craig

    Comment

    • Leif K-Brooks

      #3
      Re: Parse file into array

      amfr wrote:[color=blue]
      > I was wondering how i could parse the contents of a file into an array.
      > the file would look something like this:
      >
      > gif:image/gif
      > html:text/html
      > jpg:image/jpeg
      > ...
      >
      > As you can see, it contains the mime type and the file extension
      > seperated by commas, 1 per line. I was wondering if it was possible to
      > create and array like this:
      >
      > (Pseudocode)
      > mimetypearray[gif] = "image/gif"
      > mimetypearray[html] = "text/html"
      > mimetypearray[jpg] = "image/jpeg"
      > ...[/color]

      You want a dictionary, not an array.

      mimetypedict = {}
      for line in mimetypefile:
      line = line.rsplit('\r \n')
      extension, mimetype = line.split(':')
      mimetypedict[extension] = mimetype

      Note that there's already a MIME type database in the standard mimtypes
      module: <http://python.org/doc/current/lib/module-mimetypes.html> .

      Comment

      • amfr

        #4
        Re: Parse file into array

        Thanks a lot. The webserver I am writing works now :)

        Comment

        • Leif K-Brooks

          #5
          Re: Parse file into array

          Leif K-Brooks wrote:[color=blue]
          > line = line.rsplit('\r \n')[/color]
          Er, that should be line.rstrip, not line.rsplit.

          Comment

          • Andrew Nelis

            #6
            Re: Parse file into array

            If it helps, there's a builtin module for figuring out mimetypes;

            Source code: Lib/mimetypes.py The mimetypes module converts between a filename or URL and the MIME type associated with the filename extension. Conversions are provided from filename to MIME type a...

            [color=blue][color=green][color=darkred]
            >>> import mimetypes
            >>> mimetypes.guess _type('.gif')[/color][/color][/color]
            ('image/gif', None)


            Cheers,

            Andy.

            Comment

            • Bengt Richter

              #7
              Re: Parse file into array

              On 14 Nov 2005 13:48:45 -0800, "amfr" <amfr.org@gmail .com> wrote:
              [color=blue]
              >I was wondering how i could parse the contents of a file into an array.
              > the file would look something like this:
              >
              >gif:image/gif
              >html:text/html
              >jpg:image/jpeg
              >...
              >
              >As you can see, it contains the mime type and the file extension
              >seperated by commas, 1 per line. I was wondering if it was possible to
              >create and array like this:
              >
              >(Pseudocode)
              >mimetypearra y[gif] = "image/gif"
              >mimetypearra y[html] = "text/html"
              >mimetypearra y[jpg] = "image/jpeg"
              >...
              >
              >I come from a PHP backround where I know this is possible, but I am new
              >at Python. Please disregard this if it is a stupid question.
              >[/color]
              Pretty much anything is possible in Python, if you can conceive it well enough ;-)

              Assuming f is from f = open(yourfile), simulated with StringIO file object here,
              [color=blue][color=green][color=darkred]
              >>> from StringIO import StringIO
              >>> f = StringIO("""\[/color][/color][/color]
              ... gif:image/gif
              ... html:text/html
              ... jpg:image/jpeg
              ... """)

              And assuming that there are no spaces around the ':' or in the two pieces,
              but maybe some optional whitespace at either end of a line and \n at the end
              (except maybe the last line), and no blank lines, you can get a dict mapping easily:
              [color=blue][color=green][color=darkred]
              >>> mimetypedict = dict(tuple(line .strip().split( ':')) for line in f)[/color][/color][/color]

              This gives you:[color=blue][color=green][color=darkred]
              >>> mimetypedict[/color][/color][/color]
              {'gif': 'image/gif', 'html': 'text/html', 'jpg': 'image/jpeg'}

              Which you can access using the names as keys, e.g.,[color=blue][color=green][color=darkred]
              >>> mimetypedict['gif'][/color][/color][/color]
              'image/gif'

              If you want to use bare names to access the info via an object, you can use the
              dict info to create a class or class instance and give it the named attributes, e.g.
              a class with the data as class variables is quick:
              [color=blue][color=green][color=darkred]
              >>> MTC = type('MTC',(), mimetypedict)
              >>> MTC.gif[/color][/color][/color]
              'image/gif'[color=blue][color=green][color=darkred]
              >>> MTC.jpg[/color][/color][/color]
              'image/jpeg'

              Or you could substitute the mimetypedict expression from above to make another one-liner ;-)

              Other ways of setting up your info are certainly possible, and may be more suitable,
              depending on how you intend to use the info. As mentioned, the mimetypes module
              may already have much of the data and/or functionality you want.

              Regards,
              Bengt Richter

              Comment

              • Bengt Richter

                #8
                Re: Parse file into array

                On 14 Nov 2005 13:48:45 -0800, "amfr" <amfr.org@gmail .com> wrote:
                [color=blue]
                >I was wondering how i could parse the contents of a file into an array.
                > the file would look something like this:
                >
                >gif:image/gif
                >html:text/html
                >jpg:image/jpeg
                >...
                >
                >As you can see, it contains the mime type and the file extension
                >seperated by commas, 1 per line. I was wondering if it was possible to
                >create and array like this:
                >
                >(Pseudocode)
                >mimetypearra y[gif] = "image/gif"
                >mimetypearra y[html] = "text/html"
                >mimetypearra y[jpg] = "image/jpeg"
                >...
                >
                >I come from a PHP backround where I know this is possible, but I am new
                >at Python. Please disregard this if it is a stupid question.
                >[/color]
                Pretty much anything is possible in Python, if you can conceive it well enough ;-)

                Assuming f is from f = open(yourfile), simulated with StringIO file object here,
                [color=blue][color=green][color=darkred]
                >>> from StringIO import StringIO
                >>> f = StringIO("""\[/color][/color][/color]
                ... gif:image/gif
                ... html:text/html
                ... jpg:image/jpeg
                ... """)

                And assuming that there are no spaces around the ':' or in the two pieces,
                but maybe some optional whitespace at either end of a line and \n at the end
                (except maybe the last line), and no blank lines, you can get a dict mapping easily:
                [color=blue][color=green][color=darkred]
                >>> mimetypedict = dict(tuple(line .strip().split( ':')) for line in f)[/color][/color][/color]

                This gives you:[color=blue][color=green][color=darkred]
                >>> mimetypedict[/color][/color][/color]
                {'gif': 'image/gif', 'html': 'text/html', 'jpg': 'image/jpeg'}

                Which you can access using the names as keys, e.g.,[color=blue][color=green][color=darkred]
                >>> mimetypedict['gif'][/color][/color][/color]
                'image/gif'

                If you want to use bare names to access the info via an object, you can use the
                dict info to create a class or class instance and give it the named attributes, e.g.
                a class with the data as class variables is quick:
                [color=blue][color=green][color=darkred]
                >>> MTC = type('MTC',(), mimetypedict)
                >>> MTC.gif[/color][/color][/color]
                'image/gif'[color=blue][color=green][color=darkred]
                >>> MTC.jpg[/color][/color][/color]
                'image/jpeg'

                Or you could substitute the mimetypedict expression from above to make another one-liner ;-)

                Other ways of setting up your info are certainly possible, and may be more suitable,
                depending on how you intend to use the info. As mentioned, the mimetypes module
                may already have much of the data and/or functionality you want.

                Regards,
                Bengt Richter

                Comment

                Working...