Reading a line from text, and separating it into variables/ Structure

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • tooswole23
    New Member
    • Feb 2012
    • 10

    Reading a line from text, and separating it into variables/ Structure

    So i recently was forced to switch from C to python for Numerical Analysis reasons, and being new to Python/NumPy I was wondering if there was any equivalent of the function fscanf for Python.NumPy or how I would go about reading in a line of data and store the individual "strings" into variables.

    I figured my best bet was probably splitting the string into x pieces using split(), but I'm not entirely sure how to assign each individual piece of the string to a corresponding variable.

    Also wondering if there is an equivalent of a C structure in Python.

    Thank you for reading.
  • Smygis
    New Member
    • Jun 2007
    • 126

    #2
    Guessing wildly here but I think what you are after is best accomplished with a dictionary.

    But I'm not sure what you are on about with "assign each individual piece of the string to a corresponding variable.". Or exactly how the data looks like. Or how you want to access it later.

    But still a dictionary is a great tool.

    Code:
    >>> dd = {}
    >>> dd["add"] = lambda x, y: x+y
    >>> dd["hello"] = "Hello world"
    >>> dd
    {'add': <function <lambda> at 0x0000000002C2CF98>, 'hello': 'Hello world'}
    >>> dd["add"](2,7)
    9

    Comment

    • tooswole23
      New Member
      • Feb 2012
      • 10

      #3
      Okay, so the data I am dealing with looks like a set of thousands of what is below:

      USB270.15385-29.63146 270.153847 -29.631455 2.966699e+03 -9.99 1.300391e+03 -9.99 -9.99 A-A-- 6.787463e+01 -9.99 1.555773e+02 -9.99 -9.99 10100 | ----- ------ ------ ------ | 0.373 13.554 12.928 12.670 AAA | ----- -------- - -------- - -------- - -------- - | --- ---------- - ---------- - --------- - --------- - --------- - ---------- -

      I want to get each segment of the line for example "USB 270.15385-29.63146 270.153847" and store it in a two dimensional array named star[i][0].
      And the second segment -29.631455 and store it in the same array star[i][1]... for the first 14 segments.

      Using numpy I came accross the function genfromtext() where it allows me to get a string of data and break it up based on delimiters, but I'm not entirely sure how to make that into the two dimensional array I want

      Comment

      • bvdet
        Recognized Expert Specialist
        • Oct 2006
        • 2851

        #4
        It appears the data you displayed is on one line which makes sense. You can compile a two dimensional list by iterating on the file object something like:
        Code:
        fileObj = open(file_name)
        data = []
        for line in fileObj:
            s = line.split()
            data.append((s[0:2], s[2:14]))
        fileObj.close()

        Comment

        • tooswole23
          New Member
          • Feb 2012
          • 10

          #5
          I am getting a
          Type Error: 'int' object is not subscriptable

          Let me see if I get the jist of how the for loop works...
          It's going to scan the file line by line,
          For each line it will split it into however elements it contains.

          I don't really understand the append line part of the program, from what I understand, it inserts the first two parts segments of the string in the first dimension of the array, and the last 13 elements on the other part.

          I think what I want to do is this here, but I can't get around the int object error :

          Code:
          fileObj = open(file_name)
          data = []
          i=0
          for line in fileObj:
              s = line.split()
              for j in range(13)
                 data[i].append(s[j:j+1])
              i=i+1
          fileObj.close()
          where it inputs all of the data of the line onto one dimension determined by the int i, which is updated every new line

          Comment

          • dwblas
            Recognized Expert Contributor
            • May 2008
            • 626

            #6
            If you want to insert the entire line, then it would just be data.append(s) which would give you a list of lists=2 dimensions. The alternative would be to append the subset
            data.append([s[0], s[3], s[5]]) --> appends 3 fields only, 1st, 4th, and 6th
            Note the [] around the 3 fields which indicates a sub-list which can be interpreted as a second dimension. Unless you have a data set in the gigabytes, it is just as easy to append the entire list and then use whatever part is relevant.

            And you can always use a class as a replacement for a C structure but since you know what each position is then it is just as easy to say data[x][y] as some variable name.
            Code:
            test_data="""USB270.15385-29.63146 270.153847 -29.631455 2.966699e+03 -9.99 1.300391e+03 -9.99 -9.99 A-A-- 6.787463e+01 -9.99 1.555773e+02 -9.99 -9.99 10100 | 0.373 13.554 12.928 12.670 AAA"""
            ##fileObj = open(file_name)
            bogus_file_obj=test_data.split("|")
            data = []
            for line in bogus_file_obj:
                s = line.split()
                data.append(s)
            
            for rec in data:     ## print the results
                print "-"*30
                for sub_rec in rec:
                    print sub_rec 
            #
            #  print using "array" indexing
            print "\n==================================\n"
            titles=["Name", "Variance", "Third"]
            for x in range(len(data)):
                print "-"*30
                for y in range(len(data[x])):
                    ## associate a name with the field location
                    if y < len(titles):
                        print titles[y],
                    print data[x][y]

            Comment

            • tooswole23
              New Member
              • Feb 2012
              • 10

              #7
              Thank you very much for your help!

              Comment

              Working...