any other best way of reading the file

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • psbasha
    Contributor
    • Feb 2007
    • 440

    #16
    Originally posted by psbasha
    Code:
    Sample1.txt
     
    Sample.txt
    Pnt      100100123.      0.      0.
    Pnt      200200035.      0.      0.
    Pnt      3040000010.     0.      0.
    Pnt      4000000015.     0.      0.
    Pnt      5005000020.     0.      0.
    Pnt      600008000.      5.      0.
    Pnt      700000005.      5.      0.
    Pnt      8000900010.     5.      0.
    Pnt      9000900015.     5.      0.
    Code:
    Sample2.txt
    Pnt    *         3280311       0          1.36567432E+03 -3.71226532E+02
    *         2.01031464E+02       0
    Pnt	 *         3280502       0          1.25433850E+03 -1.42613068E+02
    *         1.80202667E+02       0
    Pnt	 *         3280503       0          1.27057288E+03 -1.75843582E+02
    *         1.84236084E+02       0
    Pnt    *         3280504       0          1.28286145E+03 -2.01004501E+02
    *         1.87218460E+02       0
    Code:
    Sample3.txt
    Pnt*     10260209                       1156.26599      313.992828
    *       155.018463
    Pnt*     10270106                       1097.15002      250.676315
    *       140.789337
    Pnt*     10270107                       1115.47864      271.83374
    *       144.698837
    I am getting inconsistency input data from different softwares,but I have to write a generic Pyton code where I can read any input data format as mentioned in the above examples

    Comment

    • psbasha
      Contributor
      • Feb 2007
      • 440

      #17
      Originally posted by psbasha
      I am getting inconsistency input data from different softwares,but I have to write a generic Pyton code where I can read any input data format as mentioned in the above examples
      Could any body help me in resolving this issue of handling the generic format data.

      Thanks in advance
      PSB

      Comment

      • bvdet
        Recognized Expert Specialist
        • Oct 2006
        • 2851

        #18
        Originally posted by psbasha
        Code:
        Sample1.txt
         
        Sample.txt
        Pnt      100100123.      0.      0.
        Pnt      200200035.      0.      0.
        Pnt      3040000010.     0.      0.
        Pnt      4000000015.     0.      0.
        Pnt      5005000020.     0.      0.
        Pnt      600008000.      5.      0.
        Pnt      700000005.      5.      0.
        Pnt      8000900010.     5.      0.
        Pnt      9000900015.     5.      0.
        Code:
        Sample2.txt
        Pnt    *         3280311       0          1.36567432E+03 -3.71226532E+02
        *         2.01031464E+02       0
        Pnt	 *         3280502       0          1.25433850E+03 -1.42613068E+02
        *         1.80202667E+02       0
        Pnt	 *         3280503       0          1.27057288E+03 -1.75843582E+02
        *         1.84236084E+02       0
        Pnt    *         3280504       0          1.28286145E+03 -2.01004501E+02
        *         1.87218460E+02       0
        Code:
        Sample3.txt
        Pnt*     10260209                       1156.26599      313.992828
        *       155.018463
        Pnt*     10270106                       1097.15002      250.676315
        *       140.789337
        Pnt*     10270107                       1115.47864      271.83374
        *       144.698837
        I think we have taken care of Sample1, have we not? Can you explain Sample2 and Sample3 format? Is the point data really on two separate lines? What is the significance of the asterisk? Why are there zeros mixed in with the numbers in scientific notation? Help us help you.

        Comment

        • psbasha
          Contributor
          • Feb 2007
          • 440

          #19
          Originally posted by bvdet
          I think we have taken care of Sample1, have we not? Can you explain Sample2 and Sample3 format? Is the point data really on two separate lines? What is the significance of the asterisk? Why are there zeros mixed in with the numbers in scientific notation? Help us help you.
          a) "I think we have taken care of Sample1, have we not?"

          Yes

          b) "Can you explain Sample2 and Sample3 format?"

          This format is some what different with Sample-1

          The X,Y,Z co-ordinates are not written in a single line.They are splitted into two lines.Each String/Number is of 16-Field data
          The maximum length of the line is ( 79)
          c)Is the point data really on two separate lines?
          Yes
          d)What is the significance of the asterisk?
          The "*" in the second line may be used as continuation of the fields

          e) Why are there zeros mixed in with the numbers in scientific notation?
          Pnt * 3280504 0 1.28286145E+03 -2.01004501E+02
          * 1.87218460E+02 0
          Currently I dont need of this zero's.It is also one of the ID which may be refering to some number later

          Thanks in advacne
          PSB

          Comment

          • bvdet
            Recognized Expert Specialist
            • Oct 2006
            • 2851

            #20
            Originally posted by psbasha
            a) "I think we have taken care of Sample1, have we not?"

            Yes

            b) "Can you explain Sample2 and Sample3 format?"

            This format is some what different with Sample-1

            The X,Y,Z co-ordinates are not written in a single line.They are splitted into two lines.Each String/Number is of 16-Field data
            The maximum length of the line is ( 79)
            c)Is the point data really on two separate lines?
            Yes
            d)What is the significance of the asterisk?
            The "*" in the second line may be used as continuation of the fields

            e) Why are there zeros mixed in with the numbers in scientific notation?
            Pnt * 3280504 0 1.28286145E+03 -2.01004501E+02
            * 1.87218460E+02 0
            Currently I dont need of this zero's.It is also one of the ID which may be refering to some number later

            Thanks in advacne
            PSB
            Here's one way of adding the data in this format to your point dictionary:
            Code:
            >>> patt = re.compile(r'''\d+\.\d+E\+\d+|
            ... \d+\.\d+E\+\d+|
            ... -\d+\.\d+E\+\d+|
            ... -\d+\.\d+E-\d+|
            ... \d+\.\d+E-\d+|
            ... \d+\.\d+|
            ... -\d+\.\d+|
            ... \d+''', re.X
            ... )
            >>> patt
            <_sre.SRE_Pattern object at 0x00DE68D0>
            >>> s = 'Pnt    *         3280311       0          +1.36567432E+03 -3.71226532E+02'
            >>> re.findall(patt,s)
            ['3280311', '0', '1.36567432E+03', '-3.71226532E+02']
            >>> dd = {}
            >>> lst = re.findall(patt,s)
            >>> dd[int(lst[0])] = [float(i) for i in lst[1:] if i != '0']
            >>> dd
            {3280311: [1365.6743200000001, -371.22653200000002]}
            >>> s1 = '*       155.018463'
            >>> lst1 = re.findall(patt,s)
            >>> dd[int(lst[0])] = dd[int(lst[0])]+[float(i) for i in lst1 if i != '0']
            >>> dd
            {3280311: [1365.6743200000001, -371.22653200000002, 155.018463]}
            >>>
            You can add an elif for the word 'pnt' in combination with '*'. Whoever designed the output for this data ought to be ..............

            Comment

            • psbasha
              Contributor
              • Feb 2007
              • 440

              #21
              Hi BV,

              Is there any other simple approach available?.It looks like we have to do the formating of the values for readiing it.

              Thanks
              PSB

              Comment

              • bvdet
                Recognized Expert Specialist
                • Oct 2006
                • 2851

                #22
                Originally posted by psbasha
                Hi BV,

                Is there any other simple approach available?.It looks like we have to do the formating of the values for readiing it.

                Thanks
                PSB
                The code I showed you works. I guess you could do splits, strips, slices. etc., but I don't think it would be simpler. After incorporating that code into the other code I showed you, you should get output like this:
                Code:
                >>> Point dictionary:
                30400000 = [10.0, 0.0, 0.0]
                40000000 = [15.0, 0.0, 0.0]
                2 = [2, 5.0, 0.0, 0.0]
                3 = [3, 10.0, 0.0, 0.0]
                4 = [4, 15.0, 0.0, 0.0]
                5 = [5, 20.0, 0.0, 0.0]
                6 = [6, 0.0, 5.0, 0.0]
                1 = [1, 0.0, 0.0, 0.0]
                8 = [8, 10.0, 5.0, 0.0]
                9 = [9, 15.0, 5.0, 0.0]
                10270106 = [1097.15002, 250.67631499999999, 140.78933699999999]
                10270107 = [1115.47864, 271.83373999999998, 144.698837]
                10010012 = [3.0, 0.0, 0.0]
                60000800 = [0.0, 5.0, 0.0]
                20020003 = [5.0, 0.0, 0.0]
                10260209 = [1156.2659900000001, 313.99282799999997, 155.018463]
                80009000 = [10.0, 5.0, 0.0]
                7 = [7, 5.0, 5.0, 0.0]
                3280311 = [1365.6743200000001, -371.22653200000002, 201.031464]
                50050000 = [20.0, 0.0, 0.0]
                90009000 = [15.0, 5.0, 0.0]
                70000000 = [5.0, 5.0, 0.0]
                3280502 = [1254.3385000000001, -142.613068, 180.20266699999999]
                3280503 = [1270.5728799999999, -175.843582, 184.23608400000001]
                3280504 = [1282.8614500000001, -201.004501, 187.21845999999999]
                
                Wire dictionary:
                10000000 = [10000000, 20000000, 70000000]
                20000000 = [20000000, 30000000, 80000000]
                30000000 = [30000000, 40000000, 90000000]
                10000071 = [10000101, 20000022, 70000000, 60000055]
                40000000 = [40000000, 50000000]
                30000088 = [30000208, 40000002, 90005000, 80003000]
                20000092 = [20000105, 30000004, 80004000, 71111167]
                40000094 = [40000304, 50000071, 90000600]
                from data like this:
                Code:
                Rect    1000007110000101200000227000000060000055
                Rect    2000009220000105300000048000400071111167
                Rect    3000008830000208400000029000500080003000
                Tria     40000094400003045000007190000600
                Pnt      100100123.      0.      0.
                Pnt      200200035.      0.      0.
                Pnt      3040000010.     0.      0.
                Pnt      4000000015.     0.      0.
                Pnt      5005000020.     0.      0.
                Pnt      600008000.      5.      0.
                Pnt      700000005.      5.      0.
                Pnt      8000900010.     5.      0.
                Pnt      9000900015.     5.      0.
                Pnt      100100123.      0.      0.
                Pnt      200200035.      0.      0.
                Pnt      3040000010.     0.      0.
                Pnt      4000000015.     0.      0.
                Pnt      5005000020.     0.      0.
                Pnt      600008000.      5.      0.
                Pnt      700000005.      5.      0.
                Pnt      8000900010.     5.      0.
                Pnt      9000900015.     5.      0.
                Rect    100000001000000020000000700000006
                Rect    200000002000000030000000800000007
                Rect    300000003000000040000000900000008
                Tria    4000000040000000500000009
                Pnt     1       0.      0.      0.
                Pnt     2       5.      0.      0.
                Pnt     3       10.     0.      0.
                Pnt     4       15.     0.      0.
                Pnt     5       20.     0.      0.
                Pnt     6       0.      5.      0.
                Pnt     7       5.      5.      0.
                Pnt     8       10.     5.      0.
                Pnt     9       15.     5.      0.
                
                
                
                
                Pnt    *         3280311       0          1.36567432E+03 -3.71226532E+02
                *         2.01031464E+02       0
                Pnt	 *         3280502       0          1.25433850E+03 -1.42613068E+02
                *         1.80202667E+02       0
                Pnt	 *         3280503       0          1.27057288E+03 -1.75843582E+02
                *         1.84236084E+02       0
                Pnt    *         3280504       0          1.28286145E+03 -2.01004501E+02
                *         1.87218460E+02       0
                
                Pnt*     10260209                       1156.26599      313.992828
                *       155.018463
                Pnt*     10270106                       1097.15002      250.676315
                *       140.789337
                Pnt*     10270107                       1115.47864      271.83374
                *       144.698837
                The data files were not formatted is the best manner for reading.

                Comment

                • bvdet
                  Recognized Expert Specialist
                  • Oct 2006
                  • 2851

                  #23
                  Maybe this will be easier to follow:
                  Code:
                  def read_file_data(f):
                      ptDict = {}
                      wireDict = {}
                      fList = open(f).readlines()
                      
                      in_pnt = False
                      patt = re.compile(r'''\d+\.\d+E\+\d+|           # engineering notation ++
                                            -\d+\.\d+E\+\d+|          # engineering notation -+
                                            -\d+\.\d+E-\d+|           # engineering notation --
                                            \d+\.\d+E-\d+|            # engineering notation +-
                                            \d+\.\d+|                 # positive float format
                                            -\d+\.\d+|                # negative float format
                                            \d+                       # positive integer
                                            ''', re.X
                                        )
                      
                      for line in fList:
                          lineList = [x.lower().strip() for x in line.strip().split(' ', 1) if x != '']

                  Comment

                  • psbasha
                    Contributor
                    • Feb 2007
                    • 440

                    #24
                    Code:
                    Sample.txt
                    $$$$$
                    START
                    COLOR RED
                    LINETYPE SOLID
                    END
                    $$$$$$$
                    PLine    1        6      1.5     9.375   .001    .001
                    $ Line Details
                    Line*    1               1                1              2
                    *        .002952         .992547         .121827
                    $
                    Rect     2        1       2       3       7       6
                    Rect     3        1       3       4       8       7
                    PRect*   4               11              15              16
                    *        10              11              0.3
                    Rect*    4               1               5               6
                    *        10              11              0.
                    Othr*    1               1               5               6
                    *        10              11              0.              0.
                    *        10              11              0.              1.0
                    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
                    Tria     5        1       7       2       11
                    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
                    Point    1               0.0     0.0     0.0
                    Point    2               1.0     0.0     0.0
                    Point    3               2.0     0.0     0.0
                    Point    4               3.0     0.0     0.0
                    Point    5               0.0     1.0     0.0
                    Point    6               1.0     1.0     0.0
                    Point    7               2.0     1.0     0.0
                    Point    8               4.0     1.0     0.0
                    Point*   9                              0.0             2.0
                    *          0.0
                    Point  *3280504         0               1.28286145E+03  1.28286145E+03
                    *       -2.01004501E+02
                    $
                    END


                    Code:
                    Sample.py
                    
                        
                    def read_file_data(strFile):
                        f = open(strFile,'r')
                        pointID = 0
                        curvetID = 0
                        pointIDDict = {}        
                        pointList = []             
                        coordList = []
                        attrdict=[]
                        
                        curveIDDict = {}
                        curveOneDimIDDict = {}
                        curveTwoDimIDPointIDDict = {}        
                        largeFieldFlag = False
                        
                        curveCardLargeFieldFlag = False
                        bTriaFlag = False
                        bRectFlag = False
                        bOnlyPointCoord = True
                        b1DCurveFlag = False
                        propDict={}
                    
                        strTemp = f.readlines()
                        for line in strTemp:
                            if(line.startswith('Point') or line.startswith('Point*') or line.startswith('Point  *') or line.startswith('*') and bOnlyPointCoord):
                                
                                if (line.startswith('Point') and (line[:8].strip().isalpha())):
                                    pointID = int(line[8:16])                        
                                    coordList.append((float(line[24:32])))
                                    coordList.append((float(line[32:40])))                
                                    coordList.append((float(line[40:48])))
                                    largeFieldFlag = False
                                elif (line.startswith('Point*') or line.startswith('Point  *')):
                                    pointID = int(line[8:24])                        
                                    coordList.append((float(line[40:56])))
                                    coordList.append((float(line[56:72])))
                                    largeFieldFlag = True
                                    bOnlyPointCoord = True
                                elif (line.startswith('*') and largeFieldFlag):                  
                                    coordList.append((float(line[8:24])))
                                    largeFieldFlag = False                    
                                if ( pointID and largeFieldFlag == False):
                                    pointIDDict[pointID]=coordList                    
                                    pointID =0   
                                    coordList = []
                                    
                                bOnlyPointCoord = True
                            elif (line.startswith('Rect') or line.startswith('Tria') or  \
                                  line.startswith('Line') and line[:8].strip().isalpha() or \
                                  line.startswith('Rect*') or line.startswith('Tria*') or\
                                  line.startswith('Line*')or line.startswith('*')):
                                  
                                if (line.startswith('Rect  ') or \
                                    line.startswith('Line') and line[:8].strip().isalpha() ):
                    
                                    curvetID = int(line[8:16])                        
                                    pointList.append((int(line[24:32])))
                                    pointList.append((int(line[32:40])))
                                    b1DCurveFlag = True
                                    
                                    if (line[:4]=='Tria'or line[:4]=='Rect'):                        
                                        pointList.append((int(line[40:48])))
                                        b1DCurveFlag = False
                                                
                                        if (line[:4]=='Rect' ):
                                            pointList.append((int(line[48:56])))
                                            
                                    curveCardLargeFieldFlag = False
                                        
                                elif   (line.startswith('Rect*') or line.startswith('Tria*') or \
                                        line.startswith('Line*')):
                                    curvetID = int(line[8:24])                        
                                    pointList.append((int(line[40:56])))
                                    pointList.append((int(line[56:72])))
                                    curveCardLargeFieldFlag = True
                                    bOnlyPointCoord = False
                                    b1DCurveFlag = True
                                    if line.startswith('Rect*') :
                                        bRectFlag = True
                                        bTriaFlag = False
                                    elif line.startswith('Tria*'):
                                        bTriaFlag = True
                                        bRectFlag = False                        
                                    
                                elif line.startswith('*') and curveCardLargeFieldFlag:                    
                                    if (bTriaFlag or bRectFlag):
                                        pointList.append((int(line[8:24])))
                                        b1DCurveFlag = False                                
                                        if bRectFlag:
                                            pointList.append((int(line[24:40])))
                                            
                                    bTriaFlag = False
                                    bRectFlag = False
                                            
                                    curveCardLargeFieldFlag = False
                                            
                                if ( curvetID and curveCardLargeFieldFlag == False):                    
                                    # Map ElementID and Node ID's of that element
                                    curveIDDict[curvetID]=pointList
                                    if b1DCurveFlag:
                                        curveOneDimIDDict[curvetID]= pointList
                                        b1DCurveFlag = False
                                    else:
                                        curveTwoDimIDPointIDDict[curvetID]= pointList
                                        b1DCurveFlag = False                    
                    
                                    curveCardLargeFieldFlag = False
                                    bOnlyPointCoord = False
                                    curvetID = 0                    
                                    pointList = []          
                        
                        f.close()
                    
                        #Node
                        #For all Nodes
                        print pointIDDict
                    
                        print curveIDDict
                    
                        print  curveOneDimIDDict
                    
                        print curveTwoDimIDPointIDDict  
                    
                        
                    if __name__ == '__main__':
                        read_file_data("C:\\ReadFile\\SampleData.txt")
                    Above is the sample text file ,and the sample code for the above file reading.I would like to avoid using the flags and so many variables to define.Is it possible to use regular expression and reduce the piece of code

                    Thanks
                    PSB
                    Last edited by psbasha; Dec 25 '07, 06:23 AM. Reason: Remove the folder name

                    Comment

                    • psbasha
                      Contributor
                      • Feb 2007
                      • 440

                      #25
                      Originally posted by psbasha
                      Code:
                      Sample.txt
                      $$$$$
                      START
                      COLOR RED
                      LINETYPE SOLID
                      END
                      $$$$$$$
                      PLine    1        6      1.5     9.375   .001    .001
                      $ Line Details
                      Line*    1               1                1              2
                      *        .002952         .992547         .121827
                      $
                      Rect     2        1       2       3       7       6
                      Rect     3        1       3       4       8       7
                      PRect*   4               11              15              16
                      *        10              11              0.3
                      Rect*    4               1               5               6
                      *        10              11              0.
                      Othr*    1               1               5               6
                      *        10              11              0.              0.
                      *        10              11              0.              1.0
                      $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
                      Tria     5        1       7       2       11
                      $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
                      Point    1               0.0     0.0     0.0
                      Point    2               1.0     0.0     0.0
                      Point    3               2.0     0.0     0.0
                      Point    4               3.0     0.0     0.0
                      Point    5               0.0     1.0     0.0
                      Point    6               1.0     1.0     0.0
                      Point    7               2.0     1.0     0.0
                      Point    8               4.0     1.0     0.0
                      Point*   9                              0.0             2.0
                      *          0.0
                      Point  *3280504         0               1.28286145E+03  1.28286145E+03
                      *       -2.01004501E+02
                      $
                      END


                      Code:
                      Sample.py
                      
                          
                      def read_file_data(strFile):
                          f = open(strFile,'r')
                          pointID = 0
                          curvetID = 0
                          pointIDDict = {}        
                          pointList = []             
                          coordList = []
                          attrdict=[]
                          
                          curveIDDict = {}
                          curveOneDimIDDict = {}
                          curveTwoDimIDPointIDDict = {}        
                          largeFieldFlag = False
                          
                          curveCardLargeFieldFlag = False
                          bTriaFlag = False
                          bRectFlag = False
                          bOnlyPointCoord = True
                          b1DCurveFlag = False
                          propDict={}
                      
                          strTemp = f.readlines()
                          for line in strTemp:
                              if(line.startswith('Point') or line.startswith('Point*') or line.startswith('Point  *') or line.startswith('*') and bOnlyPointCoord):
                                  
                                  if (line.startswith('Point') and (line[:8].strip().isalpha())):
                                      pointID = int(line[8:16])                        
                                      coordList.append((float(line[24:32])))
                                      coordList.append((float(line[32:40])))                
                                      coordList.append((float(line[40:48])))
                                      largeFieldFlag = False
                                  elif (line.startswith('Point*') or line.startswith('Point  *')):
                                      pointID = int(line[8:24])                        
                                      coordList.append((float(line[40:56])))
                                      coordList.append((float(line[56:72])))
                                      largeFieldFlag = True
                                      bOnlyPointCoord = True
                                  elif (line.startswith('*') and largeFieldFlag):                  
                                      coordList.append((float(line[8:24])))
                                      largeFieldFlag = False                    
                                  if ( pointID and largeFieldFlag == False):
                                      pointIDDict[pointID]=coordList                    
                                      pointID =0   
                                      coordList = []
                                      
                                  bOnlyPointCoord = True
                              elif (line.startswith('Rect') or line.startswith('Tria') or  \
                                    line.startswith('Line') and line[:8].strip().isalpha() or \
                                    line.startswith('Rect*') or line.startswith('Tria*') or\
                                    line.startswith('Line*')or line.startswith('*')):
                                    
                                  if (line.startswith('Rect  ') or \
                                      line.startswith('Line') and line[:8].strip().isalpha() ):
                      
                                      curvetID = int(line[8:16])                        
                                      pointList.append((int(line[24:32])))
                                      pointList.append((int(line[32:40])))
                                      b1DCurveFlag = True
                                      
                                      if (line[:4]=='Tria'or line[:4]=='Rect'):                        
                                          pointList.append((int(line[40:48])))
                                          b1DCurveFlag = False
                                                  
                                          if (line[:4]=='Rect' ):
                                              pointList.append((int(line[48:56])))
                                              
                                      curveCardLargeFieldFlag = False
                                          
                                  elif   (line.startswith('Rect*') or line.startswith('Tria*') or \
                                          line.startswith('Line*')):
                                      curvetID = int(line[8:24])                        
                                      pointList.append((int(line[40:56])))
                                      pointList.append((int(line[56:72])))
                                      curveCardLargeFieldFlag = True
                                      bOnlyPointCoord = False
                                      b1DCurveFlag = True
                                      if line.startswith('Rect*') :
                                          bRectFlag = True
                                          bTriaFlag = False
                                      elif line.startswith('Tria*'):
                                          bTriaFlag = True
                                          bRectFlag = False                        
                                      
                                  elif line.startswith('*') and curveCardLargeFieldFlag:                    
                                      if (bTriaFlag or bRectFlag):
                                          pointList.append((int(line[8:24])))
                                          b1DCurveFlag = False                                
                                          if bRectFlag:
                                              pointList.append((int(line[24:40])))
                                              
                                      bTriaFlag = False
                                      bRectFlag = False
                                              
                                      curveCardLargeFieldFlag = False
                                              
                                  if ( curvetID and curveCardLargeFieldFlag == False):                    
                                      # Map ElementID and Node ID's of that element
                                      curveIDDict[curvetID]=pointList
                                      if b1DCurveFlag:
                                          curveOneDimIDDict[curvetID]= pointList
                                          b1DCurveFlag = False
                                      else:
                                          curveTwoDimIDPointIDDict[curvetID]= pointList
                                          b1DCurveFlag = False                    
                      
                                      curveCardLargeFieldFlag = False
                                      bOnlyPointCoord = False
                                      curvetID = 0                    
                                      pointList = []          
                          
                          f.close()
                      
                          #Node
                          #For all Nodes
                          print pointIDDict
                      
                          print curveIDDict
                      
                          print  curveOneDimIDDict
                      
                          print curveTwoDimIDPointIDDict  
                      
                          
                      if __name__ == '__main__':
                          read_file_data("C:\\Shakil\\ReadFile\\SampleData.txt")
                      Above is the sample text file ,and the sample code for the above file reading.I would like to avoid using the flags and so many variables to define.Is it possible to use regular expression and reduce the piece of code

                      Thanks
                      PSB
                      In some scenarios I have to read following data in the file

                      PLine 1 6 1.5 9.375 .001 .001
                      PRect* 4 11 15 16
                      * 10 11 0.3
                      Othr* 1 1 5 6
                      * 10 11 0. 0.
                      * 10 11 0. 1.0

                      In Some scenarios the Point data will be defined as below

                      Point *3280505 0 1.28286145+03 1.28286145-03
                      * -2.01004501+02

                      1.28286145+03 is same as 1.28286145E+03
                      1.28286145-03 is same as 1.28286145E-03

                      How to handle the above scenarios while reading the file

                      Thanks
                      PSB

                      Comment

                      • psbasha
                        Contributor
                        • Feb 2007
                        • 440

                        #26
                        PLine 1 6 1.5 9.375 .001 .001
                        PRect* 4 11 15 16
                        * 10 11 0.3
                        Othr* 1 1 5 6
                        * 10 11 0. 0.
                        * 10 11 0. 1.0

                        I have not written a code for the above Card lines to store the properties of the curves.

                        In some cases the Point coordinates are represented as shown below

                        Point *3280505 0 1.28286145+03 1.28286145-03
                        * -2.01004501+02

                        1.28286145+03 is same as 1.28286145E+03
                        1.28286145-03 is same as 1.28286145E-03

                        Is anybody suggest me ,how to store and print the data?

                        Thanks
                        PSB

                        Comment

                        • psbasha
                          Contributor
                          • Feb 2007
                          • 440

                          #27
                          Originally posted by psbasha
                          PLine 1 6 1.5 9.375 .001 .001
                          PRect* 4 11 15 16
                          * 10 11 0.3
                          Othr* 1 1 5 6
                          * 10 11 0. 0.
                          * 10 11 0. 1.0

                          I have not written a code for the above Card lines to store the properties of the curves.

                          In some cases the Point coordinates are represented as shown below

                          Point *3280505 0 1.28286145+03 1.28286145-03
                          * -2.01004501+02

                          1.28286145+03 is same as 1.28286145E+03
                          1.28286145-03 is same as 1.28286145E-03

                          Is anybody suggest me ,how to store and print the data?

                          Thanks
                          PSB
                          Any suggestions to the above queries ?

                          Comment

                          • psbasha
                            Contributor
                            • Feb 2007
                            • 440

                            #28
                            Hi BV,

                            Any suggestions on the above code.

                            Thanks
                            PSB

                            Comment

                            • bvdet
                              Recognized Expert Specialist
                              • Oct 2006
                              • 2851

                              #29
                              Try this:[code=Python]import re

                              def convert_data(s) :
                              for func in (int, float):
                              try:
                              n = func(s)
                              return n
                              except:
                              pass
                              return s

                              pattnum = re.compile(r'''
                              \d+\.\d+E\+\d+| # engineering notation ++
                              -\d+\.\d+E\+\d+| # engineering notation -+
                              -\d+\.\d+E-\d+| # engineering notation --
                              \d+\.\d+E-\d+| # engineering notation +-
                              \d+\.\d+| # positive float format
                              -\d+\.\d+| # negative float format
                              \d+\.| # positive float format
                              -\d+\.| # negative float format
                              \.\d+| # positive float format
                              -\.\d+| # negative float format
                              \d+ # positive integer
                              ''', re.X
                              )

                              def parseData(fn, *kargs):
                              fileList = [item.strip() for item in open(fn).readli nes()\
                              if not item.startswith ('$')]
                              pattkey = re.compile('|'. join([r'\b(%s)' % item for item in kargs]))
                              '''
                              print pattkey
                              print pattkey.pattern
                              '''
                              # create dictionary with keys from kargs
                              masterDict = dict(zip(kargs, [[] for _ in kargs]))
                              inData = False
                              for line in fileList:
                              if inData and line.startswith ('*'):
                              data.extend(re. findall(pattnum , line))
                              elif inData and not line.startswith ('*'):
                              masterDict[m.group(0)].append([convert_data(it em)\
                              for item in data])
                              inData = False
                              m = pattkey.match(l ine)
                              if m:
                              # m.group(0) is the current keyword
                              if '*' in line.split()[0]:
                              inData = True
                              data = re.findall(patt num, line)
                              else:
                              data = re.findall(patt num, line)
                              masterDict[m.group(0)].append([convert_data(it em)\
                              for item in data])
                              else:
                              m = pattkey.match(l ine)
                              if m:
                              # m.group(0) is the current keyword
                              if '*' in line.split()[0]:
                              inData = True
                              data = re.findall(patt num, line)
                              else:
                              data = re.findall(patt num, line)
                              masterDict[m.group(0)].append([convert_data(it em)\
                              for item in data])
                              return masterDict

                              fn = 'H:\\TEMP\\tems ys\\sample_poin ts8.txt'
                              keywords = ['Point', 'Othr', 'Rect', 'PRect', 'PLine', 'Line', 'Tria']
                              dd = parseData(fn, *keywords)
                              for key in dd:
                              print key
                              for item in dd[key]:
                              print ' %s' % item
                              [/code]Output:
                              [code=Python]>>> Point
                              [1, 0.0, 0.0, 0.0]
                              [2, 1.0, 0.0, 0.0]
                              [3, 2.0, 0.0, 0.0]
                              [4, 3.0, 0.0, 0.0]
                              [5, 0.0, 1.0, 0.0]
                              [6, 1.0, 1.0, 0.0]
                              [7, 2.0, 1.0, 0.0]
                              [8, 4.0, 1.0, 0.0]
                              [9, 0.0, 2.0, 0.0]
                              [3280504, 0, 1282.8614500000 001, 1282.8614500000 001]
                              PLine
                              [1, 6, 1.5, 9.375, 0.001, 0.001]
                              Tria
                              [5, 1, 7, 2, 11]
                              PRect
                              [4, 11, 15, 16, 10, 11, 0.2999999999999 9999]
                              Line
                              [1, 1, 1, 2, 0.0029520000000 000002, 0.9925469999999 9996, 0.121827]
                              Rect
                              [2, 1, 2, 3, 7, 6]
                              [3, 1, 3, 4, 8, 7]
                              [4, 1, 5, 6, 10, 11, 0.0]
                              Othr
                              [1, 1, 5, 6, 10, 11, 0.0, 0.0, 10, 11, 0.0, 1.0]
                              [/code]

                              Comment

                              • bvdet
                                Recognized Expert Specialist
                                • Oct 2006
                                • 2851

                                #30
                                I made a few modifications so it would work properly. It probably needs some more work, but I will leave it up to you. Let us know how it turns out.[code=Python]import re

                                def convert_data(s) :
                                for func in (int, float):
                                try:
                                n = func(s)
                                return n
                                except:
                                pass
                                return s

                                pattnum = re.compile(r'''
                                -\d+\.\d+E\+\d+| # engineering notation -+
                                \d+\.\d+E\+\d+| # engineering notation ++
                                -\d+\.\d+E-\d+| # engineering notation --
                                \d+\.\d+E-\d+| # engineering notation +-
                                -\d+\.\d+| # negative float format
                                \d+\.\d+| # positive float format
                                -\d+\.| # negative float format
                                \d+\.| # positive float format
                                -\.\d+| # negative float format
                                \.\d+| # positive float format
                                \d+ # positive integer
                                ''', re.X
                                )

                                def parseData(fn, *kargs):
                                fileList = [item.strip() for item in open(fn).readli nes()\
                                if not item.startswith ('$')]
                                pattkey = re.compile('|'. join([r'\b(%s)' % item for item in kargs]))
                                '''
                                print pattkey
                                print pattkey.pattern
                                '''
                                # create dictionary with keys from kargs
                                masterDict = dict(zip(kargs, [[] for _ in kargs]))
                                inData = False
                                for line in fileList:
                                if inData and line.startswith ('*'):
                                data.extend(re. findall(pattnum , line))
                                elif inData and not line.startswith ('*'):
                                masterDict[m.group(0)].append([convert_data(it em)\
                                for item in data])
                                inData = False
                                m = pattkey.match(l ine)
                                if m:
                                # m.group(0) is the current keyword
                                if '*' in line:
                                inData = True
                                data = re.findall(patt num, line)
                                else:
                                data = re.findall(patt num, line)
                                masterDict[m.group(0)].append([convert_data(it em)\
                                for item in data])
                                else:
                                m = pattkey.match(l ine)
                                if m:
                                # m.group(0) is the current keyword
                                if '*' in line:
                                inData = True
                                data = re.findall(patt num, line)
                                else:
                                data = re.findall(patt num, line)
                                masterDict[m.group(0)].append([convert_data(it em)\
                                for item in data])
                                return masterDict

                                fn = 'sample.txt'
                                keywords = ['Point', 'Othr', 'Rect', 'PRect', 'PLine', 'Line', 'Tria']
                                dd = parseData(fn, *keywords)
                                for key in dd:
                                print key
                                for item in dd[key]:
                                print ' %s' % item[/code]

                                Comment

                                Working...