Imported tab delimited data into HTML using python

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nitenmistry
    New Member
    • Feb 2007
    • 5

    Imported tab delimited data into HTML using python

    Hi,

    I'm creating a HTML report using Python and would like to import some tabular data which is in tab delimited format into the HTML page.

    The data needs to be displayed as a table within the HTML page, but i'm not sure how to go about doing it.

    Thanks in advance,

    Niten
  • bvdet
    Recognized Expert Specialist
    • Oct 2006
    • 2851

    #2
    Originally posted by nitenmistry
    Hi,

    I'm creating a HTML report using Python and would like to import some tabular data which is in tab delimited format into the HTML page.

    The data needs to be displayed as a table within the HTML page, but i'm not sure how to go about doing it.

    Thanks in advance,

    Niten
    It seems this is a combined HTML and Python question. Using Python to insert formatted data into an existing file is straightforward if we know where to put it (line number, after a certain keyword or character sequence, etc.). Since I don't know HTML (and don't really want to know), I connot help with that. We may have an expert here that knows both, or someone on the HTML forum can help.

    Comment

    • Motoma
      Recognized Expert Specialist
      • Jan 2007
      • 3236

      #3
      Originally posted by nitenmistry
      Hi,

      I'm creating a HTML report using Python and would like to import some tabular data which is in tab delimited format into the HTML page.

      The data needs to be displayed as a table within the HTML page, but i'm not sure how to go about doing it.

      Thanks in advance,

      Niten

      Does the HTML page exist already, or are you creating an HTML page to display the data?
      The way I would do this, is to parse through your tab delimited data set, and store the data as a list of lists.
      Then you can iterate through with nested for loops writing out HTML tags.

      Comment

      • nitenmistry
        New Member
        • Feb 2007
        • 5

        #4
        Originally posted by Motoma
        Does the HTML page exist already, or are you creating an HTML page to display the data?
        The way I would do this, is to parse through your tab delimited data set, and store the data as a list of lists.
        Then you can iterate through with nested for loops writing out HTML tags.
        I'm creating a HTML page which needs to display data which is in the tab delimited format.

        I've only just started learning Python, how would I go about parsing through the delimited data set?

        Thanks again,

        Niten

        Comment

        • Motoma
          Recognized Expert Specialist
          • Jan 2007
          • 3236

          #5
          Originally posted by nitenmistry
          I'm creating a HTML page which needs to display data which is in the tab delimited format.

          I've only just started learning Python, how would I go about parsing through the delimited data set?

          Thanks again,

          Niten
          How much of the code have you done so far?

          Comment

          • nitenmistry
            New Member
            • Feb 2007
            • 5

            #6
            Originally posted by Motoma
            How much of the code have you done so far?
            Not much, i'm trying to build on something which I have done previously so that variables from a text file can be passed into a script.
            Code:
            def openAndParse(inpName):
            
                inp=open(inpName)
                lines = inp.readlines()
                inp.close()
            
                params = lines[6:62]
                for p in params:
                    exec(p[6:])
            
                return h5
            Im not sure how to iterate through the nested loops and attach HTML tags

            Regards,

            Niten
            Last edited by bartonc; Feb 22 '07, 07:40 PM. Reason: added [code][/code] tags

            Comment

            • dshimer
              Recognized Expert New Member
              • Dec 2006
              • 136

              #7
              The parsing is pretty straightforward . Looking at one line that has mixed text but is delimited by tabs.
              Code:
              >>> txt='this and\tthat\twith some\tbeside'
              >>> txt.split('\t')
              ['this and', 'that', 'with some', 'beside']
              note that by specifying the split only occur at the tabs, strings that have a space between will stay together. split can take any delimiter, or if none is specified then any whitespace will cause a split.

              readlines will return the whole file as a list of individual lines, so for each line in the list of data lines split at the tabs and append the result to a new list. In the end you will have a list with the same number of elements as the file had lines, each element will be a list of parsed data. For example
              Code:
              >>> file=open('/tmp/tmp.txt','r')
              >>> lines=file.readlines()
              >>> fulllist=[]
              >>> for line in lines:
              ... 	fulllist.append(line.split('\t'))
              ... 	
              >>> fulllist
              [['a', '3', 'b', '4\n'], ['c', '5', 'd', '6\n'], ['four', 'words', 'with', 'tabs\n']]
              >>>
              by changing line.split('\t' ) to line.replace('\ n','').split('\ t') you can also strip the newline character before the split.

              Comment

              • bvdet
                Recognized Expert Specialist
                • Oct 2006
                • 2851

                #8
                Originally posted by dshimer
                The parsing is pretty straightforward . Looking at one line that has mixed text but is delimited by tabs.
                Code:
                >>> txt='this and\tthat\twith some\tbeside'
                >>> txt.split('\t')
                ['this and', 'that', 'with some', 'beside']
                note that by specifying the split only occur at the tabs, strings that have a space between will stay together. split can take any delimiter, or if none is specified then any whitespace will cause a split.

                readlines will return the whole file as a list of individual lines, so for each line in the list of data lines split at the tabs and append the result to a new list. In the end you will have a list with the same number of elements as the file had lines, each element will be a list of parsed data. For example
                Code:
                >>> file=open('/tmp/tmp.txt','r')
                >>> lines=file.readlines()
                >>> fulllist=[]
                >>> for line in lines:
                ... 	fulllist.append(line.split('\t'))
                ... 	
                >>> fulllist
                [['a', '3', 'b', '4\n'], ['c', '5', 'd', '6\n'], ['four', 'words', 'with', 'tabs\n']]
                >>>
                by changing line.split('\t' ) to line.replace('\ n','').split('\ t') you can also strip the newline character before the split.
                This will remove leading and trailing whitespace characters:
                Code:
                s.strip().split('\t')

                Comment

                • bvdet
                  Recognized Expert Specialist
                  • Oct 2006
                  • 2851

                  #9
                  I needed to parse some tab delimited files recently. I hope this will help you. Keep in mind I do not know HTML.
                  Code:
                  import os
                  
                  fn = 'H:/TEMP/temsys/MemData.txt'
                  
                  f = open(fn, 'r')
                  labelLst = f.readline().strip().split('\t')
                  lineLst = []
                  
                  for line in f:
                      if not line.startswith('#'):
                          lineLst.append(line.strip().split('\t'))
                  
                  s1 = '%s\n%s\n' % ('<HTML tag>', ', '.join(labelLst))
                  sLst = []
                  for line in lineLst:
                      sLst.append(','.join(line)+'\n')
                      
                  s2 = ''.join(sLst)
                      
                  finished_string = '%s%s%s' % (s1, s2, '</HTML tag>')
                  print finished_string
                  
                  
                  """ file data
                  mem_no	mod_azimuth	vessel_OR	platform_IR	platform_OR	toe_dir	brkt_type	tos_el	hr_ext
                  ##############################################################################################################
                  831	90.0	109.0	120.0	216.0	In	F	1456.5	Yes
                  832	337.0	109.0	120.0	216.0	In	F1	1456.5	Yes
                  833	316.0	109.0	120.0	216.0	In	F2	1456.5	Yes
                  834	298.0	109.0	120.0	192.0	Out	F3	1456.5	Yes
                  836	277.0	109.0	120.0	192.0	Out	F4	1456.5	Yes
                  837	270.0	109.0	120.0	192.0	In	F	1468.5	Yes
                  838	256.0	109.0	120.0	192.0	In	F	1468.5	No
                  839	180.0	109.0	120.0	216.0	Out	F2	1456.5	Yes
                  840	59.0	109.0	120.0	216.0	In	F	1456.5	Yes
                  841	39.0	109.0	120.0	216.0	In	F	1456.5	Yes
                  842	17.0	109.0	120.0	216.0	Out	F	1456.5	Yes
                  849	356.0	109.0	120.0	216.0	Out	F	1456.5	Yes
                  """
                  """ finished_string
                  <HTML tag>
                  mem_no, mod_azimuth, vessel_OR, platform_IR, platform_OR, toe_dir, brkt_type, tos_el, hr_ext
                  831,90.0,109.0,120.0,216.0,In,F,1456.5,Yes
                  832,337.0,109.0,120.0,216.0,In,F1,1456.5,Yes
                  833,316.0,109.0,120.0,216.0,In,F2,1456.5,Yes
                  834,298.0,109.0,120.0,192.0,Out,F3,1456.5,Yes
                  836,277.0,109.0,120.0,192.0,Out,F4,1456.5,Yes
                  837,270.0,109.0,120.0,192.0,In,F,1468.5,Yes
                  838,256.0,109.0,120.0,192.0,In,F,1468.5,No
                  839,180.0,109.0,120.0,216.0,Out,F2,1456.5,Yes
                  840,59.0,109.0,120.0,216.0,In,F,1456.5,Yes
                  841,39.0,109.0,120.0,216.0,In,F,1456.5,Yes
                  842,17.0,109.0,120.0,216.0,Out,F,1456.5,Yes
                  849,356.0,109.0,120.0,216.0,Out,F,1456.5,Yes
                  </HTML tag>
                  >>>
                  """
                  You will need to format the data to suit the way you want to display the information - probably columnar - so instead of a comma, you would use a pad string that needs to be calculated. I recall that Barton posted a nifty function for creating pad strings. I will try to find it.

                  Comment

                  • bvdet
                    Recognized Expert Specialist
                    • Oct 2006
                    • 2851

                    #10
                    I found Barton's function and applied it to the example:
                    Code:
                    import os
                    
                    col_width = 14
                    
                    # Barton wrote this
                    def columnize(word, width):
                        nSpaces = width - len(word)
                        if nSpaces < 0:
                            nSpaces = 0
                        return word + (" " * nSpaces)
                    
                    fn = 'H:/TEMP/temsys/MemData.txt'
                    
                    f = open(fn, 'r')
                    labelLst = f.readline().strip().split('\t')
                    lineLst = []
                    
                    for line in f:
                        if not line.startswith('#'):
                            lineLst.append(line.strip().split('\t'))
                    
                    s1 = '<HTML tag>\n'
                    
                    for word in labelLst:
                        s1 += columnize(word, col_width)
                        
                    sLst = []
                    for line in lineLst:
                        line_of_words = ''
                        for word in line:
                            line_of_words += columnize(word, col_width)
                        sLst.append(line_of_words+'\n')
                            
                    s2 = ''.join(sLst)
                    
                    finished_string = '%s\n%s\n%s%s' % (s1, '='*col_width*len(labelLst), s2, '</HTML tag>')
                    print finished_string
                    
                    """ finished_string
                    >>> <HTML tag>
                    mem_no        mod_azimuth   vessel_OR     platform_IR   platform_OR   toe_dir       brkt_type     tos_el        hr_ext        
                    ==============================================================================================================================
                    831           90.0          109.0         120.0         216.0         In            F             1456.5        Yes           
                    832           337.0         109.0         120.0         216.0         In            F1            1456.5        Yes           
                    833           316.0         109.0         120.0         216.0         In            F2            1456.5        Yes           
                    834           298.0         109.0         120.0         192.0         Out           F3            1456.5        Yes           
                    836           277.0         109.0         120.0         192.0         Out           F4            1456.5        Yes           
                    837           270.0         109.0         120.0         192.0         In            F             1468.5        Yes           
                    838           256.0         109.0         120.0         192.0         In            F             1468.5        No            
                    839           180.0         109.0         120.0         216.0         Out           F2            1456.5        Yes           
                    840           59.0          109.0         120.0         216.0         In            F             1456.5        Yes           
                    841           39.0          109.0         120.0         216.0         In            F             1456.5        Yes           
                    842           17.0          109.0         120.0         216.0         Out           F             1456.5        Yes           
                    849           356.0         109.0         120.0         216.0         Out           F             1456.5        Yes           
                    </HTML tag>
                    >>>
                    """

                    Comment

                    • Motoma
                      Recognized Expert Specialist
                      • Jan 2007
                      • 3236

                      #11
                      The original problem was to format this into an HTML Table.

                      The general format for this is:
                      Code:
                      <table>
                          <tr><th>header column 1</th><th>header column 2</th><th>header column 3</th></tr>
                          <tr><td>Row 1 Col 1</td><td>Row 1 Col 2</td><td>Row 1 Col 3</td></tr>
                          <tr><td>Row 2 Col 1</td><td>Row 2 Col 2</td><td>Row 2 Col 3</td></tr>
                          <tr><td>Row 3 Col 1</td><td>Row 3 Col 2</td><td>Row 3 Col 3</td></tr>
                      </table>

                      Comment

                      • bvdet
                        Recognized Expert Specialist
                        • Oct 2006
                        • 2851

                        #12
                        Originally posted by Motoma
                        The original problem was to format this into an HTML Table.

                        The general format for this is:
                        Code:
                        <table>
                            <tr><th>header column 1</th><th>header column 2</th><th>header column 3</th></tr>
                            <tr><td>Row 1 Col 1</td><td>Row 1 Col 2</td><td>Row 1 Col 3</td></tr>
                            <tr><td>Row 2 Col 1</td><td>Row 2 Col 2</td><td>Row 2 Col 3</td></tr>
                            <tr><td>Row 3 Col 1</td><td>Row 3 Col 2</td><td>Row 3 Col 3</td></tr>
                        </table>
                        Thanks Motoma!
                        Code:
                        """
                        Read a tab delimited file and format for an HTML table
                        The general format for an HTML table:
                        <table>
                            <tr><th>header column 1</th><th>header column 2</th><th>header column 3</th></tr>
                            <tr><td>Row 1 Col 1</td><td>Row 1 Col 2</td><td>Row 1 Col 3</td></tr>
                            <tr><td>Row 2 Col 1</td><td>Row 2 Col 2</td><td>Row 2 Col 3</td></tr>
                            <tr><td>Row 3 Col 1</td><td>Row 3 Col 2</td><td>Row 3 Col 3</td></tr>
                        </table>
                        """
                        fn = 'H:/TEMP/temsys/MemData.txt'
                        
                        f = open(fn, 'r')
                        labelLst = f.readline().strip().split('\t')
                        lineLst = []
                        
                        for line in f:
                            if not line.startswith('#'):
                                lineLst.append(line.strip().split('\t'))
                        
                        s1 = '<table>\n'
                        s1a = '%s%s%s%s' % ('    <tr><th>','</th><th>'.join(labelLst),'</td></tr>','\n')
                            
                        sLst = []
                        for line in lineLst:
                            line_of_words = '%s%s%s%s' % ('    <tr><th>','</th><th>'.join(line),'</td></tr>','\n')
                            sLst.append(line_of_words)
                                
                        s2 = ''.join(sLst)
                        
                        finished_string = '%s%s%s%s' % (s1, s1a, s2, '</table>')
                        print finished_string
                        
                        """ data file
                        mem_no	mod_azimuth	vessel_OR	platform_IR	platform_OR	toe_dir	brkt_type	tos_el	hr_ext
                        ##############################################################################################################
                        831	90.0	109.0	120.0	216.0	In	F	1456.5	Yes
                        832	337.0	109.0	120.0	216.0	In	F1	1456.5	Yes
                        833	316.0	109.0	120.0	216.0	In	F2	1456.5	Yes
                        834	298.0	109.0	120.0	192.0	Out	F3	1456.5	Yes
                        836	277.0	109.0	120.0	192.0	Out	F4	1456.5	Yes
                        837	270.0	109.0	120.0	192.0	In	F	1468.5	Yes
                        838	256.0	109.0	120.0	192.0	In	F	1468.5	No
                        839	180.0	109.0	120.0	216.0	Out	F2	1456.5	Yes
                        840	59.0	109.0	120.0	216.0	In	F	1456.5	Yes
                        841	39.0	109.0	120.0	216.0	In	F	1456.5	Yes
                        842	17.0	109.0	120.0	216.0	Out	F	1456.5	Yes
                        849	356.0	109.0	120.0	216.0	Out	F	1456.5	Yes
                        """
                        
                        """ finished_string
                        >>>
                        <table>
                            <tr><th>mem_no</th><th>mod_azimuth</th><th>vessel_OR</th><th>platform_IR</th><th>platform_OR</th><th>toe_dir</th><th>brkt_type</th><th>tos_el</th><th>hr_ext</td></tr>
                            <tr><th>831</th><th>90.0</th><th>109.0</th><th>120.0</th><th>216.0</th><th>In</th><th>F</th><th>1456.5</th><th>Yes</td></tr>
                            <tr><th>832</th><th>337.0</th><th>109.0</th><th>120.0</th><th>216.0</th><th>In</th><th>F1</th><th>1456.5</th><th>Yes</td></tr>
                            <tr><th>833</th><th>316.0</th><th>109.0</th><th>120.0</th><th>216.0</th><th>In</th><th>F2</th><th>1456.5</th><th>Yes</td></tr>
                            <tr><th>834</th><th>298.0</th><th>109.0</th><th>120.0</th><th>192.0</th><th>Out</th><th>F3</th><th>1456.5</th><th>Yes</td></tr>
                            <tr><th>836</th><th>277.0</th><th>109.0</th><th>120.0</th><th>192.0</th><th>Out</th><th>F4</th><th>1456.5</th><th>Yes</td></tr>
                            <tr><th>837</th><th>270.0</th><th>109.0</th><th>120.0</th><th>192.0</th><th>In</th><th>F</th><th>1468.5</th><th>Yes</td></tr>
                            <tr><th>838</th><th>256.0</th><th>109.0</th><th>120.0</th><th>192.0</th><th>In</th><th>F</th><th>1468.5</th><th>No</td></tr>
                            <tr><th>839</th><th>180.0</th><th>109.0</th><th>120.0</th><th>216.0</th><th>Out</th><th>F2</th><th>1456.5</th><th>Yes</td></tr>
                            <tr><th>840</th><th>59.0</th><th>109.0</th><th>120.0</th><th>216.0</th><th>In</th><th>F</th><th>1456.5</th><th>Yes</td></tr>
                            <tr><th>841</th><th>39.0</th><th>109.0</th><th>120.0</th><th>216.0</th><th>In</th><th>F</th><th>1456.5</th><th>Yes</td></tr>
                            <tr><th>842</th><th>17.0</th><th>109.0</th><th>120.0</th><th>216.0</th><th>Out</th><th>F</th><th>1456.5</th><th>Yes</td></tr>
                            <tr><th>849</th><th>356.0</th><th>109.0</th><th>120.0</th><th>216.0</th><th>Out</th><th>F</th><th>1456.5</th><th>Yes</td></tr>
                        </table>
                        >>>
                        """

                        Comment

                        • Motoma
                          Recognized Expert Specialist
                          • Jan 2007
                          • 3236

                          #13
                          Originally posted by bvdet
                          Thanks Motoma!
                          Glad to help.

                          Comment

                          • nitenmistry
                            New Member
                            • Feb 2007
                            • 5

                            #14
                            Thanks everyone, you've been a great help

                            Regards,

                            Niten

                            Comment

                            Working...