newb comment request

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Alexandre

    newb comment request

    Hi,

    Im a newb to dev and python... my first sefl assigned mission was to read a
    pickled file containing a list with DB like data and convert this to
    MySQL... So i wrote my first module which reads this pickled file and writes
    an XML file with list of tables and fields (... next step will the module
    who creates the tables according to details found in the XML file).

    If anyone has some minutes to spare, suggestions and comments would be verry
    much appreciated to help me make my next modules better.. and not starting
    with bad habit :)

    Alexandre

    ############### ######### <EIPyFormatToXM L>
    ############### ############### ########
    #pickle.loads a file containing database like data in a python list ->
    #Outputs a XML file with "tables" details which will be used in a future
    module
    #to build MySQL tables. See the comment blocks at the end of the module
    #for more details.

    import sys
    import pickle
    import types

    infile = open('cached-objects-Python-pickled-sample', 'rb')
    _data = pickle.load(inf ile)
    infile.close()

    def ExtractTablesFr omData(data):
    """Extracts all the table names from the Dumped items data file and
    returns the list."""
    tablesR = []
    for tables in data:#For each tables found in 'data'
    tablesR.append([tables[0]])#Appends the list with current table name
    return tablesR

    def ExtractFieldNam esFromData(data ):
    """Extract all fields from data list (the calling function defines for
    which table in 'data' argument)."""
    fieldsR = []
    for fields in data:
    fieldsR.append([fields])
    return fieldsR

    def ExtractFieldVal uesFromData(dat a, indexField):
    """Check each value of the field passed as argument to the function, it
    returns [valueType, maxLength, maxValue, minValue, floatPrecision,
    NoneValues(bool ), sampleValue]."""
    valueType, maxLength, maxValue, minValue, values , floatPrecision,
    NoneValues = None, None, None, 999999999999, [], None, False
    sampleValue = 'numeric value, check min and max values as sample'
    for valuesD in data:#for each record...
    value = valuesD[indexField]#focus value on required field
    if type(value) is not types.NoneType: #if a value other than None is
    found
    valueType = type(value)#val ueType is given the type of the
    current value
    else:#... if the value is None
    NoneValues = True#None values exist for this field in the record
    list
    if valueType is str:#if type is string
    minValue = None#set minValue to None (minValue and maxValue are
    only for numeric types)
    if len(value) > maxLength:#if current string lenght is bigger
    than maxLength
    maxLength = len(value)#Set maxLength value to current string
    length
    sampleValue = value#Sets sampleValue with the longest string
    found
    else:#... if not string type
    if value > maxValue:#if current value bigger than maxValue
    maxValue = value#Sets current value to maxValue
    if value and value < minValue:#if value is not None AND smaller
    than minValue
    minValue = value#Sets new minValue with current value
    if valueType is float and value != 0:#if value type is float and
    not 0
    precisionTemp = len(str(value - int(value)))-2
    if precisionTemp > floatPrecision: #if the current length
    after decimal point is bigger than previous
    floatPrecision = precisionTemp#s et current value to
    precision
    if valueType is float and floatPrecision == None:#if float could not be
    determined because only 0.0 values were found
    floatPrecision = 1#set precision to 1
    if valueType is not float and floatPrecision != None:#if last value type
    was not float but some float values were found
    valueType = type(1.234)#set valueType to float
    if valueType is str and maxLength == 0:#if value type found only ''
    (empty) records
    NoneValues = True#allow null values
    if minValue == 999999999999:#i f minValue was not set
    minValue = None#then minValue is None
    values[:] = [valueType, maxLength, maxValue, minValue, floatPrecision,
    NoneValues, sampleValue]
    return values

    def AddFieldsPerTab le():
    """Appends field list to each table."""
    tables = ExtractTablesFr omData(_data) #First extract list of tables
    for i, table in enumerate(table s): #Then for each table in the list
    fields = ExtractFieldNam esFromData(_dat a[i][1][0])#get field list
    ([i] as table index, [1][0] to reach field list)
    tables[i].append(fields) #Appends the returned field list to current
    table
    return tables

    def AddFieldsDetail sPerField():
    """Extend field list with details for each field."""
    tables = AddFieldsPerTab le()#First get table list
    for iTable, table in enumerate(table s):#Then for each table
    for iField, field in enumerate(table[1]):#...for each field in the
    current table
    values = ExtractFieldVal uesFromData(_da ta[iTable][1][1],
    iField)#Get field's details([iTable] as table index, [1][1] to reach records
    list, iField to focus search on current field)
    field.extend(va lues)#Extends the tables list with returned field
    details
    return tables

    def AddNbOfRecordsP erTable():#Inse rt number of records per table.
    """Extend 'tables' details with number of records per table."""
    tables = AddFieldsDetail sPerField()#get tables
    for i, table in enumerate(table s):#for each table
    nbOfRecords = len(_data[i][1][1])#get number of records ([i]=table
    index, [1][1] = record list)
    table.insert(1, nbOfRecords)#in serts the number of records in tables
    list
    return tables

    def WriteFileTableF ormat(fileName) :#Creates the XML with 'tables' list
    tables = AddNbOfRecordsP erTable()#get tables detailed list
    f = open(fileName, 'w')
    f.write("""<?xm l version="1.0" encoding="ISO-8859-1"?>\n""")
    f.write("<Root> \n")

    for table in tables:
    f.write("\t<tab le>\n")
    f.write("\t\t<n ame>%s</name>\n" % table[0])
    f.write("\t\t<n bOfRecords>%s</nbOfRecords>\n" % table[1])
    for field in table[2][:]:
    f.write("\t\t<f ield>\n")
    f.write("\t\t\t <name>%s</name>\n" % field[0])
    if str(field[1])[:7] == "<type '":
    field[1] = str(field[1])[7:-2]
    f.write("\t\t\t <pythonType>% s</pythonType>\n" % str(field[1]))
    f.write("\t\t\t <maxLength>%s </maxLength>\n" % str(field[2]))
    f.write("\t\t\t <maxValue>%s</maxValue>\n" % str(field[3]))
    f.write("\t\t\t <minValue>%s</minValue>\n" % str(field[4]))
    f.write("\t\t\t <floatPrecision >%s</floatPrecision> \n" %
    str(field[5]))
    f.write("\t\t\t <NoneValues>% s</NoneValues>\n" % str(field[6]))
    f.write("\t\t\t <sampleValue>%s </sampleValue>\n" % str(field[7]))
    f.write("\t\t\t <mysqlFieldType ></mysqlFieldType> \n")
    f.write("\t\t</field>\n")
    f.write("\t</table>\n")

    f.write("</Root>")
    f.close

    WriteFileTableF ormat('EITables Format.xml')

    ############ <Help to understand '_data' structure>
    #
    # [['FirstTableName ', (['FirstFieldName ', 'nFieldName'],
    [['FirstFieldFirs tValue', 'nFieldFirstVal ue'],
    # ['FirstFieldnVal ue', 'nFieldnValue']])], ['nTableName', (['etc..
    # print _data[0][0] #[0]=FirstTable, [0]=TableName -> output :
    'FirstTableName '
    # print len(_data) #number of tables in 'data'
    # print _data[0][1] #[0]=FirstTable, [1]=FieldList And Records
    # print _data[0][1][0] #[0]=FirstTable, [1]=FieldList, [0]=FieldNames ->
    output : ['FirstFieldName ', 'nFieldName']
    # print len(_data[0][1][0]) #number of fields in first table
    # print _data[0][1][1] #[0]=FirstTable, [1]=FieldList, [1]=RecordList
    # print len(_data[0][1][1]) #number of records in first table
    # print _data[0][1][1][0][2] #[0]=firstTable, [1]=FieldList,
    [1]=RecordList, [0] = First Record, [2]=Third Field Value
    #
    ############### ######### </Help to understand '_data' structure>


    ############ <Final 'tables' variable format>
    #
    # The final 'tables' format used to build the XML should look like :
    # ([tablename_1, nbOfRecords
    # [
    # [fieldname_1, pythonType, maxLength, maxValue, minValue,
    floatPrecision, NoneValues, sampleValue],
    # [fieldname_1, pythonType, maxLength, maxValue, minValue,
    floatPrecision, NoneValues, sampleValue]
    # ],
    # [tablename_n,
    # [
    # [fieldname_1, ...]
    # ]
    # ])
    #
    ############### ######### </Final 'tables' variable format>
    ############### ############### ###### </EIPyFormatToXML >


  • Peter Otten

    #2
    Re: newb comment request

    Alexandre wrote:
    [color=blue]
    > Hi,
    >
    > Im a newb to dev and python... my first sefl assigned mission was to read
    > a pickled file containing a list with DB like data and convert this to
    > MySQL... So i wrote my first module which reads this pickled file and
    > writes an XML file with list of tables and fields (... next step will the
    > module who creates the tables according to details found in the XML file).
    >
    > If anyone has some minutes to spare, suggestions and comments would be
    > verry much appreciated to help me make my next modules better.. and not
    > starting with bad habit :)[/color]

    I would suggest that you repost the script without the excessive comments.
    Most programmers find Python very readable, your comments actually make it
    harder to parse for the human eye.
    Also, make it work if you can, or point to the actual errors that you cannot
    fix yourself. Provide actual test data in your post instead of the
    unpickling code, so that others can easily reproduce your errors.
    Only then you should ask for improvements.

    Random remarks:

    Object oriented programming is about programming against interfaces, so
    exessive type checking is a strong hint to design errors.

    Avoid using global variables in your functions; rather pass them explicitly
    as arguments.

    Where is valuesD introduced?

    "%s" % str(1.23)
    is the same as
    "%s" % 1.23

    type(value) is not types.NoneType
    is the same as
    value is not None

    Module level code is habitually wrapped like so:

    if __name__ == "__main__":
    infile = open('cached-objects-Python-pickled-sample', 'rb')
    _data = pickle.load(inf ile)
    infile.close()
    # process _data...

    That way, you can import the module as well as use it as a standalone
    script.

    I'm sure, there is more, but then again, clean up the comments, try to make
    it work, and then repost.

    Peter

    Comment

    • Alexandre

      #3
      Re: newb comment request


      "Peter Otten" <__peter__@web. de> a écrit dans le message de news:bq4eo0$fho $03$1@news.t-online.com...[color=blue]
      > I would suggest that you repost the script without the excessive comments.
      > Most programmers find Python very readable, your comments actually make it
      > harder to parse for the human eye.[/color]

      Done, so your help will also make my next posts better :)
      [color=blue]
      > Also, make it work if you can, or point to the actual errors that you cannot
      > fix yourself. Provide actual test data in your post instead of the
      > unpickling code, so that others can easily reproduce your errors.
      > Only then you should ask for improvements.[/color]

      Also available in my new version :)
      [color=blue]
      > Random remarks:
      >
      > Object oriented programming is about programming against interfaces, so
      > exessive type checking is a strong hint to design errors.[/color]

      I'm not sure i understand... well, let's put it that way : i'm sure i don't understand :)
      I guess i'll have to read about OO design... this is my first program, although i think i understand what OO means, i'm
      not abble to write OO yet :/
      [color=blue]
      > Avoid using global variables in your functions; rather pass them explicitly
      > as arguments.[/color]

      I'll try that !
      [color=blue]
      > Where is valuesD introduced?[/color]

      Not sure i understand the question... i used this variable name "ValuesD" (meaning Values from Data) not to conflict
      with the other variable named "values" in the same function.
      [color=blue]
      > "%s" % str(1.23)
      > is the same as
      > "%s" % 1.23[/color]

      Ok

      [color=blue]
      > type(value) is not types.NoneType
      > is the same as
      > value is not None[/color]

      Shame on me :)
      [color=blue]
      > Module level code is habitually wrapped like so:
      >
      > if __name__ == "__main__":
      > infile = open('cached-objects-Python-pickled-sample', 'rb')
      > _data = pickle.load(inf ile)
      > infile.close()
      > # process _data...
      >
      > That way, you can import the module as well as use it as a standalone
      > script.[/color]

      Oh ? but, i can use it as standalone script ?!
      If i type myScriptName.py in my dos prompt the script is working !
      [color=blue]
      > I'm sure, there is more, but then again, clean up the comments, try to make
      > it work, and then repost.[/color]

      I'm sure there's more :)
      Thanks a lot for your comments Peter !
      Best regards,
      Alexandre


      Comment

      • Alexandre

        #4
        Re: newb comment request (uncommented code + sample data)

        ############### ######### <EIPyFormatToXM L> ############### ############### ########
        import sys
        import types

        _data = [['Table1',(['Field01','Fiel d02','Field03',],
        [['a string', 12345, 1.000123],
        ['a second string', None, 3406.3],
        ['', 64654564, 35]])],
        ['Table2', (['Field04', 'Field05'], [[None, -0.3]])],
        ['Table3',(['Field06', 'Field07', 'Field08'],
        [['', 0, 0.001],
        ['', None, 646464.0],
        ['', 6546, 0.1],
        ['', -6444, 0.2],
        ['', 0, 0.3]])]]

        def ExtractTablesFr omData(data):
        """Extracts all the table names from the Dumped items data file and returns the list."""
        tablesR = []
        for tables in data:
        tablesR.append([tables[0]])
        return tablesR

        def ExtractFieldNam esFromData(data ):
        """Extract all fields from data list (the calling function defines for which table in 'data' argument)."""
        fieldsR = []
        for fields in data:
        fieldsR.append([fields])
        return fieldsR

        def ExtractFieldVal uesFromData(dat a, indexField):
        """Check each value of the field passed as argument to the function."""
        values , floatPrecision, NoneValues = [], None, False
        valueType, maxLength, maxValue, minValue = None, None, None, 999999999999
        sampleValue = 'numeric value, check min and max values as sample'
        for valuesD in data:
        value = valuesD[indexField]
        if type(value) is not types.NoneType:
        valueType = type(value)
        else:
        NoneValues = True
        if valueType is str:
        minValue = None
        if len(value) > maxLength:
        maxLength = len(value)
        sampleValue = value
        else:
        if value > maxValue:
        maxValue = value
        if value and value < minValue:
        minValue = value
        if valueType is float and value != 0:
        precisionTemp = len(str(value - int(value)))-2
        if precisionTemp > floatPrecision:
        floatPrecision = precisionTemp
        if valueType is float and floatPrecision == None:
        floatPrecision = 1
        if valueType is not float and floatPrecision != None:
        valueType = type(1.234)
        if valueType is str and maxLength == 0:
        NoneValues = True
        if minValue == 999999999999:
        minValue = None
        values[:] = [valueType, maxLength, maxValue, minValue, floatPrecision, NoneValues, sampleValue]
        return values

        def AddFieldsPerTab le():
        """Appends field list to each table."""
        tables = ExtractTablesFr omData(_data)
        for i, table in enumerate(table s):
        fields = ExtractFieldNam esFromData(_dat a[i][1][0])
        tables[i].append(fields)
        return tables

        def AddFieldsDetail sPerField():
        """Extend field list with details for each field."""
        tables = AddFieldsPerTab le()
        for iTable, table in enumerate(table s):
        for iField, field in enumerate(table[1]):
        values = ExtractFieldVal uesFromData(_da ta[iTable][1][1], iField)
        field.extend(va lues)
        return tables

        def AddNbOfRecordsP erTable():
        """Extend 'tables' details with number of records per table."""
        tables = AddFieldsDetail sPerField()
        for i, table in enumerate(table s):
        nbOfRecords = len(_data[i][1][1])
        table.insert(1, nbOfRecords)
        return tables

        def WriteFileTableF ormat(fileName) :
        tables = AddNbOfRecordsP erTable()
        f = open(fileName, 'w')
        f.write("""<?xm l version="1.0" encoding="ISO-8859-1"?>\n""")
        f.write("<Root> \n")

        for table in tables:
        f.write("\t<tab le>\n")
        f.write("\t\t<n ame>%s</name>\n" % table[0])
        f.write("\t\t<n bOfRecords>%s</nbOfRecords>\n" % table[1])
        for field in table[2][:]:
        f.write("\t\t<f ield>\n")
        f.write("\t\t\t <name>%s</name>\n" % field[0])
        if str(field[1])[:7] == "<type '":
        field[1] = str(field[1])[7:-2]
        f.write("\t\t\t <pythonType>% s</pythonType>\n" % str(field[1]))
        f.write("\t\t\t <maxLength>%s </maxLength>\n" % str(field[2]))
        f.write("\t\t\t <maxValue>%s</maxValue>\n" % str(field[3]))
        f.write("\t\t\t <minValue>%s</minValue>\n" % str(field[4]))
        f.write("\t\t\t <floatPrecision >%s</floatPrecision> \n" % str(field[5]))
        f.write("\t\t\t <NoneValues>% s</NoneValues>\n" % str(field[6]))
        f.write("\t\t\t <sampleValue>%s </sampleValue>\n" % str(field[7]))
        f.write("\t\t\t <mysqlFieldType ></mysqlFieldType> \n")
        f.write("\t\t</field>\n")
        f.write("\t</table>\n")

        f.write("</Root>")
        f.close

        WriteFileTableF ormat('EITables Format.xml')

        ############### ############### ###### </EIPyFormatToXML >



        -> result xml

        <?xml version="1.0" encoding="ISO-8859-1"?>

        <Root>

        <table>

        <name>Table1</name>

        <nbOfRecords> 3</nbOfRecords>

        <field>

        <name>Field01 </name>

        <pythonType>str </pythonType>

        <maxLength>15 </maxLength>

        <maxValue>Non e</maxValue>

        <minValue>Non e</minValue>

        <floatPrecision >None</floatPrecision>

        <NoneValues>Fal se</NoneValues>

        <sampleValue> a second string</sampleValue>

        <mysqlFieldTy pe/>

        </field>

        <field>

        <name>Field02 </name>

        <pythonType>int </pythonType>

        <maxLength>None </maxLength>

        <maxValue>64654 564</maxValue>

        <minValue>12345 </minValue>

        <floatPrecision >None</floatPrecision>

        <NoneValues>Tru e</NoneValues>

        <sampleValue>nu meric value, check min and max values as sample</sampleValue>

        <mysqlFieldTy pe/>

        </field>

        <field>

        <name>Field03 </name>

        <pythonType>flo at</pythonType>

        <maxLength>None </maxLength>

        <maxValue>3406. 3</maxValue>

        <minValue>1.000 123</minValue>

        <floatPrecision >6</floatPrecision>

        <NoneValues>Fal se</NoneValues>

        <sampleValue>nu meric value, check min and max values as sample</sampleValue>

        <mysqlFieldTy pe/>

        </field>

        </table>

        <table>

        <name>Table2</name>

        <nbOfRecords> 1</nbOfRecords>

        <field>

        <name>Field04 </name>

        <pythonType>Non e</pythonType>

        <maxLength>None </maxLength>

        <maxValue>Non e</maxValue>

        <minValue>Non e</minValue>

        <floatPrecision >None</floatPrecision>

        <NoneValues>Tru e</NoneValues>

        <sampleValue>nu meric value, check min and max values as sample</sampleValue>

        <mysqlFieldTy pe/>

        </field>

        <field>

        <name>Field05 </name>

        <pythonType>flo at</pythonType>

        <maxLength>None </maxLength>

        <maxValue>-0.3</maxValue>

        <minValue>-0.3</minValue>

        <floatPrecision >2</floatPrecision>

        <NoneValues>Fal se</NoneValues>

        <sampleValue>nu meric value, check min and max values as sample</sampleValue>

        <mysqlFieldTy pe/>

        </field>

        </table>

        <table>

        <name>Table3</name>

        <nbOfRecords> 5</nbOfRecords>

        <field>

        <name>Field06 </name>

        <pythonType>str </pythonType>

        <maxLength>0</maxLength>

        <maxValue>Non e</maxValue>

        <minValue>Non e</minValue>

        <floatPrecision >None</floatPrecision>

        <NoneValues>Tru e</NoneValues>

        <sampleValue/>

        <mysqlFieldTy pe/>

        </field>

        <field>

        <name>Field07 </name>

        <pythonType>int </pythonType>

        <maxLength>None </maxLength>

        <maxValue>654 6</maxValue>

        <minValue>-6444</minValue>

        <floatPrecision >None</floatPrecision>

        <NoneValues>Tru e</NoneValues>

        <sampleValue>nu meric value, check min and max values as sample</sampleValue>

        <mysqlFieldTy pe/>

        </field>

        <field>

        <name>Field08 </name>

        <pythonType>flo at</pythonType>

        <maxLength>None </maxLength>

        <maxValue>64646 4.0</maxValue>

        <minValue>0.001 </minValue>

        <floatPrecision >3</floatPrecision>

        <NoneValues>Fal se</NoneValues>

        <sampleValue>nu meric value, check min and max values as sample</sampleValue>

        <mysqlFieldTy pe/>

        </field>

        </table>

        </Root>


        Comment

        • Alexandre

          #5
          Re: newb comment request

          [color=blue][color=green]
          > > Object oriented programming is about programming against interfaces, so
          > > exessive type checking is a strong hint to design errors.[/color]
          >
          > I'm not sure i understand... well, let's put it that way : i'm sure i don't understand :)
          > I guess i'll have to read about OO design... this is my first program, although i think i understand what OO means,[/color]
          i'm[color=blue]
          > not abble to write OO yet :/[/color]

          ok... a little shower always help, now i understand your remark :)
          The thing is, this first module is retrieving '_data' from another app.
          This other app is not written to share '_data' with other apps.

          So my first module should be the only one which deals with types checking because those data will then be available to
          my app from a DB (... once i've written the second module :)

          Thx again and regards,
          Alexandre


          Comment

          • Peter Otten

            #6
            Re: newb comment request

            Alexandre wrote:

            [Peter][color=blue][color=green]
            >> Object oriented programming is about programming against interfaces, so
            >> exessive type checking is a strong hint to design errors.[/color][/color]

            [Alexandre][color=blue]
            > I'm not sure i understand... well, let's put it that way : i'm sure i
            > don't understand :) I guess i'll have to read about OO design... this is
            > my first program, although i think i understand what OO means, i'm not
            > abble to write OO yet :/[/color]

            Type checking is generally best avoided, e. g. if you test for

            isinstance(myst ream, file)

            this may fail on mystream objects that have all the methods needed to
            replace a file in subsequent code and you thus unnecessarily limit its
            usage.
            As your script is explicitly in the "type checking business", my remark was
            a bit off, though.
            [color=blue][color=green]
            >> Where is valuesD introduced?[/color]
            > Not sure i understand the question... i used this variable name "ValuesD"
            > (meaning Values from Data) not to conflict with the other variable named
            > "values" in the same function.[/color]

            I spotted an error where there was none - nothing to understand here :-(
            [color=blue][color=green]
            >> Module level code is habitually wrapped like so:
            >>
            >> if __name__ == "__main__":
            >> infile = open('cached-objects-Python-pickled-sample', 'rb')
            >> _data = pickle.load(inf ile)
            >> infile.close()
            >> # process _data...
            >>
            >> That way, you can import the module as well as use it as a standalone
            >> script.[/color]
            >
            > Oh ? but, i can use it as standalone script ?!
            > If i type myScriptName.py in my dos prompt the script is working ![/color]

            Yes, but you can *only* use it as a standalone script. If you import it into
            another module, it will try to read the pickled data from the hard-coded
            file before you can do anything else.


            Peter


            Comment

            Working...