How to link foreign keys & primary keys using python?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • sonal

    How to link foreign keys & primary keys using python?

    Hi all,
    I hv started with python just recently... and have been assigned to
    make an utility which would be used for data validations...
    In short we take up various comma separated data files
    for eg: area.txt, school.txt, students.txt... . and so on (ok?!?)
    now,
    1. area code used in the school.txt must be defined in the area.txt
    (Primary key in area => area_code defined in area.txt
    & Foreign key on school => area_code defined in school.txt)

    i hv created primary key using the following piece of code:

    # primary key for area.txt (index created on the column AREACODE as per
    the
    # schema defined earlier )
    area_pk = PartitionedPK( name = 'Area PK ',
    save_to =
    '../Index/area_code.idx',
    fields = ['A_AREACODE'] )
    # col name in area schema

    # foreign key is defined as follows...
    school_fk = HashedFK( name = ' School code FK',
    load_from = '../Index/area_code.idx',
    fields = ['S_AREACODE'] )
    # col name in school schema

    Description for abv code:
    1. An index {area_code.idx } is formed on the field AREACODE in the
    area.txt
    (i.e.,A_AREACOD E)
    2. The data values in the S_AREACODE field in the school.txt are
    checked in the index {area_code.idx}
    If the area code given in school.txt is not present in the area
    code, then the record is not validated(as foreign key constraint
    fails.)

    Now if the Primary key is on mutiple columns...the foreign key, which
    is also definedon the same no. of columns works..
    for eg..

    # primary key for school.txt (index created on the columns AREACODE &
    SCHOOLCODE as per the
    # schema defined earlier )
    school_pk = PartitionedPK( name = 'School PK ',
    save_to = '../Index/school.idx',
    fields =
    ['S_AREACODE','S _SCHOOLCODE'] ) # col names in school schema

    # foreign key for students is defined as follows...
    student_fk = HashedFK( name = ' STUDENT code FK',
    load_from = '../Index/student.idx',
    fields =
    ['STUD_AREACODE' ,'STUD_SCHOOLCO DE'] ) # col name in
    student schema

    Now if I hv to define foreign key on student.txt but with only one
    field, school code, so as to make sure that the school name given in
    the student.txt exists in the school.txt whatever be the area code...
    I am unable to do this...

    Say, if the AREACODE field is not present in the student.txt file....
    then???
    I tried using the foll code

    student_fk = HashedFK( name = ' Student FK',
    load_from = '../Index/school.idx',
    fields = ['STUD_SCHOOLCOD E'] )

    Its showing an AttributeError : 'list' object has no attribute
    'has_key'

    I also tried making another index with only SCHOOLCODE field (another
    PK for school.txt) and the foreign key on student.txt to be loaded from
    this index...
    it still shows the same error
    That's may be becoz... one SCHOOLCODE may repeat often, with a
    different AREACODE... so may be we need to index on distinct
    SCHOOLCODES in school.txt...
    how do i do that??

    I hope i am not confusing the genious' ... plz help me...
    till get my hands on python... :)

  • MTD

    #2
    Re: How to link foreign keys & primary keys using python?

    Your post is confusing. Here is my advice: investigate the use of
    dictionaries. Dictionaries can allow you to define data in the form {
    key:data }, e.g.

    { area_code : area_data }

    { (area_code,scho ol_code) : school_data }

    { (school_code,st udent_code) : student_data }

    Comment

    • Harry George

      #3
      Re: How to link foreign keys & primary keys using python?

      "sonal" <sonaldgr8@gmai l.com> writes:
      [color=blue]
      > Hi all,
      > I hv started with python just recently... and have been assigned to
      > make an utility which would be used for data validations...[/color]
      ,snip][color=blue]
      > plz help me... till get my hands on python... :)
      >[/color]

      1. This sure looks like a school assignment.

      2. "till get my..." Do you actually not have python installed yet? It
      is hopeless to tackle this problem if you haven't done a few simple
      "hello, world" tasks.

      3. Do you have a data model? In this case, you need to think
      carefully about what a RDBMS, PK, and FK are. In otherwords, a
      metamodel. Generally, once you understand the data structures you
      should implement them pretty much verbatim in a "model" module.
      Then do reader/writer modules so you can load that model from your
      data sources and dump out to your data sinks.

      4. Do you have a testsuite and test harness? Put together your test
      harness, then develop for the simplest case, then add complexity.
      E.g., no FK, FK with 1 attr, FK with multiple attrs, FKs with
      shared attrs.


      --
      Harry George
      PLM Engineering Architecture

      Comment

      • sonal.patankar@3i-infotech.com

        #4
        Re: How to link foreign keys &amp; primary keys using python?

        Hi Mr. George,

        Let me try it again...

        I am not using any relational database to store the required tables
        with primary keys & foreign keys....

        When I say PRIMARY KEY =>
        1. It means an index is created on the specified fields
        (Out of various fields given in the comma separated txt file)
        FileFormat: CODE, FIRST_NAME, last_name, area_of_experti se, country
        Eg: A1,Harry,George , python, XYZCOUNTRY--------(1st record)

        2. The index can be formed on a single field or on multiple fields
        Eg: a. 'CODE' (single field ) {pk_code}
        b. 'CODE' & 'NAME' (multiple fields ) {pk_code_fname}

        Now when I say FOREIGN KEY =>
        1. If the foreign Key is formed on the field 'CODE' in another text
        file
        Format: subsriber_code, CODE,first_name , no_of_posts,act ive(Y/N)
        Eg: SUB_001, A1, Harry, 50, Y

        This means the CODE (A1) given here is checked in the index formed
        above
        with primary key: pk_code...

        2. If the foreign Key is formed on the fields 'CODE' & 'FIRST_NAME'
        Format: subsriber_code, CODE,FIRST_NAME , no_of_posts,act ive(Y/N)
        Eg: SUB_001, A1, Harry, 50, Y

        This means the CODE (A1) & FIRST_NAME (Harry) given here
        are checked in the index formed above with primary key:
        pk_code_fname.. .

        I am done till here.....

        The problem starts if I have defined an index on multiple fields
        (composite PK)
        say: CODE & FIRST_NAME (pk_code_fname)
        and if I need to define a FK on a single field out of these
        say: CODE

        I am unable to do that...
        Mr. George, I thought i must explain the code i am dealin with,
        for better understanding.. , but i am sorry i confused you all the more
        (incase, u want to view the code please refer to the code snippets in
        my first query posted)

        I hope you atleast get an idea of what i am hunting for.... :(

        Comment

        • sonal.patankar@3i-infotech.com

          #5
          Re: How to link foreign keys &amp; primary keys using python?


          MTD wrote:[color=blue]
          > Your post is confusing. Here is my advice: investigate the use of
          > dictionaries. Dictionaries can allow you to define data in the form {
          > key:data }, e.g.
          >
          > { area_code : area_data }
          >
          > { (area_code,scho ol_code) : school_data }
          >
          > { (school_code,st udent_code) : student_data }[/color]

          Thanx Mr. Marc...
          I am surely investigating the dictionaries... but the problem is that
          I have to use the existing code...
          and thas what is creating problems for me... :(

          Comment

          • sonal

            #6
            Re: How to link foreign keys &amp; primary keys using python?

            Hi Mr. George,
            Sorry for confusing u so much...
            Let me try it again...

            I am not using any relational database to store the required tables
            with primary keys & foreign keys....

            When I say PRIMARY KEY =>
            1. It means an index is created on the specified fields
            (Out of various fields given in the comma separated txt file)
            FileFormat: CODE, FIRST_NAME, last_name, area_of_experti se, country
            Eg: A1,Harry,George , python, XYZCOUNTRY--------(1st record)

            2. The index can be formed on a single field or on multiple fields
            Eg: a. 'CODE' (single field ) {pk_code}
            b. 'CODE' & 'NAME' (multiple fields ) {pk_code_fname}

            Now when I say FOREIGN KEY =>
            1. If the foreign Key is formed on the field 'CODE' in another text
            file
            Format: subsriber_code, CODE,first_name , no_of_posts,act ive(Y/N)
            Eg: SUB_001, A1, Harry, 50, Y

            This means the CODE (A1) given here is checked in the index formed
            above
            with primary key: pk_code...

            2. If the foreign Key is formed on the fields 'CODE' & 'FIRST_NAME'
            Format: subsriber_code, CODE,FIRST_NAME , no_of_posts,act ive(Y/N)
            Eg: SUB_001, A1, Harry, 50, Y

            This means the CODE (A1) & FIRST_NAME (Harry) given here
            are checked in the index formed above with primary key:
            pk_code_fname.. .

            I am done till here.....

            The problem starts if I have defined an index on multiple fields
            (composite PK)
            say: CODE & FIRST_NAME (pk_code_fname)
            and if I need to define a FK on a single field out of these
            say: CODE

            I am unable to do that...
            Mr. George, I thought i must explain the code i am dealin with,
            for better understanding.. , but i am sorry i confused you all the more
            (incase, u want to view the code please refer to the code snippets in
            my first query posted)

            Thanks & regards,
            sonal

            Comment

            • sonal

              #7
              Re: How to link foreign keys &amp; primary keys using python?


              MTD wrote:[color=blue]
              > Your post is confusing. Here is my advice: investigate the use of
              > dictionaries. Dictionaries can allow you to define data in the form {
              > key:data }, e.g.
              >
              > { area_code : area_data }
              >
              > { (area_code,scho ol_code) : school_data }
              >
              > { (school_code,st udent_code) : student_data }[/color]

              Thanx Mr. Marc...
              I am surely investigating the dictionaries... but the problem is that
              I have to use the existing code...
              and thats what is creating problems for me... :(

              Comment

              • Steve Holden

                #8
                Re: How to link foreign keys &amp; primary keys using python?

                sonal.patankar@ 3i-infotech.com wrote:[color=blue]
                > Hi Mr. George,
                >
                > Let me try it again...
                >
                > I am not using any relational database to store the required tables
                > with primary keys & foreign keys....
                >[/color]
                None the less, you are using relational database terminology. If you
                want people to understand you then you should use it correctly or
                explain yourself in other ways ...
                [color=blue]
                > When I say PRIMARY KEY =>
                > 1. It means an index is created on the specified fields
                > (Out of various fields given in the comma separated txt file)
                > FileFormat: CODE, FIRST_NAME, last_name, area_of_experti se, country
                > Eg: A1,Harry,George , python, XYZCOUNTRY--------(1st record)
                >
                > 2. The index can be formed on a single field or on multiple fields
                > Eg: a. 'CODE' (single field ) {pk_code}
                > b. 'CODE' & 'NAME' (multiple fields ) {pk_code_fname}
                >
                > Now when I say FOREIGN KEY =>
                > 1. If the foreign Key is formed on the field 'CODE' in another text
                > file
                > Format: subsriber_code, CODE,first_name , no_of_posts,act ive(Y/N)
                > Eg: SUB_001, A1, Harry, 50, Y
                >
                > This means the CODE (A1) given here is checked in the index formed
                > above
                > with primary key: pk_code...
                >
                > 2. If the foreign Key is formed on the fields 'CODE' & 'FIRST_NAME'
                > Format: subsriber_code, CODE,FIRST_NAME , no_of_posts,act ive(Y/N)
                > Eg: SUB_001, A1, Harry, 50, Y
                >
                > This means the CODE (A1) & FIRST_NAME (Harry) given here
                > are checked in the index formed above with primary key:
                > pk_code_fname.. .
                >
                > I am done till here.....
                >
                > The problem starts if I have defined an index on multiple fields
                > (composite PK)
                > say: CODE & FIRST_NAME (pk_code_fname)
                > and if I need to define a FK on a single field out of these
                > say: CODE
                >
                > I am unable to do that...
                > Mr. George, I thought i must explain the code i am dealin with,
                > for better understanding.. , but i am sorry i confused you all the more
                > (incase, u want to view the code please refer to the code snippets in
                > my first query posted)
                >
                > I hope you atleast get an idea of what i am hunting for.... :(
                >[/color]
                You say you have to use existing code, which is a pity because it seems
                that code is organised in such a way as to make your problem difficult.

                The only ways to locate information by a *portion* of the primary key,
                given that you appear to be using tuples as primary key values, is
                either to maintain a separate index on the partial key (in other words
                have a lookup table that holds the partial key values).

                Note, also, that you run a risk here: there is no guarantee that only
                one record will have any given partial key value.

                The whole point of foreign keys in databases is that they are *complete*
                primary key values, and can therefore refer at at most one occurrence of
                the referenced entity. It is not correct to refer to a reference to a
                partial primary key as a "foreign key", precisely because all guarantees
                of uniqueness are lost.

                It seems that you are not a native English speaker, so I am sorry if
                this is leading to further problems.

                The more simply you can explain yourself the easier it will be to help!

                regards
                Steve
                --
                Steve Holden +44 150 684 7255 +1 800 494 3119
                Holden Web LLC/Ltd http://www.holdenweb.com
                Love me, love my blog http://holdenweb.blogspot.com
                Recent Ramblings http://del.icio.us/steve.holden

                Comment

                • Steve Holden

                  #9
                  Re: How to link foreign keys &amp; primary keys using python?

                  sonal wrote:[color=blue]
                  > Hi Mr. George,
                  > Sorry for confusing u so much...
                  > Let me try it again...
                  >
                  > I am not using any relational database to store the required tables
                  > with primary keys & foreign keys....
                  >
                  > When I say PRIMARY KEY =>
                  > 1. It means an index is created on the specified fields
                  > (Out of various fields given in the comma separated txt file)
                  > FileFormat: CODE, FIRST_NAME, last_name, area_of_experti se, country
                  > Eg: A1,Harry,George , python, XYZCOUNTRY--------(1st record)
                  >
                  > 2. The index can be formed on a single field or on multiple fields
                  > Eg: a. 'CODE' (single field ) {pk_code}
                  > b. 'CODE' & 'NAME' (multiple fields ) {pk_code_fname}
                  >[/color]
                  What, in Python, *are* these indexes - lists, dictionaries, tuples or
                  something else?
                  [color=blue]
                  > Now when I say FOREIGN KEY =>
                  > 1. If the foreign Key is formed on the field 'CODE' in another text
                  > file
                  > Format: subsriber_code, CODE,first_name , no_of_posts,act ive(Y/N)
                  > Eg: SUB_001, A1, Harry, 50, Y
                  >
                  > This means the CODE (A1) given here is checked in the index formed
                  > above
                  > with primary key: pk_code...
                  >
                  > 2. If the foreign Key is formed on the fields 'CODE' & 'FIRST_NAME'
                  > Format: subsriber_code, CODE,FIRST_NAME , no_of_posts,act ive(Y/N)
                  > Eg: SUB_001, A1, Harry, 50, Y
                  >
                  > This means the CODE (A1) & FIRST_NAME (Harry) given here
                  > are checked in the index formed above with primary key:
                  > pk_code_fname.. .
                  >
                  > I am done till here.....
                  >
                  > The problem starts if I have defined an index on multiple fields
                  > (composite PK)
                  > say: CODE & FIRST_NAME (pk_code_fname)
                  > and if I need to define a FK on a single field out of these
                  > say: CODE
                  >
                  > I am unable to do that...
                  > Mr. George, I thought i must explain the code i am dealin with,
                  > for better understanding.. , but i am sorry i confused you all the more
                  > (incase, u want to view the code please refer to the code snippets in
                  > my first query posted)
                  >
                  > Thanks & regards,
                  > sonal
                  >[/color]
                  A lot depends on the data structure that the code uses to represent the
                  "indexes". Perhaps you could explain how things are looked up in them?

                  regards
                  Steve
                  --
                  Steve Holden +44 150 684 7255 +1 800 494 3119
                  Holden Web LLC/Ltd http://www.holdenweb.com
                  Love me, love my blog http://holdenweb.blogspot.com
                  Recent Ramblings http://del.icio.us/steve.holden

                  Comment

                  • sonal

                    #10
                    Re: How to link foreign keys &amp; primary keys using python?

                    Hi Mr. Steve,
                    The *indexes* i am using are lists...
                    The code for creation of the PartionedPK is given below...
                    *************** *************** *************** *************** *************** ****
                    class PartitionedPK(o bject):
                    def __init__(self, name, save_to, fields):
                    self.name = name
                    self.idx_name = save_to
                    self.part_name = save_to + PARTITION_SUFFI X
                    self.fields = fields
                    self.digester = sha.new

                    def setup(self, schema):
                    self.partitions = [[] for i in range(256)]
                    self.flush_pos = [[] for i in range(256)]
                    self.flush_coun t = 0
                    self.index = {}
                    self.offsets = field_offsets(s elf.fields, schema)
                    if not self.offsets:
                    raise ValueError('One or more index field names are
                    invalid')

                    self.idx_file = open(self.idx_n ame, 'wb+')
                    self.part_file = open(self.part_ name, 'wb+')

                    def save(self):
                    pickle.dump(sel f.flush_count, self.part_file, -1)
                    pickle.dump(sel f.flush_pos, self.part_file, -1)
                    self.idx_file.c lose()
                    self.part_file. close()

                    def flush(self):
                    self.flush_coun t += 1
                    for i in range(256):
                    self.flush_pos[i].append(self.id x_file.tell())
                    pickle.dump(sel f.partitions[i], self.idx_file, -1)
                    self.partitions[i] = []

                    def valid(self, record, data):
                    key = self.digester(' '.join( [data[i] for i in self.offsets]
                    )).digest()
                    self.partitions[ ord(key[0]) ].append( (key, record) )
                    # defer checking till later
                    return True

                    def finalize(self):
                    self.flush()
                    errors = []
                    for bin in range(256):
                    #show('Checking %s, bin %d/256 ... ' % (self.name, bin))
                    seen = {}
                    has = seen.has_key
                    for flush in range(self.flus h_count):
                    self.idx_file.s eek( self.flush_pos[bin][flush] )
                    records = pickle.load(sel f.idx_file)
                    for key, value in records:
                    if has(key):
                    errors.append(v alue)
                    else:
                    seen[key] = value
                    return errors
                    *************** *************** *************** *************** *************** ****
                    the PK definition is as follows:
                    vol_pk = PartitionedPK( name = 'VOL_PK',
                    save_to = '../Index/vol.idx',
                    fields = ['ID','Type','Cu rr_Code','Tenor '])

                    The code for the Foreign Key declaration (referencing to the indexes)
                    is as given below...
                    *************** *************** *************** *************** *************** ****
                    class HashedFK(object ):
                    def __init__(self, name, load_from, fields):
                    self.name = name
                    self.filename = load_from
                    self.fields = fields
                    self.digester = sha.new

                    def setup(self, schema):
                    self.index = {}
                    self.offsets = field_offsets(s elf.fields, schema)

                    if not self.offsets:
                    raise ValueError('One or more index field names are
                    invalid')

                    file = open(self.filen ame, 'rb+')
                    self.index = pickle.load(fil e)
                    file.close()

                    def valid(self, record, fields):
                    key = self.digester(' '.join( [fields[i] for i in self.offsets]
                    )).digest()

                    return self.index.has_ key(key)

                    def flush(self):
                    pass

                    def finalize(self):
                    return None
                    *************** *************** *************** *************** *************** ****
                    the FK definition is as follows:
                    vol_fk = HashedFK( name = ' VOL_FK,
                    load_from = '../Index/vol.idx',
                    fields= ['ID','Type','Cu rr_Code','Tenor '])

                    The code is working fine when the foreign key is referenced to the
                    complete primary key

                    But if the FK were to be on only 'ID'
                    the FK defn would have been like =>

                    vol_fk = HashedFK( name = ' VOL_FK,
                    load_from = '../Index/vol.idx',
                    fields = ['ID'] )
                    This is were the problem lies...
                    it shows AttributeError: 'list' object has no attribute 'has_key'

                    I have also tried defining another PK with a single field as follows =>
                    Tvol_pk = PartitionedPK( name = 'TVOL_PK',
                    save_to = '../Index/tvol.idx',
                    fields = ['ID'] )
                    The index ''tvol.idx'' is being created at the given location(path
                    specified)
                    but referencing to this index(i.e., tvol.idx) with the vol_fk given
                    above also gives
                    the same error.

                    Comment

                    Working...