optimum datatype for primary key column O9i

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Sanjay Minni

    optimum datatype for primary key column O9i

    What is the datatype to be used
    for Primary Key columns for most
    optimised access given that

    - There will be a single column primary key only
    - The values will only be integers (but as
    strings) at least 12 digits (characters) long
    - all positions will be occupied (no leading 0's)

    - Tables may have upto 1m+ rows
    - There will be lot of foriegn key references
    and master-child access

    Please let me have any suggestions

    Sanjay Minni

  • hrishy

    #2
    Re: optimum datatype for primary key column O9i

    Hi

    You can use varchar2...but i suggest you dont code any intelligence in
    these keys...use dumb keys like this

    My_dumb_key My_key
    1 A12
    2 A2


    now make my_dumb_key as your primary which is a number populated by a
    sequence or something..

    I think this is the best.You can make My_key unique..

    Read books on data modelling they will have detailed explanation

    regards
    Hrishy

    Comment

    • Sanjay Minni

      #3
      Re: optimum datatype for primary key column O9i

      What I meant was that specifically to Oracle 8i/9i,
      internal data storage methods and data & index structures
      what will be the optimum datatype to be used
      [for 12 digit key lengths] to make best use
      of space [data & index] and speed of access [joins etc].

      The effect will cascade as most of the primary key columns
      will be duplicated as foriegn keys in other tables
      and invariably large joins will be used

      Sanjay Minni

      Comment

      • sybrandb@yahoo.com

        #4
        Re: optimum datatype for primary key column O9i

        sminni_SPAM_DIV ERTER@planage.c om (Sanjay Minni) wrote in message news:<4fe109d.0 310271823.58b3b 2c8@posting.goo gle.com>...
        What I meant was that specifically to Oracle 8i/9i,
        internal data storage methods and data & index structures
        what will be the optimum datatype to be used
        [for 12 digit key lengths] to make best use
        of space [data & index] and speed of access [joins etc].
        >
        The effect will cascade as most of the primary key columns
        will be duplicated as foriegn keys in other tables
        and invariably large joins will be used
        >
        Sanjay Minni
        www.planage.com
        If your key is numeric, one of the most stupid moves you can make is
        storing them in varchar2s. Numeric columns are packed, varchar2
        columns aren't. Also you are likely to end up with implicit conversion
        issues if someone mistakenly writes queries like this one
        key_column = 1234.
        This will be automatically converted by Oracle to
        to_number(key_c olumn) = 1234
        and consequently the primary key index will not be used.

        Regards
        Sybrand Bakker
        Senior Oracle DBA

        Comment

        • Sanjay Minni

          #5
          Re: optimum datatype for primary key column O9i

          sybrandb@yahoo. com wrote in message news:<a1d154f4. 0310280118.1f47 d7a2@posting.go ogle.com>...
          sminni_SPAM_DIV ERTER@planage.c om (Sanjay Minni) wrote in message news:<4fe109d.0 310271823.58b3b 2c8@posting.goo gle.com>...
          [snip...]what will be the optimum datatype to be used
          [for 12 digit key lengths] to make best use
          of space [data & index] and speed of access [joins etc].
          >
          If your key is numeric, one of the most stupid moves you can make is
          storing them in varchar2s. Numeric columns are packed, varchar2
          columns aren't. Also you are likely to end up with implicit conversion
          issues if someone mistakenly writes queries like this one
          key_column = 1234.
          This will be automatically converted by Oracle to
          to_number(key_c olumn) = 1234
          and consequently the primary key index will not be used.
          >
          good, so once again, assuming I need around 12 digit long values
          in primary keys what is the best datatype to use to optimise
          Primary keys, indexes, foriegn keys, joins and matches
          i.e. "=" , ">"...

          will I gain anything if I reduce length to
          (say minimum 10 digit positions) or instead
          use fixed length character datatypes on 12 positions

          Actually we are generating key values and padding
          with a leading (number) value to make up 12 positions
          We can go minimum 10 positions if it really leads
          to significant savings in any manner

          Sanjay

          Sanjay Minni

          Comment

          • mcstock

            #6
            Re: optimum datatype for primary key column O9i

            to illustrate, VARCHAR2 is variable length, so no padding is used in storage
            (although a length byte or 2 is required)

            however, most number in a VARCHAR2 requires more storage than the same value
            in a NUMBER column, since the NUMBER column stores the significant digits
            and precision, vs each of the literal digits:

            SQLcreate table tbl1 (
            2 as_varchar2 varchar2(12)
            3 , as_number number(12)
            4 );

            Table created.

            SQLinsert into tbl1 values( '20000000', 20000000 );
            SQLinsert into tbl1 values ('12345678', 12345678 );

            SQLselect as_number, vsize(as_varcha r2), vsize(as_number ) from tbl1;
            ....

            AS_NUMBER VSIZE(AS_VARCHA R2) VSIZE(AS_NUMBER )
            ---------- ------------------ ----------------
            20000000 8 2
            12345678 8 5



            so, real simple:

            make PK columns NUMBER whenever possible -- with our without a maximum
            precision

            always declare the FK column(s) with the exact same datatype and precision
            as the referenced PK column(s)

            assign the PK value with an Oracle SEQUENCE object


            --
            ----------------------------------------
            Mark C. Stock

            (888) 512-2048


            "Sanjay Minni" <sminni_SPAM_DI VERTER@planage. comwrote in message
            news:4fe109d.03 10280846.583313 5d@posting.goog le.com...
            sybrandb@yahoo. com wrote in message
            news:<a1d154f4. 0310280118.1f47 d7a2@posting.go ogle.com>...
            sminni_SPAM_DIV ERTER@planage.c om (Sanjay Minni) wrote in message
            news:<4fe109d.0 310271823.58b3b 2c8@posting.goo gle.com>...
            [snip...]what will be the optimum datatype to be used
            [for 12 digit key lengths] to make best use
            of space [data & index] and speed of access [joins etc].
            If your key is numeric, one of the most stupid moves you can make is
            storing them in varchar2s. Numeric columns are packed, varchar2
            columns aren't. Also you are likely to end up with implicit conversion
            issues if someone mistakenly writes queries like this one
            key_column = 1234.
            This will be automatically converted by Oracle to
            to_number(key_c olumn) = 1234
            and consequently the primary key index will not be used.
            >
            good, so once again, assuming I need around 12 digit long values
            in primary keys what is the best datatype to use to optimise
            Primary keys, indexes, foriegn keys, joins and matches
            i.e. "=" , ">"...
            >
            will I gain anything if I reduce length to
            (say minimum 10 digit positions) or instead
            use fixed length character datatypes on 12 positions
            >
            Actually we are generating key values and padding
            with a leading (number) value to make up 12 positions
            We can go minimum 10 positions if it really leads
            to significant savings in any manner
            >
            Sanjay
            >
            Sanjay Minni

            Comment

            • Sanjay Minni

              #7
              Re: optimum datatype for primary key column O9i

              make PK columns NUMBER whenever possible -- with our without a maximum
              precision
              >
              always declare the FK column(s) with the exact same datatype and precision
              as the referenced PK column(s)
              >
              assign the PK value with an Oracle SEQUENCE object
              Mark C. Stock
              www.enquery.com
              Thanks,
              Now as I need only
              - integer values and of
              - 12 digit positions all filled (i.e. maximum 1 or 2 leading zeroes)

              in an attempt to optimise
              I will declare as NUMBER(12,0)

              Q. Is the storage pattern same if the number is shorter say NUMBER(10,2)
              except off course the saving of say one byte per value
              specially in indexes (I can even go shorter if required)

              and ...

              Q is there any overheads in joins when using NUMBER datatype
              The highest use of the primary key will be in Joins
              and the implicit internal index accesses and comparisions
              in joins.

              There will hardly be any other use of the primary key values

              Regards
              Sanjay Minni

              Comment

              • sybrandb@yahoo.com

                #8
                Re: optimum datatype for primary key column O9i

                sminni_SPAM_DIV ERTER@planage.c om (Sanjay Minni) wrote in message news:<4fe109d.0 310282228.7358f 8c4@posting.goo gle.com>...
                make PK columns NUMBER whenever possible -- with our without a maximum
                precision

                always declare the FK column(s) with the exact same datatype and precision
                as the referenced PK column(s)

                assign the PK value with an Oracle SEQUENCE object
                >
                Mark C. Stock
                www.enquery.com
                >
                Thanks,
                Now as I need only
                - integer values and of
                - 12 digit positions all filled (i.e. maximum 1 or 2 leading zeroes)
                >
                in an attempt to optimise
                I will declare as NUMBER(12,0)
                >
                Q. Is the storage pattern same if the number is shorter say NUMBER(10,2)
                except off course the saving of say one byte per value
                specially in indexes (I can even go shorter if required)
                >
                and ...
                >
                Q is there any overheads in joins when using NUMBER datatype
                The highest use of the primary key will be in Joins
                and the implicit internal index accesses and comparisions
                in joins.
                >
                There will hardly be any other use of the primary key values
                >
                Regards
                Sanjay Minni
                You DON'T NEED leading zeroes, and you can't store them in a number
                datatype (which is good, as leading zeroes is a *display* property).
                Number(10,2) means you have 10 positions, of which 2 are used as
                fraction.
                There is NO overhead in using a NUMBER datatype in joins!!
                There is overhead in using VARCHAR2s (which you already demonstrate as
                you seem to feel compelled mistakenly to pad them)

                Could you please brush up your manual reading skills? It's all there.

                Sybrand Bakker
                Senior Oracle DBA

                Comment

                • Max Pieh

                  #9
                  Re: optimum datatype for primary key column O9i

                  Thanks,
                  Now as I need only
                  - integer values and of
                  - 12 digit positions all filled (i.e. maximum 1 or 2 leading zeroes)

                  in an attempt to optimise
                  I will declare as NUMBER(12,0)

                  Q. Is the storage pattern same if the number is shorter say NUMBER(10,2)
                  except off course the saving of say one byte per value
                  specially in indexes (I can even go shorter if required)

                  and ...

                  Q is there any overheads in joins when using NUMBER datatype
                  The highest use of the primary key will be in Joins
                  and the implicit internal index accesses and comparisions
                  in joins.

                  There will hardly be any other use of the primary key values

                  Regards
                  Sanjay Minni
                  >
                  You DON'T NEED leading zeroes, and you can't store them in a number
                  datatype (which is good, as leading zeroes is a *display* property).
                  Number(10,2) means you have 10 positions, of which 2 are used as
                  fraction.
                  There is NO overhead in using a NUMBER datatype in joins!!
                  There is overhead in using VARCHAR2s (which you already demonstrate as
                  you seem to feel compelled mistakenly to pad them)
                  >
                  Could you please brush up your manual reading skills? It's all there.
                  >
                  Sybrand Bakker
                  Senior Oracle DBA

                  How can anyone give such a rude and incompetent answer to a sensible
                  question?

                  Nobody claimed that leading zeroes can be stored in a number datatype.

                  The question is:
                  Does declaring a column as number(10,0) make it need less memory
                  storage than declaring it simply as number?

                  Are search operations which use the index faster when the index column
                  is declared as number(10,0) as compared to when it is declared as
                  number?

                  regards,
                  Max

                  Comment

                  • Frank

                    #10
                    Re: optimum datatype for primary key column O9i

                    Max Pieh wrote:
                    >>>Thanks,
                    >>>Now as I need only
                    >> - integer values and of
                    >> - 12 digit positions all filled (i.e. maximum 1 or 2 leading zeroes)
                    >>>
                    >>>in an attempt to optimise
                    >>>I will declare as NUMBER(12,0)
                    >>>
                    >>>Q. Is the storage pattern same if the number is shorter say NUMBER(10,2)
                    >> except off course the saving of say one byte per value
                    >> specially in indexes (I can even go shorter if required)
                    >>>
                    >>>and ...
                    >>>
                    >>>Q is there any overheads in joins when using NUMBER datatype
                    >> The highest use of the primary key will be in Joins
                    >> and the implicit internal index accesses and comparisions
                    >> in joins.
                    >>>
                    >> There will hardly be any other use of the primary key values
                    >>>
                    >>>Regards
                    >>>Sanjay Minni
                    >>
                    >>You DON'T NEED leading zeroes, and you can't store them in a number
                    >>datatype (which is good, as leading zeroes is a *display* property).
                    >>Number(10,2 ) means you have 10 positions, of which 2 are used as
                    >>fraction.
                    >>There is NO overhead in using a NUMBER datatype in joins!!
                    >>There is overhead in using VARCHAR2s (which you already demonstrate as
                    >>you seem to feel compelled mistakenly to pad them)
                    >>
                    >>Could you please brush up your manual reading skills? It's all there.
                    >>
                    >>Sybrand Bakker
                    >>Senior Oracle DBA
                    >
                    >
                    >
                    How can anyone give such a rude and incompetent answer to a sensible
                    question?
                    >
                    Nobody claimed that leading zeroes can be stored in a number datatype.
                    >
                    The question is:
                    Does declaring a column as number(10,0) make it need less memory
                    storage than declaring it simply as number?
                    >
                    Are search operations which use the index faster when the index column
                    is declared as number(10,0) as compared to when it is declared as
                    number?
                    >
                    regards,
                    Max
                    Because it's a non-issue. All the answers are there (OK, OK - some
                    coding was actually left to the OP).

                    Give a man something to eat, and he will be hungry tomorrow,
                    teach him how to fidh, and he'll never be hungry again.
                    Or words the like.

                    Generally, we're not here to hold hands...

                    --
                    Regards, Frank van Bortel

                    Comment

                    • Sanjay Minni

                      #11
                      Re: optimum datatype for primary key column O9i

                      How can anyone give such a rude and incompetent answer to a sensible
                      question?
                      >
                      Nobody claimed that leading zeroes can be stored in a number datatype.
                      >
                      The question is:
                      Does declaring a column as number(10,0) make it need less memory
                      storage than declaring it simply as number?
                      >
                      Are search operations which use the index faster when the index column
                      is declared as number(10,0) as compared to when it is declared as
                      number? [...<added>or as any other type whatsoever given you are using
                      only digits for values]
                      regards,
                      Max
                      Max you have summarised to the exact question which I was trying to
                      pose
                      all along, I have just added the note in [...] for further
                      clarification

                      Of the above, the second issue on joins is vital specially as there
                      are large tables and large joins. A typical SQL for a complex report
                      in a substantially normalised database can be a nightmare and
                      every bit of optimisation helps (sometimes we have over 10 joins
                      and at least 2-3 tables have over 1,000,000 rows, while other
                      hover around 100,000 rows). In fact that is where we use Oracle

                      Space is not the issue, only its implication in speed is, Infact all
                      focus is on optimisation in SQL for Selects with large joins, rows
                      sets
                      and possible group by clauses

                      In response to the observations in the other posts, i would like to
                      state that I have been thru the manuals but the could not find
                      sufficient material to the issues as summarised above, specially the
                      second.

                      I suppose "Senior Oracle DBA's" have a right to be ...

                      Comment

                      • Jug Ray

                        #12
                        Re: optimum datatype for primary key column O9i

                        Frank <fvanbortel@net scape.netwrote in message news:<bnpdbo$nd q$1@news1.tilbu 1.nb.home.nl>.. .
                        Because it's a non-issue. All the answers are there (OK, OK - some
                        coding was actually left to the OP).
                        >
                        Give a man something to eat, and he will be hungry tomorrow,
                        teach him how to fidh, and he'll never be hungry again.
                        Or words the like.
                        >
                        Generally, we're not here to hold hands...
                        is the datatype an issue in indexes or not
                        i am looking also for an answer to that one
                        for a long time, tell me where it is in the manual
                        chapter, verses, book of ... as you seem to speak
                        for the group at large

                        what are you trying to say in "Give a man..."
                        I know its all there but can make sense of what you say

                        Jug

                        Comment

                        • mcstock

                          #13
                          Re: optimum datatype for primary key column O9i

                          Sanjay,

                          You're focusing much to much attention on what really is a non-issue. For
                          many years, any type of NUMBER datatype has been the recommended datatype
                          for PKs. Overhead of NUMBER(10) vs NUMBER vs NUMBER(10,0) etc, etc. is in
                          reality not relevant.

                          What has a far greater impact on performance are 1) poorly written PL/SQL
                          (typically excessive loops that spawn excessive recursive SQL). 2)poorly
                          indexed tables and 3) poorly written SQL (partial joins, joins to multiple
                          tables rather than natural PK/FK joins, columns wrapped in expressions).

                          Use NUMBER datatypes, constrain them if you like, and move on to some more
                          important issues.

                          -- MCS

                          "Sanjay Minni" <sminni_SPAM_DI VERTER@planage. comwrote in message
                          news:4fe109d.03 10300422.4adf55 40@posting.goog le.com...
                          How can anyone give such a rude and incompetent answer to a sensible
                          question?

                          Nobody claimed that leading zeroes can be stored in a number datatype.

                          The question is:
                          Does declaring a column as number(10,0) make it need less memory
                          storage than declaring it simply as number?

                          Are search operations which use the index faster when the index column
                          is declared as number(10,0) as compared to when it is declared as
                          number? [...<added>or as any other type whatsoever given you are using
                          only digits for values]
                          regards,
                          Max
                          >
                          Max you have summarised to the exact question which I was trying to
                          pose
                          all along, I have just added the note in [...] for further
                          clarification
                          >
                          Of the above, the second issue on joins is vital specially as there
                          are large tables and large joins. A typical SQL for a complex report
                          in a substantially normalised database can be a nightmare and
                          every bit of optimisation helps (sometimes we have over 10 joins
                          and at least 2-3 tables have over 1,000,000 rows, while other
                          hover around 100,000 rows). In fact that is where we use Oracle
                          >
                          Space is not the issue, only its implication in speed is, Infact all
                          focus is on optimisation in SQL for Selects with large joins, rows
                          sets
                          and possible group by clauses
                          >
                          In response to the observations in the other posts, i would like to
                          state that I have been thru the manuals but the could not find
                          sufficient material to the issues as summarised above, specially the
                          second.
                          >
                          I suppose "Senior Oracle DBA's" have a right to be ...

                          Comment

                          • mcstock

                            #14
                            Re: optimum datatype for primary key column O9i

                            jug, no one speaks for 'the group', though some do like to speak louder.

                            datatype is not an issue in the strictest sense

                            using updatable values in primary keys, or values that have a meaning beyond
                            identification, are issues

                            creating an index on a DNAME column in a departments table (a real table,
                            not Scott's 4-row table) is not an issue if the index is being created to
                            support searches on DNAME or to support uniqueness

                            however, if i use DNAME as a PK (since my analysis tells me it has to be
                            unique) instead of a system-assigned, or even a user-assigned number i will
                            have the following issues:

                            [_] larger values in the index
                            [_] potential updates as department names change
                            [_] cascading updates to other tables with FK's referencing the updated PK
                            [_] potential table structure revisions as business rules change (if
                            department names are no longer unique, it's a lot easier to drop a unique
                            constraint than to restructure the table with a new PK -- and new FK's)
                            [_] etc., etc.

                            and even if i use a short user-assigned mnemonic (which i sometimes will
                            do), we run the risk of PK updates (to be religiously avoided) or of
                            imbedding intelligence in the PK. for example, why is one department called
                            S023 and another called M023? did we miss an attribute that should group the
                            departments by business unit or such, i.e., SALES and MANUFACTURING (or is
                            that M for MARKETING?)

                            some of these issues are important data modeling issues, others are
                            implementation and tuning issues

                            where to look in the manuals? read up in the Concepts, DBA, and Tuning
                            manuals. chapter and verse? interesting analogy. sometimes issues have to be
                            dealt with based on breadth and depth of experience and familiarity with a
                            number of concepts, principals, and scenarios -- when you've got serious
                            concerns you need to research relevant topics (in this case indexes,
                            datatypes, optimizer, constraints) and do some of your own testing (see
                            Jonathan Lewis' website www.jcomp.demon.co.uk for some excellent info on
                            research, testing, and logical analysis of Oracle issues). then you'll also
                            be able to benefit more from other people's observations, and filter our
                            what is relevant and what is not.

                            -- mcs

                            "Jug Ray" <jray_76@yahoo. comwrote in message
                            news:fd754911.0 310300430.43b5b 018@posting.goo gle.com...
                            Frank <fvanbortel@net scape.netwrote in message
                            news:<bnpdbo$nd q$1@news1.tilbu 1.nb.home.nl>.. .
                            Because it's a non-issue. All the answers are there (OK, OK - some
                            coding was actually left to the OP).

                            Give a man something to eat, and he will be hungry tomorrow,
                            teach him how to fidh, and he'll never be hungry again.
                            Or words the like.

                            Generally, we're not here to hold hands...
                            >
                            is the datatype an issue in indexes or not
                            i am looking also for an answer to that one
                            for a long time, tell me where it is in the manual
                            chapter, verses, book of ... as you seem to speak
                            for the group at large
                            >
                            what are you trying to say in "Give a man..."
                            I know its all there but can make sense of what you say
                            >
                            Jug

                            Comment

                            • Sanjay Minni

                              #15
                              Re: optimum datatype for primary key column O9i

                              "mcstock" <mcstock@enquer y.comwrote in message news:<v7ydncOSC srTvTyiRVn-vg@comcast.com> ...
                              Sanjay,
                              >
                              You're focusing much to much attention on what really is a non-issue. For
                              many years, any type of NUMBER datatype has been the recommended datatype
                              for PKs. Overhead of NUMBER(10) vs NUMBER vs NUMBER(10,0) etc, etc. is in
                              reality not relevant.
                              >
                              [...snip...]
                              -- MCS
                              Thanks MCS,
                              Thats exactly what I was looking for

                              Regards
                              Sanjay

                              Comment

                              Working...