What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Trey Walpole

    #31
    Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

    "Greg D. Moore (Strider)" <mooregr@greenm s.com> wrote in message
    news:ILazb.1621 27$ji3.79874@tw ister.nyroc.rr. com...[color=blue]
    >
    > "Trey Walpole" <treyNOpole@SPc omcastAM.net> wrote in message
    > news:uPV5k3SuDH A.1196@TK2MSFTN GP12.phx.gbl...[color=green][color=darkred]
    > > > And what happens if someone does a DBCC checkident ('FOO', RESEED)?
    > > >
    > > > Or you have to copy it into a new table and accidently set that table[/color][/color][/color]
    up[color=blue][color=green][color=darkred]
    > > > with an IDENTITY column and now all your rows get new IDs?[/color]
    > >
    > > Fire the DBA. If they've allowed unskilled people such access or do[/color]
    > anything[color=green]
    > > accidentally, no telling what other problems they'll cause ;)[/color]
    >
    > Wow. Can I get a job where you work where folks never make mistakes?
    >
    > Seriously, a DBCC checkident can be necessary in some recovery scenarios.
    >
    > Copying a table over into another one is often necessary in general
    > maintenance, schema changes, etc. It's pretty easy to forget to do it
    > right.
    >[/color]

    I was, of course, being facetious -- well, mostly. :*) Mistakes do happen,
    but in most situations in db management, like these mentioned, they are
    completely avoidable.

    The DBA should be the one doing the DBCC CHECKIDENT - not just "someone".
    And if it is an approveed someone else, the DBA better know about it.

    Copying a table's data over is often necessary, but you do need to be very
    careful and know all the things that might be affected. Again, since this is
    a DBA responsibility, he'd better know how to do it right and have some test
    db to work with.
    [And if that scenario happens, it's pretty easy to fix, although it does
    mean moving data all over again.]
    [color=blue]
    > Admittedly, they are pretty contrived examples, but the point is, the[/color]
    value[color=blue]
    > of the identity relies on some arbitrary state of the DB at the point in
    > time it is created.
    >
    > Now, in some cases that just might not matter, but in many cases it can be
    > an important factor.
    >[/color]

    Also very true. But it is the DBA that has [or should have] the control over
    any modifications that affect identity values, whereas any user has the
    ability to change natural keys [which was the point I was making :)]


    Comment

    • Greg D. Moore \(Strider\)

      #32
      Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?


      "Trey Walpole" <treyNOpole@SPc omcastAM.net> wrote in message
      news:uNtoqBUuDH A.2316@TK2MSFTN GP10.phx.gbl...[color=blue]
      > "Greg D. Moore (Strider)" <mooregr@greenm s.com> wrote in message
      > news:ILazb.1621 27$ji3.79874@tw ister.nyroc.rr. com...[color=green]
      > >[/color]
      >
      > I was, of course, being facetious -- well, mostly. :*) Mistakes do happen,
      > but in most situations in db management, like these mentioned, they are
      > completely avoidable.[/color]

      Oh I know. :-)
      [color=blue]
      >
      > The DBA should be the one doing the DBCC CHECKIDENT - not just "someone".
      > And if it is an approveed someone else, the DBA better know about it.[/color]

      Keep in mind not all companies have that level of experience. I've
      consulted for a few.
      [color=blue]
      >
      > Copying a table's data over is often necessary, but you do need to be very
      > careful and know all the things that might be affected. Again, since this[/color]
      is[color=blue]
      > a DBA responsibility, he'd better know how to do it right and have some[/color]
      test[color=blue]
      > db to work with.
      > [And if that scenario happens, it's pretty easy to fix, although it does
      > mean moving data all over again.][/color]

      Actually it's impossible to fix if you've deleted the original table since
      you no longer have the original ID numbers.
      [color=blue]
      >[color=green]
      > > Admittedly, they are pretty contrived examples, but the point is, the[/color]
      > value[color=green]
      > > of the identity relies on some arbitrary state of the DB at the point in
      > > time it is created.
      > >
      > > Now, in some cases that just might not matter, but in many cases it can[/color][/color]
      be[color=blue][color=green]
      > > an important factor.
      > >[/color]
      >
      > Also very true. But it is the DBA that has [or should have] the control[/color]
      over[color=blue]
      > any modifications that affect identity values, whereas any user has the
      > ability to change natural keys [which was the point I was making :)]
      >[/color]

      In an ideal situation, I'd agree.

      [color=blue]
      >[/color]


      Comment

      • Aaron Bertrand [MVP]

        #33
        Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

        You have definitely hit a couple of the strong points in IDENTITY's favor.
        But there are cases where a "natural" kind of key can be small as well.
        Consider when eBay bought PayPal (and other than that, the rest of this
        paragraph is completely fictional), they likely had to merge some data...
        perhaps eBay used an IDENTITY to generate customer numbers, but they want to
        align those primary keys with the new data in the PayPal tables. So, the
        keys in the PayPal data become INTs, but not IDENTITY. They are kind of
        "natural" because they came to the PayPal from an external source, so to
        speak, rather than generated arbitrarily from within.

        Of course, completely fictional. But surely you can see that not all
        natural keys are going to be larger than an IDENTITY, or less efficient.
        There are other examples, too. In a small stats system, a SMALLDATETIME
        could be the primary key (perhaps several subrelated tables are organized by
        day). In fact, part of http://www.aspfaq.com/stats.asp (and plenty more
        that you can't see) is derived on a set of tables where SMALLDATETIME is the
        only key of relevance. Okay, so that's still 4 bytes, but you save 4 if
        your other alternative is to store an IDENTITY along with the SMALLDATETIME
        value. Consider:

        CREATE TABLE calendar
        (
        dateValue SMALLDATETIME PRIMARY KEY
        )

        vs.

        CREATE TABLE calendar
        (
        dateID INT IDENTITY PRIMARY KEY,
        dateValue SMALLDATETIME NOT NULL
        )

        Never mind my goofy naming scheme. :-)

        Now, Kass could probably show me some cool dateadd tricks that would allow
        me to store just an INT (or maybe even a SMALLINT, depending on the date
        range required), and determine what the date value is at runtime. Not that
        I think that's what his argument would be, but rather just to show that it
        is still possible to choose either route. I think the usability of the date
        value representing what it is, rather than having to derive its value from
        some formula, is a good thing.

        In cases like e-mail address and SSN (and in fact most cases), I still
        prefer your route, where there is a surrogate key (IDENTITY) that prevents
        me from having to cascade changes all over the place, and store larger
        foreign keys.

        Firstname + lastname is obviously a bad choice for a key of any kind,
        because I know more than one Aaron Bertrand. So then you bring middle name
        into the key, and it can still be repeated. Other things like getting
        adopted, re-married, legally changing their name, and other reasons why this
        "key" would change are minor; changes to the key can be dealt with in the
        database using DRI/CASCADE or, worst case scenario, through rigorous update
        code; it will be tougher to re-train users to look up all the tables
        containing "Carmen Bertrand" instead of "Carmen Electra." :-) However, I
        think the possibility of two people having the same key is a far more
        compelling argument for bypassing the natural key and placing some
        meaningless identifier, like IDENTITY, that the user doesn't care about and
        would never have to change.

        Now, you might think, "why not bring SSN into the FirstName + MiddleName +
        LastName key? That would make it unique." Yes, and hideously large. If
        SSN is unique, then why not just use SSN as the key? Again, it's large even
        on its own (CHAR(9) and surely to become CHAR(10) in our lifetimes), so I
        fail to see the benefit of repeating the value in every related table, DRI
        or not.

        Sorry about the earful, sometimes I get a little typographical diarrhea.
        Hopefully that was at least marginally intelligible.

        --
        Aaron Bertrand
        SQL Server MVP





        "Stijn Verrept" <sverrept@nospa n.vub.ac.be> wrote in message
        news:#G5HW$TuDH A.1512@TK2MSFTN GP10.phx.gbl...[color=blue]
        > I've read through this thread but I don't understand it. I always use an
        > int or smallint as primary key, with identity. I believe it would be a[/color]
        mess[color=blue]
        > otherwise.
        >
        > Example: I have a table with people, last name, first name, address, ...[/color]
        So[color=blue]
        > suppose you would make a natural key then you need at least the last name
        > and the first name. I have >25 other tables that reference that table.[/color]
        If[color=blue]
        > I get this right I will need to use the name and firstname field in all[/color]
        the[color=blue]
        > other tables as well to reference. Isn't that just a lot of data waste?[/color]
        If[color=blue]
        > I'm missing something, please tell me what because this seems a bit silly.
        >
        >
        > Stijn Verrept.
        >
        >[/color]


        Comment

        • Aaron Bertrand [MVP]

          #34
          Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

          > Actually it's impossible to fix if you've deleted the original table since[color=blue]
          > you no longer have the original ID numbers.[/color]

          Hopefully you're not starting a DTS task, never mind dropping a table before
          validating a successful transfer, without a decent backup in place. I think
          this is the kind of thing that Trey means when he uses the term
          "avoidable" ... I'll stretch it here to also mean "correctabl e."


          Comment

          • Louis Davidson

            #35
            Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

            "Steve Kass" <skass@drew.edu > wrote in message
            news:3FCCF5D9.9 070405@drew.edu ...
            [color=blue]
            > In fact none of the 257 different identity values assigned to me will ever
            > get changed - unless I'm organized and honest, and unless someone
            > checks some kind of natural key of mine, how will anyone know
            > it was me each of those 257 times I opened an account?[/color]

            Those values were not assigned to you, they were assigned to 257 different
            accounts. Those 257 accounts will also have 257 natural keys most likely,
            which will be an account number, likely taken from a paper form in the bank
            office.

            If you are talking about you giving 257 different identities (you tell them
            you are different people) then the fake people are getting new identity
            values, and pretty soon they will get a new artificial key printed on a
            fashionable orange jumpsuit :)

            --
            ----------------------------------------------------------------------------
            -----------
            Louis Davidson (drsql@hotmail. com)
            Compass Technology Management

            Pro SQL Server 2000 Database Design


            Note: Please reply to the newsgroups only unless you are
            interested in consulting services. All other replies will be ignored :)


            Comment

            • Stijn Verrept

              #36
              Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

              > In cases like e-mail address and SSN (and in fact most cases), I still[color=blue]
              > prefer your route, where there is a surrogate key (IDENTITY) that prevents
              > me from having to cascade changes all over the place, and store larger
              > foreign keys.[/color]

              Ok I agree that sometimes natural keys are good. But mind the word:
              sometimes :). After reading through this thread I got the impression that
              natural keys are used more than surrogate keys, while in practice I use
              identity with surrogate keys almost all the time.
              [color=blue]
              > Now, you might think, "why not bring SSN into the FirstName + MiddleName +
              > LastName key? That would make it unique." Yes, and hideously large. If
              > SSN is unique, then why not just use SSN as the key? Again, it's large[/color]
              even[color=blue]
              > on its own (CHAR(9) and surely to become CHAR(10) in our lifetimes), so I
              > fail to see the benefit of repeating the value in every related table, DRI
              > or not.[/color]

              Indeed, and I don't know about American legislation but maybe in the future
              (or even now) you can have people work for you who don't have a SSN (who
              work from a distance country for example) and then you'll get stuck again.
              [color=blue]
              > Sorry about the earful, sometimes I get a little typographical diarrhea.
              > Hopefully that was at least marginally intelligible.[/color]

              It was :)


              Stijn Verrept.


              Comment

              • Aaron Bertrand [MVP]

                #37
                Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

                > sometimes :). After reading through this thread I got the impression that[color=blue]
                > natural keys are used more than surrogate keys, while in practice I use
                > identity with surrogate keys almost all the time.[/color]

                I don't get that impression at all, and I don't believe it is true. Maybe
                that's what purists would *like* however...

                --
                Aaron Bertrand
                SQL Server MVP



                Comment

                • Bob Badour

                  #38
                  Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

                  "Louis Davidson" <dr_dontspamme_ sql@hotmail.com > wrote in message
                  news:ex84CxQuDH A.1788@tk2msftn gp13.phx.gbl...[color=blue]
                  > The problem is with how it is used. If you start giving users access to
                  > identity based values, you get into a bad spot where they want to make
                  > changes to the value (in my line of work, we don't like the numbers 666 in
                  > account numbers) so using identities for user values is a bad idea. I use
                  > them only for internal pointers that are never presented to users, since
                  > they are not modifiable. I could use guids, or characters, or whatever[/color]
                  for[color=blue]
                  > keys and no one would be the wiser.[/color]

                  Keys are logical identifiers. They identify data for the user as well as for
                  the dbms. Preventing the user from seeing the identifier is just stupid.

                  [color=blue][color=green]
                  > > Natural keys are nothing more than familiar surrogates.[/color]
                  >
                  > You are kind of right here, but it is generally true that natural keys can
                  > change, because in the world, things can change. Identities cannot[/color]
                  change.

                  It is not a "kind of" rightness. The statement is obviously and
                  self-evidently right to anyone with at least a minimal education in the
                  fundamentals of data management.

                  [color=blue]
                  > --
                  > --------------------------------------------------------------------------[/color]
                  --[color=blue]
                  > -----------
                  > Louis Davidson (drsql@hotmail. com)
                  > Compass Technology Management
                  >
                  > Pro SQL Server 2000 Database Design
                  > http://www.apress.com/book/bookDisplay.html?bID=266
                  >
                  > Note: Please reply to the newsgroups only unless you are
                  > interested in consulting services. All other replies will be ignored :)
                  >
                  > "Bob Badour" <bbadour@golden .net> wrote in message
                  > news:IOydndJOXv 5aR1GiRVn-hg@golden.net.. .[color=green]
                  > > "David Portas" <REMOVE_BEFORE_ REPLYING_dporta s@acm.org> wrote in message
                  > > news:e-qdnT5gCqKOTFGiR Vn-iQ@giganews.com ...[color=darkred]
                  > > > > living people than VendorID values. I don't think an identity[/color][/color]
                  > VendorID[color=green][color=darkred]
                  > > > > value is any more artificial than any other unique way of[/color][/color][/color]
                  identifying[color=blue]
                  > a[color=green][color=darkred]
                  > > > > vendor, so long as it is assigned the first time a vendor enters the
                  > > > > system and is never changed.
                  > > >
                  > > > But the point is that a "natural" key is verifiable outside of the[/color][/color]
                  > system.[color=green]
                  > >
                  > > As soon as one records the generated identity value outside the system,[/color]
                  > the[color=green]
                  > > key is verifiable outside of the system. All keys are surrogates or
                  > > artificial keys. Natural keys are nothing more than familiar surrogates.
                  > >
                  > >[/color]
                  >
                  >[/color]


                  Comment

                  • Bob Badour

                    #39
                    Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

                    "Trey Walpole" <treyNOpole@SPc omcastAM.net> wrote in message
                    news:#Gwam6QuDH A.2248@TK2MSFTN GP09.phx.gbl...[color=blue]
                    >
                    > "David Portas" <REMOVE_BEFORE_ REPLYING_dporta s@acm.org> wrote in message
                    > news:e-qdnT5gCqKOTFGiR Vn-iQ@giganews.com ...[color=green][color=darkred]
                    > > > living people than VendorID values. I don't think an identity[/color][/color][/color]
                    VendorID[color=blue][color=green][color=darkred]
                    > > > value is any more artificial than any other unique way of identifying[/color][/color][/color]
                    a[color=blue][color=green][color=darkred]
                    > > > vendor, so long as it is assigned the first time a vendor enters the
                    > > > system and is never changed.[/color]
                    > >
                    > > But the point is that a "natural" key is verifiable outside of the[/color][/color]
                    system.[color=blue][color=green]
                    > >
                    > > When I see a NG post from "skass[at]drew.edu" I don't care whether[/color][/color]
                    that's[color=blue][color=green]
                    > > based on your "real" name or even whether S.Kass is the same name as on[/color]
                    > your[color=green]
                    > > passport or driver's licence. What's important to me is that it's[/color]
                    > determined[color=green]
                    > > by a consistent method outside of the system which gives me some[/color]
                    > acceptable[color=green]
                    > > degree of confidence that you're the same person who posted here as
                    > > "skass[at]drew.edu" yesterday. Of course that validity is destroyed if[/color][/color]
                    you[color=blue][color=green]
                    > > change your email address or if someone spoofs your address. But it's[/color]
                    > still[color=green]
                    > > intrinsically better than an arbitrary ID allocated by the server.
                    > >[/color]
                    >
                    > I am a firm believer that natural keys should only be used to logically
                    > design/normalize the data. When it comes to the real reason for keys, data
                    > integrity, more often than not I have seen that natural keys are
                    > intrinsically not good physical primary keys.[/color]

                    You are evidently ignorant of the purpose of a key. A candidate key is a
                    logical identifier that identifies data. It is not a physical identifier. A
                    dbms can map the logical identifier to any physical identifier using any
                    method without altering the logical data model.

                    I suggest an elementary education in data management is in order before you
                    pontificate on the subject. Spreading your ignorance will only confuse the
                    unwary novice. Shame on you.


                    Comment

                    • Bob Badour

                      #40
                      Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

                      "Trey Walpole" <treyNOpole@SPc omcastAM.net> wrote in message
                      news:uPV5k3SuDH A.1196@TK2MSFTN GP12.phx.gbl...[color=blue][color=green]
                      > > And what happens if someone does a DBCC checkident ('FOO', RESEED)?
                      > >
                      > > Or you have to copy it into a new table and accidently set that table up
                      > > with an IDENTITY column and now all your rows get new IDs?[/color]
                      >
                      > Fire the DBA. If they've allowed unskilled people such access or do[/color]
                      anything[color=blue]
                      > accidentally, no telling what other problems they'll cause ;)[/color]

                      Pray your employer never employs the same standard.


                      Comment

                      • Bob Badour

                        #41
                        Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

                        "Trevor Best" <bouncer@localh ost> wrote in message
                        news:1rrpsvc8ib qr1fu8n475e59hu 2748ipvb4@4ax.c om...[color=blue]
                        > On Tue, 2 Dec 2003 09:04:35 -0500 in comp.databases, "Bob Badour"
                        > <bbadour@golden .net> wrote:
                        >[color=green]
                        > >
                        > >"Louis Davidson" <dr_dontspamme_ sql@hotmail.com > wrote in message
                        > >news:#rIbinJuD HA.3436@tk2msft ngp13.phx.gbl.. .[color=darkred]
                        > >> While it is true that the chosen primary key cannot contain any[/color][/color][/color]
                        optional[color=blue][color=green][color=darkred]
                        > >> values, it is more the praticioner (sp?) that disagrees with this[/color][/color][/color]
                        stance.[color=blue][color=green][color=darkred]
                        > >> Compound keys are unwieldy and bad for performance, but the theorist in[/color][/color][/color]
                        me[color=blue][color=green][color=darkred]
                        > >> says "who cares?" It is all about what is right/best, not what is
                        > >> fastest/easiest.[/color]
                        > >
                        > >Performance is determined by the physical structure and not by the[/color][/color]
                        logical[color=blue][color=green]
                        > >interface. Legitimate theorists have written ad nauseum on the severe
                        > >logical problems caused by using compound keys for references when data[/color][/color]
                        may[color=blue][color=green]
                        > >be missing.[/color]
                        >
                        > Are you talking about data being missing from a foreign key side of a
                        > relationship? I'm pro identity column myself but I don't see how that
                        > would help in this instance.[/color]

                        The specific issue is compound keys and missing information. A simple key
                        does not exhibit the same problems regardless whether it is an identity
                        column.

                        Consider a compound key with attributes A and B. What happens when the user
                        inserts a referencing row with a known A and an unknown B? Should the dbms
                        allow the insert? When should it allow the insert? Should the dbms verify
                        the A exists at least once in the referenced table? If the A value exists
                        only once in the referenced table, should the dbms substitute the only
                        corresponding B value that could be correct in the inserted row? Suppose the
                        user deletes all the rows from the referenced table that contain a specific
                        A value. What happens when the referencing table contains corresponding rows
                        with a known A and an unknown B?


                        Comment

                        • Bob Badour

                          #42
                          Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

                          "Trevor Best" <bouncer@localh ost> wrote in message
                          news:u8spsvc8s0 hfc741blgblj7rq 62u8nsp1k@4ax.c om...[color=blue]
                          > On Tue, 2 Dec 2003 17:53:44 -0000 in comp.databases, "David Portas"
                          > <REMOVE_BEFORE_ REPLYING_dporta s@acm.org> wrote:
                          >[color=green][color=darkred]
                          > >> living people than VendorID values. I don't think an identity VendorID
                          > >> value is any more artificial than any other unique way of identifying a
                          > >> vendor, so long as it is assigned the first time a vendor enters the
                          > >> system and is never changed.[/color]
                          > >
                          > >But the point is that a "natural" key is verifiable outside of the[/color][/color]
                          system.[color=blue][color=green]
                          > >
                          > >When I see a NG post from "skass[at]drew.edu" I don't care whether that's
                          > >based on your "real" name or even whether S.Kass is the same name as on[/color][/color]
                          your[color=blue][color=green]
                          > >passport or driver's licence. What's important to me is that it's[/color][/color]
                          determined[color=blue][color=green]
                          > >by a consistent method outside of the system which gives me some[/color][/color]
                          acceptable[color=blue][color=green]
                          > >degree of confidence that you're the same person who posted here as
                          > >"skass[at]drew.edu" yesterday. Of course that validity is destroyed if[/color][/color]
                          you[color=blue][color=green]
                          > >change your email address or if someone spoofs your address. But it's[/color][/color]
                          still[color=blue][color=green]
                          > >intrinsicall y better than an arbitrary ID allocated by the server.[/color]
                          >
                          > If Steve leaves that educational facility he's at now, his email
                          > address will surely change. If he goes into a witness protection
                          > scheme his name, address and SS number will change, he or someone else
                          > could change quiet a bit about him but if he's on a database with an
                          > identity column as his PK then it's more likely that it will *never*
                          > change.[/color]

                          With all due respect, the whole point of the witness protection programme is
                          to prevent people from associating the individual with their previous
                          identity.


                          Comment

                          • Bob Badour

                            #43
                            Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

                            "Stijn Verrept" <sverrept@nospa n.vub.ac.be> wrote in message
                            news:#G5HW$TuDH A.1512@TK2MSFTN GP10.phx.gbl...[color=blue]
                            > I've read through this thread but I don't understand it. I always use an
                            > int or smallint as primary key, with identity. I believe it would be a[/color]
                            mess[color=blue]
                            > otherwise.[/color]

                            Your belief does not alter the correct criteria for choosing a key:
                            simplicity, familiarity and stability.

                            [color=blue]
                            > Example: I have a table with people, last name, first name, address, ...[/color]
                            So[color=blue]
                            > suppose you would make a natural key then you need at least the last name
                            > and the first name. I have >25 other tables that reference that table.[/color]
                            If[color=blue]
                            > I get this right I will need to use the name and firstname field in all[/color]
                            the[color=blue]
                            > other tables as well to reference. Isn't that just a lot of data waste?[/color]
                            If[color=blue]
                            > I'm missing something, please tell me what because this seems a bit silly.[/color]

                            You have constructed a straw man. One anecdote does not demonstrate or
                            justify a general principle or rule.


                            Comment

                            • Bob Badour

                              #44
                              Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

                              I see why you were elected a most vociferous person. That's a very long way
                              of stating something as simple as the criteria for choosing a primary key:
                              simplicity, familiarity and stability.

                              "Aaron Bertrand [MVP]" <aaron@TRASHasp faq.com> wrote in message
                              news:entEu2UuDH A.1888@TK2MSFTN GP10.phx.gbl...[color=blue]
                              > You have definitely hit a couple of the strong points in IDENTITY's favor.
                              > But there are cases where a "natural" kind of key can be small as well.
                              > Consider when eBay bought PayPal (and other than that, the rest of this
                              > paragraph is completely fictional), they likely had to merge some data...
                              > perhaps eBay used an IDENTITY to generate customer numbers, but they want[/color]
                              to[color=blue]
                              > align those primary keys with the new data in the PayPal tables. So, the
                              > keys in the PayPal data become INTs, but not IDENTITY. They are kind of
                              > "natural" because they came to the PayPal from an external source, so to
                              > speak, rather than generated arbitrarily from within.
                              >
                              > Of course, completely fictional. But surely you can see that not all
                              > natural keys are going to be larger than an IDENTITY, or less efficient.
                              > There are other examples, too. In a small stats system, a SMALLDATETIME
                              > could be the primary key (perhaps several subrelated tables are organized[/color]
                              by[color=blue]
                              > day). In fact, part of http://www.aspfaq.com/stats.asp (and plenty more
                              > that you can't see) is derived on a set of tables where SMALLDATETIME is[/color]
                              the[color=blue]
                              > only key of relevance. Okay, so that's still 4 bytes, but you save 4 if
                              > your other alternative is to store an IDENTITY along with the[/color]
                              SMALLDATETIME[color=blue]
                              > value. Consider:
                              >
                              > CREATE TABLE calendar
                              > (
                              > dateValue SMALLDATETIME PRIMARY KEY
                              > )
                              >
                              > vs.
                              >
                              > CREATE TABLE calendar
                              > (
                              > dateID INT IDENTITY PRIMARY KEY,
                              > dateValue SMALLDATETIME NOT NULL
                              > )
                              >
                              > Never mind my goofy naming scheme. :-)
                              >
                              > Now, Kass could probably show me some cool dateadd tricks that would allow
                              > me to store just an INT (or maybe even a SMALLINT, depending on the date
                              > range required), and determine what the date value is at runtime. Not[/color]
                              that[color=blue]
                              > I think that's what his argument would be, but rather just to show that it
                              > is still possible to choose either route. I think the usability of the[/color]
                              date[color=blue]
                              > value representing what it is, rather than having to derive its value from
                              > some formula, is a good thing.
                              >
                              > In cases like e-mail address and SSN (and in fact most cases), I still
                              > prefer your route, where there is a surrogate key (IDENTITY) that prevents
                              > me from having to cascade changes all over the place, and store larger
                              > foreign keys.
                              >
                              > Firstname + lastname is obviously a bad choice for a key of any kind,
                              > because I know more than one Aaron Bertrand. So then you bring middle[/color]
                              name[color=blue]
                              > into the key, and it can still be repeated. Other things like getting
                              > adopted, re-married, legally changing their name, and other reasons why[/color]
                              this[color=blue]
                              > "key" would change are minor; changes to the key can be dealt with in the
                              > database using DRI/CASCADE or, worst case scenario, through rigorous[/color]
                              update[color=blue]
                              > code; it will be tougher to re-train users to look up all the tables
                              > containing "Carmen Bertrand" instead of "Carmen Electra." :-) However, I
                              > think the possibility of two people having the same key is a far more
                              > compelling argument for bypassing the natural key and placing some
                              > meaningless identifier, like IDENTITY, that the user doesn't care about[/color]
                              and[color=blue]
                              > would never have to change.
                              >
                              > Now, you might think, "why not bring SSN into the FirstName + MiddleName +
                              > LastName key? That would make it unique." Yes, and hideously large. If
                              > SSN is unique, then why not just use SSN as the key? Again, it's large[/color]
                              even[color=blue]
                              > on its own (CHAR(9) and surely to become CHAR(10) in our lifetimes), so I
                              > fail to see the benefit of repeating the value in every related table, DRI
                              > or not.
                              >
                              > Sorry about the earful, sometimes I get a little typographical diarrhea.
                              > Hopefully that was at least marginally intelligible.
                              >
                              > --
                              > Aaron Bertrand
                              > SQL Server MVP
                              > http://www.aspfaq.com/
                              >
                              >
                              >
                              >
                              > "Stijn Verrept" <sverrept@nospa n.vub.ac.be> wrote in message
                              > news:#G5HW$TuDH A.1512@TK2MSFTN GP10.phx.gbl...[color=green]
                              > > I've read through this thread but I don't understand it. I always use[/color][/color]
                              an[color=blue][color=green]
                              > > int or smallint as primary key, with identity. I believe it would be a[/color]
                              > mess[color=green]
                              > > otherwise.
                              > >
                              > > Example: I have a table with people, last name, first name, address, ...[/color]
                              > So[color=green]
                              > > suppose you would make a natural key then you need at least the last[/color][/color]
                              name[color=blue][color=green]
                              > > and the first name. I have >25 other tables that reference that table.[/color]
                              > If[color=green]
                              > > I get this right I will need to use the name and firstname field in all[/color]
                              > the[color=green]
                              > > other tables as well to reference. Isn't that just a lot of data waste?[/color]
                              > If[color=green]
                              > > I'm missing something, please tell me what because this seems a bit[/color][/color]
                              silly.[color=blue][color=green]
                              > >
                              > >
                              > > Stijn Verrept.
                              > >
                              > >[/color]
                              >
                              >[/color]


                              Comment

                              • Stijn Verrept

                                #45
                                Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

                                "Bob Badour" <bbadour@golden .net> wrote in message
                                news:OdKdnZZyy_ SlbFCiRVn-vg@golden.net.. .[color=blue]
                                > Your belief does not alter the correct criteria for choosing a key:
                                > simplicity, familiarity and stability.[/color]

                                Well non natural keys do meet the simplicity and stability criteria. Even
                                more than natural keys I believe.
                                [color=blue]
                                > You have constructed a straw man. One anecdote does not demonstrate or
                                > justify a general principle or rule.[/color]

                                Hmmm one anecdote? I have a database full of tables like this. List of
                                doctors, departments, users, contacts, medication, ... everywhere I used a
                                non natural key. The use of natural keys will only happen sometimes when
                                it's really appropriate, otherwise not.


                                Stijn Verrept.


                                Comment

                                Working...