What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

**David Portas** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

IDENTITY may be a *surrogate* key but it isn't a *natural* key because it
bears no relation to the entity that you are modelling in your table. A
natural primary key is a subset of the attributes of an entity which
uniquely identify that entity. IDENTITY clearly isn't an attribute of any
real entity - it's just an arbitrary number.

Taking your Vendors table as an example, a naive design might look like
this:

CREATE TABLE Vendors (vendor_id INTEGER IDENTITY PRIMARY KEY /* ?? */,
vendor_name VARCHAR(40) NOT NULL, vendor_tax_id VARCHAR(10) NOT NULL, ...)

But this table has no uniqueness or integrity because multiple vendors can
exist with different (arbitrary) vendor_ids. It may well work internally for
a particular application but will break when someone needs to do some real
analysis on your data. And what if you need to combine it with data from
another system that doesn't have that same magical Vendor_id column?

Here's a better alternative:

CREATE TABLE Vendors (vendor_id INTEGER NOT NULL UNIQUE /* surrogate */,
vendor_name VARCHAR(40) NOT NULL UNIQUE, vendor_tax_id VARCHAR(10) NOT NULL
PRIMARY KEY, ...)

By declaring UNIQUE / PK constraints on the correct attributes you can
ensure that you have verifiably unique data. Keep the surrogate key if you
like but make sure you declare the Natural key as well. (Moving the actual
PK declaration is essentially cosmetic - PK is equivalent to NOT NULL UNIQUE
and it's not unusual to have several NOT NULL UNIQUE keys in a table).

Here's Celko on keys:

Software and Services recent news | InformationWeek

http://www.intelligententerprise.com/030320/605celko1_1.shtml

Explore the latest news and expert commentary on software and services, brought to you by the editors of InformationWeek

--
David Portas
------------
Please reply only to the newsgroup
--

**Steve Kass** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQLSERVER 2000?

Aaron Bertrand - MVP wrote:
[color=blue][color=green]
>>I am by no means a SQL expert, so forgive me if this seems ignorant...But
>>why can't the ID columm be a natural key? For example, I am working on a
>>project that has a vendors table. The list of vendors is used in
>>relationshi p to several other tables. We build this table with an identity
>>
>>[/color]
>
>An identity value that is generated by the system is not "natural".. . a
>natural key means that the key is, by nature, identifying a single row...
>not artificially because you generated some value for it. A natural key
>could be an e-mail address, or a social security number, or a license plate
>number, or a latitude and longitude -- something that is part of the data
>that also happens to uniquely identify it.
>[/color]
Quiz: Classify each key below as "natural" or "artificial ":

Northwind..Cust omers.CustomerI D
Northwind..Orde rs.OrderID
Northwind..Terr itories.Territo ryID

Most keys are fundamentally artificial, but somehow we only call them
artificial if we made them up, not if someone else made them up. Social
Security numbers are probably nothing different than identity values in
someone elses database, which doesn't make them any more intrinsic to
living people than VendorID values. I don't think an identity VendorID
value is any more artificial than any other unique way of identifying a
vendor, so long as it is assigned the first time a vendor enters the
system and is never changed.

I think this whole natural/artifical distinction is mostly quite silly.
In a well-designed database, entities can just as easily be identified
by an integer they are assigned when they enter the system as they can
by some set of attributes within the data they enter the system with.
The arguments against using identity values always seem to be arguments
against the ways people misuse identity values.

The advantage of keys like Northwind..Cust omers.CustomerI D is really an
error-correction issue. If carefully chosen, those 5-character keys can
be recovered if a single letter is mistyped. But this can be done with
artificial values also, if check digits or other error-correcting
schemes are used.

SK
[color=blue]
>
>Keep in mind that a primary key does not have to a natural key. I don't see
>any problems with using an IDENTITY as a primary key, such as in your case.
>But in your case it is not a natural key.
>
>
>[/color]

**David Portas** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

> living people than VendorID values. I don't think an identity VendorID[color=blue]
> value is any more artificial than any other unique way of identifying a
> vendor, so long as it is assigned the first time a vendor enters the
> system and is never changed.[/color]

But the point is that a "natural" key is verifiable outside of the system.

When I see a NG post from "skass[at]drew.edu" I don't care whether that's
based on your "real" name or even whether S.Kass is the same name as on your
passport or driver's licence. What's important to me is that it's determined
by a consistent method outside of the system which gives me some acceptable
degree of confidence that you're the same person who posted here as
"skass[at]drew.edu" yesterday. Of course that validity is destroyed if you
change your email address or if someone spoofs your address. But it's still
intrinsically better than an arbitrary ID allocated by the server.

--
David Portas
------------
Please reply only to the newsgroup
--

**Aaron Bertrand - MVP** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

> artificial if we made them up, not if someone else made them up. Social[color=blue]
> Security numbers are probably nothing different than identity values in
> someone elses database,[/color]

Right, but this is a centralized and controlled database. Nobody else has
my SSN (though there are exceptions, e.g. someday SSNs for deceased people
will have to be re-used); anybody else who uses it to identify themselves is
likely attempting fraud / identity theft. I consider it a "natural" key
because I supply it to the database, rather than the other way around.

Whereas my customerID according to Barnes & Noble is very unlikely to be the
same as my customerID at J.Crew.

In any case, I do agree that the distinction is largely silly, especially
when it erupts into arguments and "but Celko says..." nonsense. Like many
other things in the database world, the choice of a key is not dictated by
some higher power, but is rather situation-dependent.

--
Aaron Bertrand
SQL Server MVP

Expired DNS Hosting Services | DNS Made Easy

http://www.aspfaq.com/

Please contact this domain's administrator as their DNS Made Easy services have expired.

**Bob Badour** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

"David Portas" <REMOVE_BEFORE_ REPLYING_dporta s@acm.org> wrote in message
news:e-qdnT5gCqKOTFGiR Vn-iQ@giganews.com ...[color=blue][color=green]
> > living people than VendorID values. I don't think an identity VendorID
> > value is any more artificial than any other unique way of identifying a
> > vendor, so long as it is assigned the first time a vendor enters the
> > system and is never changed.[/color]
>
> But the point is that a "natural" key is verifiable outside of the system.[/color]

As soon as one records the generated identity value outside the system, the
key is verifiable outside of the system. All keys are surrogates or
artificial keys. Natural keys are nothing more than familiar surrogates.

**Steve Kass** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQLSERVER 2000?

David Portas wrote:
[color=blue][color=green]
>>living people than VendorID values. I don't think an identity VendorID
>>value is any more artificial than any other unique way of identifying a
>>vendor, so long as it is assigned the first time a vendor enters the
>>system and is never changed.
>>
>>[/color]
>
>But the point is that a "natural" key is verifiable outside of the system.
>[/color]
I guess it depends on where you draw the boundaries of "the system." If
you need something outside of "the system" to verify your key, aren't
you just working within a larger system (and one that is not entirely
under your control)?
[color=blue]
>
>When I see a NG post from "skass[at]drew.edu" I don't care whether that's
>based on your "real" name or even whether S.Kass is the same name as on your
>passport or driver's licence. What's important to me is that it's determined
>by a consistent method outside of the system which gives me some acceptable
>degree of confidence that you're the same person who posted here as
>"skass[at]drew.edu" yesterday. Of course that validity is destroyed if you
>change your email address or if someone spoofs your address. But it's still
>intrinsicall y better than an arbitrary ID allocated by the server.
>[/color]
This is a good point. If entities enter your system from time to time,
and you must determine whether they duplicate entities already in your
system or are new, then you need some "natural" method of
identification. If I apply for a Microsoft credit card, Microsoft will
likely generate an artifical credit card number for me. But that can't
be the only way in which Microsoft can identify me if a business rule
prohibits one person from possessing two Microsoft credit cards

A nice way to look at this is by recognizing when there is and when
there isn't some external entity that participates in an internal
business rule. The need to use a "natural" key such as my social
security number exists only because there is an entity outside
Microsoft's credit department (the
person-registered-with-the-social-security-administration entity) that
participates in a business rule: that the cardinality of the PRWTSSA <->
CCP relation is required to be 1<->{0,1}.

This is a useful way to look at things. Is an identity value VendorID a
good key for a vendor? It depends on whether there is some entity
"outside the system" with a certain relationship to the entity
identified by VendorID. If there is, then VendorID is not suitable as
the only key. If it is used, it must be a surrogate for a natural
external key also recorded within the system. Some businesses may allow
one corporation more than one VendorID, and others may not.

Is an identity value InvoiceID a good key for invoices (in the database
of the business generating the invoices)? It might well be, since
invoices can be internal to the system. While there might be a more
"natural" key, such as (CreationDateti me, IssuingEmployee _or_System),
the natural key might not provide any added value beyond its individual
attribute values if invoices are entirely internal.

In some cases, what appear to be external entities can use internal
(artificial) keys, because there is no business rule relating the
internal and external entities. An example might be a deli counter
customer who pulls an identity value from the "take a ticket" machine.
Here the only confusion is that there is no handy word for the
human-visit-to-deli-counter entity, and Customer might be a more
convenient name. Despite the fact that customers are people and people
can be identified uniquely outside the system, there's no need to worry
about that to manage visits to the deli counter.

I'm curious now to know whether this point of view helps make a little
more sense of the big debate. If not that, maybe it will at least help
me understand why I get anxious every time I need to use a Microsoft
Passport for identification.

SK
[color=blue]
>
>
>[/color]

**Louis Davidson** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

The problem is with how it is used. If you start giving users access to
identity based values, you get into a bad spot where they want to make
changes to the value (in my line of work, we don't like the numbers 666 in
account numbers) so using identities for user values is a bad idea. I use
them only for internal pointers that are never presented to users, since
they are not modifiable. I could use guids, or characters, or whatever for
keys and no one would be the wiser.
[color=blue]
> Natural keys are nothing more than familiar surrogates.[/color]

You are kind of right here, but it is generally true that natural keys can
change, because in the world, things can change. Identities cannot change.

--
----------------------------------------------------------------------------
-----------
Louis Davidson (drsql@hotmail. com)
Compass Technology Management

Pro SQL Server 2000 Database Design

Page Not Found

http://www.apress.com/book/bookDisplay.html?bID=266

Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)

"Bob Badour" <bbadour@golden .net> wrote in message
news:IOydndJOXv 5aR1GiRVn-hg@golden.net.. .[color=blue]
> "David Portas" <REMOVE_BEFORE_ REPLYING_dporta s@acm.org> wrote in message
> news:e-qdnT5gCqKOTFGiR Vn-iQ@giganews.com ...[color=green][color=darkred]
> > > living people than VendorID values. I don't think an identity[/color][/color][/color]
VendorID[color=blue][color=green][color=darkred]
> > > value is any more artificial than any other unique way of identifying[/color][/color][/color]
a[color=blue][color=green][color=darkred]
> > > vendor, so long as it is assigned the first time a vendor enters the
> > > system and is never changed.[/color]
> >
> > But the point is that a "natural" key is verifiable outside of the[/color][/color]
system.[color=blue]
>
> As soon as one records the generated identity value outside the system,[/color]
the[color=blue]
> key is verifiable outside of the system. All keys are surrogates or
> artificial keys. Natural keys are nothing more than familiar surrogates.
>
>[/color]

**Trey Walpole** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

"David Portas" <REMOVE_BEFORE_ REPLYING_dporta s@acm.org> wrote in message
news:e-qdnT5gCqKOTFGiR Vn-iQ@giganews.com ...[color=blue][color=green]
> > living people than VendorID values. I don't think an identity VendorID
> > value is any more artificial than any other unique way of identifying a
> > vendor, so long as it is assigned the first time a vendor enters the
> > system and is never changed.[/color]
>
> But the point is that a "natural" key is verifiable outside of the system.
>
> When I see a NG post from "skass[at]drew.edu" I don't care whether that's
> based on your "real" name or even whether S.Kass is the same name as on[/color]
your[color=blue]
> passport or driver's licence. What's important to me is that it's[/color]
determined[color=blue]
> by a consistent method outside of the system which gives me some[/color]
acceptable[color=blue]
> degree of confidence that you're the same person who posted here as
> "skass[at]drew.edu" yesterday. Of course that validity is destroyed if you
> change your email address or if someone spoofs your address. But it's[/color]
still[color=blue]
> intrinsically better than an arbitrary ID allocated by the server.
>[/color]

I am a firm believer that natural keys should only be used to logically
design/normalize the data. When it comes to the real reason for keys, data
integrity, more often than not I have seen that natural keys are
intrinsically not good physical primary keys.

1. Natural keys are, being natural and therefore user entered [i.e.,
provided to the database by external means], fungible. If a user enters
data, they must also be able to modify it. If data can be modified, then its
value as a systemic primary key is gone. Yes, you can cascade updates to
these, but why do it when it can be avoided to start with.

2. Natural keys are typically a composite of atomic attributes. If using a
composite, these must be propagated to referencing tables as foreign keys.
Your normalization drops below par, by having these [potentially] massively
duplicated columns.
Attributes that are single, [supposedly] unique attributes (e.g., SSN),
usually represent some official, governmentally recognized ID, and therefore
have legal issues with being propagated throughout a system.

Also, for amateurs and many professionals, natural keys are very often
chosen incorrectly. e.g., I believe some combination of Name and other info
has been used by my ISP as their primary key. My last name was entered into
their system incorrectly, but they cannot fix it because their system will
not allow it. Preposterous and poor design.

Surrogate keys generated by using the identity property are ideal for data
integrity, because
1. They are static values [i.e., once entered, it does not change] and the
DBA has control over allowing values in identity columns to be modified.
2. They are singleton row ids. The fact that they are sequential is
irrelevant. That is simply the most efficient means of generating new
numeric values.

Identity integers can be problematic in two-way replication, but proper
management of key ranges can alleviate these issues. GUIDs are the MS
recommended way to deal with distributed data and two-way replication, but
not as easy to deal with in unreplicated databases.

**Trevor Best** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

On Tue, 2 Dec 2003 09:04:35 -0500 in comp.databases, "Bob Badour"
<bbadour@golden .net> wrote:
[color=blue]
>
>"Louis Davidson" <dr_dontspamme_ sql@hotmail.com > wrote in message
>news:#rIbinJuD HA.3436@tk2msft ngp13.phx.gbl.. .[color=green]
>> While it is true that the chosen primary key cannot contain any optional
>> values, it is more the praticioner (sp?) that disagrees with this stance.
>> Compound keys are unwieldy and bad for performance, but the theorist in me
>> says "who cares?" It is all about what is right/best, not what is
>> fastest/easiest.[/color]
>
>Performance is determined by the physical structure and not by the logical
>interface. Legitimate theorists have written ad nauseum on the severe
>logical problems caused by using compound keys for references when data may
>be missing.[/color]

Are you talking about data being missing from a foreign key side of a
relationship? I'm pro identity column myself but I don't see how that
would help in this instance.

--
A)bort, R)etry, I)nfluence with large hammer.

**Trevor Best** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

On Tue, 2 Dec 2003 17:53:44 -0000 in comp.databases, "David Portas"
<REMOVE_BEFORE_ REPLYING_dporta s@acm.org> wrote:
[color=blue][color=green]
>> living people than VendorID values. I don't think an identity VendorID
>> value is any more artificial than any other unique way of identifying a
>> vendor, so long as it is assigned the first time a vendor enters the
>> system and is never changed.[/color]
>
>But the point is that a "natural" key is verifiable outside of the system.
>
>When I see a NG post from "skass[at]drew.edu" I don't care whether that's
>based on your "real" name or even whether S.Kass is the same name as on your
>passport or driver's licence. What's important to me is that it's determined
>by a consistent method outside of the system which gives me some acceptable
>degree of confidence that you're the same person who posted here as
>"skass[at]drew.edu" yesterday. Of course that validity is destroyed if you
>change your email address or if someone spoofs your address. But it's still
>intrinsicall y better than an arbitrary ID allocated by the server.[/color]

If Steve leaves that educational facility he's at now, his email
address will surely change. If he goes into a witness protection
scheme his name, address and SS number will change, he or someone else
could change quiet a bit about him but if he's on a database with an
identity column as his PK then it's more likely that it will *never*
change.

--
A)bort, R)etry, I)nfluence with large hammer.

**Steve Kass** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQLSERVER 2000?

Trevor Best wrote:
[color=blue]
>On Tue, 2 Dec 2003 17:53:44 -0000 in comp.databases, "David Portas"
><REMOVE_BEFORE _REPLYING_dport as@acm.org> wrote:
>
>
>[color=green][color=darkred]
>>>living people than VendorID values. I don't think an identity VendorID
>>>value is any more artificial than any other unique way of identifying a
>>>vendor, so long as it is assigned the first time a vendor enters the
>>>system and is never changed.
>>>
>>>[/color]
>>But the point is that a "natural" key is verifiable outside of the system.
>>
>>When I see a NG post from "skass[at]drew.edu" I don't care whether that's
>>based on your "real" name or even whether S.Kass is the same name as on your
>>passport or driver's licence. What's important to me is that it's determined
>>by a consistent method outside of the system which gives me some acceptable
>>degree of confidence that you're the same person who posted here as
>>"skass[at]drew.edu" yesterday. Of course that validity is destroyed if you
>>change your email address or if someone spoofs your address. But it's still
>>intrinsical ly better than an arbitrary ID allocated by the server.
>>
>>[/color]
>
>If Steve leaves that educational facility he's at now, his email
>address will surely change. If he goes into a witness protection
>scheme his name, address and SS number will change, he or someone else
>could change quiet a bit about him but if he's on a database with an
>identity column as his PK then it's more likely that it will *never*
>change.
>[/color]
In fact none of the 257 different identity values assigned to me will ever
get changed - unless I'm organized and honest, and unless someone
checks some kind of natural key of mine, how will anyone know
it was me each of those 257 times I opened an account?

SK
[color=blue]
>
>
>[/color]

**Greg D. Moore \(Strider\)** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

"BenignVani lla" <bv@tibetanbeef garden.com> wrote in message
news:O5OdnVtG1u WtN1GiRVn-hg@giganews.com ...[color=blue]
>
> I am by no means a SQL expert, so forgive me if this seems ignorant...But
> why can't the ID columm be a natural key? For example, I am working on a
> project that has a vendors table. The list of vendors is used in
> relationship to several other tables. We build this table with an identity
> column, and a column with the vendor's name. Now when a vendor is added to
> the table, they are assigned a unique ID that ties all other related data
> back to this vendor, and in the case of a vendor changing their name, or a
> typo, we can make updates without affecting data. Seems like a perfect use
> for an identity field, and it is our primary key.[/color]

And what happens if someone does a DBCC checkident ('FOO', RESEED)?

Or you have to copy it into a new table and accidently set that table up
with an IDENTITY column and now all your rows get new IDs?

[color=blue]
>
>
> --
> BV.
> WebPorgmaster - www.IHeartMyPond.com
> Work at Home, Save the Environment - www.amothersdream.com
>
>[/color]

**Trey Walpole** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

> And what happens if someone does a DBCC checkident ('FOO', RESEED)?[color=blue]
>
> Or you have to copy it into a new table and accidently set that table up
> with an IDENTITY column and now all your rows get new IDs?[/color]

Fire the DBA. If they've allowed unskilled people such access or do anything
accidentally, no telling what other problems they'll cause ;)

**Greg D. Moore \(Strider\)** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

"Trey Walpole" <treyNOpole@SPc omcastAM.net> wrote in message
news:uPV5k3SuDH A.1196@TK2MSFTN GP12.phx.gbl...[color=blue][color=green]
> > And what happens if someone does a DBCC checkident ('FOO', RESEED)?
> >
> > Or you have to copy it into a new table and accidently set that table up
> > with an IDENTITY column and now all your rows get new IDs?[/color]
>
> Fire the DBA. If they've allowed unskilled people such access or do[/color]
anything[color=blue]
> accidentally, no telling what other problems they'll cause ;)[/color]

Wow. Can I get a job where you work where folks never make mistakes?

Seriously, a DBCC checkident can be necessary in some recovery scenarios.

Copying a table over into another one is often necessary in general
maintenance, schema changes, etc. It's pretty easy to forget to do it
right.

Admittedly, they are pretty contrived examples, but the point is, the value
of the identity relies on some arbitrary state of the DB at the point in
time it is created.

Now, in some cases that just might not matter, but in many cases it can be
an important factor.

[color=blue]
>
>[/color]

**Stijn Verrept** · Jul 20 '05, 02:13 AM

Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

I've read through this thread but I don't understand it. I always use an
int or smallint as primary key, with identity. I believe it would be a mess
otherwise.

Example: I have a table with people, last name, first name, address, ... So
suppose you would make a natural key then you need at least the last name
and the first name. I have >25 other tables that reference that table. If
I get this right I will need to use the name and firstname field in all the
other tables as well to reference. Isn't that just a lot of data waste? If
I'm missing something, please tell me what because this seems a bit silly.

Stijn Verrept.

What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment