Sequential Number in an Update

**--CELKO--** · Nov 1 '06, 10:55 PM

Re: Sequential Number in an Update

>So I'll join in the challenge - how would your constraint look like if it _has_ to work on MS SQL Server 2005? <<

This grep checks email format against RFC 3696 and was written by David
Thompson

[a-z0-9!$'*+\-_]+(\.[a-z0-9!$'*+\-_]+)*
@([a-z0-9]+(-+[a-z0-9]+)*\.)+
([a-z]{2}aero|arpa|bi z|cat|com|coop| edu|gov|info|in t|jobs|mil|mobi |museum|name|ne t|org|pro|trave l)

Let's translate it. The SQL Server LIKE predicate is missing the
Kleene plus + and Kleene star * repetition tokens. We can fake those
with a table of repetitions and a JOIN. The * table has an empty
string in it in addition to repetitions of the character search
pattern.

CREATE TABLE Symbols
(char_cnt INTEGER NOT NULL PRIMARY KEY,
token VARCHAR(255) NOT NULL);
INSERT INTO Symbols VALUES (1, '[a-z0-9!$'*+-_]');
INSERT INTO Symbols VALUES (2, '[a-z0-9!$'*+-_][a-z0-9!$'*+-_]');
Etc.

So to get a pattern for a string with a dot in it, you would use

SELECT 'Valid'
FROM Symbols AS A1, Symbols AS A2
WHERE @target LIKE A1.token + '.' + A2.token
AND LEN(@target) < char_cnt - 1;

The test for length can be improved by looking at substrings in front
of and behind the period and the suffix code, but you get the idea.

The | can be done same way

CREATE TABLE Suffixes (d VARCHAR(4) NOT NULL);
INSERT INTO Suffixes VALUES ('aero');
-- and so forth for 'arpa', 'biz', 'cat', 'com',
'coop', 'edu', 'gov', 'info', 'int', 'jobs',
'mil', 'mobi', 'museum', 'name', 'net', 'org',
'pro', 'travel'

Alternatively, use a chain of OR-equality tests on the suffix via a
substring. I would use a table of valid country codes in place of
[a-z]{2}

The final tuning trick is the CASE expression from Hell. Using the
order of execution of the WHEN clauses, test for common simple errors
and finish the monster like predicate.

SELECT ..
FROM ..
WHERE CASE
WHEN <@ count is not oneTHEN 'F'
WHEN <illegal char in targetTHEN 'F'
WHEN <invalid suffixTHEN 'F'
Etc.
WHEN <LIKE predicate from HellTHEN 'T'
ELSE 'F' END = 'T';

Is this ugly? Well, I have seen more tables with longer rows in a lot
of code posted here.

Is it easy to understand, if you read SQL? I think so.

Do I like it? No, I want my ANSI Standard SIMILAR TO predicate after
over a decade of waiting. The LIKE predicate is very weak and we are
using more complex encodings, like email addresses, in modern data. We
got OUTER JOINs, so I live in hope (but they did take MERGE out of 2005
after having it in a Beta).

I think this demonstates that it is possible and could be put into a
Standard SQL procedure.

What do you do when the DB you are DBA-ing has the grep you like in
your favorite CLR language, the grep fred likes in his favorite CLR
language, and so forth for a dozen different developers? What if one
grep is not like another grep? Look at how many versions of grep we
have in Unix alone.

The DBA group really cannot learn those languages and stay current in
them. We now need two application programmers per language (always
hire in pairs for the same reasons you make back ups). So we pick a
limited set of CLR languages for the DB shop. Sure hope that the
application users agree witht he choice. If they are on VB and we are
on C#, that +1, -1 thing in Booleans could be a problem ..

**--CELKO--** · Nov 1 '06, 11:15 PM

Re: Sequential Number in an Update

>Here you're just plain wrong Joe. <<

Mea culpa, mea culpa, mea maxima culpa!

**Tony Rogerson** · Nov 2 '06, 06:45 AM

Re: Sequential Number in an Update

And just how do you get that working in a CONSTRAINT .... CHECK( .... ) ?
You can't without using CLR - try again.

What do you do when the DB you are DBA-ing has the grep you like in
your favorite CLR language, the grep fred likes in his favorite CLR
language, and so forth for a dozen different developers? What if one
grep is not like another grep? Look at how many versions of grep we
have in Unix alone.

We use RegularExpressi ons in .NET, its one simple class and about 3 lines of
code - you pick a regular expression from the standard list of expressions
from google - simple.

That's how developers work, if you'd bother to go out and get some real
industrial experience you'd know that - being class room bound and
speculating how the current DBA shop works is very dangerous and cowboyish
at best.

Most code is written in C# or VB.NET and to my knowledge only a couple of
langauges will work in the CLR because of some restrictions required by SQL
Server.

>
The DBA group really cannot learn those languages and stay current in
them. We now need two application programmers per language (always
hire in pairs for the same reasons you make back ups). So we pick a
limited set of CLR languages for the DB shop. Sure hope that the
application users agree witht he choice. If they are on VB and we are
on C#, that +1, -1 thing in Booleans could be a problem ..
>

Wrong, wrong wrong; if you actually would bother to read and learn CLR
fundementals and how they work with SQL Server you'd know how the data types
really work and that the SQL types are exposed in the CLR.

SQL Server DBA's are required to know VB.NET becuase of Integration Services
"period".

C# and VB.NET look almost identical when you are developing, the syntax
varies but the logic and class names are all the same.

Will you PLEASE go and do some bloody research and stop guessing,
speculating on how things work.

You've STILL not shown how YOU would write this into a CHECK CONSTRAINT
without using CLR.

--
Tony Rogerson
SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a SQL
Server Consultant
http://sqlserverfaq.com - free video tutorials

"--CELKO--" <jcelko212@eart hlink.netwrote in message
news:1162422718 .509968.121450@ f16g2000cwb.goo glegroups.com.. .

>>So I'll join in the challenge - how would your constraint look like if
>>it _has_ to work on MS SQL Server 2005? <<

>
This grep checks email format against RFC 3696 and was written by David
Thompson
>
[a-z0-9!$'*+\-_]+(\.[a-z0-9!$'*+\-_]+)*
@([a-z0-9]+(-+[a-z0-9]+)*\.)+
([a-z]{2}aero|arpa|bi z|cat|com|coop| edu|gov|info|in t|jobs|mil|mobi |museum|name|ne t|org|pro|trave l)
>
Let's translate it. The SQL Server LIKE predicate is missing the
Kleene plus + and Kleene star * repetition tokens. We can fake those
with a table of repetitions and a JOIN. The * table has an empty
string in it in addition to repetitions of the character search
pattern.
>
CREATE TABLE Symbols
(char_cnt INTEGER NOT NULL PRIMARY KEY,
token VARCHAR(255) NOT NULL);
INSERT INTO Symbols VALUES (1, '[a-z0-9!$'*+-_]');
INSERT INTO Symbols VALUES (2, '[a-z0-9!$'*+-_][a-z0-9!$'*+-_]');
Etc.
>
So to get a pattern for a string with a dot in it, you would use
>
SELECT 'Valid'
FROM Symbols AS A1, Symbols AS A2
WHERE @target LIKE A1.token + '.' + A2.token
AND LEN(@target) < char_cnt - 1;
>
The test for length can be improved by looking at substrings in front
of and behind the period and the suffix code, but you get the idea.
>
The | can be done same way
>
CREATE TABLE Suffixes (d VARCHAR(4) NOT NULL);
INSERT INTO Suffixes VALUES ('aero');
-- and so forth for 'arpa', 'biz', 'cat', 'com',
'coop', 'edu', 'gov', 'info', 'int', 'jobs',
'mil', 'mobi', 'museum', 'name', 'net', 'org',
'pro', 'travel'
>
Alternatively, use a chain of OR-equality tests on the suffix via a
substring. I would use a table of valid country codes in place of
[a-z]{2}
>
The final tuning trick is the CASE expression from Hell. Using the
order of execution of the WHEN clauses, test for common simple errors
and finish the monster like predicate.
>
SELECT ..
FROM ..
WHERE CASE
WHEN <@ count is not oneTHEN 'F'
WHEN <illegal char in targetTHEN 'F'
WHEN <invalid suffixTHEN 'F'
Etc.
WHEN <LIKE predicate from HellTHEN 'T'
ELSE 'F' END = 'T';
>
Is this ugly? Well, I have seen more tables with longer rows in a lot
of code posted here.
>
Is it easy to understand, if you read SQL? I think so.
>
Do I like it? No, I want my ANSI Standard SIMILAR TO predicate after
over a decade of waiting. The LIKE predicate is very weak and we are
using more complex encodings, like email addresses, in modern data. We
got OUTER JOINs, so I live in hope (but they did take MERGE out of 2005
after having it in a Beta).
>
I think this demonstates that it is possible and could be put into a
Standard SQL procedure.
>
What do you do when the DB you are DBA-ing has the grep you like in
your favorite CLR language, the grep fred likes in his favorite CLR
language, and so forth for a dozen different developers? What if one
grep is not like another grep? Look at how many versions of grep we
have in Unix alone.
>
The DBA group really cannot learn those languages and stay current in
them. We now need two application programmers per language (always
hire in pairs for the same reasons you make back ups). So we pick a
limited set of CLR languages for the DB shop. Sure hope that the
application users agree witht he choice. If they are on VB and we are
on C#, that +1, -1 thing in Booleans could be a problem ..
>

**Damien** · Nov 2 '06, 08:55 AM

Re: Sequential Number in an Update

--CELKO-- wrote:
[snip]

>
The DBA group really cannot learn those languages and stay current in
them. We now need two application programmers per language (always
hire in pairs for the same reasons you make back ups). So we pick a
limited set of CLR languages for the DB shop. Sure hope that the
application users agree witht he choice. If they are on VB and we are
on C#, that +1, -1 thing in Booleans could be a problem ..

Please Joe,

spend a little time learning about the *COMMON* language runtime, which
also involves the *COMMON* type system - on which both VB.Net and C#
are built. That's right - for as long as VB.Net and C# have existed,
they've had no discrepancies on booleans (or any other built in types),
because they're using exactly the same implementation.

You've had a number of years to learn this Joe, please try to take it
in.

By the time the code is added into the database, it doesn't matter
whether it was written in VB.Net or C# (or any other CLR language).
Provided each language is equally powerful, the same algorithm will
have been compiled to the same IL code. And if you're not comfortable
with the language in which a piece of code is written (though any
decent .Net programmer should, IMO, at least be able to read/maintain
in either VB.Net or C#), there are plenty of tools to convert from one
language to another.

Damien

**Hugo Kornelis** · Nov 2 '06, 11:35 PM

Re: Sequential Number in an Update

On 1 Nov 2006 15:11:58 -0800, --CELKO-- wrote:

>>So I'll join in the challenge - how would your constraint look like if it _has_ to work on MS SQL Server 2005? <<

>
>This grep checks email format against RFC 3696 and was written by David
>Thompson
>
>[a-z0-9!$'*+\-_]+(\.[a-z0-9!$'*+\-_]+)*
>@([a-z0-9]+(-+[a-z0-9]+)*\.)+
>([a-z]{2}aero|arpa|bi z|cat|com|coop| edu|gov|info|in t|jobs|mil|mobi |museum|name|ne t|org|pro|trave l)

Hi Joe,

The requirement was actually to check a URL, not an email address. But
I'd be just as happy with a working CHECK constraint for well-formed
email address - the actual issue here is that I believe that there ARE
things that are better handled in the CLR than in ANSI SQL.

Oh, and if you ever have to build a REAL email address validator, please
find a better one than the one above - David Thompson appears to be
unaware that people outside of the US also have mail now, since he
disallows all non-US top-level domain names.

>Let's translate it. The SQL Server LIKE predicate is missing the
>Kleene plus + and Kleene star * repetition tokens. We can fake those
>with a table of repetitions and a JOIN. The * table has an empty
>string in it in addition to repetitions of the character search
>pattern.
>
>CREATE TABLE Symbols
>(char_cnt INTEGER NOT NULL PRIMARY KEY,
token VARCHAR(255) NOT NULL);
>INSERT INTO Symbols VALUES (1, '[a-z0-9!$'*+-_]');
>INSERT INTO Symbols VALUES (2, '[a-z0-9!$'*+-_][a-z0-9!$'*+-_]');
>Etc.
>
>So to get a pattern for a string with a dot in it, you would use
>
>SELECT 'Valid'
FROM Symbols AS A1, Symbols AS A2
WHERE @target LIKE A1.token + '.' + A2.token
AND LEN(@target) < char_cnt - 1;

So how would you incorporate this logic in a SQL Server CHECK constraint
which (as you undoubtebly know) is not allowed to access other tables or
even other rows from the same tables?

(snip)

>The final tuning trick is the CASE expression from Hell. Using the
>order of execution of the WHEN clauses, test for common simple errors
>and finish the monster like predicate.
>
>SELECT ..
FROM ..
WHERE CASE
WHEN <@ count is not oneTHEN 'F'
WHEN <illegal char in targetTHEN 'F'
WHEN <invalid suffixTHEN 'F'
Etc.
WHEN <LIKE predicate from HellTHEN 'T'
ELSE 'F' END = 'T';
>
>Is this ugly? Well, I have seen more tables with longer rows in a lot
>of code posted here.

I don't care if it's ugly - it's not a constraint! How is this SELECT
going to prevent someone entering data that is not well-formed?

>Is it easy to understand, if you read SQL? I think so.

It will probably no lonnger be easy to understand once it's finished (if
you are able to get it working in a CHECK constraint at all!!). The
first parts (<@ count is not oneand <illegal char in target>) will
probably be quite straightforward , The next (<invalid suffix>) less so.
And the <LIKE predicate from Hell>? Well, if that refers to the SELECT
'Valid' (...) query from the start of your post, it's definitely not
easy. I consider myself to be quite good at understanding SQL, and I had
to look hard before I understood the trick. A less experienced SQL coder
would probably not get it.

(snip)

>I think this demonstates that it is possible and could be put into a
>Standard SQL procedure.

If standard SQL allows a subquery in a CHECK constraint, then you have
indeed demonstrated that it is possible in standard SQL. But the
question was to implement the constraint in SQL Server, using only
standard SQL (or rather, the subset of standard SQL that SQL Server
implements). I maintain that it is not possible.

>What do you do when the DB you are DBA-ing has the grep you like in
>your favorite CLR language, the grep fred likes in his favorite CLR
>language, and so forth for a dozen different developers? What if one
>grep is not like another grep? Look at how many versions of grep we
>have in Unix alone.

In such a case, I'd do the same as what I'd do if the DB I am DBA-ing
has table and column names in PascalCase, Fred likes to code in
camelCase and so forth for a dozen different developers. I would force
all developers to follow one standard (set by me), making sure that
management backs me on this. Developers who don't follow the standard
will very soon realise that I am the one who decides what does and what
doesn't get deployed on the production server, and that following the
standards is a good way to improve the chance of getting the code
deployed.

With CLR, it's the same. I get to review the code before I install the
assembly on the production server, so it better be in a language I am
able to read. And if an email or URL validator is already installed,
there's no chance that any second such validator will be added - if the
first doesn't work satisfactory, just file a change request but don't
duplicate code.

I really don't see the big deal. If you put the code to validate in a
SQL stored procedure or user-defined function, you'd also have to deal
with various developers who might code competing versions of the same
code.

--
Hugo Kornelis, SQL Server MVP

Sequential Number in an Update

Comment

Comment

Comment

Comment

Comment