validate a text input field. (again)

**Douglas Crockford** · Jul 20 '05, 12:34 PM

Re: validate a text input field. (again)

> I need to validate a text input field.[color=blue]
>
> I just want to say if user enters
>
> 93101 or 93102 or 93103 or 93105 or 93106 or 93107 or 93108 or 93109
> or 93110 or 93111 or 93116 or 93117 or 93118 or 93120 or 93121 or
> 93130 or 93140 or 93150 or 93160 or 93190 or 93199 or 93199 or 93401
> or 93402 or 93403 or 93405 or 93406 or 93407 or 93408 or 93409 or
> 93410 or 93412
>
> he can not submit the form. (because we do not service that area)
>
> Any help would be greatly appreciated.[/color]

var zone = {
'93101': 1,
'93102': 1,
'93103': 1,
'93105': 1,
'93106': 1,
'93107': 1,
'93108': 1,
'93109': 1,
'93110': 1,
'93111': 1,
'93116': 1,
'93117': 1,
'93118': 1,
'93120': 1,
'93121': 1,
'93130': 1,
'93140': 1,
'93150': 1,
'93160': 1,
'93190': 1,
'93199': 1,
'93199': 1,
'93401': 1,
'93402': 1,
'93403': 1,
'93405': 1,
'93406': 1,
'93407': 1,
'93408': 1,
'93409': 1,
'93410': 1,
'93412': 1};

if (zone[input] == 1) {
// reject
} else {
// accept
}

JSON

http://www.JSON.org/

**Thomas 'PointedEars' Lahn** · Jul 20 '05, 12:34 PM

Re: validate a text input field. (again)

Douglas Crockford wrote:
[color=blue][color=green]
>> I just want to say if user enters
>>
>> 93101 or 93102 or 93103 or 93105 or 93106 or 93107 or 93108 or 93109
>> or 93110 or 93111 or 93116 or 93117 or 93118 or 93120 or 93121 or
>> 93130 or 93140 or 93150 or 93160 or 93190 or 93199 or 93199 or 93401
>> or 93402 or 93403 or 93405 or 93406 or 93407 or 93408 or 93409 or
>> 93410 or 93412
>>
>> he can not submit the form. (because we do not service that area)
>>
>> Any help would be greatly appreciated.[/color]
>
> [Lengthy object definition]
>
> if (zone[input] == 1) {
> // reject
> } else {
> // accept
> }[/color]

OMG. Have you just forgot that there are RegExp?

function checkMe(o)
{

return(!/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|40[1-35-9]|41[02])$/.test(o.value)) ;
}

<form ... onsubmit="retur n checkMe(this.el ements['bla'])">
<input name="bla">
</form>

PointedEars

**Lasse Reichstein Nielsen** · Jul 20 '05, 12:35 PM

Re: validate a text input field. (again)

Thomas 'PointedEars' Lahn <PointedEars@we b.de> writes:
[color=blue]
> OMG. Have you just forgot that there are RegExp?[/color]

Most likely not. He gave a generic way to test for a finite number of
strings. It wokrs whether there are structure to the strings or not.

Regexps take more work to make, and are harder to read. And *much*
harder to extend with new numbers, if it becomes necessary
[color=blue]
> return(!/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|40[1-35-9]|41[02])$/.test(o.value)) ;
> }[/color]

Are you sure that your regexp matches exactly the correct strings? :)
(It probably does, but comparing RegExps to string is ExpSpace complete
in general, so very hard to do).

/L
--
Lasse Reichstein Nielsen - lrn@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleD OM.html>
'Faith without judgement merely degrades the spirit divine.'

**@SM** · Jul 20 '05, 12:35 PM

Re: validate a text input field. (again)

Eddie a ecrit :
[color=blue]
> I need to validate a text input field.
>
> I just want to say if user enters
>
> 93101 or 93102 or 93103 or 93105 or 93106 or 93107 or 93108 or 93109
> or 93110 or 93111 or 93116 or 93117 or 93118 or 93120 or 93121 or
> 93130 or 93140 or 93150 or 93160 or 93190 or 93199 or 93199 or 93401
> or 93402 or 93403 or 93405 or 93406 or 93407 or 93408 or 93409 or
> 93410 or 93412
>
> he can not submit the form. (because we do not service that area)
>
> Any help would be greatly appreciated.[/color]

an easy way to actualize this list of numbers could be

<script type="text/javascript"></script>

<form action="code.ph p"
onsubmit="ok==1 ? return true :
{alert('uncorre ct code !); return false ;}">
Enter your code here :
<input type=text onchange="valid TextField(this. value);">
<input type=submit value="Validate ">
</form>

**Dr John Stockton** · Jul 20 '05, 12:35 PM

Re: validate a text input field. (again)

JRS: In article <6a779a0d.03120 51638.3a3546f0@ posting.google. com>, seen
in news:comp.lang. javascript, Eddie <sales@detector .com> posted at Fri,
5 Dec 2003 16:38:46 :-
[color=blue]
>I just want to say if user enters
>
>93101 or 93102 or 93103 or 93105 or 93106 or 93107 or 93108 or 93109
>or 93110 or 93111 or 93116 or 93117 or 93118 or 93120 or 93121 or
>93130 or 93140 or 93150 or 93160 or 93190 or 93199 or 93199 or 93401
>or 93402 or 93403 or 93405 or 93406 or 93407 or 93408 or 93409 or
>93410 or 93412
>
>he can not submit the form. (because we do not service that area)[/color]

It seems likely, from the above, that all outside 93xxx are likely to
remain serviceable; OTOH, the list may change.

To save repetitive typing and gain run-time efficiency, one can first
test for the 93; after that, it is well to minimise the size of the
code. Consider, but with the full test list,

S = '93103'

OK = S.substring(0, 2) != "93" ||
'101 102 103 105 106 107 108 109 110'.indexOf(S. substring(2))<0

If those are postal codes, what do you do if someone enters "SW1A 1AA" ?

--
© John Stockton, Surrey, UK. ?@merlyn.demon. co.uk Turnpike v4.00 IE 4 ©
<URL:http://jibbering.com/faq/> Jim Ley's FAQ for news:comp.lang. javascript
<URL:http://www.merlyn.demo n.co.uk/js-index.htm> JS maths, dates, sources.
<URL:http://www.merlyn.demo n.co.uk/> TP/BP/Delphi/JS/&c., FAQ topics, links.

**Thomas 'PointedEars' Lahn** · Jul 20 '05, 12:36 PM

Re: validate a text input field. (again)

Lasse Reichstein Nielsen wrote:
[color=blue]
> Thomas 'PointedEars' Lahn <PointedEars@we b.de> writes:[color=green]
>> OMG. Have you just forgot that there are RegExp?[/color]
>
> Most likely not. He gave a generic way to test for a finite number of
> strings. It wokrs whether there are structure to the strings or not.[/color]

Undoubtedly. But his method consumes much more memory and computing
time than mine, no matter if the strings are structured or not. IOW:
Compared to my method, his is highly inefficient in *every* case.
[color=blue]
> Regexps take more work to make, and are harder to read.[/color]

Not generally, no.
[color=blue]
> And *much* harder to extend with new numbers, if it becomes necessary[/color]

No, see below.
[color=blue][color=green]
>> return(!/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|40[1-35-9]|41[02])$/.test(o.value)) ;
>> }[/color]
>
> Are you sure that your regexp matches exactly the correct strings? :)[/color]

Pretty sure.
[color=blue]
> (It probably does, but comparing RegExps to string is ExpSpace complete[/color]
^^^^^^^^^^^^^^^ ^
Define that.
[color=blue]
> in general, so very hard to do).[/color]

Not at all. It is primarily a matter of structured building of the
RegExp, finding similarities first.

See the numbers again the RegExp should match. I remove the duplicate
93199 and group the numbers so one sees clearly what they have in common.

93101, 93102, 93103, 93105, 93106, 93107, 93108, 93109,
93110, 93111, 93116, 93117, 93118,
93120, 93121,
93130, 93140, 93150, 93160, 93190,
93199,

93401, 93402, 93403, 93405, 93406, 93407, 93408, 93409
93410, 93412

Obviously all numbers begin with 93:

/^93/

There are numbers continuing with 1 and with 4:

/^93(1|4)/

Numbers continuing with 1 continue with either 0 to 6, or 9:

/^93(1(0|1|2|3|4 |5|6|9)|4)/

Numbers continuing from there with 0 continue with digits from 1 to 9,
except of 4:

/^93(1(0[1-35-9]|1|2|3|4|5|6|9) |4)/

Numbers continuing from there with 1 continue with 0, 1, and 6 to 8:

/^93(1(0[1-35-9]|1[016-8]|2|3|4|5|6|9)|4 )/

Numbers continuing from there with 2 continue with either 0 or 1:

/^93(1(0[1-35-9]|1[016-8]|2[01]|3|4|5|6|9)|4)/

Numbers continuing from there with 3 to 6 and 9 continue with 0:

/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0)|4)/

If the fourth digit was 9, also 9 can follow:

/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4)/

(One could have also grouped 93190 and 93199 together:
...|[3-6]0|9[09])...)

Numbers having a 4 as third digit continue with either 0 or 1:

/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4[01])/

Because the fourth digit is followed by different sets of digits we write

/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4(0|1))/

instead.

If the third digit is 4 and the fourth digit is 0, digits from 1 to 3
and 5 to 9 may follow:

/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4(0[1-35-9]|1))/

If the third digit is 4 and the fourth digit is 1, the fifth may be
only 0 and 2:

/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4(0[1-35-9]|1[02]))/

For we match whole numbers, we finally add the end-of-text meta character:

/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4(0[1-35-9]|1[02]))$/

Now compare that to my RegExp which was built (but only in mind)
using the same procedure:

/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|40[1-35-9]|41[02])$/

The only difference is that I wrote `40[...]|41[...]' instead of
`4(0[...]|1[...])' which is semantically equal, though.

You will not tell me that the above was hard work, will you?

Because the RegExp was built *this* way, it is easy as well to find out
the strings it will match, going from left to right, creating a branch
in a build tree every time we find an alternative (including sets of
characters):

93
931
9310
93101
93102
93103
93105
93106
93107
93108
93109
9311
93110
93111
93116
93117
93118
9312
93120
93121
9313
93130
9314
93140
9315
93150
9316
93160
9319
93190
93199
934
9340
93401
93402
93403
93405
93406
93407
93408
93409
9341
93410
93412

We take only the leaves of the build tree:

93101
93102
93103
93105
93106
93107
93108
93109
93110
93111
93116
93117
93118
93120
93121
93130
93140
93150
93160
93190
93199
93401
93402
93403
93405
93406
93407
93408
93409
93410
93412

Group them:

93101, 93102, 93103, 93105, 93106, 93107, 93108, 93109
93110, 93111, 93116, 93117, 93118,
93120, 93121
93130, 93140, 93150, 93160, 93190,
93199,

93401, 93402, 93403, 93405, 93406, 93407, 93408, 93409
93410, 93412

And compare with what was provided (already grouped here and removed
dupes):

93101, 93102, 93103, 93105, 93106, 93107, 93108, 93109,
93110, 93111, 93116, 93117, 93118,
93120, 93121,
93130, 93140, 93150, 93160, 93190,
93199,

93401, 93402, 93403, 93405, 93406, 93407, 93408, 93409
93410, 93412

q.e.d.

We have only five-digit numbers with few linear exceptions here, one
should manage it to see that the above RegExp matches without writing
the matches down, especially if one has built the RegExp by themselves
as described above.

If reading the entire RegExp is still too difficult, one can also manage
it to divide the RegExp into many (say each for every third or fourth
digit) and have the tests combined with `&&'.

So new numbers are not be a problem at all. If in doubt, one can simply
add another alternative: If 93429 should be forbidden, too, the RegExp
can be simply changed to

/^(93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4(0[1-35-9]|1[02])|93429)$/
^ ^^^^^^^

which, of course, could (later) be optimized to

/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4(0[1-35-9]|1[02]|29))$/

An additional test may be as well combined with `&&' without wasting to
much computing time.

PointedEars

**Lasse Reichstein Nielsen** · Jul 20 '05, 12:36 PM

Re: validate a text input field. (again)

Thomas 'PointedEars' Lahn <PointedEars@we b.de> writes:
[color=blue]
> Lasse Reichstein Nielsen wrote:[/color]

[color=blue]
> But his method consumes much more memory and computing time than
> mine, no matter if the strings are structured or not. IOW: Compared
> to my method, his is highly inefficient in *every* case.[/color]

Have you tested it? It consumes much less computing time, and the
memory is constant. I think you underestimate the complexity of
interpreting a regular expression (or more likey: running the finite
state automaton it has been compiled into) on a string.

I created a test that made an array of 100000 random umbers in the
range 80000-99999. Then it tested both methods against that table, using
result1[i] = !!table[data[i]];
and
result2[i] = re.test(data[i]);
(with a base check of
result0[i] = data[i],false;
to find the overhead of the other parts of the code not used in the
actual test)
The entire test is included below.

The results were (in milliseconds):
base table regexp
IE 6: 1212 1733 23373
Opera 7.23: 601 681 2944
Moz FB 0.7 471 540 2304

So, your method is, by far, less efficient than a table lookup, and
even in IE, which (IIRC) uses a linear lookup for object properties.

The only case where the table lookup loses is in size. It can be made
better by building the table dynamically:

var numbers = [101,102,103,105 ,106,107,108,10 9,110,111,116,1 17,118,120,
121,130,140,150 ,160,190,199,40 1,402,403,405,4 06,407,408,409, 410,412];
var table = {};
for (var i in numbers) {table[93000+numbers[i]]=true;}

Still larger than a regular expression, but not significantly.
[color=blue][color=green]
>> Regexps take more work to make, and are harder to read.[/color]
>
> Not generally, no.[/color]

Definitly yes.
I am very familiar with regular expressions, but I still have to think
to read and understand one. The table is obvious. And while the table
might take more space (that is a relevant parameter), it is easier to
write. Given the list of numbers, it won't take long in Emacs to turn
it into a table.
[color=blue][color=green]
>> And *much* harder to extend with new numbers, if it becomes necessary[/color]
>
> No, see below.[/color]

To extend the regular expression, you have to either find the place in it
that requires changing, or rebuild it from scratch. In a (sorted) table,
you just have to find the correct place and add the line (or add it anywhere
if you don't sort the table).

It might not be a big difference, but it is definitly there.
Regular expressions requires thought. The table can be automated.
[color=blue][color=green]
>> (It probably does, but comparing RegExps to string is ExpSpace complete[/color]
> ^^^^^^^^^^^^^^^ ^
> Define that.[/color]

It's a complexity class.

The *genereal* problem of, given a regular expression and another
efficient description of a language (where language := set of strings
not necessarily finite), decide whether the regular expression
recognizes exactly the strings of the language, can (worst case)
require memory space that is exponential in the size of the regular
expression. I.e., it's bloody slow.

As a comparison, factorizing (large) numbers only requires polynomial
space and exponential time, and it's considered too inefficient to
use in practice.
[color=blue]
> Not at all. It is primarily a matter of structured building of the
> RegExp, finding similarities first.[/color]

It takes thought and familiarity with regular expressions. You can do
it. I can do it. Many other people here can too, but there are lots of
people writing Javascript for web pages that considers regular expressions
black magic, and just uses what they are given. If one of them is going
to maintain the page with your regular expression, he'll be back here
to ask for help in changing it when the numbers change.
[color=blue]
> You will not tell me that the above was hard work, will you?[/color]

Hard, no. Work, yes. Building the table was *no* work at all.
[color=blue]
> Because the RegExp was built *this* way, it is easy as well to find out
> the strings it will match,[/color]

This regular expression is also special in that it only recognizes a
finite number of strings. That makes it easier to handle than ones
with "*" or "+" in them. So, the general hardness of the problem
doesn't necessarily apply to this case.
[color=blue]
> We have only five-digit numbers with few linear exceptions here, one
> should manage it to see that the above RegExp matches without writing
> the matches down, especially if one has built the RegExp by themselves
> as described above.[/color]

Yes. It's (fairly) easy.
[color=blue]
> So new numbers are not be a problem at all. If in doubt, one can simply
> add another alternative: If 93429 should be forbidden, too, the RegExp
> can be simply changed to
>
> /^(93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4(0[1-35-9]|1[02])|93429)$/
> ^ ^^^^^^^
>
> which, of course, could (later) be optimized to
>
> /^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4(0[1-35-9]|1[02]|29))$/[/color]

Yes. It's a relatively simple case.
But regualr expressions are not as obvious to a lot of other people.

The test:
---
//<script>
function test(){
var table = {
93101 : true, 93102 : true, 93103 : true, 93105 : true, 93106 : true,
93107 : true, 93108 : true, 93109 : true, 93110 : true, 93111 : true,
93116 : true, 93117 : true, 93118 : true, 93120 : true, 93121 : true,
93130 : true, 93140 : true, 93150 : true, 93160 : true, 93190 : true,
93199 : true, 93401 : true, 93402 : true, 93403 : true, 93405 : true,
93406 : true, 93407 : true, 93408 : true, 93409 : true, 93410 : true,
93412 : true
};
var re = /^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|40[1-35-9]|41[02])$/;
var testsize = 100000;
var testdata = [];
for (var i = 0;i<testsize;i+ +) {
testdata[i]=Math.floor(Mat h.random()*2000 0+80000);
}

var result0 = new Array(testsize) ;
var d1 = new Date();
for(var i =0;i<testsize;i ++) {
result0[i] = testdata[i],false;
}
var d2 = new Date();
var timebase = d2-d1;

var result1 = new Array(testsize) ;
var d1 = new Date();
for(var i =0;i<testsize;i ++) {
result1[i] = !!table[testdata[i]];
}
var d2 = new Date();
var timetable = d2-d1;

var result2 = new Array(testsize) ;
var d1 = new Date();
for(var i =0;i<testsize;i ++) {
result2[i] = re.test(testdat a[i]);
}
var d2 = new Date();
var timere = d2-d1;

alert([timebase,timeta ble,timere]);
}
test();
//</script>
---

/L
--
Lasse Reichstein Nielsen - lrn@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleD OM.html>
'Faith without judgement merely degrades the spirit divine.'

**Thomas 'PointedEars' Lahn** · Jul 20 '05, 12:36 PM

Re: validate a text input field. (again)

Lasse Reichstein Nielsen wrote:
[color=blue]
> Thomas 'PointedEars' Lahn <PointedEars@we b.de> writes:[color=green]
>> Lasse Reichstein Nielsen wrote:
>> But his method consumes much more memory and computing time than
>> mine, no matter if the strings are structured or not. IOW: Compared
>> to my method, his is highly inefficient in *every* case.[/color]
>
> Have you tested it?[/color]

I do not have with JavaScript, and I must admit that `*every* case'
was a bit exaggerated.
[color=blue]
> It consumes much less computing time,[/color]

Well, apparently that depends on the implementation and on the
complexity of the RegExp. AFAIS Mozilla/5.0's engine is on the
average much faster on RegExp than other engines of ECMAScript
implementations .
[color=blue]
> and the memory is constant. I think you underestimate the complexity of
> interpreting a regular expression (or more likey: running the finite
> state automaton it has been compiled into) on a string.
> [...]
> So, your method is, by far, less efficient than a table lookup, and
> even in IE, which (IIRC) uses a linear lookup for object properties.[/color]

It depends on how you build the RegExp, i.e. on how it is composed.

What you overlook here is that I used an RegExp optimized for
length. Of course, the matching can be also done with the longer
/^(93101|93102|9 3103|...)$/, respectively, where the RegExp *wins*
in matters of speed, size and amount of maintenance effort.
[color=blue][color=green][color=darkred]
>>> Regexps take more work to make, and are harder to read.[/color]
>>
>> Not generally, no.[/color]
>
> Definitly yes.[/color]

Wrong, see above and below.
[color=blue]
> I am very familiar with regular expressions, but I still have to think
> to read and understand one. The table is obvious.[/color]

A list of simple-formed alternatives separated by `|' is obvious, too,
if not even more than a table solution.
[color=blue]
> And while the table might take more space (that is a relevant parameter),
> it is easier to write.[/color]

/(foo|bar)/ *is* easy to write.
[color=blue]
> Given the list of numbers, it won't take long in Emacs to turn
> it into a table.[/color]

Although I'd prefer `vi', same goes for RegExps.
[color=blue][color=green][color=darkred]
>>> And *much* harder to extend with new numbers, if it becomes necessary[/color]
>>
>> No, see below.[/color]
>
> To extend the regular expression, you have to either find the place in it
> that requires changing, or rebuild it from scratch.[/color]

You do not have to. As I wrote, when in doubt, simply add another
alternative expression at the lowest subexpression level. Since it
is not evaluated if the previous does match, it then only takes a
little bit more of memory, not really of computing or maintenance
time.
[color=blue]
> In a (sorted) table, you just have to find the correct place and add the
> line (or add it anywhere if you don't sort the table).[/color]

In a RegExp, you just have to find the place where the `(' and `)'
for alternatives must be placed and add another alternative. Or for
testing matches you simply AND-combine the previous test with another
one testing the new number expression (or a subexpression of it).
[color=blue]
> It might not be a big difference, but it is definitly there.
> Regular expressions requires thought. The table can be automated.[/color]

You can do that with RegExps, too. Using the RegExp(...) constructor
function and a string argument, you can even accomplish that with
JavaScript.
[color=blue][color=green][color=darkred]
>>> (It probably does, but comparing RegExps to string is ExpSpace complete[/color]
>> ^^^^^^^^^^^^^^^ ^
>> Define that.[/color]
>
> It's a complexity class.
> [...][/color]

Thanks.
[color=blue][color=green]
>> Not at all. It is primarily a matter of structured building of the
>> RegExp, finding similarities first.[/color]
>
> It takes thought and familiarity with regular expressions.[/color]

It takes thought and at least average familiarity with JavaScript to
create an object/array (literal) from a given set of strings. Your
turn.
[color=blue][color=green]
>> You will not tell me that the above was hard work, will you?[/color]
>
> [...] Building the table was *no* work at all.[/color]

I seriously doubt that ;-)
[color=blue][color=green]
>> Because the RegExp was built *this* way, it is easy as well to find out
>> the strings it will match,[/color]
>
> This regular expression is also special in that it only recognizes a
> finite number of strings. That makes it easier to handle than ones
> with "*" or "+" in them. So, the general hardness of the problem
> doesn't necessarily apply to this case.[/color]

You can easily add alternatives or additional tests no matter how the
original RegExp was composed.
[color=blue][color=green]
>> So new numbers are not be a problem at all. If in doubt, one can simply
>> add another alternative: If 93429 should be forbidden, too, the RegExp
>> can be simply changed to
>>
>> /^(93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4(0[1-35-9]|1[02])|93429)$/
>> ^ ^^^^^^^
>>
>> which, of course, could (later) be optimized to
>>
>> /^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|4(0[1-35-9]|1[02]|29))$/[/color]
>
> Yes. It's a relatively simple case.
> But regualr expressions are not as obvious to a lot of other people.[/color]

It is all about how to add another alternative. The optimization
for length (which turned out as the opposite regarding computing
speed) that I performed here is _not_ required.
[color=blue]
> The test:
> ---
> //<script>[/color]

Why have you added the `//' *in* *front* of the `<script>' tag?
[color=blue]
> [...][/color]

Thanks. I tried that and got about the same as you did in the
mentioned UAs.

Now guess what changed when using number atoms as alternative
expressions: The RegExp solution then proved to be about 4 to 6
(according to repeated tests) times faster than the table solution,
but (surprisingly) only with Mozilla/5.0. (Seems that IE's and
Opera's RegExp engines need a little bit of tuning :))

For Regular Expressions are widely known as *the* efficient method for
matching strings, the opposite would have been very surprising to me.

Note:
Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; Q312461) requires
the property identifiers here to be quoted as they do not conform to
valid identifiers in the version of JScript it supports by default.
For Mozilla also accepts string literals, without having the access
method to differ from numeric ones, one should always quote the
identifier if in doubt.

PointedEars

**Lasse Reichstein Nielsen** · Jul 20 '05, 12:36 PM

Re: validate a text input field. (again)

Thomas 'PointedEars' Lahn <PointedEars@we b.de> writes:
[color=blue]
> Lasse Reichstein Nielsen wrote:[/color]
[color=blue][color=green]
>> It consumes much less computing time,[/color]
>
> Well, apparently that depends on the implementation and on the
> complexity of the RegExp.[/color]

Obviously. But I still haven't seen a single example where the RegExp
was even close to being more efficient. More like an order of magnitude
slower.
[color=blue]
> AFAIS Mozilla/5.0's engine is on the average much faster on RegExp
> than other engines of ECMAScript implementations .[/color]

It seems so from my results (actually Opera is faster on RegExps), but
it is also faster at property lookup.
[color=blue]
> It depends on how you build the RegExp, i.e. on how it is composed.[/color]

Probably. But not trivially.
[color=blue]
> What you overlook here is that I used an RegExp optimized for
> length. Of course, the matching can be also done with the longer
> /^(93101|93102|9 3103|...)$/, respectively, where the RegExp *wins*
> in matters of speed, size and amount of maintenance effort.[/color]

Test it. I did, and it was even slower than the "size optimized"
version. The RegExp:
var re = /^(?:93101|93102 |93103|93105|93 106|93107|93108 |93109|93110|93 111|93116|93117 |93118|93120|93 121|93130|93140 |93150|93160|93 190|93199|93401 |93402|93403|93 405|93406|93407 |93408|93409|93 410|93412)$/;

The results:
base table short regexp long regexp
IE6 1032 1482 20980 27239
O7.23 570 591 1853 2103
Moz FB 0.7 431 489 2053 2434
[color=blue]
> A list of simple-formed alternatives separated by `|' is obvious, too,
> if not even more than a table solution.[/color]

If you build the table from an array of names, the array is simpler.
[color=blue]
> Although I'd prefer `vi', same goes for RegExps.[/color]

If you use simple regular expressions, yes.
[color=blue][color=green]
>> In a (sorted) table, you just have to find the correct place and add the
>> line (or add it anywhere if you don't sort the table).[/color]
>
> In a RegExp, you just have to find the place where the `(' and `)'
> for alternatives must be placed and add another alternative.[/color]

I.e., same complexity.
[color=blue]
> You can do that with RegExps, too. Using the RegExp(...) constructor
> function and a string argument, you can even accomplish that with
> JavaScript.[/color]

Correct.
[color=blue]
> It takes thought and at least average familiarity with JavaScript to
> create an object/array (literal) from a given set of strings. Your
> turn.[/color]

Ok. Let's call it a draw. If we use simepl "|"-separated regular
expressions, writing them are equally simple.
[color=blue][color=green]
>> [...] Building the table was *no* work at all.[/color]
>
> I seriously doubt that ;-)[/color]

It took time, not work :)
[color=blue]
> It is all about how to add another alternative. The optimization
> for length (which turned out as the opposite regarding computing
> speed) that I performed here is _not_ required.[/color]

I find that the "long" regExp is slower than the size-optimized
version in all my browsers.
[color=blue][color=green]
>> The test:
>> ---
>> //<script>[/color]
>
> Why have you added the `//' *in* *front* of the `<script>' tag?[/color]

I used the same code and either evaluated it with "eval" or inserted
it into a new page. This way, it's legal either way :)
[color=blue]
> Thanks. I tried that and got about the same as you did in the
> mentioned UAs.[/color]

[color=blue]
> Now guess what changed when using number atoms as alternative
> expressions: The RegExp solution then proved to be about 4 to 6
> (according to repeated tests) times faster than the table solution,
> but (surprisingly) only with Mozilla/5.0. (Seems that IE's and
> Opera's RegExp engines need a little bit of tuning :))[/color]

I don't get that. In Mozilla FB 0.7, using the above "long" regular
expression and the original "size-optimized" regexp, I find that the long
one is slower (200 ms on 100000 runs, but slower).
[color=blue]
> For Regular Expressions are widely known as *the* efficient method for
> matching strings, the opposite would have been very surprising to me.[/color]

They are very efficient for *complex*
[color=blue]
> Note:
> Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; Q312461) requires
> the property identifiers here to be quoted as they do not conform to
> valid identifiers in the version of JScript it supports by default.
> For Mozilla also accepts string literals, without having the access
> method to differ from numeric ones, one should always quote the
> identifier if in doubt.[/color]

That led me to one (stupid) mistake I made in my test. I created the
test data as numbers, not strings, which forced a toString conversion
in some cases.

I changed the test data to be strings, saving later toString
conversions. It helped the time for the regular expression tests, but
not enough to be faster than the table lookup. To make it compatible
with Netscape 4, I build the table from an array (of strings, not
numbers). I also build the long regular expression from the same
data, using RegExp("^("+tab leData.join("|" )+")$") .

New results:

base tabel short re long re
IE 6 1612 1983 2694 3525
O7.23 591 902 1572 1772
Moz 511 641 1051 1342
NS 4* 831 1713 1762 1572

Much better performance for regular expressions (due to less toString
conversion). Still slower than table look up (but not as much), and
long RE still slower than short. Except for Netscape 4, where the
long re is the fastest.

(Instead of posting the code again, I have uploaded it to
<URL:http://www.infimum.dk/privat/numberLookup.ht ml>)

My conclusion stands: Regular expressions are not more efficient than
table lookup. They might be as simple to write, but then they are not
as efficient as they can be.

/L
--
Lasse Reichstein Nielsen - lrn@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleD OM.html>
'Faith without judgement merely degrades the spirit divine.'

**Dr John Stockton** · Jul 20 '05, 12:36 PM

Re: validate a text input field. (again)

JRS: In article <d6b0ofgw.fsf@h otpop.com>, seen in
news:comp.lang. javascript, Lasse Reichstein Nielsen <lrn@hotpop.com >
posted at Sun, 7 Dec 2003 15:43:59 :-[color=blue]
>
>Definitly yes.
>I am very familiar with regular expressions, but I still have to think
>to read and understand one. The table is obvious. And while the table
>might take more space (that is a relevant parameter), it is easier to
>write. Given the list of numbers, it won't take long in Emacs to turn
>it into a table.[/color]

Indeed. While all available ability can be used to generate the initial
code, one should allow for a possible future change, and the need to
implement it with inferior staff. Table-based methods are easy to read,
and fairly easy to make minor modifications to. Complex RegExps are
not, and would need extra bolt-on tests or complete redesign.

--
© John Stockton, Surrey, UK. ?@merlyn.demon. co.uk Turnpike v4.00 MIME. ©
<URL:http://www.merlyn.demo n.co.uk/> TP/BP/Delphi/&c., FAQqy topics & links;
<URL:http://www.merlyn.demo n.co.uk/clpb-faq.txt> RAH Prins : c.l.p.b mFAQ;
<URL:ftp://garbo.uwasa.fi/pc/link/tsfaqp.zip> Timo Salmi's Turbo Pascal FAQ.

**Grant Wagner** · Jul 20 '05, 12:41 PM

Re: validate a text input field. (again)

Thomas 'PointedEars' Lahn wrote:
[color=blue]
> Lasse Reichstein Nielsen wrote:
>[color=green]
> > Thomas 'PointedEars' Lahn <PointedEars@we b.de> writes:[color=darkred]
> >> OMG. Have you just forgot that there are RegExp?[/color]
> >
> > Most likely not. He gave a generic way to test for a finite number of
> > strings. It wokrs whether there are structure to the strings or not.[/color]
>
> Undoubtedly. But his method consumes much more memory and computing
> time than mine, no matter if the strings are structured or not. IOW:
> Compared to my method, his is highly inefficient in *every* case.
>[color=green]
> > Regexps take more work to make, and are harder to read.[/color]
>
> Not generally, no.
>[color=green]
> > And *much* harder to extend with new numbers, if it becomes necessary[/color]
>
> No, see below.
>[color=green][color=darkred]
> >> return(!/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|40[1-35-9]|41[02])$/.test(o.value)) ;
> >> }[/color]
> >
> > Are you sure that your regexp matches exactly the correct strings? :)[/color]
>
> Pretty sure.
>[color=green]
> > (It probably does, but comparing RegExps to string is ExpSpace complete[/color]
> ^^^^^^^^^^^^^^^ ^
> Define that.
>[color=green]
> > in general, so very hard to do).[/color]
>
> Not at all. It is primarily a matter of structured building of the
> RegExp, finding similarities first.
>[/color]

Tiy say, "Not at all" then proceed with a hundred line explanation of how to compose a RegExp
to match his set of numbers.
[color=blue]
> An additional test may be as well combined with `&&' without wasting to
> much computing time.
>
> PointedEars[/color]

How about this:

function isValidZIP(theZ IP) {
switch (theZIP) {
case 93101: case 93102: case 93103: case 93105:
case 93106: case 93107: case 93108: case 93109:
case 93110: case 93111: case 93116: case 93117:
case 93118: case 93120: case 93121: case 93130:
case 93140: case 93150: case 93160: case 93190:
case 93199: case 93199: case 93401: case 93402:
case 93403: case 93405: case 93406: case 93407:
case 93408: case 93409: case 93410: case 93412:
// ZIP is invalid
return false;
break;
default:
// ZIP is valid
return true;
break;
}
}

Self-documenting, you can see AT A GLANCE which ZIP codes are valid, and you can easily add or
remove additional ZIP codes without having to reconstruct your RegExp.

With your RegExp, you'd have to add a comment similar to:

/*
matches 93101, or 93102 or 93103, or 93105 ....
or 93412
*/

because when you come back to work on it in 6 months, you won't remember what it does, and
you'll have to waste time decoding it to figure out which ZIP codes are valid.

--
| Grant Wagner <gwagner@agrico reunited.com>

* Client-side Javascript and Netscape 4 DOM Reference available at:
* http://devedge.netscape.com/library/...ce/frames.html
* Internet Explorer DOM Reference available at:
* http://msdn.microsoft.com/workshop/a...ence_entry.asp
* Netscape 6/7 DOM Reference available at:
* http://www.mozilla.org/docs/dom/domref/
* Tips for upgrading JavaScript for Netscape 7 / Mozilla
* http://www.mozilla.org/docs/web-deve...upgrade_2.html

**Thomas 'PointedEars' Lahn** · Jul 20 '05, 12:42 PM

Re: validate a text input field. (again)

Grant Wagner wrote:
[color=blue]
> Thomas 'PointedEars' Lahn wrote:[color=green]
>> Lasse Reichstein Nielsen wrote:[color=darkred]
>>> Thomas 'PointedEars' Lahn <PointedEars@we b.de> writes:
>>>> return(!/^93(1(0[1-35-9]|1[016-8]|2[01]|[3-69]0|99)|40[1-35-9]|41[02])$/.test(o.value)) ;
>>>> }
>>>
>>> Are you sure that your regexp matches exactly the correct
>>> strings? :) (It probably does, but comparing RegExps to string is
>>> ExpSpace complete in general, so very hard to do).[/color]
>>
>> Not at all. It is primarily a matter of structured building of the
>> RegExp, finding similarities first.[/color]
>
> Tiy say, "Not at all" then proceed with a hundred line explanation of
> how to compose a RegExp to match his set of numbers.[/color]

That was how I built mine and once you are used to it, RegExp are no
longer difficult (so I explained it in detail for others to learn).
You can build yours far simpler, and I explained that, too.
[color=blue][color=green]
>> An additional test may be as well combined with `&&' without
>> wasting to much computing time.
>> [...][/color]
>
> How about this:
> [switch-case-default-example]
> Self-documenting, you can see AT A GLANCE which
> ZIP codes are valid, and you can easily add or remove additional ZIP
> codes without having to reconstruct your RegExp.
>
> With your RegExp, you'd have to add a comment similar to:
>
> /* matches 93101, or 93102 or 93103, or 93105 .... or 93412 */
>
> because when you come back to work on it in 6 months, you won't
> remember what it does, and you'll have to waste time decoding it
> to figure out which ZIP codes are valid.[/color]

OK, you wanted it, you get it:

function isValidZIP(
/** @argument number|string */ sInput,
/** @argument Array of number|string */ aInvalidZIPs)
/**
* @author (C) 2003 Thomas Lahn <zipcode.js@ PointedEars.de& gt;
* @param sInput ZIP code to be checked.
* @returns <code>true</code> if <code>sInput</code> is
* a valid ZIP code, <code>false</code> otherwise.
*/
{
var rxInvalidZIPs =
new RegExp("^(" + aInvalidZIPs.jo in("|") + ")$");
return !rxInvalidZIPs. test(sInput);
}

// Array of invalid ZIP codes
var aInvZIPs =
[93101, 93102, 93103, 93105, 93106, 93107, 93108, 93109,
93110, 93111, 93116, 93117, 93118, 93140, 93150, 93199,
93199, 93401, 93402, 93403, 93405, 93406, 93407, 93408,
93409, 93410, 93412];

var r = String(Math.flo or(Math.random( ) * 1000)); // integer 0..999
while (r.length < 3) // add leading zeroes
r = "0" + r;
var z = "93" + r; // add prefix
alert(
z + " is "
+ (isValidZIP(z, aInvZIPs) ? "" : "NOT")
+ " a valid ZIP code.");

Happy testing!
[color=blue]
> --[/color]

Your signature separator is borken, do not use Mozillas HTML editor to
avoid that. Besides, your signature is far too long. Appropriate is
a signature of up to 4 lines with up to 80 characters.

80 characters per line is also the allowed maximum for Usenet messages
which your posting exceeds by far. Set your automagic linebreak
function to a recommended value between 72 to 76 characters per line so
that a few quoting levels do not extend the 80th.

And please trim your quotes to the absolute necessary. Especially, do
not quote signatures (names and so-called signatures as well) if you do
not refer to them.

PointedEars

**Dr John Stockton** · Jul 20 '05, 12:43 PM

Re: validate a text input field. (again)

JRS: In article <3FD7880B.10400 08@PointedEars. de>, seen in
news:comp.lang. javascript, Thomas 'PointedEars' Lahn
<PointedEars@we b.de> posted at Wed, 10 Dec 2003 21:54:35 :-[color=blue]
>
>80 characters per line is also the allowed maximum for Usenet messages
>which your posting exceeds by far. Set your automagic linebreak
>function to a recommended value between 72 to 76 characters per line so
>that a few quoting levels do not extend the 80th.[/color]

I know of no reference for an allowed maximum, except at of the order of
1000 characters. If you know of a lower one, in an authoritative
document which takes evident cognisance of posting non-text material,
then cite it.

There is a strong recommendation that paragraphs of text should be sent
properly wrapped with hard returns; figures vary from about 64 to 76
characters. But where a line which ought be long is to be sent, it
should not be arbitrarily broken.

Script for News, therefore, should be composed with that limit in mind;
anyhow, it seems more readable that way. But script which is longer
must not be machine-wrapped, unless the machine understands the wrapping
of indented script.

Material which is transmitted with lines longer than 70-80 characters
may be broken by displaying software, but it may be possible for the
reader to extend those margins, and it should be possible to copy the
material as transmitted into a file.

--
© John Stockton, Surrey, UK. ?@merlyn.demon. co.uk Turnpike v4.00 MIME ©
Web <URL:http://www.uwasa.fi/~ts/http/tsfaq.html> -> Timo Salmi: Usenet Q&A.
Web <URL:http://www.merlyn.demo n.co.uk/news-use.htm> : about usage of News.
No Encoding. Quotes before replies. Snip well. Write clearly. Don't Mail News.

validate a text input field. (again)

validate a text input field. (again)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment