Sorting alphanumeric

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Bob

    Sorting alphanumeric

    Sorting the following alphanumerics using myArray.sort():

    04-273-0001
    04-272-0001
    04-272-0003
    04-272-0001
    04-273-0001

    Results in:

    04-272-0001
    04-272-0001
    04-273-0001
    04-273-0001
    04-272-0003 <--
    ^

    I cannot assume a standard format nor can I assume what alphanumeric
    characters might be in the array. How do I sort this accuratly when
    these values could contain any alpha numeric character? Is this
    possible without creating a crazy hierarchy or characters and exception
    rules??

    Thanks.
    Bob - Cerner Corp.

  • Evertjan.

    #2
    Re: Sorting alphanumeric

    Bob wrote on 08 dec 2004 in comp.lang.javas cript:
    [color=blue]
    > Sorting the following alphanumerics using myArray.sort():
    >
    > 04-273-0001
    > 04-272-0001
    > 04-272-0003
    > 04-272-0001
    > 04-273-0001
    >
    > Results in:
    >
    > 04-272-0001
    > 04-272-0001
    > 04-273-0001
    > 04-273-0001
    > 04-272-0003 <--
    >[/color]

    That is numeric sorting:

    4-272-1 = -269
    4-273-1 = -270
    4-272-3 = -271

    --
    Evertjan.
    The Netherlands.
    (Replace all crosses with dots in my emailaddress)

    Comment

    • Mick White

      #3
      Re: Sorting alphanumeric

      Bob wrote:[color=blue]
      > Sorting the following alphanumerics using myArray.sort():
      >
      > 04-273-0001
      > 04-272-0001
      > 04-272-0003
      > 04-272-0001
      > 04-273-0001
      >
      > Results in:
      >
      > 04-272-0001
      > 04-272-0001
      > 04-273-0001
      > 04-273-0001
      > 04-272-0003 <--
      > ^
      >
      > I cannot assume a standard format nor can I assume what alphanumeric
      > characters might be in the array. How do I sort this accuratly when
      > these values could contain any alpha numeric character? Is this
      > possible without creating a crazy hierarchy or characters and exception
      > rules??
      >[/color]

      You'll have to roll your own sort function:
      A= ["04-273-0001","04-272-0001","04-272-0003","04-272-0001","04-273-0001"]

      function bobSort(a,b){
      c=Number(a.spli t("-")[0])
      d=Number(b.spli t("-")[0])
      if (c==d){
      c=Number(a.spli t("-")[1])
      d=Number(b.spli t("-")[1])
      }
      if (c==d){
      c=Number(a.spli t("-")[2])
      d=Number(b.spli t("-")[2])
      }
      return c-d
      }

      alert(A.sort(bo bSort))

      The following format may be superior, though:

      c=parseInt(a.sp lit("-")[0],10)
      d=parseInt(b.sp lit("-")[0],10)

      ....

      Mick

      Comment

      • RobG

        #4
        Re: Sorting alphanumeric

        Mick White wrote:
        [...][color=blue]
        > You'll have to roll your own sort function:[/color]
        [...]

        A good start Mick that got me thinking. I can't believe JavaScript
        doesn't have a generic sort that works on alpha-numeric strings. So I
        had a hack at your code and came up with what's below. My contribution
        to the world is to kick off a generic sort function.

        Whether numbers sort ahead of non-numbers can be modified to suit by
        making all comparisons using ASCII codes or by changing the charCodeAt
        lines slight to add or subtract a constant, or multiply by -1;

        My modification of your script handles any format string. To handle
        numbers and non-numbers, I change non-numbers to their ASCII code and
        compare that to single digits. Not great, but it does sort OK - caveat
        below.

        If the sort runs out of characters to compare, it should put the
        shortest one ahead of the longest. Different browsers require a
        different return value: Safari needs -1, Firefox needs 0. I don't know
        how to discriminate using feature detection - or should I be returning
        something else?

        Also, this causes a difference in the sort order for different browsers
        (arggghh).

        I got it this far, over to the gurus. An improvement would be to have
        two sorts: sortAsNum() and sortAsChar().

        Test results (all on Mac):
        Safari: fine
        Camino: fine
        Firefox: need to change return -1 to return 0
        IE 5.2: fails
        Netscape: need to change return -1 to return 0
        Opera: Sometimes needs -1, sometimes 0 depending on whether some
        entries start with alphas or not.

        <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
        <HTML>
        <HEAD><title>So rt play</title>
        <script type="text/javascript">
        function shortestOf(a,b) {
        return (a.length <= b.length)? a.length:b.leng th;
        }
        function bobSort(a,b){
        var z = shortestOf(a,b) ;
        for (var i=0; i<shortestOf(a, b); i++) {
        var c = a.split('')[i];
        var d = b.split('')[i];

        if ( c != d ) {
        c = (isNaN(c))? c.charCodeAt(0) :c;
        d = (isNaN(d))? d.charCodeAt(0) :d;
        var x = c-d;
        return c - d;
        }

        }
        return -1;
        }


        function saySort(inp) {
        var p = inp.split('\n') ;
        alert(p);
        alert(p.sort(bo bSort).join('\n '));
        }
        </script>

        </HEAD>
        <BODY>
        <form action="">
        <textarea cols="40" rows="20" name="stuff">04-273-0005
        040-272-0001
        040-272-0003
        040-272-0001
        04a-2y2-00c0
        04-273-0001
        04a-222-00a0
        04a-222-00b0
        04a-222-00b1
        04z-2x2-00a0
        04a-2x2-00a0
        04a-2y2-00a0
        04a-2y2-00c0
        04a-2y2-00b0
        04a
        0ba-cc2&&8##y2-00b0
        </textarea>
        <input type="button" value="saySort( )" onclick="
        saySort(this.fo rm.stuff.value) ;
        "> <input type="reset">
        </form>
        </BODY>
        </HTML>



        --
        Rob

        Comment

        • Mick White

          #5
          Re: Sorting alphanumeric

          RobG wrote:
          [color=blue]
          > Mick White wrote:
          > [...]
          >[color=green]
          >> You'll have to roll your own sort function:[/color]
          >
          > [...]
          >
          > [snip][/color]

          [color=blue]
          > <script type="text/javascript">
          > function shortestOf(a,b) {
          > return (a.length <= b.length)? a.length:b.leng th;
          > }
          > function bobSort(a,b){
          > var z = shortestOf(a,b) ;
          > for (var i=0; i<shortestOf(a, b); i++) {
          > var c = a.split('')[i];
          > var d = b.split('')[i];
          >
          > if ( c != d ) {
          > c = (isNaN(c))? c.charCodeAt(0) :c;
          > d = (isNaN(d))? d.charCodeAt(0) :d;
          > var x = c-d;
          > return c - d;
          > }
          >
          > }
          > return -1;
          > }
          >[/color]

          ....
          I didn't read the OP's post carefully, in which he did mention alpha
          characters. I'll run some tests on your code, but it seems as if I have
          the same battery of browsers as you do. What I've done in the past is to
          isolate alphas from numeric:
          http://www.mickweb.com/football/aleague/profiles.html (Power Ratings)
          I use sortNumerical2( ) which assigns the variables text1 and text2 an
          arbitary low number to non-numeric array entries. (c & d in the OP's case).

          Mick





          Comment

          • Dr John Stockton

            #6
            Re: Sorting alphanumeric

            JRS: In article <1102535558.104 683.80430@f14g2 000cwb.googlegr oups.com>,
            dated Wed, 8 Dec 2004 11:52:38, seen in news:comp.lang. javascript, Bob
            <bob.hubbard@gm ail.com> posted :[color=blue]
            >Sorting the following alphanumerics using myArray.sort():
            >
            >04-273-0001
            >04-272-0001
            >04-272-0003
            >04-272-0001
            >04-273-0001
            >
            >Results in:
            >
            >04-272-0001
            >04-272-0001
            >04-273-0001
            >04-273-0001
            >04-272-0003 <--
            >^
            >
            >I cannot assume a standard format nor can I assume what alphanumeric
            >characters might be in the array. How do I sort this accuratly when
            >these values could contain any alpha numeric character? Is this
            >possible without creating a crazy hierarchy or characters and exception
            >rules??[/color]

            You do not say whether your array has been loaded with strings or with
            numerical expressions; that is most important. Neither do you say what
            system(s) you are testing on.

            As strings, they should sort lexically to :
            04-272-0001,04-272-0001,04-272-0003,04-273-0001,04-273-0001
            As expressions, by value to :
            -269,-269,-270,-270,-271
            The result you show should not, IMHO, be obtained.

            --
            © John Stockton, Surrey, UK. ?@merlyn.demon. co.uk Turnpike v4.00 IE 4 ©
            <URL:http://www.jibbering.c om/faq/> JL/RC: FAQ of news:comp.lang. javascript
            <URL:http://www.merlyn.demo n.co.uk/js-index.htm> jscr maths, dates, sources.
            <URL:http://www.merlyn.demo n.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.

            Comment

            • RobG

              #7
              Re: Sorting alphanumeric

              Mick White wrote:[color=blue]
              > RobG wrote:
              >[color=green]
              >> Mick White wrote:
              >> [...]
              >>[color=darkred]
              >>> You'll have to roll your own sort function:[/color]
              >>
              >>
              >> [...]
              >>
              >> [snip][/color]
              >
              >
              >[color=green]
              >> <script type="text/javascript">
              >> function shortestOf(a,b) {
              >> return (a.length <= b.length)? a.length:b.leng th;
              >> }
              >> function bobSort(a,b){
              >> var z = shortestOf(a,b) ;
              >> for (var i=0; i<shortestOf(a, b); i++) {
              >> var c = a.split('')[i];
              >> var d = b.split('')[i];
              >>
              >> if ( c != d ) {
              >> c = (isNaN(c))? c.charCodeAt(0) :c;
              >> d = (isNaN(d))? d.charCodeAt(0) :d;
              >> var x = c-d;
              >> return c - d;
              >> }
              >>
              >> }
              >> return -1;
              >> }
              >>[/color]
              >
              > ....
              > I didn't read the OP's post carefully, in which he did mention alpha
              > characters. I'll run some tests on your code, but it seems as if I have
              > the same battery of browsers as you do. What I've done in the past is to
              > isolate alphas from numeric:
              > http://www.mickweb.com/football/aleague/profiles.html (Power Ratings)
              > I use sortNumerical2( ) which assigns the variables text1 and text2 an
              > arbitary low number to non-numeric array entries. (c & d in the OP's case).[/color]

              An interesting approach. BTW, if you make your DOB ISO8601, they will
              sort much better (e.g. 01/21/1980 becomes 1980-10-21, a fairly trivial
              conversion I think). The dates will then sort properly as either chars
              or numbers (noting that all single digit numbers must be zero-padded to
              two digits). You could use the conversion just for the sort, then put
              them back as "US Dates" if that makes your users more comfortable.

              A small fix to my routine is to change:

              return -1;

              to

              return (a.length-b.length);

              The obvious error came to me whilst I was in the shower. It fixes the
              "should I use -1 or zero" problem - I was using completely the wrong
              logic. My only defence is that it was about midnight on a very long
              day.

              The lines:

              var z = shortestOf(a,b) ;
              ....
              var x = c-d;

              can both be ditched, they are remnants of development & debug.

              shortestOf is modified to return the shortest string, rather than its
              length (that's more logical and useful I think).

              Finally, the 'split character' can be passed to the function so that
              the calling function can say what the delimiter is (comma, return, ...)

              I have tested the new version in IE, Firefox and Netscape in Windows
              and all is fine. So if the OP is still watching this thread, here is
              a generic "I want to sort anything" routine:

              <script type="text/javascript">
              /* Returns the length of the shortest of two strings */
              function shortestOf(a,b) {
              return (a.length <= b.length)? a:b;
              }

              function bobSort(a,b){
              // Only iterate for the length of the shortest string
              for (var i=0; i<shortestOf(a, b).length; i++) {
              var c = a.split('')[i];
              var d = b.split('')[i];
              // When we get to the first non-identical character,
              // sort on it
              if ( c != d ) {
              c = (isNaN(c))? c.charCodeAt(0) :c;
              d = (isNaN(d))? d.charCodeAt(0) :d;
              return c-d;
              }
              }
              // If get to the end of the shortest string
              // and all evaluated chars are the same...
              return (a.length-b.length);
              }

              /* inp is an string of values separated by splitChar */
              function saySort(inp,spl itChar) {
              // splitChar is the array delimiter
              var p = inp.split(split Char);
              alert(p.sort(bo bSort).join('\n '));
              }
              </script>



              --
              Rob

              Comment

              • RobG

                #8
                Re: Sorting alphanumeric

                The saga continues...

                I was a bit concerned over performance, 300 records takes about 4
                seconds to sort on my machine in Firefox, so I did a bit of
                optimisation and now 300 records take about 1 second.

                Change bobSort to:

                function bobSort(a,b){
                // Only iterate for the length of the shortest string
                var c = a.split('');
                var d = b.split('');
                for (var i=0; i<shortestOf(a, b).length; i++) {
                // When we get to the first non-identical character,
                // sort on it
                if ( c[i] != d[i] ) {
                c[i] = (isNaN(c[i]))? c[i].charCodeAt(0): c[i];
                d[i] = (isNaN(d[i]))? d[i].charCodeAt(0): d[i];
                return c[i]-d[i];
                }
                }
                // If get to the end of the shortest string
                // and all evaluated chars are the same...
                return (a.length-b.length);
                }

                Making all comparisons on charCode makes almost zero difference to the
                time taken (I thought it would be much quicker), but it does make the
                if statement really simple:

                if ( c[i] != d[i] ) {
                return c[i].charCodeAt(0) - d[i].charCodeAt(0);
                }

                Choose whatever suits. And I'm done. ;-)

                --
                Rob

                Comment

                • Michael Winter

                  #9
                  Re: Sorting alphanumeric

                  On Thu, 09 Dec 2004 23:53:27 GMT, RobG <rgqld@iinet.ne t.auau> wrote:

                  I haven't been following this thread really - I've been kinda busy
                  elsewhere in this group - but I will contribute one thing...
                  [color=blue]
                  > I was a bit concerned over performance[/color]

                  [snip]

                  Well, one way to improve performance is to only call shortestOf once. At
                  the moment, it's called on *every* iteration of the loop.
                  [color=blue]
                  > for (var i=0; i<shortestOf(a, b).length; i++) {[/color]

                  for(var i = 0, n = shortestOf(a, b).length; i < n; ++i) {

                  or

                  for(var i = 0, n = Math.min(a.leng th, b.length); i < n; ++i) {

                  [snip]

                  Mike

                  --
                  Michael Winter
                  Replace ".invalid" with ".uk" to reply by e-mail.

                  Comment

                  • RobG

                    #10
                    Re: Sorting alphanumeric

                    Michael Winter wrote:
                    [...][color=blue]
                    >
                    > for(var i = 0, n = shortestOf(a, b).length; i < n; ++i) {[/color]

                    About halved the execution time. My original code was actually almost
                    identical but I didn't realise how much it affects performance.
                    [color=blue]
                    > for(var i = 0, n = Math.min(a.leng th, b.length); i < n; ++i) {[/color]

                    Shaved another 10%. Firefox takes about 2.8 seconds for 1,200 records,
                    IE about 1.8 seconds. The original takes 25 seconds (or so...).

                    Times are for comparative purposes only.

                    Thanks Mike.

                    --
                    Rob

                    Comment

                    • RobG

                      #11
                      Re: Sorting alphanumeric

                      Dr John Stockton wrote:
                      [...][color=blue]
                      > Why are you people apparently assuming that the default string sort is
                      > not what is needed?[/color]

                      The OP, having posted, didn't bother to hang around long enough to let
                      on. Given that a routine was required that just sorted stuff without
                      regard for any patterns or whether it was alpha, numeric or whatever,
                      it became a bit of fun to write a generic "sort anything" routine.

                      Without knowing what the sorted list will be used for, or understanding
                      any patterns within the data that should be respected by the sort,
                      assumptions are all we have to go on.

                      Post away.

                      --
                      Rob

                      Comment

                      Working...