CHECKSUM to determine record changes

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • rcamarda

    CHECKSUM to determine record changes

    I've searched the forum for uses of CHECKSUM and havent found a
    satisfactory answer.
    I need to know when a row changes and I dont care what it was or is, I
    just need to know it changed.
    Detail:
    Hourly I select rows where the record as a create date or change date
    of the last 3 days. I am only interested when the address or name
    changes, not other columns.
    Currently, I might select 30,000 rows that are new or changed, but
    turns out that only 100 have address changes and 500 are new. I want
    to process the 600, not the 30,000.
    My main concern is if some columns change, but the resulting checksum
    doesnt. Then I would have missed processing that record.
    The column types that I will be tracking are:
    "forenames" VARCHAR(50) NULL,
    "surname" VARCHAR(51) NULL,
    "ADDRESS1" VARCHAR(60) NULL,
    "ADDRESS2" VARCHAR(60) NULL,
    "ADDRESS4" VARCHAR(50) NULL,
    "STATE" VARCHAR(10) NULL,
    "ZIP" VARCHAR(20) NULL,
    "email" VARCHAR(50) NULL,
    "telephone" VARCHAR(10) NULL
    (I left off the keys, these are the only fields that I will be using
    for the checksum()).
    Am I safe? I read about 32 bit CRC and some data changes would go
    unnoticed, but not sure if this layout would qualifiy.

    TIA
    Rob
  • rcamarda

    #2
    Re: CHECKSUM to determine record changes

    Im using SQL Server 2005 currently patched.

    Comment

    • David Portas

      #3
      Re: CHECKSUM to determine record changes

      "rcamarda" <robert.a.camar da@gmail.comwro te in message
      news:2d88cd48-2853-4f01-b797-c5c166bd167d@k1 3g2000hse.googl egroups.com...
      Im using SQL Server 2005 currently patched.
      CHECKSUM isn't a reliable way to detect change because it's quite common to
      find different rows with the same CHECKSUM value. You could use a ROWVERSION
      column instead. ROWVERSION is guaranteed to increment when the row data is
      updated.

      Another alternative is to use a hash. The HashBytes function will return a
      secure hash of a binary value with a very high probability of uniqueness.
      Duplicate hashes are theoretically possible but are incredibly unlikely to
      occur unintentionally . If you are extremely paranoid then you can use two
      different hashes.

      --
      David Portas


      Comment

      • Gert-Jan Strik

        #4
        Re: CHECKSUM to determine record changes

        David Portas wrote:
        >
        "rcamarda" <robert.a.camar da@gmail.comwro te in message
        news:2d88cd48-2853-4f01-b797-c5c166bd167d@k1 3g2000hse.googl egroups.com...
        Im using SQL Server 2005 currently patched.
        >
        CHECKSUM isn't a reliable way to detect change because it's quite common to
        find different rows with the same CHECKSUM value. You could use a ROWVERSION
        column instead. ROWVERSION is guaranteed to increment when the row data is
        updated.
        >
        Another alternative is to use a hash. The HashBytes function will return a
        secure hash of a binary value with a very high probability of uniqueness.
        Duplicate hashes are theoretically possible but are incredibly unlikely to
        occur unintentionally . If you are extremely paranoid then you can use two
        different hashes.
        >
        --
        David Portas
        I agree about the advice for use ROWVERSION.

        However, CHECKSUM is also just a hash. Whether the chance of an
        unnotices change is lower if you use HashBytes function instead of
        CHECKSUM depends on your data.

        Although adding a second hash value will lower the chance of a missed
        change, it would be silly to do so. In the end you would need as many
        hash bytes as there are bytes in the data.

        --
        Gert-Jan

        Comment

        Working...