Which is better? Int or smalldatetime(MSSQL)

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Bostonasian

    Which is better? Int or smalldatetime(MSSQL)

    I have a table that contains transactional data. Such as site view by
    whom, when, which template, etc, etc...
    Everytime when I pulled the report, hh:mm:ss never matters. Only
    breakdown by dates, not time.
    Having read some chapters of Ralph Kimball's book, I am inspired to
    build "date" table with integer as primary key.

    Here's what I have for schema of transactional table.

    - viewed_customer _id int (4bytes)
    - template_id uniqidentifier (16 bytes)
    - viewed_on datetime (8 bytes)

    And here's the version I am thinking of building

    - viewed_customer _id int (4bytes)
    - template_id uniqidentifier (16 bytes)
    - viewed_date_key int (4 bytes)
    - seconds int (4 bytes)

    * I put seconds just in case I need to retrieve hour based or minute
    based report.

    Here's my question. I've also noticed that smalldatetime is also 4bytes
    of memory but it consists of 2 sets of 2 bytes.

    When I index, would there be significant performance difference between
    indexing 4 byte of column and indexing 2bytes x 2 of column?

  • Erland Sommarskog

    #2
    Re: Which is better? Int or smalldatetime(M SSQL)

    Bostonasian (axkixx@gmail.c om) writes:[color=blue]
    > I have a table that contains transactional data. Such as site view by
    > whom, when, which template, etc, etc...
    > Everytime when I pulled the report, hh:mm:ss never matters. Only
    > breakdown by dates, not time.
    > Having read some chapters of Ralph Kimball's book, I am inspired to
    > build "date" table with integer as primary key.
    >
    > Here's what I have for schema of transactional table.
    >
    > - viewed_customer _id int (4bytes)
    > - template_id uniqidentifier (16 bytes)
    > - viewed_on datetime (8 bytes)
    >
    > And here's the version I am thinking of building
    >
    > - viewed_customer _id int (4bytes)
    > - template_id uniqidentifier (16 bytes)
    > - viewed_date_key int (4 bytes)
    > - seconds int (4 bytes)[/color]

    The problem with using int for dates is that you might run into confusion
    what your zero date is. In SQL Server the base date is 1900-01-01. In
    Visual Basic etc it's 1899-12-30. In Unix it's 1970-01-01.

    While datetime is not ideal for dates-only data, it's fairly easy
    to use. In our system we actually have a type for it:

    EXEC sp_addtype 'aba_date', 'datetime'
    go
    CREATE RULE aba_date_rule AS convert(char(8) , @x, 112) = @x
    go
    EXEC sp_bindrule 'aba_date_rule' , 'aba_date'
    go

    So it's not even possible to sneak in any seconds there. To strip
    Hours and seconds from a value, this is the deal:

    convert(char(8) , date_value, 112)

    (It's important to use format 112, as the resulting string will
    always be converted back to datetime correctly.)
    [color=blue]
    > Here's my question. I've also noticed that smalldatetime is also 4bytes
    > of memory but it consists of 2 sets of 2 bytes.
    >
    > When I index, would there be significant performance difference between
    > indexing 4 byte of column and indexing 2bytes x 2 of column?[/color]

    4 bytes is 4 bytes. While smalldatetime may be described as 2+2, it's
    nevertheless just another 32 bits when it comes to indexing.

    As for whether smalldatetime is an option to datetime, it depends on
    your business requirements. If all you use it for is tracking events,
    then it might do. We have abandoned smalldatetime in our system, save
    for some auditing columns, because we have encountered real-world
    data beyond 2076.


    --
    Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

    Books Online for SQL Server SP3 at
    Get the flexibility you need to use integrated solutions, apps, and innovations in technology with your data, wherever it lives—in the cloud, on-premises, or at the edge.

    Comment

    • Bostonasian

      #3
      Re: Which is better? Int or smalldatetime(M SSQL)

      Thanks for the advice.
      I got that "date" table from Ralph Kimball's "Data Warehouse tookit",
      from "Retail" chapter.
      I guess in the retail. It's crucial to know sales date was weekday or
      weekend? Was is holiday? Which fiscal month/year it was, etc, etc...
      Beyond reporting by only dates, there are occasions where I need to
      come up with analysis(though it was not often).

      Thanks for the tips.

      Comment

      • Hugo Kornelis

        #4
        Re: Which is better? Int or smalldatetime(M SSQL)

        On 30 Mar 2005 07:54:35 -0800, Bostonasian wrote:
        [color=blue]
        >Thanks for the advice.
        >I got that "date" table from Ralph Kimball's "Data Warehouse tookit",
        >from "Retail" chapter.
        >I guess in the retail. It's crucial to know sales date was weekday or
        >weekend? Was is holiday? Which fiscal month/year it was, etc, etc...
        >Beyond reporting by only dates, there are occasions where I need to
        >come up with analysis(though it was not often).
        >
        >Thanks for the tips.[/color]

        Hi Bostonasian,

        Having a table with all dates and various properties of each date (such
        as weekend, public holiday, reporting period, etc.) is good. Using an
        integer to represent the data is not.

        To save myself the typing, I'll just refer you to www.aspfaq.com/2519,
        where it is all explained in full detail.

        Best, Hugo
        --

        (Remove _NO_ and _SPAM_ to get my e-mail address)

        Comment

        • David Rawheiser

          #5
          Re: Which is better? Int or smalldatetime(M SSQL)


          "Hugo Kornelis" <hugo@pe_NO_rFa ct.in_SPAM_fo> wrote in message
          news:jh3m41p8hh d191t6houg1qpf7 7025ficnm@4ax.c om...[color=blue]
          > On 30 Mar 2005 07:54:35 -0800, Bostonasian wrote:
          >[color=green]
          >>Thanks for the advice.
          >>I got that "date" table from Ralph Kimball's "Data Warehouse tookit",
          >>from "Retail" chapter.
          >>I guess in the retail. It's crucial to know sales date was weekday or
          >>weekend? Was is holiday? Which fiscal month/year it was, etc, etc...
          >>Beyond reporting by only dates, there are occasions where I need to
          >>come up with analysis(though it was not often).
          >>
          >>Thanks for the tips.[/color]
          >
          > Hi Bostonasian,
          >
          > Having a table with all dates and various properties of each date (such
          > as weekend, public holiday, reporting period, etc.) is good. Using an
          > integer to represent the data is not.
          >
          > To save myself the typing, I'll just refer you to www.aspfaq.com/2519,
          > where it is all explained in full detail.
          >
          > Best, Hugo
          > --
          >
          > (Remove _NO_ and _SPAM_ to get my e-mail address)[/color]

          While I agree that datetypes should be as close to what they represent,
          in the book he refers to the fact table with the detailed data, has an
          integer as a key ID to a seperate date table.

          The actual datetime value, and all the necesary flags and groupings ( fiscal
          quarter, date of week, holiday, etc..) is stored in that side car or
          outrigger table.

          Since current production versions of SQL don't have a seperate date
          datatype, I tend to use smalldatetime instead.

          With the side car table, a text representation can also be done to make
          reporting easier
          i.e. FullDateName column could be "Wednesday, February 31st 2007" , but
          sorting is done by actual datetime datatype column.



          Comment

          • Hugo Kornelis

            #6
            Re: Which is better? Int or smalldatetime(M SSQL)

            On Thu, 31 Mar 2005 04:34:02 GMT, David Rawheiser wrote:

            (snip quoteback)[color=blue]
            >While I agree that datetypes should be as close to what they represent,
            >in the book he refers to the fact table with the detailed data, has an
            >integer as a key ID to a seperate date table.
            >
            >The actual datetime value, and all the necesary flags and groupings ( fiscal
            >quarter, date of week, holiday, etc..) is stored in that side car or
            >outrigger table.[/color]
            (snip)

            Hi David,

            I don't have this book, but what I'd like to know is what advantage the
            integer ID key of this side car table has over a similar side table with
            a datetime (or smalldatetime) key, with a CHECK constraint to ensure
            that only the default time portion (midnight) is allowed.

            [color=blue]
            >Since current production versions of SQL don't have a seperate date
            >datatype, I tend to use smalldatetime instead.[/color]

            As long as you're aware that smalldatetime has a time portion as well
            (though with less precision) and that dates after June 6, 2079 can't be
            represented, this is fine.

            [color=blue]
            >With the side car table, a text representation can also be done to make
            >reporting easier
            >i.e. FullDateName column could be "Wednesday, February 31st 2007" , but
            >sorting is done by actual datetime datatype column.[/color]

            I agree - but that can equally well be done if the side car table uses a
            [small]datetime column os primary key.

            Best, Hugo
            --

            (Remove _NO_ and _SPAM_ to get my e-mail address)

            Comment

            • David Rawheiser

              #7
              Re: Which is better? Int or smalldatetime(M SSQL)


              "Hugo Kornelis" <hugo@pe_NO_rFa ct.in_SPAM_fo> wrote in message
              news:jehn41tih1 a8ss9pjucv6qpim sso3tdo6e@4ax.c om...[color=blue]
              > On Thu, 31 Mar 2005 04:34:02 GMT, David Rawheiser wrote:
              >
              > (snip quoteback)[color=green]
              >>While I agree that datetypes should be as close to what they represent,
              >>in the book he refers to the fact table with the detailed data, has an
              >>integer as a key ID to a seperate date table.
              >>
              >>The actual datetime value, and all the necesary flags and groupings (
              >>fiscal
              >>quarter, date of week, holiday, etc..) is stored in that side car or
              >>outrigger table.[/color]
              > (snip)
              >
              > Hi David,
              >
              > I don't have this book, but what I'd like to know is what advantage the
              > integer ID key of this side car table has over a similar side table with
              > a datetime (or smalldatetime) key, with a CHECK constraint to ensure
              > that only the default time portion (midnight) is allowed.
              >
              >[color=green]
              >>Since current production versions of SQL don't have a seperate date
              >>datatype, I tend to use smalldatetime instead.[/color]
              >
              > As long as you're aware that smalldatetime has a time portion as well
              > (though with less precision) and that dates after June 6, 2079 can't be
              > represented, this is fine.
              >
              >[color=green]
              >>With the side car table, a text representation can also be done to make
              >>reporting easier
              >>i.e. FullDateName column could be "Wednesday, February 31st 2007" , but
              >>sorting is done by actual datetime datatype column.[/color]
              >
              > I agree - but that can equally well be done if the side car table uses a
              > [small]datetime column os primary key.
              >
              > Best, Hugo
              > --
              >
              > (Remove _NO_ and _SPAM_ to get my e-mail address)[/color]

              I depends on how you are using the date.

              If you don't need to present the items by grouping by month, sales period,
              fiscal quarters, day of week, etc.
              or if you don't want to include weekends or holidays or full moons, etc...
              in your reports,
              you either need to:
              a) put a boat load of complex code in your procedures or functions to
              determine these things.
              b) create a date calendar table, add columns and set the flags on each of
              these atributes.

              Thus the resulting query is a simple join from the fact table to this
              calendar table and include/exclude in the where clause or groupby those you
              need.

              You can use the date datatype itself as the keys, and get the same results
              but in the book it tends to use abstract datatypes for the fact table
              attributes.




              Comment

              • Hugo Kornelis

                #8
                Re: Which is better? Int or smalldatetime(M SSQL)

                On Fri, 01 Apr 2005 04:30:36 GMT, David Rawheiser wrote:

                (snip)[color=blue]
                >I depends on how you are using the date.
                >
                >If you don't need to present the items by grouping by month, sales period,
                >fiscal quarters, day of week, etc.
                >or if you don't want to include weekends or holidays or full moons, etc...
                >in your reports,
                >you either need to:
                >a) put a boat load of complex code in your procedures or functions to
                >determine these things.
                >b) create a date calendar table, add columns and set the flags on each of
                >these atributes.[/color]

                Hi David,

                I fully agree - and in most cases, the calendar table is the better
                option.

                [color=blue]
                >Thus the resulting query is a simple join from the fact table to this
                >calendar table and include/exclude in the where clause or groupby those you
                >need.[/color]

                Yep.

                [color=blue]
                >You can use the date datatype itself as the keys, and get the same results
                >but in the book it tends to use abstract datatypes for the fact table
                >attributes.
                >
                >http://www.amazon.com/exec/obidos/tg...&s=ebooks&st=*[/color]

                And this is where I disagrgee with the advise given in the book. Sure,
                there are situations where a surrogate key is better than a natural key,
                but in this specific case, I fail to see any advantages.

                Like I said - I don't have this book (and I don't intend to buy it
                either :-P) Do the authors give any reasons for their choice to use an
                artificial key here? If so, I'd be interested to hear it!

                Best, Hugo
                --

                (Remove _NO_ and _SPAM_ to get my e-mail address)

                Comment

                Working...