Data modeling question

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Diego Buendia

    Data modeling question

    I'm facing the next problem:

    I have a table with two columns (among others) modeling category and
    subcategory data for each row. I need to summarize info on this two
    columns, but with the next specs:

    1.- Some grouping is only on the category column.
    2.- Some other grouping consider the two columns.

    The values for the two columns come from external source, i.e. I have
    no means to know the precise universe of data (I suppose soon or later
    we'll have a sufficient sample of data, but for now it's not the
    case). So, I would like to have a grouping table so it's not necessary
    to insert a row for every pair of category and subcategory (although
    it would be the best approach for the sake of design's simplicity). As
    I don't know every possible combination, I would prefer something like
    'this category is a - no matter the subcategory', and 'this other
    category + subcategory is b'. Let's go with a sample:

    --------------------------------------------------------

    Create Table B ( -- groupings ----
    categ char(8),
    subcateg char(5),
    what_group char(10)
    )

    -- All rows with 432 code are cat. A ----
    Insert B ( '00000432', ' ', 'Category A' )

    -- All rows with 636 code are cat. C except when subcat is 8552 (cat.
    B) ----
    Insert B ( '00000636', '08552', 'Category B' )
    Insert B ( '00000636', ' ', 'Category C' )

    -- Some data ----

    Create Table A ( -- data ----
    categ char(8),
    subcateg char(5)
    )

    Insert A ( '00000432', '01322' )
    Insert A ( '00000432', '01222' )
    Insert A ( '00000432', '01100' )
    Insert A ( '00000432', ' ' )

    Insert A ( '00000636', '08552' )
    Insert A ( '00000636', '08552' )
    Insert A ( '00000636', '01100' )
    Insert A ( '00000636', ' ' )
    Insert A ( '00000636', '01111' )

    -- The query like:

    Select b.what_group, count(*) as cnt
    From a
    Left Join b
    On /* ? ? ? ? */

    -- Should give ---

    what_group cnt
    -------------- ----------
    Category A 4
    Category B 2
    Category C 3

    -------------------------------------------------------------------

    It would be easier knowing all the pairs categ - subcateg. If I don't
    know them, is a good idea to model the grouping table as I've done
    with rows in B?

    TIA,

    Diego
    Bcn, Spain
  • Erland Sommarskog

    #2
    Re: Data modeling question

    [posted and mailed, please reply in news]

    Diego Buendia (dbuendiab@yaho o.es) writes:[color=blue]
    > I'm facing the next problem:
    >
    > I have a table with two columns (among others) modeling category and
    > subcategory data for each row. I need to summarize info on this two
    > columns, but with the next specs:
    >
    > 1.- Some grouping is only on the category column.
    > 2.- Some other grouping consider the two columns.
    >
    > The values for the two columns come from external source, i.e. I have
    > no means to know the precise universe of data (I suppose soon or later
    > we'll have a sufficient sample of data, but for now it's not the
    > case). So, I would like to have a grouping table so it's not necessary
    > to insert a row for every pair of category and subcategory (although
    > it would be the best approach for the sake of design's simplicity). As
    > I don't know every possible combination, I would prefer something like
    > 'this category is a - no matter the subcategory', and 'this other
    > category + subcategory is b'. Let's go with a sample:[/color]

    I have done one change to your set up: rather than using space to
    mean "no subcategory", I'm using NULL. Here is a repro which appears
    to give the correct result:

    Create Table B ( -- groupings ----
    categ char(8) NOT NULL,
    subcateg char(5) NULL,
    what_group char(10) NOT NULL
    )

    -- All rows with 432 code are cat. A ----
    Insert B VALUES( '00000432', NULL, 'Category A' )

    -- All rows with 636 code are cat. C except when subcat is 8552 (cat. B)
    ----
    Insert B VALUES ( '00000636', '08552', 'Category B' )
    Insert B VALUES ( '00000636', NULL, 'Category C' )

    -- Some data ----

    Create Table A ( -- data ----
    categ char(8) NOT NULL,
    subcateg char(5) NULL
    )

    Insert A VALUES( '00000432', '01322' )
    Insert A VALUES( '00000432', '01222' )
    Insert A VALUES( '00000432', '01100' )
    Insert A VALUES( '00000432', NULL )

    Insert A VALUES( '00000636', '08552' )
    Insert A VALUES( '00000636', '08552' )
    Insert A VALUES( '00000636', '01100' )
    Insert A VALUES( '00000636', NULL )
    Insert A VALUES( '00000636', '01111' )
    go
    SELECT what_group , COUNT(*)
    FROM (
    SELECT B.what_group
    FROM A
    JOIN B ON A.categ = B.categ
    AND A.subcateg = B.subcateg
    UNION ALL
    SELECT B.what_group
    FROM A
    JOIN B ON A.categ = B.categ
    AND B.subcateg IS NULL
    WHERE NOT EXISTS (SELECT *
    FROM B b1
    WHERE A.categ = b1.categ
    AND A.subcateg = b1.subcateg)
    ) AS x
    GROUP BY what_group
    go
    DROP TABLE A, B


    --
    Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

    Books Online for SQL Server SP3 at
    Accelerate your AI application's time to market by harnessing the power of your own data and the built-in AI capabilities of SQL Server 2025, the enterprise database with best-in-class security, performance and availability.

    Comment

    Working...