Questions regarding large MDC tables

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Joachim Klassen

    Questions regarding large MDC tables

    Hi all,

    I'm currently investigating the use of MDC Tables for large data warehouse
    tables.
    My scenario:

    A fact table with 1000 Million Rows distributed on 12 Partitions (3 physical
    hosts with 4 logical partitions each).
    Overall size of table is 350 GB . Each night 1.5 Million new rows will be
    added
    and approx. the same amount of old records will be deleted (Roll in/Roll out
    with SQL INSERT/DELETE).
    The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
    Extentsize.
    The tablespace has 6 containers on each partition. Each container is on a
    separate IBM ESS array. Prefetchsize is 384 (6 containers * 64 pages).
    Prefetch behaves very well
    with these settings (DB2_PARALLEL_I O is set). DB2 is V8.1 ESE (DPF) FP5 and
    runs on AIX 5.2.

    We figured out that for our choosen MDC dimensions we will have to use an
    extentsize of 2 pages otherwise we would waste too much space. This
    extentsize gives us headaches:

    - What is an optimal prefetchsize here?
    With prefetchsize of 12 (6 containers * extentsize 2) each prefetcher will
    only read 32 KB of data from one container.
    With a prefetchsize of 384 (which is optimal from a disk layout point of
    view) will DB2 start 192 prefetchers (that would be certainly overkill)?

    Further:
    Does anybody have experiences with MDC tables for large warehouse tables and
    is willing to share them?
    Especially performance experiences when inserting/deleting during the
    roll-in/roll-out of daily data ?

    Unfortunately I do not have an adequate environment to test all these
    issues - so any comments are highly appreciated.

    TIA
    Joachim

    PS: Feel free to send comments by email to joklassen at web dot de


  • Serge Rielau

    #2
    Re: Questions regarding large MDC tables

    Joachim Klassen wrote:[color=blue]
    > Hi all,
    >
    > I'm currently investigating the use of MDC Tables for large data warehouse
    > tables.
    > My scenario:
    >
    > A fact table with 1000 Million Rows distributed on 12 Partitions (3 physical
    > hosts with 4 logical partitions each).
    > Overall size of table is 350 GB . Each night 1.5 Million new rows will be
    > added
    > and approx. the same amount of old records will be deleted (Roll in/Roll out
    > with SQL INSERT/DELETE).
    > The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
    > Extentsize.
    > The tablespace has 6 containers on each partition. Each container is on a
    > separate IBM ESS array. Prefetchsize is 384 (6 containers * 64 pages).
    > Prefetch behaves very well
    > with these settings (DB2_PARALLEL_I O is set). DB2 is V8.1 ESE (DPF) FP5 and
    > runs on AIX 5.2.
    >
    > We figured out that for our choosen MDC dimensions we will have to use an
    > extentsize of 2 pages otherwise we would waste too much space. This
    > extentsize gives us headaches:
    >
    > - What is an optimal prefetchsize here?
    > With prefetchsize of 12 (6 containers * extentsize 2) each prefetcher will
    > only read 32 KB of data from one container.
    > With a prefetchsize of 384 (which is optimal from a disk layout point of
    > view) will DB2 start 192 prefetchers (that would be certainly overkill)?
    >
    > Further:
    > Does anybody have experiences with MDC tables for large warehouse tables and
    > is willing to share them?
    > Especially performance experiences when inserting/deleting during the
    > roll-in/roll-out of daily data ?
    >
    > Unfortunately I do not have an adequate environment to test all these
    > issues - so any comments are highly appreciated.
    >
    > TIA
    > Joachim
    >
    > PS: Feel free to send comments by email to joklassen at web dot de
    >
    >[/color]
    Try the new advisor in V8.2. You can simply download DB2 V8.2 onto your
    laptop, mimic the stats and then apply the proposal to DB2 V8.1.5.

    Cheers
    Serge

    --
    Serge Rielau
    DB2 SQL Compiler Development
    IBM Toronto Lab

    Comment

    • Joachim Klassen

      #3
      Re: Questions regarding large MDC tables

      Serge,
      thanks for the quick reply.

      I already followed your suggestion but with no luck. Design advisor makes no
      recommendations for MDC.
      Does that mean that MDC is not recommended at all for my scenario or do I
      have to refine my workload input (which is at the moment very basic with 3
      Queries and 1 Insert statement).

      Thanks again
      Joachim

      "Serge Rielau" <srielau@ca.ibm .com> schrieb im Newsbeitrag
      news:35plh7F421 m8tU1@individua l.net...[color=blue]
      > Joachim Klassen wrote:[color=green]
      >> Hi all,
      >>
      >> I'm currently investigating the use of MDC Tables for large data
      >> warehouse
      >> tables.
      >> My scenario:
      >>
      >> A fact table with 1000 Million Rows distributed on 12 Partitions (3
      >> physical hosts with 4 logical partitions each).
      >> Overall size of table is 350 GB . Each night 1.5 Million new rows will be
      >> added
      >> and approx. the same amount of old records will be deleted (Roll in/Roll
      >> out with SQL INSERT/DELETE).
      >> The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
      >> Extentsize.
      >> The tablespace has 6 containers on each partition. Each container is on a
      >> separate IBM ESS array. Prefetchsize is 384 (6 containers * 64 pages).
      >> Prefetch behaves very well
      >> with these settings (DB2_PARALLEL_I O is set). DB2 is V8.1 ESE (DPF) FP5
      >> and runs on AIX 5.2.
      >>
      >> We figured out that for our choosen MDC dimensions we will have to use an
      >> extentsize of 2 pages otherwise we would waste too much space. This
      >> extentsize gives us headaches:
      >>
      >> - What is an optimal prefetchsize here?
      >> With prefetchsize of 12 (6 containers * extentsize 2) each prefetcher
      >> will
      >> only read 32 KB of data from one container.
      >> With a prefetchsize of 384 (which is optimal from a disk layout point of
      >> view) will DB2 start 192 prefetchers (that would be certainly overkill)?
      >>
      >> Further:
      >> Does anybody have experiences with MDC tables for large warehouse tables
      >> and
      >> is willing to share them?
      >> Especially performance experiences when inserting/deleting during the
      >> roll-in/roll-out of daily data ?
      >>
      >> Unfortunately I do not have an adequate environment to test all these
      >> issues - so any comments are highly appreciated.
      >>
      >> TIA
      >> Joachim
      >>
      >> PS: Feel free to send comments by email to joklassen at web dot de
      >>
      >>[/color]
      > Try the new advisor in V8.2. You can simply download DB2 V8.2 onto your
      > laptop, mimic the stats and then apply the proposal to DB2 V8.1.5.
      >
      > Cheers
      > Serge
      >
      > --
      > Serge Rielau
      > DB2 SQL Compiler Development
      > IBM Toronto Lab[/color]


      Comment

      • Serge Rielau

        #4
        Re: Questions regarding large MDC tables

        Joachim,

        Sure it does. The V8.2 advisor handles Indexes, Partitioning, MQT, and
        MDC. You sure you tried V8.2?
        MDC will not improve your insert performance. It can help with delete
        (rollout) and query.

        Cheers
        Serge
        --
        Serge Rielau
        DB2 SQL Compiler Development
        IBM Toronto Lab

        Comment

        • Joachim Klassen

          #5
          Re: Questions regarding large MDC tables

          "Serge Rielau" <srielau@ca.ibm .com> schrieb im Newsbeitrag
          news:35q4toF4q4 g98U1@individua l.net...[color=blue]
          > Joachim,
          >
          > Sure it does. The V8.2 advisor handles Indexes, Partitioning, MQT, and
          > MDC. You sure you tried V8.2?
          > MDC will not improve your insert performance. It can help with delete
          > (rollout) and query.
          >
          > Cheers
          > Serge
          > --
          > Serge Rielau
          > DB2 SQL Compiler Development
          > IBM Toronto Lab[/color]
          Serge
          Yes, I tried V8.2 (V8.1 FP7). I will retry with FP8 and a refined workload.
          But maybe I use it wrong. Here is what I've done so far:
          - captured objects ddl (tables and indexes) and statistics from the original
          tables via db2look
          - recreated the objects on my laptop in a non-partitioned instance (maybe
          thats the problem)
          - inserted some thousand records and run RUNSTATS to get entries in
          SYSCOLDIST etc.. to be able to update them later
          - updated the stats with the original values
          - started Design advisor GUI and defined a workload consisting of 3 typical
          queries and 1 insert
          - started recommendations

          I thought that MDC would not help with insert performance (more likely the
          opposite ?) thats why we consider LOAD FROM CURSOR.
          Our main goal is improvement for roll-out. Queries already perform well.
          But the more informations I find about MDC in a partitioned environment the
          more I doubt that they will help us in this special scenario.

          Thanks for your comments
          Joachim


          Comment

          Working...