Performance problems when inserting into a large table

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Joachim Klassen

    Performance problems when inserting into a large table

    Hi all,

    first apologies if this question looks the same as another one I recently
    posted - its a different thing but for the same szenario:-).

    We are having performance problems when inserting/deleting rows from a large
    table.
    My scenario:

    Table (lets call it FACT1) with 1000 million rows distributed on 12
    Partitions (3 physical hosts with 4 logical partitions each).
    Overall size of table is 350 GB. Each night 1.5 Million new rows will be
    added
    and approx. the same amount of old records will be deleted (Roll in/Roll out
    with SQL INSERT/DELETE).
    The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
    Extentsize.
    The tablespace has 6 containers on each partition. Each container is on a
    separate IBM ESS array.
    Prefetchsize is 384 (6 containers * 64 pages). Prefetch behaves very well
    with these settings (DB2_PARALLEL_I O is set)
    DB2 is V8.1 ESE (DPF) FP5 and runs on AIX.

    It takes 7 hours to insert 1.5 Million Rows into FACT1 and up to 7 hours to
    delete the same amount.
    The Insert is done via INSERT INTO FACT1 ... SELECT * FROM STAGING_TABLE.
    Both the fact and the staging table are in tablespaces in the same nodegroup
    and do have the same partitioning key.

    On a similar table (lets call it FACT2) with a comparable amount of
    data/rows and nearly identical configuration the same process takes only 5
    minutes.

    The main difference between these two tables is that FACT1 has 7 indexes
    defined on it and FACT2 only 4.
    One of the indexes in each case is unique, the others not (all type 2).
    There is no clustering index and the APPEND attribute is set to ON.
    I'm aware of the pseudo-delete mechanism of type-2 indexes and the
    corresponding longer search time for insert's in the index leaf pages .
    But an exclusive lock on the table before inserting/deleting does not change
    the needed runtime.
    (And the docs say that with a X-lock on table pseudo-deletes will not
    happen).
    Also after reorg of table and indexes the insert runtime is the same as
    before.

    Is it possible that the additional index maintenace for FACT1 leads to such
    a longer runtime ?
    What exactly happens internal for index maintenance (searched the docs - but
    do not found internals)?
    Anyone seen similar behaviour ?

    I can post additional infos if required (table and Index definitions,
    statistics ...) - but wanted to keep the posting small in first place.

    TIA for any comments
    Joachim

    PS: Feel free to send comments by email to joklassen at web dot de
    PPS: We are parallel investigating in MDC tables, using smaller tables (and
    combining them with a UNION ALL view) and the use of LOAD FROM CURSOR
    instead of INSERT


  • Serge Rielau

    #2
    Re: Performance problems when inserting into a large table

    Joachim Klassen wrote:[color=blue]
    > Hi all,
    >
    > first apologies if this question looks the same as another one I recently
    > posted - its a different thing but for the same szenario:-).
    >
    > We are having performance problems when inserting/deleting rows from a large
    > table.
    > My scenario:
    >
    > Table (lets call it FACT1) with 1000 million rows distributed on 12
    > Partitions (3 physical hosts with 4 logical partitions each).
    > Overall size of table is 350 GB. Each night 1.5 Million new rows will be
    > added
    > and approx. the same amount of old records will be deleted (Roll in/Roll out
    > with SQL INSERT/DELETE).
    > The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
    > Extentsize.
    > The tablespace has 6 containers on each partition. Each container is on a
    > separate IBM ESS array.
    > Prefetchsize is 384 (6 containers * 64 pages). Prefetch behaves very well
    > with these settings (DB2_PARALLEL_I O is set)
    > DB2 is V8.1 ESE (DPF) FP5 and runs on AIX.
    >
    > It takes 7 hours to insert 1.5 Million Rows into FACT1 and up to 7 hours to
    > delete the same amount.
    > The Insert is done via INSERT INTO FACT1 ... SELECT * FROM STAGING_TABLE.
    > Both the fact and the staging table are in tablespaces in the same nodegroup
    > and do have the same partitioning key.
    >
    > On a similar table (lets call it FACT2) with a comparable amount of
    > data/rows and nearly identical configuration the same process takes only 5
    > minutes.
    >
    > The main difference between these two tables is that FACT1 has 7 indexes
    > defined on it and FACT2 only 4.
    > One of the indexes in each case is unique, the others not (all type 2).
    > There is no clustering index and the APPEND attribute is set to ON.
    > I'm aware of the pseudo-delete mechanism of type-2 indexes and the
    > corresponding longer search time for insert's in the index leaf pages .
    > But an exclusive lock on the table before inserting/deleting does not change
    > the needed runtime.
    > (And the docs say that with a X-lock on table pseudo-deletes will not
    > happen).
    > Also after reorg of table and indexes the insert runtime is the same as
    > before.
    >
    > Is it possible that the additional index maintenace for FACT1 leads to such
    > a longer runtime ?
    > What exactly happens internal for index maintenance (searched the docs - but
    > do not found internals)?[/color]
    I'm not privy of index maintenance internals, but could it be the 7
    indexes cause a spill of some heap? Maybe sort heap? Have you checked
    the snapshots?
    Have you verified that the plans are good? You shouldn't see any TQs.
    Also are you sure you don't have any other complicating factors (SQL
    Functions, Triggers, check or RI constraints) (The plans will show).[color=blue]
    > PPS: We are parallel investigating in MDC tables, using smaller tables (and
    > combining them with a UNION ALL view) and the use of LOAD FROM CURSOR
    > instead of INSERT[/color]
    Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do
    that in a scalable fashion you would fire up concurrent LOADs on each
    node filtering the source by DBPARTITION.
    You shouldn't need UNION ALL.

    Cheers
    Serge

    --
    Serge Rielau
    DB2 SQL Compiler Development
    IBM Toronto Lab

    Comment

    • Joachim Klassen

      #3
      Re: Performance problems when inserting into a large table

      Serge,
      again thanks for your quick reply :-)

      I will try to get snapshot information next days (Problem is that "get
      snapshot for all " runs 1 hour on production and once crashed the instance
      in the past :-) (problem is fixed in FP7 which will be applied in the near
      time)).
      [color=blue]
      > Have you verified that the plans are good? You shouldn't see any TQs.
      > Also are you sure you don't have any other complicating factors (SQL
      > Functions, Triggers, check or RI constraints) (The plans will show).[/color]
      The plan looks good (for me). Maybe you can comment it:

      Section Code Page = 819

      Estimated Cost = 31926.718750
      Estimated Cardinality = 75608.000000

      Coordinator Subsection - Main Processing:
      (-----) Distribute Subsection #1
      | Broadcast to Node List
      | | Nodes = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
      | | 11, 12

      Subsection #1:
      ( 3) Access Table Name = DTMP1T.STAGING ID = 411,121
      | #Columns = 24
      | Volatile Cardinality
      | Relation Scan
      | | Prefetch: Eligible
      | Lock Intents
      | | Table: Intent Share
      | | Row : Next Key Share
      ( 2) Insert: Table Name = DPERMT.FACT1 ID = 1714,2

      End of section


      Optimizer Plan:

      INSERT
      ( 2)
      /----/ \
      TBSCAN Table:
      ( 3) DPERMT
      | F7KB_F_A_T_Q_B_ K
      Table:
      DTMP1T
      F7KB_F_A_T_Q_B_ K
      [color=blue]
      > Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do that
      > in a scalable fashion you would fire up concurrent LOADs on each node
      > filtering the source by DBPARTITION.[/color]

      Does that mean
      DECLARE C1 CURSOR for select * from stage where dbpartitionnum( column) = 1
      LOAD FROM C1 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNU MS 1
      DECLARE C2 CURSOR for select * from stage where dbpartitionnum( column) = 2
      LOAD FROM C2 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNU MS 2
      and so on

      Thanks
      Joachim

      "Serge Rielau" <srielau@ca.ibm .com> schrieb im Newsbeitrag
      news:35pnsrF4q7 sclU1@individua l.net...[color=blue]
      > Joachim Klassen wrote:[color=green]
      >> Hi all,
      >>
      >> first apologies if this question looks the same as another one I recently
      >> posted - its a different thing but for the same szenario:-).
      >>
      >> We are having performance problems when inserting/deleting rows from a
      >> large table.
      >> My scenario:
      >>
      >> Table (lets call it FACT1) with 1000 million rows distributed on 12
      >> Partitions (3 physical hosts with 4 logical partitions each).
      >> Overall size of table is 350 GB. Each night 1.5 Million new rows will be
      >> added
      >> and approx. the same amount of old records will be deleted (Roll in/Roll
      >> out with SQL INSERT/DELETE).
      >> The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
      >> Extentsize.
      >> The tablespace has 6 containers on each partition. Each container is on a
      >> separate IBM ESS array.
      >> Prefetchsize is 384 (6 containers * 64 pages). Prefetch behaves very well
      >> with these settings (DB2_PARALLEL_I O is set)
      >> DB2 is V8.1 ESE (DPF) FP5 and runs on AIX.
      >>
      >> It takes 7 hours to insert 1.5 Million Rows into FACT1 and up to 7 hours
      >> to delete the same amount.
      >> The Insert is done via INSERT INTO FACT1 ... SELECT * FROM STAGING_TABLE.
      >> Both the fact and the staging table are in tablespaces in the same
      >> nodegroup and do have the same partitioning key.
      >>
      >> On a similar table (lets call it FACT2) with a comparable amount of
      >> data/rows and nearly identical configuration the same process takes only
      >> 5 minutes.
      >>
      >> The main difference between these two tables is that FACT1 has 7 indexes
      >> defined on it and FACT2 only 4.
      >> One of the indexes in each case is unique, the others not (all type 2).
      >> There is no clustering index and the APPEND attribute is set to ON.
      >> I'm aware of the pseudo-delete mechanism of type-2 indexes and the
      >> corresponding longer search time for insert's in the index leaf pages .
      >> But an exclusive lock on the table before inserting/deleting does not
      >> change the needed runtime.
      >> (And the docs say that with a X-lock on table pseudo-deletes will not
      >> happen).
      >> Also after reorg of table and indexes the insert runtime is the same as
      >> before.
      >>
      >> Is it possible that the additional index maintenace for FACT1 leads to
      >> such a longer runtime ?
      >> What exactly happens internal for index maintenance (searched the docs -
      >> but do not found internals)?[/color]
      > I'm not privy of index maintenance internals, but could it be the 7
      > indexes cause a spill of some heap? Maybe sort heap? Have you checked the
      > snapshots?
      > Have you verified that the plans are good? You shouldn't see any TQs.
      > Also are you sure you don't have any other complicating factors (SQL
      > Functions, Triggers, check or RI constraints) (The plans will show).[color=green]
      >> PPS: We are parallel investigating in MDC tables, using smaller tables
      >> (and combining them with a UNION ALL view) and the use of LOAD FROM
      >> CURSOR instead of INSERT[/color]
      > Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do that
      > in a scalable fashion you would fire up concurrent LOADs on each node
      > filtering the source by DBPARTITION.
      > You shouldn't need UNION ALL.
      >
      > Cheers
      > Serge
      >
      > --
      > Serge Rielau
      > DB2 SQL Compiler Development
      > IBM Toronto Lab[/color]


      Comment

      • Serge Rielau

        #4
        Re: Performance problems when inserting into a large table

        Joachim Klassen wrote:[color=blue]
        > Optimizer Plan:
        >
        > INSERT
        > ( 2)
        > /----/ \
        > TBSCAN Table:
        > ( 3) DPERMT
        > | F7KB_F_A_T_Q_B_ K
        > Table:
        > DTMP1T
        > F7KB_F_A_T_Q_B_ K[/color]
        Doesn't get easier than that...[color=blue][color=green]
        >>Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do that
        >>in a scalable fashion you would fire up concurrent LOADs on each node
        >>filtering the source by DBPARTITION.[/color]
        >
        >
        > Does that mean[/color]
        Connect to node 1:[color=blue]
        > DECLARE C1 CURSOR for select * from stage where dbpartitionnum( column) = 1
        > LOAD FROM C1 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNU MS 1[/color]
        Connect to node 2:[color=blue]
        > DECLARE C2 CURSOR for select * from stage where dbpartitionnum( column) = 2
        > LOAD FROM C2 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNU MS 2[/color]
        connect to node "and so on"[color=blue]
        > and so on[/color]

        Basically you are your own splitter.

        This, btw is a great way to do batch processing with procedures.

        Cheers
        Serge

        --
        Serge Rielau
        DB2 SQL Compiler Development
        IBM Toronto Lab

        Comment

        Working...