Performance problems when inserting into a large table

**Serge Rielau** · Nov 12 '05, 10:05 AM

Re: Performance problems when inserting into a large table

Joachim Klassen wrote:[color=blue]
> Hi all,
>
> first apologies if this question looks the same as another one I recently
> posted - its a different thing but for the same szenario:-).
>
> We are having performance problems when inserting/deleting rows from a large
> table.
> My scenario:
>
> Table (lets call it FACT1) with 1000 million rows distributed on 12
> Partitions (3 physical hosts with 4 logical partitions each).
> Overall size of table is 350 GB. Each night 1.5 Million new rows will be
> added
> and approx. the same amount of old records will be deleted (Roll in/Roll out
> with SQL INSERT/DELETE).
> The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
> Extentsize.
> The tablespace has 6 containers on each partition. Each container is on a
> separate IBM ESS array.
> Prefetchsize is 384 (6 containers * 64 pages). Prefetch behaves very well
> with these settings (DB2_PARALLEL_I O is set)
> DB2 is V8.1 ESE (DPF) FP5 and runs on AIX.
>
> It takes 7 hours to insert 1.5 Million Rows into FACT1 and up to 7 hours to
> delete the same amount.
> The Insert is done via INSERT INTO FACT1 ... SELECT * FROM STAGING_TABLE.
> Both the fact and the staging table are in tablespaces in the same nodegroup
> and do have the same partitioning key.
>
> On a similar table (lets call it FACT2) with a comparable amount of
> data/rows and nearly identical configuration the same process takes only 5
> minutes.
>
> The main difference between these two tables is that FACT1 has 7 indexes
> defined on it and FACT2 only 4.
> One of the indexes in each case is unique, the others not (all type 2).
> There is no clustering index and the APPEND attribute is set to ON.
> I'm aware of the pseudo-delete mechanism of type-2 indexes and the
> corresponding longer search time for insert's in the index leaf pages .
> But an exclusive lock on the table before inserting/deleting does not change
> the needed runtime.
> (And the docs say that with a X-lock on table pseudo-deletes will not
> happen).
> Also after reorg of table and indexes the insert runtime is the same as
> before.
>
> Is it possible that the additional index maintenace for FACT1 leads to such
> a longer runtime ?
> What exactly happens internal for index maintenance (searched the docs - but
> do not found internals)?[/color]
I'm not privy of index maintenance internals, but could it be the 7
indexes cause a spill of some heap? Maybe sort heap? Have you checked
the snapshots?
Have you verified that the plans are good? You shouldn't see any TQs.
Also are you sure you don't have any other complicating factors (SQL
Functions, Triggers, check or RI constraints) (The plans will show).[color=blue]
> PPS: We are parallel investigating in MDC tables, using smaller tables (and
> combining them with a UNION ALL view) and the use of LOAD FROM CURSOR
> instead of INSERT[/color]
Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do
that in a scalable fashion you would fire up concurrent LOADs on each
node filtering the source by DBPARTITION.
You shouldn't need UNION ALL.

Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

**Joachim Klassen** · Nov 12 '05, 10:05 AM

Re: Performance problems when inserting into a large table

Serge,
again thanks for your quick reply :-)

I will try to get snapshot information next days (Problem is that "get
snapshot for all " runs 1 hour on production and once crashed the instance
in the past :-) (problem is fixed in FP7 which will be applied in the near
time)).
[color=blue]
> Have you verified that the plans are good? You shouldn't see any TQs.
> Also are you sure you don't have any other complicating factors (SQL
> Functions, Triggers, check or RI constraints) (The plans will show).[/color]
The plan looks good (for me). Maybe you can comment it:

Section Code Page = 819

Estimated Cost = 31926.718750
Estimated Cardinality = 75608.000000

Coordinator Subsection - Main Processing:
(-----) Distribute Subsection #1
| Broadcast to Node List
| | Nodes = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
| | 11, 12

Subsection #1:
( 3) Access Table Name = DTMP1T.STAGING ID = 411,121
| #Columns = 24
| Volatile Cardinality
| Relation Scan
| | Prefetch: Eligible
| Lock Intents
| | Table: Intent Share
| | Row : Next Key Share
( 2) Insert: Table Name = DPERMT.FACT1 ID = 1714,2

End of section

Optimizer Plan:

INSERT
( 2)
/----/ \
TBSCAN Table:
( 3) DPERMT
| F7KB_F_A_T_Q_B_ K
Table:
DTMP1T
F7KB_F_A_T_Q_B_ K
[color=blue]
> Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do that
> in a scalable fashion you would fire up concurrent LOADs on each node
> filtering the source by DBPARTITION.[/color]

Does that mean
DECLARE C1 CURSOR for select * from stage where dbpartitionnum( column) = 1
LOAD FROM C1 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNU MS 1
DECLARE C2 CURSOR for select * from stage where dbpartitionnum( column) = 2
LOAD FROM C2 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNU MS 2
and so on

Thanks
Joachim

"Serge Rielau" <srielau@ca.ibm .com> schrieb im Newsbeitrag
news:35pnsrF4q7 sclU1@individua l.net...[color=blue]
> Joachim Klassen wrote:[color=green]
>> Hi all,
>>
>> first apologies if this question looks the same as another one I recently
>> posted - its a different thing but for the same szenario:-).
>>
>> We are having performance problems when inserting/deleting rows from a
>> large table.
>> My scenario:
>>
>> Table (lets call it FACT1) with 1000 million rows distributed on 12
>> Partitions (3 physical hosts with 4 logical partitions each).
>> Overall size of table is 350 GB. Each night 1.5 Million new rows will be
>> added
>> and approx. the same amount of old records will be deleted (Roll in/Roll
>> out with SQL INSERT/DELETE).
>> The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
>> Extentsize.
>> The tablespace has 6 containers on each partition. Each container is on a
>> separate IBM ESS array.
>> Prefetchsize is 384 (6 containers * 64 pages). Prefetch behaves very well
>> with these settings (DB2_PARALLEL_I O is set)
>> DB2 is V8.1 ESE (DPF) FP5 and runs on AIX.
>>
>> It takes 7 hours to insert 1.5 Million Rows into FACT1 and up to 7 hours
>> to delete the same amount.
>> The Insert is done via INSERT INTO FACT1 ... SELECT * FROM STAGING_TABLE.
>> Both the fact and the staging table are in tablespaces in the same
>> nodegroup and do have the same partitioning key.
>>
>> On a similar table (lets call it FACT2) with a comparable amount of
>> data/rows and nearly identical configuration the same process takes only
>> 5 minutes.
>>
>> The main difference between these two tables is that FACT1 has 7 indexes
>> defined on it and FACT2 only 4.
>> One of the indexes in each case is unique, the others not (all type 2).
>> There is no clustering index and the APPEND attribute is set to ON.
>> I'm aware of the pseudo-delete mechanism of type-2 indexes and the
>> corresponding longer search time for insert's in the index leaf pages .
>> But an exclusive lock on the table before inserting/deleting does not
>> change the needed runtime.
>> (And the docs say that with a X-lock on table pseudo-deletes will not
>> happen).
>> Also after reorg of table and indexes the insert runtime is the same as
>> before.
>>
>> Is it possible that the additional index maintenace for FACT1 leads to
>> such a longer runtime ?
>> What exactly happens internal for index maintenance (searched the docs -
>> but do not found internals)?[/color]
> I'm not privy of index maintenance internals, but could it be the 7
> indexes cause a spill of some heap? Maybe sort heap? Have you checked the
> snapshots?
> Have you verified that the plans are good? You shouldn't see any TQs.
> Also are you sure you don't have any other complicating factors (SQL
> Functions, Triggers, check or RI constraints) (The plans will show).[color=green]
>> PPS: We are parallel investigating in MDC tables, using smaller tables
>> (and combining them with a UNION ALL view) and the use of LOAD FROM
>> CURSOR instead of INSERT[/color]
> Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do that
> in a scalable fashion you would fire up concurrent LOADs on each node
> filtering the source by DBPARTITION.
> You shouldn't need UNION ALL.
>
> Cheers
> Serge
>
> --
> Serge Rielau
> DB2 SQL Compiler Development
> IBM Toronto Lab[/color]

**Serge Rielau** · Nov 12 '05, 10:05 AM

Re: Performance problems when inserting into a large table

Joachim Klassen wrote:[color=blue]
> Optimizer Plan:
>
> INSERT
> ( 2)
> /----/ \
> TBSCAN Table:
> ( 3) DPERMT
> | F7KB_F_A_T_Q_B_ K
> Table:
> DTMP1T
> F7KB_F_A_T_Q_B_ K[/color]
Doesn't get easier than that...[color=blue][color=green]
>>Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do that
>>in a scalable fashion you would fire up concurrent LOADs on each node
>>filtering the source by DBPARTITION.[/color]
>
>
> Does that mean[/color]
Connect to node 1:[color=blue]
> DECLARE C1 CURSOR for select * from stage where dbpartitionnum( column) = 1
> LOAD FROM C1 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNU MS 1[/color]
Connect to node 2:[color=blue]
> DECLARE C2 CURSOR for select * from stage where dbpartitionnum( column) = 2
> LOAD FROM C2 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNU MS 2[/color]
connect to node "and so on"[color=blue]
> and so on[/color]

Basically you are your own splitter.

This, btw is a great way to do batch processing with procedures.

Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Performance problems when inserting into a large table

Performance problems when inserting into a large table

Comment

Comment

Comment