To insert 60million rows to database

**inychopra** · Jan 25 '07, 06:23 AM

Originally posted by dusty

hi all,

how to insert 60million rows to the postgres datase. i want to increase the performance as it is taking 1min to insert every 1000rows. is there any easier way to do it. thanks in advance.

hello dusty,actually i wanted to know that whether u have got a solution to yhis problem or not.i m facing the same problem,so u if dont mind helping me in this,then do send me the solution.thanks

**michaelb** · Jan 25 '07, 02:57 PM

The insert/delete/update performance depends on how many idexes you have on the table, make sure you have only as many as you really need.

Perhaps even more importantly, when doing a bulk insert you should consider using the COPY command instead of the SQL "Insert into my_table ..."
This should speed up the process very significantly:

COPY

http://www.postgresql.org/docs/8.0/interactive/sql-copy.html

**iburyak** · Jan 29 '07, 07:05 AM

For bulk inserts I would suggest dropping indexes totally and then recreate them back after insert complete.
When data inserted with indexes on table especially clustered it rebuilds indexes after each insert. If it is a clustered index it physically moves data which takes even more time.
Rebuild indexes will take some time at the end but it will be done once and not each time record inserted.
Also check if you have triggers\constr aints which should be dropped\recreat ed if possible.
Only triggers\constr aints that check data integrity should stay in place.

Bulk insert should work faster because it doesn’t use transaction log and speeds up tremendously.
If you still want to use straight insert use Commit transaction after each 1000 records.
It will help clearing transaction log.

Good Luck.

**ahammond** · Jan 29 '07, 11:17 PM

Originally posted by iburyak

For bulk inserts I would suggest dropping indexes totally and then recreate them back after insert complete.

This assumes that you don't have concurrent queries trying to read from those tables. If you do, and they depend on the index you won't like the effect.

Originally posted by iburyak

When data inserted with indexes on table especially clustered it rebuilds indexes after each insert. If it is a clustered index it physically moves data which takes even more time.

This is incorrect. PostgreSQL does not dynamically maintain clustering. AFAIK there are no plans to support dynamically maintained clustering at any time in the near future.

Originally posted by iburyak

Rebuild indexes will take some time at the end but it will be done once and not each time record inserted.
Also check if you have triggers\constr aints which should be dropped\recreat ed if possible.
Only triggers\constr aints that check data integrity should stay in place.

Bulk insert should work faster because it doesn’t use transaction log and speeds up tremendously.
If you still want to use straight insert use Commit transaction after each 1000 records.
It will help clearing transaction log.

Good Luck.

Nobody seems to have mentioned using prepared transactions yet. Which should make a significant improvement.

To insert 60million rows to database

To insert 60million rows to database

Comment

Comment

Comment

Comment