Tags: appreciate, concern, contextso, cursor, database, db2, federated, load, mysql, old, oracle, performance, quoting, reading, serge, someoneor, sql, unfounded

Performance of load from a cursor .... federated cursor?

On Database » DB2

13,240 words with 7 Comments; publish: Wed, 02 Jan 2008 21:39:00 GMT; (25060.55, « »)

Sorry for quoting an old post and probably I am reading out of context

so my concern is unfounded. But I would appreciate if I can get someone

or Serge to confirm. Also unlike the question asked in the post below,

my question involves non-partitioned table loads.

I want to know if, in general, loading from cursor is slower than

loading from a file? I was thinking cursor would normally be faster,

because of DB2's superior buffer/prefetch management than that of OS's

(AIX).

Now to my specific question. I need to transfer huge amount of data

from one DB to another DB, both are DB2 8.2.4 running on the same

server. Would I be better off exporting data to files and then loading

from the files, or, will it be faster to define nicknames and load from

federated cursors, since it will avoid having to create files?

Thanks

P Adhia

From:Serge Rielau - view profile

Date:Wed, Jan 26 2005 10:31 am

Email: Serge Rielau <srie....db2.todaysummary.com.ca.ibm.com>

Groups: comp.databases.ibm-db2

Not yet rated

Rating:

show options

Joachim Klassen wrote:

> Hi all,

> first apologies if this question looks the same as another one I recently

> posted - its a different thing but for the same szenario:-).

> We are having performance problems when inserting/deleting rows from a large

> table.

> My scenario:

> Table (lets call it FACT1) with 1000 million rows distributed on 12

> Partitions (3 physical hosts with 4 logical partitions each).

> Overall size of table is 350 GB. Each night 1.5 Million new rows will be

> added

--snipped--

I'm not privy of index maintenance internals, but could it be the 7

indexes cause a spill of some heap? Maybe sort heap? Have you checked

the snapshots?

Have you verified that the plans are good? You shouldn't see any TQs.

Also are you sure you don't have any other complicating factors (SQL

Functions, Triggers, check or RI constraints) (The plans will show).

> PPS: We are parallel investigating in MDC tables, using smaller tables (and

> combining them with a UNION ALL view) and the use of LOAD FROM CURSOR

> instead of INSERT

Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do

that in a scalable fashion you would fire up concurrent LOADs on each

node filtering the source by DBPARTITION.

You shouldn't need UNION ALL.

Cheers

Serge

--

Serge Rielau

DB2 SQL Compiler Development

IBM Toronto Lab

All Comments

Leave a comment...

  • 7 Comments
    • P. Adhia wrote:

      > Sorry for quoting an old post and probably I am reading out of context

      > so my concern is unfounded. But I would appreciate if I can get someone

      > or Serge to confirm. Also unlike the question asked in the post below,

      > my question involves non-partitioned table loads.

      If I'm not someone. Does that make me a no-one? ;-)

      > I want to know if, in general, loading from cursor is slower than

      > loading from a file? I was thinking cursor would normally be faster,

      > because of DB2's superior buffer/prefetch management than that of OS's

      > (AIX).

      Loading from CURSOR is slower. I guess this has to do with the codepath

      involved.

      It's not about merely getting the data out of wherever it is and LOAD is

      darn efficient reading that flat file fast.

      In non DPF experiments (on DB2 V8 for LUW) indicate that cursor load is

      about half as fast as ascii delimited which in turn is about half as

      fast as IXF.

      (e.g. 14GB/hr/CPU ... 51GB/hr/CPU on the reference system)

      > Now to my specific question. I need to transfer huge amount of data

      > from one DB to another DB, both are DB2 8.2.4 running on the same

      > server. Would I be better off exporting data to files and then loading

      > from the files, or, will it be faster to define nicknames and load from

      > federated cursors, since it will avoid having to create files?

      Good question. I don't have data on EXPORT and that's the price you pay.

      (Or the "High Performance Unloader" tools which is quite literally a

      price you pay)

      One question to answer: Do you have enough disk to hold the IXF or the

      ASC file?

      <DB2 Viper Advertisement>

      You can use a new DATABASE option for the cursor load which improves

      usability and performance:

      If ABC.TABLE1 resides in a database different from the database

      ABC.TABLE2 is in, the DATABASE, USER, and USING options of the DECLARE

      CURSOR command can be used to perform the load. For example, if

      ABC.TABLE1 resides in database DB1, and the user ID and password for

      DB1 are user1 and pwd1 respectively, executing the following commands

      will load all the data from ABC.TABLE1 into ABC.TABLE2:

      db2 declare mycurs cursor database DB1 user user1 using pwd1 for

      select two,one,three from abc.table1

      db2 load from mycurs of cursor insert into abc.table2

      </DB2 Viper Advertisement

      --

      Serge Rielau

      DB2 Solutions Development

      IBM Toronto Lab

      #1; Wed, 02 Jan 2008 21:40:00 GMT
    • PS: Here is the link to the paper I'm parroting:

      ftp.software.ibm.com/software/data/pubs/papers/loaderperf.pdf

      --

      Serge Rielau

      DB2 Solutions Development

      IBM Toronto Lab

      #2; Wed, 02 Jan 2008 21:41:00 GMT
    • In article <4bd80qF112nkvU1.db2.todaysummary.com.individual.net>, srielau.db2.todaysummary.com.ca.ibm.com says...

      > db2 declare mycurs cursor database DB1 user user1 using pwd1 for

      > select two,one,three from abc.table1

      Hi Serge,

      I found some info about this way of defining a cursor at

      http://tinyurl.com/gk8v7 but not in the 'DECLARE CURSOR' docs. Should it

      be there and is this going to be updated or is it described somewhere

      else?

      It would be great if it would also be possible in a regular select

      statement.

      #3; Wed, 02 Jan 2008 21:42:00 GMT
    • Gert van der Kooij wrote:

      > In article <4bd80qF112nkvU1.db2.todaysummary.com.individual.net>, srielau.db2.todaysummary.com.ca.ibm.com says...

      >> db2 declare mycurs cursor database DB1 user user1 using pwd1 for

      >> select two,one,three from abc.table1

      >>

      > Hi Serge,

      > I found some info about this way of defining a cursor at

      > http://tinyurl.com/gk8v7 but not in the 'DECLARE CURSOR' docs. Should it

      > be there and is this going to be updated or is it described somewhere

      > else?

      > It would be great if it would also be possible in a regular select

      > statement.

      It has the smell of a CLP command This is not your regular cursor

      definition.

      Am I right that you are gearing towards 4 part names without need to

      create nicknames? _dbname_.schema.table.column

      Cheers

      Serge

      --

      Serge Rielau

      DB2 Solutions Development

      IBM Toronto Lab

      #4; Wed, 02 Jan 2008 21:43:00 GMT
    • In article <4bdd9tF10nf7gU1.db2.todaysummary.com.individual.net>, srielau.db2.todaysummary.com.ca.ibm.com says...

      > Gert van der Kooij wrote:

      > > In article <4bd80qF112nkvU1.db2.todaysummary.com.individual.net>, srielau.db2.todaysummary.com.ca.ibm.com says...

      > >> db2 declare mycurs cursor database DB1 user user1 using pwd1 for

      > >> select two,one,three from abc.table1

      > >>

      > > Hi Serge,

      > > I found some info about this way of defining a cursor at

      > > http://tinyurl.com/gk8v7 but not in the 'DECLARE CURSOR' docs. Should it

      > > be there and is this going to be updated or is it described somewhere

      > > else?

      > > It would be great if it would also be possible in a regular select

      > > statement.

      > It has the smell of a CLP command This is not your regular cursor

      > definition.

      > Am I right that you are gearing towards 4 part names without need to

      > create nicknames? _dbname_.schema.table.column

      > Cheers

      > Serge

      You're right, I found it at http://tinyurl.com/gbpeg.

      The 4 part name would be a great solution, but I guess authentication

      against the remote database could be an issue.

      #5; Wed, 02 Jan 2008 21:44:00 GMT
    • Gert van der Kooij wrote:

      > In article <4bdd9tF10nf7gU1.db2.todaysummary.com.individual.net>, srielau.db2.todaysummary.com.ca.ibm.com says...

      >> Gert van der Kooij wrote:

      >>> In article <4bd80qF112nkvU1.db2.todaysummary.com.individual.net>, srielau.db2.todaysummary.com.ca.ibm.com says...

      >>>

      >>>> db2 declare mycurs cursor database DB1 user user1 using pwd1 for

      >>>> select two,one,three from abc.table1

      >>>>

      >>> Hi Serge,

      >>> I found some info about this way of defining a cursor at

      >>> http://tinyurl.com/gk8v7 but not in the 'DECLARE CURSOR' docs. Should it

      >>> be there and is this going to be updated or is it described somewhere

      >>> else?

      >>> It would be great if it would also be possible in a regular select

      >>> statement.

      >> It has the smell of a CLP command This is not your regular cursor

      >> definition.

      >> Am I right that you are gearing towards 4 part names without need to

      >> create nicknames? _dbname_.schema.table.column

      >>

      >> Cheers

      >> Serge

      >>

      > You're right, I found it at http://tinyurl.com/gbpeg.

      > The 4 part name would be a great solution, but I guess authentication

      > against the remote database could be an issue.

      4 part name would provide the usability solution but apparently not the

      performance boost since the data would need to be selected from the

      remote DB, then pipe through DB2's runtime engine and then into the LOAD

      utility.

      In the DB2 Viper case the load utility connects directly to the remote DB.

      Cheers

      Serge

      PS: This doesn't mean that 4-part-names aren't good in their own right.

      DB2 Vipers main improvement for federation (as I see it) is full 2-phase

      commit support. That is you can finally write e.g. triggers which modify

      nicknames.

      --

      Serge Rielau

      DB2 Solutions Development

      IBM Toronto Lab

      #6; Wed, 02 Jan 2008 21:45:00 GMT
    • Serge Rielau wrote:

      > If I'm not someone. Does that make me a no-one? ;-)

      That makes you a quite-a-one :) (some<->quite -- I know, poor attempt at

      humor)

      > Loading from CURSOR is slower. I guess this has to do with the codepath

      > involved.

      OK. I am a bit surprised. I would have thought improved IO throughput at

      the cost of higher CPU utilization is a better trade-off (assuming

      system is not CPU constrained.)

      > It's not about merely getting the data out of wherever it is and LOAD is

      > darn efficient reading that flat file fast.

      > In non DPF experiments (on DB2 V8 for LUW) indicate that cursor load is

      > about half as fast as ascii delimited which in turn is about half as

      > fast as IXF.

      > (e.g. 14GB/hr/CPU ... 51GB/hr/CPU on the reference system)

      Thanks for saving me the trouble of benchmarking myself (which wouldn't

      be as accurate anyway).

      So can I interpret this as, cursor+load is still better than

      cursor(export)+write_file+read_file+load? That is, if the source data

      resides in a DB2 table and needs to be loaded to another table. Or is

      the combination cursor+load like a negative synergy?

      > Good question. I don't have data on EXPORT and that's the price you pay.

      > (Or the "High Performance Unloader" tools which is quite literally a

      > price you pay)

      > One question to answer: Do you have enough disk to hold the IXF or the

      > ASC file?

      Yes I do have enough space. Unloading to ixf files is what I have been

      doing.

      I don't have much hands-on experience with federation, so I wasn't sure

      if federation will add any significant overhead, to basic cursor+load

      process. For example, does federation use networking services even if

      federated DB resides on the same physical machine?

      Thanks for the prompt answers and the link for the load performance doc,

      it definitely looks to be very helpful for what I am doing.

      P Adhia

      PS: Just some non-scientific numbers: one iteration I ran, involved

      exporting 350G+ DB2 data, not counting indexes, to ixf and it took about

      6 hours with 6 processes running concurrently on 2 CPU PowerPC 1.65GHz

      64 bit machine. The load of the same data on same machine took 36 hours,

      with significant time spent in building indexes. Due to lack of temp

      space, I couldn't run multiple loads concurrently, but when I run next

      iteration, I'll add more temp space and run multiple loads concurrently.

      #7; Wed, 02 Jan 2008 21:46:00 GMT