Need to generate 9B rows in a single table

The simplest way to populate your database with test data.

Moderators: David Atkinson, Anu Deshpande, Lionel

Need to generate 9B rows in a single table

Postby sderry » Fri Jan 06, 2012 11:01 pm

As I haven’t worked with the product before have some questions:


1) How is the option shuffle used? Seems to be related to null values?

2) To correlate column values would it be best to use a CSV file where the column values are defined?

3) Would each column generator use the same row from the CSV file?

4) As generation will be for a single table, is the data generation/inserts multithreaded?

5) From searching the forum found that bulk insert is used, is there a way to control the number of inserts include in a bulk insert?

6) The row size of the table is ~100 bytes, it's a 64way system, table is partitioned, any thoughts on how long it might take for 9B rows to be added (no indexes)?

Thanks,
Stan
sderry
 
Posts: 3
Joined: Fri Jan 06, 2012 10:21 pm

Postby sderry » Mon Jan 09, 2012 10:43 pm

With further research was able to answer my questions 2, 3, 5. From prelimanary test runs data generation seems to be single threaded (question 4).
sderry
 
Posts: 3
Joined: Fri Jan 06, 2012 10:21 pm

Postby peter.peart » Thu Jan 12, 2012 9:14 pm

Thanks for your post and sorry for the delay in our reply. Please allow me to answer your questions as you have laid out:

1) Sorry, but I'm not sure I understand this. What do you mean by "option shuffle"?

4) To the best of my knowledge, the application is multi-threaded.

6) There really isn't any way we can possibly estimate that. There's s many variables involved, going from where the product is installed against where you are generating the data to the processor speed on the box and speed of disk drives you have.

Pete
Peter Peart
Red Gate Software Ltd
+44 (0)870 160 0037 ext. 8569
1 866 RED GATE ext. 8569
peter.peart
Site Admin
 
Posts: 362
Joined: Tue Sep 02, 2008 9:58 am
Location: Top floor, RG towers with the cool kids

Postby sderry » Thu Jan 12, 2012 9:24 pm

!) When you have associated a generator with a column, you have a selectable element "shuffle" . I was wondering how it was used.

4) Yes it would probably be multi-threaded when generating data for multiple tables. When generating data for a single table it appears to be a single thread doing the insert. I've reverted to running multiple copies of SQLDataGenerator to overcome this.

6) Yes there are a number of variants, was hoping someone had experienced such a large load and may have some guidance.

One thing I noticed is prior to doing the inserts SQLDataGenerator executes a "select count_big(*) from <tablename>" which has a major impact on the overall performance of the product. I expect the count is a result of not setting to truncate the table prior to starting the data generation. It would be nice to have an option to disable this select.
sderry
 
Posts: 3
Joined: Fri Jan 06, 2012 10:21 pm


Return to SQL Data Generator 1

Who is online

Users browsing this forum: No registered users and 0 guests