Should I Disable Auto Commit During Load?

Moderator: NorbertKrupa

Post Reply
User avatar
piglet
Beginner
Beginner
Posts: 45
Joined: Tue Feb 07, 2012 4:04 pm

Should I Disable Auto Commit During Load?

Post by piglet » Wed Mar 20, 2013 3:24 pm

Hi all,

One way we speed up our MySQL tables loads is to disable AUTOCOMMIT. Is this something recommended that we do in Vertica prior to loading data with the COPY command?

Thanks!!!

pborne
Newbie
Newbie
Posts: 20
Joined: Mon Feb 18, 2013 1:37 am

Re: Should I Disable Auto Commit During Load?

Post by pborne » Sun Mar 24, 2013 4:12 pm

Usually, when one turns off auto commit with the COPY command is because one needs to check the data before committing. For instance you could ask Vertica to check PKs and FKs before committing.

If you want your loads to be fast, there are a few key points to keep in mind:

- Load directly to ROS with the "DIRECT" keyword. If you use COPY, you probably want to load a lot of data in bulk, so bypass the WOS, you will flood it anyway.
- Load with gzipped files, using the "GZIP" keyword, as it will help with IO requirements, either from the local disks or from the network if loading from a file server.
- Load in parallel on all nodes so you can leverage the different servers in your cluster.
- Use multiple streams per node so you can leverage multiple CPU cores in each node. You can either run multiple COPY commands on each node (your need to write a script and have multiple files) or you can reference multiple files in the "FROM" clause of the COPY command.

Another way to easily get more parallelism is to use the "ON ANY NODE" extension that was added in version 6.x. You will have some constraints though as all nodes must be able to get to the files with the exact same path. This implies a shared file server of some sort with local mount points being exactly the same or you need to copy all files on all nodes locally. This is because you cannot control what node loads what file any more.

Once you have implemented all or most of the above, you will want to look into tuning the resource pools and the tuple mover so you don't get into the "ROS pushback".

Patrice

Post Reply

Return to “Vertica Performance Tuning”