Partition Swapping
Posted: Wed Jun 12, 2013 10:41 pm
Partition swapping is a common technique for loading data into tables. Typically you will load data into a stage table, then swap the stage table into a partition in the destination table. For large tables this is much faster than a merge, with downtime measured in seconds.
Here's what I learned:
1. A fast way to create a stage table identical to the destination table is:
2. If you move a partition into an existing table that already has data in that partition, the data will be added to what is already in the table. Vertica will not check to see if Primary Key constraints are violated, you will have to do that manually.
3. If you want to partition a table differently Vertica will take care of the repartitioning in the background. You can continue to query the table while it is being rebuilt.
SQL File: HowToDoPartitionSwappingInVertica.sql
Result
Here's what I learned:
1. A fast way to create a stage table identical to the destination table is:
Code: Select all
CREATE TABLE <StageTable> LIKE <DestinationTable> INCLUDING PROJECTIONS;
3. If you want to partition a table differently Vertica will take care of the repartitioning in the background. You can continue to query the table while it is being rebuilt.
Code: Select all
ALTER TABLE <TableName> PARTITION BY <PartitionFunctionGoesHere> REORGANIZE;
Example: ALTER TABLE MyExampleDestination PARTITION BY ( FLOOR(RowID/20)::INT ) REORGANIZE;
SQL File: HowToDoPartitionSwappingInVertica.sql
Code: Select all
----------------------------------------------------------------------------------------------
-- Create a table with a Primary Key Constraint
----------------------------------------------------------------------------------------------
CREATE TABLE MyExampleDestination
( RowID INT NOT NULL PRIMARY KEY
,VariableOne INT NOT NULL
,VariableTwo INT
)
ORDER BY RowID
PARTITION BY ( FLOOR(RowID/10)::INT );
----------------------------------------------------------------------------------------------
-- Populate table with 100 records of data
----------------------------------------------------------------------------------------------
INSERT INTO MyExampleDestination
(RowID, VariableOne, VariableTwo)
SELECT ROW_NUMBER() OVER() - 1 AS RowID
,(RANDOM()*100)::INT AS VariableOne
,(RANDOM()*100)::INT AS VariableTwo
FROM all_tables
limit 100;
----------------------------------------------------------------------------------------------
-- Should return 0 records if Primary Key Constraint is valid
----------------------------------------------------------------------------------------------
SELECT ANALYZE_CONSTRAINTS('MyExampleDestination');
----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
----------------------------------------------------------------------------------------------
-- Create a Source table with the same structure as the Destination Table
----------------------------------------------------------------------------------------------
CREATE TABLE MyExampleSource
LIKE MyExampleDestination INCLUDING PROJECTIONS;
----------------------------------------------------------------------------------------------
-- Populate the Source table with data for Projection Keys < 5
----------------------------------------------------------------------------------------------
INSERT /*+ direct */
INTO MyExampleSource
SELECT *
FROM MyExampleDestination
WHERE FLOOR(RowID/10)::INT < 5
AND VariableTwo%2=0;
----------------------------------------------------------------------------------------------
-- Show the distribution of records in the Source table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT, COUNT(*) AS Quantity
FROM MyExampleSource
GROUP BY 1
ORDER BY 1;
----------------------------------------------------------------------------------------------
-- Drop Partitions 0-3. Leave 4 alone.
----------------------------------------------------------------------------------------------
SELECT DROP_PARTITION ('MyExampleDestination' , 0);
SELECT DROP_PARTITION ('MyExampleDestination' , 1);
SELECT DROP_PARTITION ('MyExampleDestination' , 2);
SELECT DROP_PARTITION ('MyExampleDestination' , 3);
----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
----------------------------------------------------------------------------------------------
-- Swap partitions 0-4 from Source table into Destination Table
----------------------------------------------------------------------------------------------
SELECT MOVE_PARTITIONS_TO_TABLE ( 'MyExampleSource',0, 4, 'MyExampleDestination' ) ;
SELECT FLOOR(RowID/10)::INT, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
-- Note that Partition 4 has both the existing and the new records.
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
----------------------------------------------------------------------------------------------
-- Should return records showing where Primary Key Constraint is NOT valid in Partition 4
----------------------------------------------------------------------------------------------
SELECT ANALYZE_CONSTRAINTS('MyExampleDestination');
----------------------------------------------------------------------------------------------
-- Delete bad Partition
----------------------------------------------------------------------------------------------
SELECT DROP_PARTITION ('MyExampleDestination' , 4);
----------------------------------------------------------------------------------------------
-- Change the way the table is partitioned. This runs in the background, you can continue
-- to query the table while it is being reorganized to use the new partitioning structure
----------------------------------------------------------------------------------------------
ALTER TABLE MyExampleDestination
PARTITION BY ( FLOOR(RowID/20)::INT ) REORGANIZE;
----------------------------------------------------------------------------------------------
-- Clean Up
----------------------------------------------------------------------------------------------
DROP TABLE MyExampleDestination, MyExampleSource;
Result
Code: Select all
dbadmin=> \i HowToDoPartitionSwappingInVertica.sql
----------------------------------------------------------------------------------------------
-- Create a table with a Primary Key Constraint
----------------------------------------------------------------------------------------------
CREATE TABLE MyExampleDestination
( RowID INT NOT NULL PRIMARY KEY
,VariableOne INT NOT NULL
,VariableTwo INT
)
ORDER BY RowID
PARTITION BY ( FLOOR(RowID/10)::INT );
CREATE TABLE
----------------------------------------------------------------------------------------------
-- Populate table with 100 records of data
----------------------------------------------------------------------------------------------
INSERT INTO MyExampleDestination
(RowID, VariableOne, VariableTwo)
SELECT ROW_NUMBER() OVER() - 1 AS RowID
,(RANDOM()*100)::INT AS VariableOne
,(RANDOM()*100)::INT AS VariableTwo
FROM all_tables
limit 100;
OUTPUT
--------
100
(1 row)
----------------------------------------------------------------------------------------------
-- Should return 0 records if Primary Key Constraint is valid
----------------------------------------------------------------------------------------------
SELECT ANALYZE_CONSTRAINTS('MyExampleDestination');
Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+------------+--------------+-----------------+-----------------+---------------
(0 rows)
----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
Partition | Quantity
-----------+----------
0 | 10
1 | 10
2 | 10
3 | 10
4 | 10
5 | 10
6 | 10
7 | 10
8 | 10
9 | 10
(10 rows)
----------------------------------------------------------------------------------------------
-- Create a Source table with the same structure as the Destination Table
----------------------------------------------------------------------------------------------
CREATE TABLE MyExampleSource
LIKE MyExampleDestination INCLUDING PROJECTIONS;
CREATE TABLE
----------------------------------------------------------------------------------------------
-- Populate the Source table with data for Projection Keys < 5
----------------------------------------------------------------------------------------------
INSERT /*+ direct */
INTO MyExampleSource
SELECT *
FROM MyExampleDestination
WHERE FLOOR(RowID/10)::INT < 5
AND VariableTwo%2=0;
OUTPUT
--------
24
(1 row)
----------------------------------------------------------------------------------------------
-- Show the distribution of records in the Source table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT, COUNT(*) AS Quantity
FROM MyExampleSource
GROUP BY 1
ORDER BY 1;
FLOOR | Quantity
-------+----------
0 | 6
1 | 3
2 | 6
3 | 4
4 | 5
(5 rows)
----------------------------------------------------------------------------------------------
-- Drop Partitions 0-3. Leave 4 alone.
----------------------------------------------------------------------------------------------
SELECT DROP_PARTITION ('MyExampleDestination' , 0);
DROP_PARTITION
-------------------
Partition dropped
(1 row)
SELECT DROP_PARTITION ('MyExampleDestination' , 1);
DROP_PARTITION
-------------------
Partition dropped
(1 row)
SELECT DROP_PARTITION ('MyExampleDestination' , 2);
DROP_PARTITION
-------------------
Partition dropped
(1 row)
SELECT DROP_PARTITION ('MyExampleDestination' , 3);
DROP_PARTITION
-------------------
Partition dropped
(1 row)
----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
Partition | Quantity
-----------+----------
4 | 10
5 | 10
6 | 10
7 | 10
8 | 10
9 | 10
(6 rows)
----------------------------------------------------------------------------------------------
-- Swap partitions 0-4 from Source table into Destination Table
----------------------------------------------------------------------------------------------
SELECT MOVE_PARTITIONS_TO_TABLE ( 'MyExampleSource',0, 4, 'MyExampleDestination' ) ;
MOVE_PARTITIONS_TO_TABLE
-----------------------------------------------------------------------------------
5 distinct partition values moved at epoch 231938. Effective move epoch: 231937.
(1 row)
SELECT FLOOR(RowID/10)::INT, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
FLOOR | Quantity
-------+----------
0 | 6
1 | 3
2 | 6
3 | 4
4 | 15
5 | 10
6 | 10
7 | 10
8 | 10
9 | 10
(10 rows)
----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
-- Note that Partition 4 has both the existing and the new records.
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
Partition | Quantity
-----------+----------
0 | 6
1 | 3
2 | 6
3 | 4
4 | 15
5 | 10
6 | 10
7 | 10
8 | 10
9 | 10
(10 rows)
----------------------------------------------------------------------------------------------
-- Should return records showing where Primary Key Constraint is NOT valid in Partition 4
----------------------------------------------------------------------------------------------
SELECT ANALYZE_CONSTRAINTS('MyExampleDestination');
Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+----------------------+--------------+-----------------+-----------------+---------------
public | MyExampleDestination | RowID | C_PRIMARY | PRIMARY | ('49')
public | MyExampleDestination | RowID | C_PRIMARY | PRIMARY | ('42')
public | MyExampleDestination | RowID | C_PRIMARY | PRIMARY | ('43')
public | MyExampleDestination | RowID | C_PRIMARY | PRIMARY | ('40')
public | MyExampleDestination | RowID | C_PRIMARY | PRIMARY | ('44')
(5 rows)
----------------------------------------------------------------------------------------------
-- Delete bad Partition
----------------------------------------------------------------------------------------------
SELECT DROP_PARTITION ('MyExampleDestination' , 4);
DROP_PARTITION
-------------------
Partition dropped
(1 row)
----------------------------------------------------------------------------------------------
-- Change the way the table is partitioned. This runs in the background, you can continue
-- to query the table while it is being reorganized to use the new partitioning structure
----------------------------------------------------------------------------------------------
ALTER TABLE MyExampleDestination
PARTITION BY ( FLOOR(RowID/20)::INT ) REORGANIZE;
vsql:HowToDoPartitionSwapVertica.sql:113: NOTICE 4954: The new partitioning scheme will produce 5 partitions
vsql:HowToDoPartitionSwapVertica.sql:113: NOTICE 4785: Started background repartition table task
ALTER TABLE
----------------------------------------------------------------------------------------------
-- Clean Up
----------------------------------------------------------------------------------------------
DROP TABLE MyExampleDestination, MyExampleSource;
DROP TABLE