Use MERGE to do whole table update

mattenx · Post by **mattenx** » Mon Sep 23, 2013 1:40 am

Hi Everyone,
I'm trying to do a whole table update using the MERGE statement. I have two identical tables and basically want to do a bulk upload of mostly new data, and perform updates to existing rows. The table has 18 columns. If I want to update everything WHERE MATCHED, and insert everything where NOT MATCHED, do I have to specify each column value, or is there an easier way to type this out?

Thank you!!

Post by **JimKnicely** » Mon Sep 23, 2013 7:12 pm

Hi,

Unfortunately I think you gotta specify all the columns.

You can use SQL to generate some of the MERGE command. For instance, if I am updating a table t1 using t2, I can generated the columns syntax for the update like this:

Code: Select all


dbadmin=> \d t1;
                                List of Fields by Tables
 Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key 
--------+-------+--------+------+------+---------+----------+-------------+-------------
 public | t1    | c1     | int  |    8 |         | f        | f           | 
 public | t1    | c2     | int  |    8 |         | f        | f           | 
 public | t1    | c3     | int  |    8 |         | f        | f           | 
 public | t1    | c4     | int  |    8 |         | f        | f           | 
 public | t1    | c5     | int  |    8 |         | f        | f           | 
 public | t1    | c6     | int  |    8 |         | f        | f           | 
 public | t1    | c7     | int  |    8 |         | f        | f           | 
 public | t1    | c8     | int  |    8 |         | f        | f           | 
 public | t1    | c9     | int  |    8 |         | f        | f           | 
 public | t1    | c10    | int  |    8 |         | f        | f           | 
 public | t1    | c11    | int  |    8 |         | f        | f           | 
 public | t1    | c12    | int  |    8 |         | f        | f           | 
 public | t1    | c13    | int  |    8 |         | f        | f           | 
 public | t1    | c14    | int  |    8 |         | f        | f           | 
 public | t1    | c15    | int  |    8 |         | f        | f           | 
 public | t1    | c16    | int  |    8 |         | f        | f           | 
 public | t1    | c17    | int  |    8 |         | f        | f           | 
 public | t1    | c18    | int  |    8 |         | f        | f           | 
(18 rows)

dbadmin=> select column_name || ' = t2.' || column_name || ','
dbadmin->   from columns where table_name = 't1' order by ordinal_position;
   ?column?    
---------------
 c1 = t2.c1,
 c2 = t2.c2,
 c3 = t2.c3,
 c4 = t2.c4,
 c5 = t2.c5,
 c6 = t2.c6,
 c7 = t2.c7,
 c8 = t2.c8,
 c9 = t2.c9,
 c10 = t2.c10,
 c11 = t2.c11,
 c12 = t2.c12,
 c13 = t2.c13,
 c14 = t2.c14,
 c15 = t2.c15,
 c16 = t2.c16,
 c17 = t2.c17,
 c18 = t2.c18,
(18 rows)

id10t · Post by **id10t** » Mon Sep 23, 2013 7:46 pm

Hi!

>> do I have to specify each column value
Yes, you have to specify all columns (or surrogate key). Vertica doesn't have such concept like ROWID (because its columnar database).

PS
How many times I heard this question! I think that no one from architects/designers didn't read Ralph Kimball but everyone talk about denormalization, FACT/DIM, data warehouse and so on! Why you didn't define a surrogate key(or your architects)? It's very easy to do with Vertica(no schema modification is required), but now, to add a surrogate key, will be painful (for feature I still recommend to it).

mattenx · Post by **mattenx** » Mon Sep 23, 2013 10:37 pm

lol Thanks everyone. It works out fine, but just looks convoluted. Our company just switched from mysql to vertica, and I'm finding that vertica is awesome in terms of performance, but weak on the convenience scale.

SKWA, not sure what you are talking about in your post script note heh.

I HAVE ANOTHER QUESTION
i currently copy local a file, then i have to manually update the timestamp >> (i have a file, in which I add in an empty date column. once I copy it into a table, I update, set datetime to current_timestamp. ) Does vertica have an auto set timestamp function? Haven't quite been able to figure this out which explains my run around. (mysql has datetime data type which auto fills)

Thanks!
Matt

Post by **JimKnicely** » Mon Sep 23, 2013 10:53 pm

Hi mattenx,

Try using a column default and then not specifying the column in the copy command.

Example:

Code: Select all

dbadmin=> create table date_test (c1 int, my_date datetime default sysdate);
CREATE TABLE
dbadmin=> \!cat date.txt
1|
2|
3|
dbadmin=> copy date_test (c1) from local '/home/dbadmin/date.txt';
 Rows Loaded 
-------------
           3
(1 row)

dbadmin=> select * from date_test;
 c1 |          my_date          
----+---------------------------
  1 | 2013-09-23 14:51:46.57865
  2 | 2013-09-23 14:51:46.57865
  3 | 2013-09-23 14:51:46.57865
(3 rows)

BTW, can you please ask questions unrelated to the post subject in a separate post? Some users line to search on the subject line

id10t · Post by **id10t** » Mon Sep 23, 2013 11:35 pm

Hi!

>> Does vertica have an auto set timestamp function?
Kind of: epoch.
You can use it in a same way like Jim in previous post, but can't fetch any additional info except epoch number. Explicit SK definition (imho) preferable than epoch column.

Code: Select all

daniel=> create table mattenx (id int not null primary key, name varchar(16));
CREATE TABLE
daniel=> copy mattenx from stdin direct;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1|foo
>> 2|bar
>> 3|baz
>> 4|egg
>> \.
daniel=> select epoch, * from mattenx ;
 epoch | id | name 
-------+----+------
  1363 |  1 | foo
  1363 |  2 | bar
  1363 |  3 | baz
  1363 |  4 | egg
(4 rows)

Code: Select all

daniel=> copy mattenx from stdin direct;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 3|buz
>> 1|fog
>> \.
daniel=> select epoch, * from mattenx ;
 epoch | id | name 
-------+----+------
  1363 |  1 | foo
  1363 |  2 | bar
  1363 |  3 | baz
  1363 |  4 | egg
  1364 |  1 | fog
  1364 |  3 | buz
(6 rows)

Code: Select all

daniel=> select analyze_constraints('');
 Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values 
-------------+------------+--------------+-----------------+-----------------+---------------
 public      | mattenx    | id           | C_PRIMARY       | PRIMARY         | ('1')
 public      | mattenx    | id           | C_PRIMARY       | PRIMARY         | ('3')
(2 rows)

Code: Select all

daniel=> delete from mattenx where (epoch, id) not in (select max(epoch), id from mattenx group by id);
 OUTPUT 
--------
      2
(1 row)

daniel=> select epoch, * from mattenx ;
 epoch | id | name 
-------+----+------
  1363 |  2 | bar
  1363 |  4 | egg
  1364 |  1 | fog
  1364 |  3 | buz
(4 rows)

Code: Select all

daniel=> select analyze_constraints('');
 Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values 
-------------+------------+--------------+-----------------+-----------------+---------------
(0 rows)

Use MERGE to do whole table update

Use MERGE to do whole table update

Re: Use MERGE to do whole table update

Re: Use MERGE to do whole table update

Re: Use MERGE to do whole table update

Re: Use MERGE to do whole table update

Re: Use MERGE to do whole table update