Issues in MERGE INTO Command

Moderator: NorbertKrupa

Post Reply
incrediblegiant
Newbie
Newbie
Posts: 3
Joined: Tue Feb 19, 2013 7:33 am

Issues in MERGE INTO Command

Post by incrediblegiant » Tue Feb 19, 2013 7:59 am

Hi,
Am new to Vertica and trying to get a hang of it. I use Vertica 6.0 and am trying to use MERGE INTO statement but face an issue. Below is the structure of my target table:

Table Name: expenses
ID calendar_month calendar_year Groceries_Expense Movie_Expense Shopping_Expense Travel_Expense Education_Expense
1 12 2012 343 60 890 780 900
1 11 2012 349 20 670 450 300

Each column is calculated from multiple joins across tables and I have seperate sql statements to calculate them. In order to have 1 row per ID with all expenses populated, I executed all SQL statements and populated the following temp table.

Table Name: tempexpenses
ID calendar_month calendar_year Groceries_Expense Movie_Expense Shopping_Expense Travel_Expense Education_Expense
1 12 2012 343
1 12 2012 60
1 12 2012 890
1 12 2012 780
1 12 2012 900
1 11 2012 349
1 11 2012 20
1 11 2012 670
1 11 2012 450
1 11 2012 300

Then I wrote a MERGE INTO statement to get my final record (eg., for groceries expense).

MERGE INTO expenses tgt
USING tempexpenses src
ON tgt.id = src.id
AND tgt.calendar_month = src.calendar_month
AND tgt.calendar_year = src.calendar_year
WHEN MATCHED THEN UPDATE SET
groceries_expense = CASE
WHEN tgt.groceries_expense IS NULL THEN src.groceries_expense
ELSE tgt.groceries_expense
END,
movie_expense = CASE
WHEN tgt.movie_expense IS NULL THEN src.movie_expense
ELSE tgt.movie_expense
END
WHEN NOT MATCHED THEN INSERT(id, calendar_month, calendar_year, groceries_expense, movie_expense) values (src.id,src.calendar_month,src.calendar_year,src.groceries_expense, src.movie_expense);

I will add other expense to the same merge statement. Initially I tried with just 2 expense to see whether it works.

But when I execute the above statement, the o/p looks the same as my temp table, 1 row for id, month, year, each expense.

Can someone point out what am I missing. I tried re-writing in other ways, nothing worked. Any help would be appreciated. Thanks!
Attachments
Query.docx
Same content as my post. Have attached this as spacing in table structure goes off in browser. Please see this.
(11.27 KiB) Downloaded 612 times

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Issues in MERGE INTO Command

Post by JimKnicely » Tue Feb 19, 2013 3:16 pm

Hi incrediblegiant,

Welcome to the forums!

Did you actually run that MERGE command? If you did you should have received an error similar to the following:

ERROR 3147: Duplicate MERGE key detected in join [(testdb.expenses x intersect_wh.tempexpenses) using expenses_b0 and subquery (PATH ID: 1)]; value [1,11,2012]

This error is occurring in the WHEN MATCHED THEN UPDATE SET section of the MERGE command... According to the Vertica documentation:
If Vertica finds more than one matching row in the source table for a row in the target table, you'll see a run-time error
Basically, you can't have duplicate values for the combination of the id, calendar_month and calendar_year columns in your tempexpenses table.
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

incrediblegiant
Newbie
Newbie
Posts: 3
Joined: Tue Feb 19, 2013 7:33 am

Re: Issues in MERGE INTO Command

Post by incrediblegiant » Wed Feb 20, 2013 6:06 pm

Hi knicely87,

Now I see why my query did not work. Thanks a lot. I'll see how to re-modify this. I shall get back if I face any issues. Thanks again.

jagadeesh
Newbie
Newbie
Posts: 21
Joined: Tue Feb 05, 2013 9:36 am

Re: Issues in MERGE INTO Command

Post by jagadeesh » Thu Feb 21, 2013 7:50 am

I am getting the Duplicate Merge Key error but i checked my source and target table there are no duplicate records.

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Issues in MERGE INTO Command

Post by JimKnicely » Thu Feb 21, 2013 6:05 pm

Hi,

From your first post, is this is the data in your tempexpenses table?

Code: Select all

dbadmin=> SELECT * FROM tempexpenses;
 ID | calendar_month | calendar_year | Groceries_Expense | Movie_Expense | Shopping_Expense | Travel_Expense | Education_Expense
----+----------------+---------------+-------------------+---------------+------------------+----------------+-------------------
  1 |             12 |          2012 |               343 |               |                  |                |         
  1 |             11 |          2012 |               349 |               |                  |                |         
  1 |             11 |          2012 |               450 |               |                  |                |         
  1 |             12 |          2012 |               343 |               |                  |                |         
  1 |             11 |          2012 |               349 |               |                  |                |         
  1 |             11 |          2012 |               450 |               |                  |                |         
  1 |             12 |          2012 |                60 |               |                  |                |         
  1 |             11 |          2012 |                20 |               |                  |                |         
  1 |             11 |          2012 |               300 |               |                  |                |         
  1 |             12 |          2012 |                60 |               |                  |                |         
  1 |             11 |          2012 |                20 |               |                  |                |         
  1 |             11 |          2012 |               300 |               |                  |                |         
  1 |             12 |          2012 |               890 |               |                  |                |         
  1 |             12 |          2012 |               780 |               |                  |                |         
  1 |             12 |          2012 |               900 |               |                  |                |         
  1 |             11 |          2012 |               670 |               |                  |                |         
  1 |             12 |          2012 |               890 |               |                  |                |         
  1 |             12 |          2012 |               780 |               |                  |                |         
  1 |             12 |          2012 |               900 |               |                  |                |         
  1 |             11 |          2012 |               670 |               |                  |                |         
(20 rows)
If so the following query will show you the duplicate values:

Code: Select all

dbadmin=> SELECT src.id,
dbadmin->        src.calendar_month,
dbadmin->        src.calendar_year,
dbadmin->        COUNT(*)
dbadmin->   FROM tempexpenses src
dbadmin->  GROUP
dbadmin->     BY src.id,
dbadmin->        src.calendar_month,
dbadmin->        src.calendar_year
dbadmin-> HAVING COUNT(*) > 1;
 id | calendar_month | calendar_year | COUNT
----+----------------+---------------+-------
  1 |             12 |          2012 |    10
  1 |             11 |          2012 |    10
(2 rows)
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

deepvoice
Newbie
Newbie
Posts: 1
Joined: Sat Aug 03, 2013 12:32 am

Re: Issues in MERGE INTO Command

Post by deepvoice » Mon Aug 05, 2013 7:16 pm

I am trying to do something similar but duplicate keys are unavoidable. I am trying to do a summarization of raw logging.

Code: Select all

Raw_table
______
EmpID | Enter/Exit | Zone1 | Zone2 
_____________________________________
bob       | Entered   | 2013-01-01 12:00:00 | null 
bob       | Entered   |  null | 2013-01-01 13:00:00 
The key is Bob

Code: Select all

MERGE INTO security.summary n USING security.raw_log m
	ON m.EmpID = n.EmpID

	WHEN MATCHED THEN UPDATE SET

  		n.last_entered_zone1 = m.zone1,
        n.last_entered_zone2 = m.zone2

 	WHEN NOT MATCHED THEN

	 INSERT (
             n.EmpID = m.EmpID
             n.first_time_entered_zone1 = m.zone1,
             n.first_time_entered_zone2 = m.zone2

);

I am getting an error because of the duplicate key in the raw_log (bob)
Not sure if this is happening on the Update part or the insert part. (I am new to Vertica and still getting up to speed)

I was using this page as a template: http://www.vertica.com/2012/08/27/load- ... more-10474

Is there a way around this in a single statement without having to summarize everything first (more coding) into a temp_table?

*Note: The example is very simplistic but I am inserting about 8K rows a second into a "merge" table (raw_logs) then trying to merge it into a summary table.
Bob does a lot of walking around. So much that it is entirely possible he will be in the temp_table twice or more between merge cycles.

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Issues in MERGE INTO Command

Post by JimKnicely » Tue Aug 06, 2013 1:08 pm

Hi,

If you have duplicate key values in the merge data you'll get the error with both the insert and the update parts of the merge command:

Code: Select all

dbadmin=> select * from jim;
 c1 |      c2       
----+---------------
  1 | Vertica Is Ok
(1 row)

dbadmin=> select * from raw_data;
 c1 |       c2        
----+-----------------
  1 | Vertica Rocks
  1 | Vertica is cool
(2 rows)
Merge with update:

Code: Select all

dbadmin=> merge into jim j using raw_data r
dbadmin->    on j.c1 = r.c1
dbadmin->    when matched then update set
dbadmin->      c2 = r.c2;
ERROR 3147:  Duplicate MERGE key detected in join [(public.jim x public.raw_data) using jim_b0 and subquery (PATH ID: 1)]; value [1]
Merge with insert:

Code: Select all

dbadmin=> merge into jim j using raw_data r
dbadmin->    on j.c1 = r.c1
dbadmin->    when not matched then insert values
dbadmin->      (r.c1, r.c2);
ERROR 3147:  Duplicate MERGE key detected in join [(public.jim x public.raw_data) using jim_b0 and subquery (PATH ID: 1)]; value [1]
The error occurs on the join.
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

Post Reply

Return to “New to Vertica”