copy command execution in cluster.

Moderator: NorbertKrupa

Post Reply
haimhai
Newbie
Newbie
Posts: 2
Joined: Fri Oct 04, 2013 12:44 pm

copy command execution in cluster.

Post by haimhai » Fri Oct 04, 2013 3:11 pm

Hello guys,
I have a cluster with three servers and each server has a dedicated folder in which files should go.
I try to write one copy command you to run on one server and run the other servers.
I read in the documentation and saw that to do so from stage I should write '<path to file>' on <node name> like this:

FROM
'/data/files_to_load/*.gz' on v_test_node0001,
'/data/files_to_load/*.gz' on v_test_node0002,
'/data/files_to_load/*.gz' on v_test_node0003

But this is only the first server and the rest is not load the data
What should I do?

Thank you all.

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: copy command execution in cluster.

Post by id10t » Fri Oct 04, 2013 3:21 pm

Hi!
  • Example of data:

    Code: Select all

    $ pwd
    /tmp/data
    

    Code: Select all

    $ ls
    data_01.gz  data_02.gz  data_03.gz
    

    Code: Select all

    $ zcat data_01.gz 
    1|a
    2|b
    3|c
    

    Code: Select all

    $ zcat data_02.gz 
    4|d
    5|e
    6|f
    

    Code: Select all

    $ zcat data_03.gz 
    7|g
    8|h
    9|i
    
  • Example of copy:

    Code: Select all

    daniel=> create table haimhai (id int, c char);
    CREATE TABLE
    daniel=> copy haimhai from '/tmp/data/data_01.gz' GZIP, '/tmp/data/data_02.gz' GZIP, '/tmp/data/data_03.gz' GZIP direct;
     Rows Loaded 
    -------------
               9
    (1 row)
    

    Code: Select all

    daniel=> select * from haimhai ;
     id | c 
    ----+---
      1 | a
      2 | b
      3 | c
      4 | d
      5 | e
      6 | f
      7 | g
      8 | h
      9 | i
    (9 rows)
    
  • Copy with wild-card

    Code: Select all

    daniel=> copy haimhai from '/tmp/data/*.gz' GZIP direct;
     Rows Loaded 
    -------------
               9
    (1 row)
    

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: copy command execution in cluster.

Post by JimKnicely » Fri Oct 04, 2013 3:53 pm

haimhai,

Try changing:

FROM
'/data/files_to_load/*.gz' on v_test_node0001,
'/data/files_to_load/*.gz' on v_test_node0002,
'/data/files_to_load/*.gz' on v_test_node0003

To:

FROM
'/data/files_to_load/*.gz' on v_test_node0001 gzip,
'/data/files_to_load/*.gz' on v_test_node0002 gzip,
'/data/files_to_load/*.gz' on v_test_node0003 gzip

Example:

I have a gz file on three nodes.

On node 1 the file contains the text "node1", node 2 the file contains the text "node2" and on node 3 the file contains the text "node3".

Code: Select all

dbadmin=> copy test from '/home/dbadmin/*.gz' on v_mydb_node0001 gzip, '/home/dbadmin/*.gz' on v_mydbl_node0002 gzip, '/home/dbadmin/*.gz' on v_mydb_node0003 gzip;
 Rows Loaded
-------------
           3
(1 row)

dbadmin=> select * from test;
 col1
-------
 node1
 node2
 node3
(3 rows)
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

haimhai
Newbie
Newbie
Posts: 2
Joined: Fri Oct 04, 2013 12:44 pm

Re: copy command execution in cluster.

Post by haimhai » Fri Oct 04, 2013 5:54 pm

Thank you

muaythai_duan
Newbie
Newbie
Posts: 18
Joined: Sun Aug 25, 2013 11:43 am

Re: copy command execution in cluster.

Post by muaythai_duan » Mon Oct 28, 2013 7:18 am

In fact,vertica will find file on the first node,the second and the third node could not find file.you should put the files in the first node.then the count number will be 9.
knicely87 wrote:haimhai,

Try changing:

FROM
'/data/files_to_load/*.gz' on v_test_node0001,
'/data/files_to_load/*.gz' on v_test_node0002,
'/data/files_to_load/*.gz' on v_test_node0003

To:

FROM
'/data/files_to_load/*.gz' on v_test_node0001 gzip,
'/data/files_to_load/*.gz' on v_test_node0002 gzip,
'/data/files_to_load/*.gz' on v_test_node0003 gzip

Example:

I have a gz file on three nodes.

On node 1 the file contains the text "node1", node 2 the file contains the text "node2" and on node 3 the file contains the text "node3".

Code: Select all

dbadmin=> copy test from '/home/dbadmin/*.gz' on v_mydb_node0001 gzip, '/home/dbadmin/*.gz' on v_mydbl_node0002 gzip, '/home/dbadmin/*.gz' on v_mydb_node0003 gzip;
Rows Loaded
-------------
3
(1 row)

dbadmin=> select * from test;
col1
-------
node1
node2
node3
(3 rows)

Post Reply

Return to “Vertica Data Load”