Hi,
We have the case that Json is pushed from outer system to our Vertica database into tables with VARCHAR(65000) columns. We do not have any possibility to change this process currently.
So to parse the Json-s we load those to our ETL tool, do the parsing there and then write back to the same database. This is quite time consuming.
We thought that maybe we could utilize the possibility to let Vertica do the parsing. But as we understand this can only be done with Flex tables, we do not have the possibility to change our regular columnar tables to flex tables.
So, now we tried to do this:
a. Create temporary flex table.
b. CONNECT TO VERTICA from and to the same instance.
c. COPY from the VARCHAR(65000) table to the temp flex table.
d. parse the Json and load the results to where needed.
But with this we hit the problem, that the CONNECT TO VERTICA seems to be possible to be done only with the dbadmin user of the connect-to database.
So we have come quite far from the initial problem with additional problems
Any good ideas how to get the Json in VARCHAR(65000) parsed as optimal as possible. Maybe using the ETL tool is still the best option in the end?
Br,
pj
Parse Json from Vertica VARCHAR column
Moderator: NorbertKrupa
Re: Parse Json from Vertica VARCHAR column
Hi PJ,
You can parse json stored in a column without using a flex table. Here are two examples:
First storing the json in a varbinary column as a vmap. This would be ideal if queries are executed repeatedly on the same json so that you only generate the vmap once.
create table t(vmap long varbinary(10000));
insert into t values (mapjsonextractor('{ "name" : "sharon"}'));
insert into t values (mapjsonextractor('{ "name" : "david"}'));
commit;
select * from t;
select maplookup(vmap, 'name') from t;
You can also parse json stored as a varchar and generate the vmap on the fly.
create table t2(json varchar(10000));
insert into t2 values ('{ "name" : "sharon"}');
insert into t2 values ('{ "name" : "david"}');
commit;
select * from t2;
select maplookup(mapjsonextractor(json), 'name') from t2;
This is buried in the MAPJSONEXTRACTOR documentation - the coljson example.
https://my.vertica.com/docs/8.0.x/HTML/ ... RACTOR.htm
—Sharon
You can parse json stored in a column without using a flex table. Here are two examples:
First storing the json in a varbinary column as a vmap. This would be ideal if queries are executed repeatedly on the same json so that you only generate the vmap once.
create table t(vmap long varbinary(10000));
insert into t values (mapjsonextractor('{ "name" : "sharon"}'));
insert into t values (mapjsonextractor('{ "name" : "david"}'));
commit;
select * from t;
select maplookup(vmap, 'name') from t;
You can also parse json stored as a varchar and generate the vmap on the fly.
create table t2(json varchar(10000));
insert into t2 values ('{ "name" : "sharon"}');
insert into t2 values ('{ "name" : "david"}');
commit;
select * from t2;
select maplookup(mapjsonextractor(json), 'name') from t2;
This is buried in the MAPJSONEXTRACTOR documentation - the coljson example.
https://my.vertica.com/docs/8.0.x/HTML/ ... RACTOR.htm
—Sharon
Sharon Cutter
Vertica Consultant, Zazz Technologies LLC
Vertica Consultant, Zazz Technologies LLC
Re: Parse Json from Vertica VARCHAR column
Hi, Sharon
Great reply, thank you.
Seems to work from the functional point of view, from performance point of view it is still better to use ETL tool for the parsing as it seems to do it a lot faster. Not sure, but assume that it is related to the fact that we have the Json in Varchar(65000) column and Vertica just does not handle good such long defined strings (even though the Json text in it rarely exceeds 1000 characters). The ETL tool just sees those as strings and manages to out-perform Vertica in this parsing case.
Br,
_pj_
Great reply, thank you.
Seems to work from the functional point of view, from performance point of view it is still better to use ETL tool for the parsing as it seems to do it a lot faster. Not sure, but assume that it is related to the fact that we have the Json in Varchar(65000) column and Vertica just does not handle good such long defined strings (even though the Json text in it rarely exceeds 1000 characters). The ETL tool just sees those as strings and manages to out-perform Vertica in this parsing case.
Br,
_pj_
Re: Parse Json from Vertica VARCHAR column
Hi Sharon,
Thank you so much . By using this I am able to load json column into another table.Now I am facing one issue with bad json file.
Is there a way to have mapjsonextractor keep going after it hits an error in the input file?
Please let me know how can I skip the input file errors.
I appreciate any help here.
thanks
HK
Thank you so much . By using this I am able to load json column into another table.Now I am facing one issue with bad json file.
Is there a way to have mapjsonextractor keep going after it hits an error in the input file?
Please let me know how can I skip the input file errors.
I appreciate any help here.
thanks
HK