Vertica and Hadoop

Moderator: NorbertKrupa

Post Reply
stefen054
Newbie
Newbie
Posts: 8
Joined: Wed Aug 27, 2014 5:23 am

Vertica and Hadoop

Post by stefen054 » Tue Jun 02, 2015 11:25 am

My assumption on difference/similarities between vertica and hadoop are

1) Vertica handles structured data (heard flex zone could handle semi structured data as well) and Hadoop can handle any file format in storage and processing, especially unstructured data
2) Real time analytics is possible with WOS and ROS, whereas in hadoop it is batch oriented.
3) Hadoop is open source
4) both can work on commodity hardware
5) both can work on peta byte of data

My question is with Apache spark and Kafka or like tools , which are getting more attraction now, where it is possible to do real time Real time analytics in hadoop , and it can be done in open stack,I am bit not clear on which one to choose if unstructured data is not in focus of requirement?

I know it is vast topic. Please share your thoughts.

NorbertKrupa
GURU
GURU
Posts: 527
Joined: Tue Oct 22, 2013 9:36 pm
Location: Chicago, IL
Contact:

Re: Vertica and Hadoop

Post by NorbertKrupa » Tue Jun 02, 2015 10:15 pm

stefen054 wrote:Where it is possible to do real time Real time analytics in hadoop , and it can be done in open stack,I am bit not clear on which one to choose if unstructured data is not in focus of requirement?
I'm very curious how this is accomplished. Given the nature of Hadoop, real-time analytics is very difficult.
stefen054 wrote:1) Vertica handles structured data (heard flex zone could handle semi structured data as well) and Hadoop can handle any file format in storage and processing, especially unstructured data
Vertica & Hadoop handle structured data, however, if you need real-time analysis, structured data could (and should) go into Vertica first, then off to cold storage (Hadoop). Flex Zone can handle semi-structured and potentially unstructured data. However, you wouldn't store a video file in Vertica. Vertica also has an On Hadoop offering which enables it to sit directly on Hadoop nodes.
stefen054 wrote:2) Real time analytics is possible with WOS and ROS, whereas in hadoop it is batch oriented.
3) Hadoop is open source
4) both can work on commodity hardware
Agreed.
stefen054 wrote:5) both can work on peta byte of data
The question you should be asking is what you want to do with that petabyte of data. Facebook uses Vertica and turns over 3-4 PB of data every 2 days. It would be extremely difficult to perform real-time analytics on this amount of data in a Hadoop environment.
Checkout vertica.tips for more Vertica resources.

stefen054
Newbie
Newbie
Posts: 8
Joined: Wed Aug 27, 2014 5:23 am

Re: Vertica and Hadoop

Post by stefen054 » Fri Jun 05, 2015 7:56 am

I assume Real time processing /analytics in hadoop cluster is possible with YARN Framework based tools like apache kafka /storm/spark in hadoop 2.x version.

Thanks for your reply. it really helped in certain area where i was not very clear on using Vertica vs Hadoop..

Post Reply

Return to “New to Vertica”