Page 1 of 1

Question about Normalize function

Posted: Thu Jan 12, 2017 6:01 pm
by beth
Hi,

If I use the NORMALIZE function prior to creating a predictive model, is there a way to put a new set of data through the same normalization in order to make predictions?

Re: Question about Normalize function

Posted: Fri Jan 13, 2017 2:12 pm
by JimKnicely
Hi Beth,

In Vertica 8.0.1 there are three normalization methods available to the NORMALIZE function:
  • minmax
  • zscore
  • robust_zscore
Please refer to the documentation that explains the methodology behind each of the methods here:

https://my.vertica.com/docs/8.0.x/HTML/ ... ngData.htm

So I did some testing and here is what I found...

For each of the normalization methods Vertica will create a database view that queries the base table containing your data. The views present back to you the normalized data. When you insert data into the base table the data returned by the views will return the adjusted normalized data taking into account the new data you added to the base table for both the minmax and zscore normalization methods. However, for the robust_zscore normailzation method, when data is added to the base table, you will need to re-run the normalization function to create a new database view.

We are already working on enhancing the capability of the NORMALIZE function to store the parameters and then give the user the ability to apply those parameters on new data sets which could be very useful for streaming data sets. So you can expect this functionality in one of the upcoming releases.

I hope this helps!