How to Group By -Runnig Total for Year to Date in VSQL

Moderator: NorbertKrupa

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: How to Group By -Runnig Total for Year to Date in VSQL

Post by JimKnicely » Mon Aug 25, 2014 11:01 pm

Look up how the DECODE statement works here:

https://my.vertica.com/docs/7.0.x/HTML/ ... ons|_____9

For example, you could replace a 0 with 1:

Code: Select all

SELECT rpt_dt,
       demand,
       supply,
       supply/demand month_avg,
       SUM(supply) OVER (PARTITION BY EXTRACT(YEAR FROM rpt_dt ) ORDER BY rpt_dt) /
         DECODE(SUM(demand) OVER (PARTITION BY EXTRACT(YEAR FROM rpt_dt) ORDER BY rpt_dt), 0, 1, SUM(demand) OVER (PARTITION BY EXTRACT(YEAR FROM rpt_dt) ORDER BY rpt_dt)) ytd_avg,
       SUM(supply) OVER (ORDER BY rpt_dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) /
         DECODE(SUM(demand) OVER (ORDER BY rpt_dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW), 0, 1, SUM(demand) OVER (ORDER BY rpt_dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)) rolling3mnth,
       SUM(supply) OVER (ORDER BY rpt_dt ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) /
         DECODE(SUM(demand) OVER (ORDER BY rpt_dt ROWS BETWEEN 5 PRECEDING AND CURRENT ROW), 0, 1, SUM(demand) OVER (ORDER BY rpt_dt ROWS BETWEEN 5 PRECEDING AND CURRENT ROW)) rolling6mnth
  FROM avg_test
 ORDER
    BY rpt_dt;
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

Vertica4VK
Newbie
Newbie
Posts: 6
Joined: Fri Aug 22, 2014 2:10 am

Re: How to Group By -Runnig Total for Year to Date in VSQL

Post by Vertica4VK » Wed Aug 27, 2014 12:57 am

Hello again James

With regards to rolling ytd by month - ( ytd_avg2 below )- where you are using PARTITON...
it works fine when i tested for just one customer but if my data has mupltiple customers like this
(sorry i didn't provided you customer name before in my data example )

rpt_dt
supply
demand
cutomer_name - AAA

rpt_dt
supply
demand
customer_anme - BBB

what i get is repeting numbers for all customers for same month and when month changes numbers do change but it repetes with new numbers for that month again for all customer and so on...like this

AAA 2014-06-30 100.741
BBB 2014-06-30 100.741
CCC 2014-06-30 100.741
DDD 2014-06-30 100.741

AAA 2014-07-31 101.429
BBB 2014-07-31 101.429
CCC 2014-07-31 101.429
DDD 2014-07-31 101.429

Looks like it takes the number of first customer in the list and gets repeted for all other customers and so on..

How do i fix this ?? In partions clause below - i triedadd month+ cusomter name etc. ?? but got syntax errors etc.

Thanks for your help

VK

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: How to Group By -Runnig Total for Year to Date in VSQL

Post by JimKnicely » Wed Aug 27, 2014 11:38 am

Try adding the cust_name to the PARTITION clause! From the docs:
... the window_partition_clause divides the rows in the input based on user-provided expressions, such as aggregation functions like SUM(x). Window partitioning is similar to the GROUP BY clause except that it returns only one result row per input row.
Example:

Code: Select all

dbadmin=> SELECT * FROM avg_test ORDER BY 1, 2;
 cust_name |   rpt_dt   | demand | supply
-----------+------------+--------+--------
 AAA       | 2014-01-01 |      5 |     10
 AAA       | 2014-02-01 |      4 |     20
 AAA       | 2014-03-01 |      5 |     15
 AAA       | 2014-04-01 |      1 |      5
 AAA       | 2014-05-01 |      2 |     10
 AAA       | 2014-06-01 |      6 |     12
 BBB       | 2014-02-01 |      5 |     20
 BBB       | 2014-03-01 |      3 |     15
 BBB       | 2014-04-01 |      2 |      4
 BBB       | 2014-05-01 |      2 |     10
 BBB       | 2014-06-01 |      3 |      9
 BBB       | 2014-07-01 |      3 |      9
(12 rows)

Code: Select all

dbadmin=> SELECT cust_name,
dbadmin->        rpt_dt,
dbadmin->        demand,
dbadmin->        supply,
dbadmin->        supply/demand month_avg,
dbadmin->        SUM(supply) OVER (PARTITION BY cust_name, EXTRACT(YEAR FROM rpt_dt ) ORDER BY rpt_dt) /
dbadmin->          DECODE(SUM(demand) OVER (PARTITION BY cust_name, EXTRACT(YEAR FROM rpt_dt) ORDER BY rpt_dt), 0, 1, SUM(demand) OVER (PARTITION BY cust_name, EXTRACT(YEAR FROM rpt_dt) ORDER BY rpt_dt)) ytd_avg,
dbadmin->        SUM(supply) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) /
dbadmin->          DECODE(SUM(demand) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW), 0, 1, SUM(demand) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)) rolling3mnth,
dbadmin->        SUM(supply) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) /
dbadmin->          DECODE(SUM(demand) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 5 PRECEDING AND CURRENT ROW), 0, 1, SUM(demand) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 5 PRECEDING AND CURRENT ROW)) rolling6mnth
dbadmin->   FROM avg_test
dbadmin->  ORDER
dbadmin->     BY cust_name, rpt_dt;
 cust_name |   rpt_dt   | demand | supply |      month_avg       |       ytd_avg        |     rolling3mnth     |     rolling6mnth
-----------+------------+--------+--------+----------------------+----------------------+----------------------+----------------------
 AAA       | 2014-01-01 |      5 |     10 | 2.000000000000000000 | 2.000000000000000000 | 2.000000000000000000 | 2.000000000000000000
 AAA       | 2014-02-01 |      4 |     20 | 5.000000000000000000 | 3.333333333333333333 | 3.333333333333333333 | 3.333333333333333333
 AAA       | 2014-03-01 |      5 |     15 | 3.000000000000000000 | 3.214285714285714286 | 3.214285714285714286 | 3.214285714285714286
 AAA       | 2014-04-01 |      1 |      5 | 5.000000000000000000 | 3.333333333333333333 | 4.000000000000000000 | 3.333333333333333333
 AAA       | 2014-05-01 |      2 |     10 | 5.000000000000000000 | 3.529411764705882353 | 3.750000000000000000 | 3.529411764705882353
 AAA       | 2014-06-01 |      6 |     12 | 2.000000000000000000 | 3.130434782608695652 | 3.000000000000000000 | 3.130434782608695652
 BBB       | 2014-02-01 |      5 |     20 | 4.000000000000000000 | 4.000000000000000000 | 3.230769230769230769 | 3.565217391304347826
 BBB       | 2014-03-01 |      3 |     15 | 5.000000000000000000 | 4.375000000000000000 | 3.357142857142857143 | 3.500000000000000000
 BBB       | 2014-04-01 |      2 |      4 | 2.000000000000000000 | 3.900000000000000000 | 3.900000000000000000 | 3.473684210526315789
 BBB       | 2014-05-01 |      2 |     10 | 5.000000000000000000 | 4.083333333333333333 | 4.142857142857142857 | 3.550000000000000000
 BBB       | 2014-06-01 |      3 |      9 | 3.000000000000000000 | 3.866666666666666667 | 3.285714285714285714 | 3.333333333333333333
 BBB       | 2014-07-01 |      3 |      9 | 3.000000000000000000 | 3.722222222222222222 | 3.500000000000000000 | 3.722222222222222222
(12 rows)
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

Vertica4VK
Newbie
Newbie
Posts: 6
Joined: Fri Aug 22, 2014 2:10 am

Re: How to Group By -Runnig Total for Year to Date in VSQL

Post by Vertica4VK » Thu Aug 28, 2014 12:37 am

Hello James

Thanks again - perfect :-)

One more question - this is not needed now as i acheived from seperate query but just wondering if you can suggest - if same can be achieved in query below using Partition -

1) How do i add any condition in Partition - for eg. - let's say for a customer when i calculate Rolling3 Months - if i don't have data for all 3 months for any reason then it should not calculate 3Month Rolling total .?

Again Thanks for your help with this

VK

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: How to Group By -Runnig Total for Year to Date in VSQL

Post by JimKnicely » Thu Aug 28, 2014 12:40 pm

How do i add any condition in Partition - for eg. - let's say for a customer when i calculate Rolling3 Months - if i don't have data for all 3 months for any reason then it should not calculate 3Month Rolling total .?
Maybe make use of a CASE statement to determine the row counts prior to calculating the sum. Like this:

Code: Select all

dbadmin=> SELECT cust_name,
dbadmin->        rpt_dt,
dbadmin->        demand,
dbadmin->        supply,
dbadmin->        supply/demand month_avg,
dbadmin->        SUM(supply) OVER (PARTITION BY cust_name, EXTRACT(YEAR FROM rpt_dt ) ORDER BY rpt_dt) /
dbadmin->          DECODE(SUM(demand) OVER (PARTITION BY cust_name, EXTRACT(YEAR FROM rpt_dt) ORDER BY rpt_dt), 0, 1, SUM(demand) OVER (PARTITION BY cust_name, EXTRACT(YEAR FROM rpt_dt) ORDER BY rpt_dt)) ytd_avg,
dbadmin->        (CASE
dbadmin(>           WHEN COUNT(supply+demand) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) = 3 THEN
dbadmin(>             SUM(supply) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) /
dbadmin(>               DECODE(SUM(demand) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW), 0, 1, SUM(demand) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW))
dbadmin(>         ELSE
dbadmin(>           NULL
dbadmin(>         END) rolling3mnth,
dbadmin->        (CASE
dbadmin(>           WHEN COUNT(supply+demand) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) = 6 THEN
dbadmin(>             SUM(supply) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) /
dbadmin(>               DECODE(SUM(demand) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 5 PRECEDING AND CURRENT ROW), 0, 1, SUM(demand) OVER (ORDER BY cust_name, rpt_dt ROWS BETWEEN 5 PRECEDING AND CURRENT ROW))
dbadmin(>         ELSE
dbadmin(>           NULL
dbadmin(>         END) rolling6mnth
dbadmin->   FROM avg_test
dbadmin->  ORDER
dbadmin->     BY cust_name, rpt_dt;
 cust_name |   rpt_dt   | demand | supply |      month_avg       |       ytd_avg        |     rolling3mnth     |     rolling6mnth
-----------+------------+--------+--------+----------------------+----------------------+----------------------+----------------------
 AAA       | 2014-01-01 |      5 |     10 | 2.000000000000000000 | 2.000000000000000000 |                      |
 AAA       | 2014-02-01 |      4 |     20 | 5.000000000000000000 | 3.333333333333333333 |                      |
 AAA       | 2014-03-01 |      5 |     15 | 3.000000000000000000 | 3.214285714285714286 | 3.214285714285714286 |
 AAA       | 2014-04-01 |      1 |      5 | 5.000000000000000000 | 3.333333333333333333 | 4.000000000000000000 |
 AAA       | 2014-05-01 |      2 |     10 | 5.000000000000000000 | 3.529411764705882353 | 3.750000000000000000 |
 AAA       | 2014-06-01 |      6 |     12 | 2.000000000000000000 | 3.130434782608695652 | 3.000000000000000000 | 3.130434782608695652
 BBB       | 2014-02-01 |      5 |     20 | 4.000000000000000000 | 4.000000000000000000 | 3.230769230769230769 | 3.565217391304347826
 BBB       | 2014-03-01 |      3 |     15 | 5.000000000000000000 | 4.375000000000000000 | 3.357142857142857143 | 3.500000000000000000
 BBB       | 2014-04-01 |      2 |      4 | 2.000000000000000000 | 3.900000000000000000 | 3.900000000000000000 | 3.473684210526315789
 BBB       | 2014-05-01 |        |     10 |                      | 4.900000000000000000 |                      |
 BBB       | 2014-06-01 |      3 |      9 | 3.000000000000000000 | 4.461538461538461538 |                      |
 BBB       | 2014-07-01 |      3 |      9 | 3.000000000000000000 | 4.187500000000000000 |                      |
(12 rows)
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

Post Reply

Return to “New to Vertica SQL”