DISTINCT vs. GROUP BY
Moderator: NorbertKrupa
-
- Newbie
- Posts: 21
- Joined: Mon Feb 13, 2012 9:44 pm
DISTINCT vs. GROUP BY
Is there a difference between DISTINCT and GROUP BY in queries where no aggregate functions are used? Does one perform better on very large data sets?
Re: DISTINCT vs. GROUP BY
Hi!
[DELETED]
[DELETED]
Last edited by id10t on Sat May 09, 2015 3:18 pm, edited 1 time in total.
-
- Newbie
- Posts: 21
- Joined: Mon Feb 13, 2012 9:44 pm
Re: DISTINCT vs. GROUP BY
Thanks. They seem to generate the same explain plan... However, in the example below, the GROUP BY has a lower associated cost?
----------
I wonder why?
Code: Select all
dbadmin=> select count(*) from abc;
count
-----------
365496300
(1 row)
Code: Select all
dbadmin=> explain select distinct a, b, c from abc;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
explain select distinct a, b, c from abc;
Access Path:
+-GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 5M, Rows: 10M] (PATH ID: 1)
| Group By: abc.a, abc.b, abc.c
| Execute on: All Nodes
| +---> STORAGE ACCESS for abc [Cost: 4M, Rows: 365M] (PATH ID: 2)
| | Projection: test.abc_b0
| | Materialize: abc.a, abc.b, abc.c
| | Execute on: All Nodes
Code: Select all
dbadmin=> explain select a, b, c from abc group by a, b, c;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
explain select a, b, c from abc group by a, b, c;
Access Path:
+-GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 4M, Rows: 10M] (PATH ID: 1)
| Group By: abc.a, abc.b, abc.c
| Execute on: All Nodes
| +---> STORAGE ACCESS for abc [Cost: 3M, Rows: 365M] (PATH ID: 2)
| | Projection: test.abc_b0
| | Materialize: abc.a, abc.b, abc.c
| | Execute on: All Nodes