User 2 User Support! That's the goal!
How about this option to get rid of dups:
I created a table with 15,000,000 records having a lot of duplicates. I added a temporary column to the table to store an indicator that I want to keep the row. I then inserted all the non-duplicates into the same table giving the temporary column a value of 1 for these rows. Finally, I deleted all records where the keep column is null and then dropped the temporary column. My table was left with all unique rows.
Example:
Code: Select all
dbadmin=> create table test (col1 int, col2 int);
CREATE TABLE
Time: First fetch (0 rows): 16.203 ms. All rows formatted: 16.217 ms
dbadmin=> insert into test (select randomint(5), randomint(5) from tables cross join columns cross join tables t2 limit 15000000);
OUTPUT
----------
15000000
(1 row)
Time: First fetch (1 row): 8180.880 ms. All rows formatted: 8180.911 ms
dbadmin=> select col1, col2, count(*)
dbadmin-> from test
dbadmin-> group by col1, col2
dbadmin-> order by col1, col2;
col1 | col2 | count
------+------+--------
0 | 0 | 600398
0 | 1 | 599744
0 | 2 | 601643
0 | 3 | 600217
0 | 4 | 600686
1 | 0 | 599671
1 | 1 | 599769
1 | 2 | 599408
1 | 3 | 598887
1 | 4 | 599205
2 | 0 | 600508
2 | 1 | 599828
2 | 2 | 599882
2 | 3 | 599538
2 | 4 | 598461
3 | 0 | 599772
3 | 1 | 600063
3 | 2 | 599769
3 | 3 | 600085
3 | 4 | 599001
4 | 0 | 601377
4 | 1 | 600934
4 | 2 | 601486
4 | 3 | 599584
4 | 4 | 600084
(25 rows)
Time: First fetch (25 rows): 1028.894 ms. All rows formatted: 1028.966 ms
dbadmin=> alter table test add column keep_row int;
ALTER TABLE
Time: First fetch (0 rows): 3198.575 ms. All rows formatted: 3198.590 ms
dbadmin=> insert into test (select distinct col1, col2, 1 from test);
OUTPUT
--------
25
(1 row)
Time: First fetch (1 row): 315.839 ms. All rows formatted: 315.870 ms
dbadmin=> delete from test where keep_row is null;
OUTPUT
----------
15000000
(1 row)
Time: First fetch (1 row): 1214.642 ms. All rows formatted: 1214.727 ms
dbadmin=> alter table test drop column keep_row;
ALTER TABLE
Time: First fetch (0 rows): 2664.760 ms. All rows formatted: 2664.774 ms
dbadmin=> select * from test;
col1 | col2
------+------
0 | 4
1 | 0
1 | 1
1 | 2
2 | 1
3 | 0
3 | 1
4 | 0
4 | 2
4 | 4
1 | 4
2 | 2
3 | 3
3 | 4
4 | 1
0 | 0
0 | 1
0 | 2
0 | 3
1 | 3
2 | 0
2 | 3
2 | 4
3 | 2
4 | 3
(25 rows)
Time: First fetch (25 rows): 27.760 ms. All rows formatted: 27.851 ms