Page 1 of 1

Query_requests table

Posted: Thu Feb 04, 2016 9:40 pm
by Beg1nner
We were in the process of monitoring all the queries our users run in our platform. I found vertica logs these in query_requests table. I have some questions on the same.

1> Is there a limit to the amount of data/ how far back that Vertica keeps logs of queries in query_requests table?
If yes, can we configure the # of rows or the size of the data that Vertica should keep logging in this table?

2> Any other table that can be useful to track the activities of users on Vertica?

Any insight is greatly appreciated.

Re: Query_requests table

Posted: Fri Feb 05, 2016 3:43 am
by NorbertKrupa
query_requests is a system view. You have to first pull the underlying system tables for this view:

Code: Select all

dbadmin=> SELECT query_string
dbadmin-> FROM   v_internal.vs_system_views
dbadmin-> WHERE  view_schema = 'v_monitor'
dbadmin->        AND view_name = 'query_requests';
                                                        query_string                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
------------------------------------------------------------------------------------------------------------------------
 create view "v_monitor"."query_requests" as select ri.node_name,
       ri.user_name,
       ri.session_id,
       ri.request_id,
       ri.transaction_id,
       ri.statement_id,
       ri.request_type,
       replace(replace(ri.request, E'\n', ' '), E'\t', ' ') as request,
       ri.label as request_label,
       ri.search_path,
       round(ra.memory_mb, 2) as memory_acquired_mb,
       rc.success,
       de.error_count,
       ri.time                                   as start_timestamp,
       rc.time                                   as end_timestamp,
       datediff('millisecond', ri.time, rc.time) as request_duration_ms,
       rc.time IS NULL                           as is_executing
from
     v_internal.dc_requests_issued ri
     LEFT OUTER JOIN v_internal.dc_requests_completed rc USING (node_name, session_id, request_id)
     LEFT OUTER JOIN (select node_name,
                       session_id,
                       request_id,
                       count(*) as error_count
                from v_internal.dc_errors
                where error_level >= 20
                group by 1,2,3) de USING (node_name, session_id, request_id)
     LEFT OUTER JOIN (select node_name,
                       transaction_id,
                       statement_id,
                       max(memory_kb)/1024::float as memory_mb
                from v_internal.dc_resource_acquisitions
                where result = 'Granted'
                group by 1,2,3) ra
                USING (node_name, transaction_id, statement_id)
order by is_executing desc, rc.time desc
;

(1 row)
This view involves:
  • v_internal.dc_requests_issued
  • v_internal.dc_requests_completed
  • v_internal.dc_errors
  • v_internal.dc_resource_acquisitions
If you try to find the oldest row for each data collector component, you'll see a lot of variance. That's because by default, data collector uses a disk based policy. Meaning, once the disk based retention limit is reached, it will purge those rows from the data collector component.

Since this view involves 4 system tables, you will need to set a retention policy for each of those to make up the query_requests view. However, you can set a time based collection policy for each component.

My recommendation would be to put together a more simple view; perhaps with just dc_requests_issued and dc_requests_completed so that you don't need to increase the retention policy for components such as dc_errors or dc_resource_acquisitions.

You can search v_internal.vs_system_tables for other user related system tables:

Code: Select all

SELECT table_schema,
       table_name,
       table_description,
       is_view
FROM   v_internal.vs_system_tables
WHERE  table_description ILIKE '%user%'
ORDER  BY table_schema,
          table_name;

Re: Query_requests table

Posted: Mon Feb 08, 2016 7:48 pm
by Beg1nner
Thanks! that was really helpful. I have one follow up question.

Is there a limit to the value we can enter in the 'interval' parameter when we are setting time specific limit? If I want table to store 1 year of data could I just enter the below?

SELECT SET_DATA_COLLECTOR_TIME_POLICY('OptimizerEvents', '365 day'::interval);

Re: Query_requests table

Posted: Mon Feb 08, 2016 8:40 pm
by JimKnicely
Hi,

I do not think there is a limit, other than the physical disk space needed to store that much data. Also, the system tables in Vertica are notoriously slow, so if you try to query those tables having 365 days worth of history, you are going to be disappointed.

Another option is to create your own table, i.e. my_query_requests. Then you simply run a cron job to insert data into it from the regular query_requets table periodically, i.e. once an hour, every 12 hours, etc. Depends on how you want to report off that table.

Good news is that your table will perform in orders of magnitude better than the Vertica system tables...