Update: nice query from Jonathan Lewis on AWR:
I found this while looking for how to eliminate reporting stats over database restarts. Apparently if using snap_id then there will be no snap_id-1 over restarts because the ids will jump more than one across restarts – have to check into this
Exploring ways to mine data in AWR. One situation I’m looking at now is a number of database that all use the same LUNs on the back end storage, so two metrics that would be interesting to look at are IOs from these databases and the IO latencies to answer the question, “does read and/or write I/O activity correlate to increased latency on the backend storage?”
I have multiple databases in one AWR repository.
I want to create the output in a format that I can read with something like Excel, so I make the values comma delimited, AWRs values for sysmetric are hourly, so I want to round the times to the nearest hour to make correlation easier and I want to translate the DBID into the database name to make reading the data easier. I could do something like:
select
n.db_name||'_'||n.host_name||'_'||
decode(metric_name,
'Physical Write Total Bytes Per Sec','write','read')||','||
to_char(trunc(begin_time)+
( ROUND ((begin_time - TRUNC (begin_time)) * 24) / 24),
'YYYY/MM/DD HH24:MI')||','||
average
from dba_hist_sysmetric_summary s,
(select distinct dbid, db_name, host_name
from dba_hist_database_instance) n
where
( s.metric_name= 'Physical Write Total Bytes Per Sec'
or
s.metric_name= 'Physical Read Total Bytes Per Sec'
) and n.dbid=s.dbid
order by begin_time;
For the I/O latencies it take some more work as I have to compute the deltas between snapshots getting the total elasped time and the count of waits to compute the average wait over the snapshot period for each I/O wait event. I also have to watch out for database bounces, which I have inelligantly addressed by simply filtering only for waits less than or equal to 0. I could do something like
select
btime||','||
n.db_name||'_'||n.host_name||'_'||
event_name||','||
nvl(round((time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0),3),0) avg_ms
from (
select
e.event_name,
e.dbid,
to_char(trunc(cast(begin_interval_time as date ))+
(ROUND ((cast(begin_interval_time as date)- TRUNC (cast(begin_interval_time as date))) * 24) / 24),
'YYYY/MM/DD HH24:MI') btime,
Lag (e.total_waits) OVER( PARTITION BY e.event_name, e.dbid ORDER BY s.snap_id)
count_beg,
total_waits count_end,
Lag (e.time_waited_micro/1000) OVER( PARTITION BY e.event_name,e.dbid ORDER BY s.snap_id)
time_ms_beg,
time_waited_micro/1000 time_ms_end
from
DBA_HIST_SYSTEM_EVENT e,
DBA_HIST_SNAPSHOT s
where
s.snap_id=e.snap_id and
s.dbid=e.dbid and
(e.event_name= 'db file sequential read'
or e.event_name='db file scattered read' )
order by begin_interval_time
) s,
(select distinct dbid, db_name, host_name from dba_hist_database_instance) n
where
n.dbid=s.dbid
and
round((time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0),3) >= 0
order by btime;
Correlating the read/write throughput along with single and multiblock read latencies across multiple database over multiple days can pose challenges. One strategy is to boil the data down to a smaller number of metrics like total read, total write to average single block I/O. For average single block I/O, I’d want to normalize the I/O latency relative to read I/O for that database. Maybe I could get an average of latencies weighted by throughput of that database Work in progress …
Comments