Performance

How to Read an AWR Report Without Drowning


An AWR report is a wall of numbers — dozens of sections, hundreds of rows, a thousand ways to get lost chasing a metric that doesn’t matter. Most people scroll to “Buffer Hit %”, see 99%, and conclude the database is healthy. That instinct is exactly backwards, and it’s why so much “tuning” optimizes the wrong thing.

There’s a faster, more reliable way to read one — and it fits in your head. It rests on a single idea (DB Time), follows a fixed five-section path, and tells you within about ninety seconds whether you’re CPU-bound, waiting on something, or chasing a ghost. This is that method. It targets Oracle 19c, the enterprise workhorse, with notes on 23ai/26ai, and it comes with a free lab so you can generate a real report and read it alongside the guide.

The short version. One number runs the whole report: DB Time — all the time the database spent in user calls. Divide it by elapsed time to get Average Active Sessions and compare that to your CPU count. Then read Top Timed Events: DB CPU on top means do less work (inefficient SQL); a wait on top means follow that wait. Let the top event pick which SQL ordered by… list to read. Ignore the hit ratios. Reach for ASH when it’s a short spike, not a steady state.

First, a licensing reality

AWR (and ASH, and ADDM, and the DBA_HIST_* views) are part of the Diagnostics Pack, a separately licensed option on Enterprise Edition. Querying them on a database you’re not licensed for is a compliance problem, not a free lunch. The governing switch is CONTROL_MANAGEMENT_PACK_ACCESS:

SELECT value FROM v$parameter WHERE name = 'control_management_pack_access';
-- DIAGNOSTIC+TUNING  -> AWR/ASH/ADDM available (Diagnostics + Tuning Pack licensed)
-- NONE               -> not available; use Statspack (free) instead

If you’re not licensed, Statspack is the free, built-in alternative — same ideas, fewer features. For the lab below we use Oracle Database Free, where the packs are enabled for development use (not a license waiver for production).

DB Time: the one metric everything hangs on

DB Time is the total time the database spent in user calls — CPU plus all non-idle waits — across all sessions. It is the master metric because tuning has exactly one goal: reduce DB Time. Every other number in the report is just a clue about where DB Time went.

The header gives you the two numbers that frame everything:

              Snap Id      Snap Time        Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:         1   13-Jun-26 10:00      42       3.1
  End Snap:         2   13-Jun-26 10:10      45       3.2
   Elapsed:               10.00 (mins)
   DB Time:               21.40 (mins)

Divide them: Average Active Sessions (AAS) = DB Time / Elapsed. Here that’s 21.4 / 10 ≈ 2.1. That single number tells you how much work the database was really doing on average. Compare it to your CPU count:

AAS vs CPU countWhat it means
AAS ≪ CPUsDatabase is mostly idle; if users complain, the problem is probably outside the DB
AAS ≈ CPUsRunning hot but not necessarily wrong — check whether it’s CPU or waits
AAS ≫ CPUsSessions are queuing — for CPU or for a wait; this is where real contention lives

Everything that follows is about explaining that DB Time — and DB Time only splits two ways, into CPU and waits:

flowchart TD
DBT[DB Time<br/>all user-call time] --> CPU[DB CPU<br/>on-CPU work]
DBT --> WAIT[Wait time<br/>non-idle waits]
WAIT --> IO[User I/O<br/>db file reads]
WAIT --> COMMIT[Commit<br/>log file sync]
WAIT --> CONC[Concurrency<br/>locks, latches]
WAIT --> CLU[Cluster<br/>gc events, RAC]
DB Time decomposes into DB CPU plus non-idle wait time, and waits group into classes. Every section of the report is just a finer breakdown of this one quantity.

The five sections that matter (in order)

You do not read an AWR report top to bottom. You read five sections, in this order, and you can stop as soon as the story is clear.

1. Load Profile — the shape of the workload

A quick orientation: is this an OLTP system doing lots of small transactions, or a few heavy queries? Scan for red flags — a hard-parse rate that isn’t near zero, enormous redo, or logical reads that dwarf physical reads (a sign of inefficient SQL doing far too many buffer gets).

Load Profile              Per Second    Per Transaction
~~~~~~~~~~~~~~~          -----------    ---------------
        DB Time(s):            2.1               0.8
         DB CPU(s):            2.1               0.8
  Logical read (blocks):   182,117             4,454
 Physical read (blocks):       210                 5
            Executes:         1,140                28
           Hard parses:           1               0.0

Logical reads massively exceeding physical reads, with DB CPU ≈ DB Time, is the classic fingerprint of CPU burned on inefficient SQL — lots of buffer gets, not lots of disk.

2. Top Timed Events — where the time actually went

This is the heart of the report. It ranks what consumed DB Time. The first question it answers: CPU or waits?

Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Event                         Waits   Time (s)   Avg Wait   % DB time   Wait Class
----------------------------- -----   --------   --------   ---------   ----------
DB CPU                                    11.9                   99.1
enq: CR - block range reuse      12        0.1     6.35ms        0.6    Other
  • DB CPU at the top (like the 99.1% above) → you’re CPU-bound. The fix is to do less work: inefficient SQL, excessive logical reads, hard parsing. Adding CPU only buys time.
  • A wait event at the top → follow it. db file sequential read (single-block I/O, usually index lookups), db file scattered read (multi-block, full scans), log file sync (commit/redo), enq: TX - row lock contention (locking), gc events (RAC interconnect). Each points somewhere specific.

The number that matters is % DB time, not raw waits. A million waits that sum to 2% of DB time are noise; one event at 70% is your headline.

3. The dominant resource → the right Top SQL list

AWR gives you several “SQL ordered by …” lists, and reading the wrong one wastes time. Let section 2 choose the list:

If Top Timed Events says…Read “SQL ordered by…”
DB CPU dominatesCPU Time (and Gets — CPU usually tracks logical reads)
db file sequential/scattered readReads (physical I/O)
Lots of executions, high parseExecutions / Parse Calls
log file synclook at commit frequency, not a single SQL

In a CPU-bound report, the top of “SQL ordered by CPU Time” is your suspect. In the lab, that’s unmistakable:

SQL ordered by CPU Time
  CPU Time (s)   Elapsed (s)   Executions   SQL Text
  -----------   -----------   ----------   ---------------------------------------
        23.6          23.8            2     SELECT /*+ awr_demo */ SUM(SQRT(LEVEL)
                                            + LN(LEVEL + 1)) FROM DUAL CONNECT BY...

One statement, ~all the CPU. That’s the whole investigation: you now have a SQL_ID to tune.

4. Instance Efficiency — the ratios to distrust

This section is a trap. Buffer Hit %, Library Hit %, and friends are not health scores. A 99.99% Buffer Hit Ratio often means a query is doing millions of logical reads of cached blocks — burning CPU while looking “efficient.” Tuning to make a ratio go up is how you end up optimizing the wrong query.

A composite scenario (illustrative). A team is paged for a slow system. The AWR shows Buffer Hit % at 99.9%, so they rule out I/O and start adding memory. The real story was two lines down: DB CPU at 95%, driven by one report query doing a nested loop over millions of cached rows. The ratio said “healthy”; DB Time said “one query is eating the box.” Trust DB Time.

Glance at Instance Efficiency for anomalies (e.g. a low parse ratio, a soft-parse problem) — never as a pass/fail grade.

5. The supporting cast (only if you still need it)

If the first four didn’t close the case, drop into the detail: Time Model Statistics (where DB Time splits — SQL execute vs parse vs PL/SQL), Segments by Logical/Physical Reads (which table/index is hot), and SQL ordered by Reads for I/O-bound systems. Most reports are solved before you get here.

A second worked example: when it’s I/O, not CPU

The whole point of the method is that the top event routes you. Flip the workload from CPU to I/O and the report tells a different story — you read a different SQL list and land in a different section. The lab’s drill-io flushes the cache and full-scans a table larger than the cache; here’s what changes.

Load Profile — physical reads jump from a rounding error to the headline:

Load Profile              Per Second
  Logical read (blocks):    60,977
 Physical read (blocks):    47,886      <- now enormous

Top Timed Events — an I/O wait appears alongside the CPU:

Event                   Waits   Time (s)   % DB time   Wait Class
----------------------- -----   --------   ---------   ----------
DB CPU                            11.0        96.2
direct path read        4,299     0.2         2.1       User I/O

Now the routing changes. There’s an I/O event in play, so you read SQL ordered by Reads (not CPU) — and the fastest path to the culprit, Segments by Physical Reads, which names the object doing the I/O:

Segments by Physical Reads
  Owner     Object Name    Physical Reads   % Total
  LABUSER   BIGTAB              545,472       99.7

One table, 99.7% of the physical reads. You found the hot segment in a single line — now the real question is whether that full scan should exist at all (a missing index? an unselective predicate? a report running far too often?).

An honest note about fast storage. On the lab’s local NVMe, those ~48k reads/sec complete in microseconds, so direct path read is only 2% of DB time and DB CPU still tops the report — the scan burns more CPU than it waits on I/O. On production storage with real latency, the same physical-read volume becomes the headline:

[illustrative — the same workload on latency-bound storage]
Event                       Waits     Time (s)   % DB time
db file scattered read    420,118       980        71.4
DB CPU                                   210        15.3

The diagnosis path is identical — Load Profile shows the reads, Top Events names the wait, Segments names the object — but now I/O is unmistakably the bottleneck. Trust the read volume and the segment; the wait time scales with your storage.

The wait events you’ll actually meet

When a wait — not DB CPU — tops the report, the name tells you where to look. You don’t need to memorize Oracle’s hundreds of events; you need the dozen that show up in real reports and what each is really saying:

Wait eventClassWhat it usually meansFirst thing to check
db file sequential readUser I/OSingle-block reads — usually index lookupsThe SQL’s access path; is the index selective?
db file scattered readUser I/OMulti-block reads — full scansShould that scan be an index range scan?
direct path readUser I/OLarge scans bypassing the buffer cacheBig-table scans, parallel query, sort/hash spills
log file syncCommitSessions waiting for commit redo to flushCommit frequency (committing per row?), LGWR/redo storage
log file parallel writeSystem I/OLGWR writing redo to diskRedo log device latency
buffer busy waitsConcurrencyTwo sessions want the same blockHot block / right-hand index growth; consider partitioning
read by other sessionUser I/OWaiting for a block another session is reading inOften the flip side of a hot full scan
enq: TX - row lock contentionApplicationRow-level locking — app waiting on a lockThe blocking session/SQL; transaction design
latch: shared pool / library cacheConcurrencyParsing/sharing pressureHard parsing, lack of bind variables
gc cr / current blockClusterRAC interconnect — fetching a block from another nodeInterconnect health; block contention across nodes

That last one is your bridge to the other half of HA: gc events live on Real Application Clusters. The rest follow the same rule — the event names the resource; the SQL and segment sections name the culprit.

AWR vs ASH vs ADDM — pick the right tool

ToolGranularityUse it when
AWRAggregated over a snapshot interval (default 1h)“The database was slow between 2 and 3pm” — the steady-state picture
ASHPer-second active-session samples”It froze for 90 seconds at 2:14” — short spikes AWR averages away; drill into one session/SQL
ADDMAutomated analysis of an AWR intervalA fast first opinion and a starting hypothesis

Rule of thumb: AWR for the interval, ASH for the moment. A 10-minute stall inside a 1-hour AWR window is diluted to ~17% of the report; ASH shows it at full intensity.

Drilling into a moment with ASH

AWR aggregates an interval; ASH samples active sessions every second, so it shows what was running at a specific moment — the right tool for a spike. You generate an ASH report the same way you’d generate AWR (ashrpt.sql, OEM, or DBMS_WORKLOAD_REPOSITORY.ASH_REPORT_TEXT), but bounded by a time range instead of snapshots. The lab’s drill-ash runs a short burst and reports on it:

Top User Events            Avg Active Sessions   % Activity
-------------------------   -------------------   ----------
CPU + Wait for CPU                  1.0              98.5

Top SQL Statements                          % Activity   SQL Id
SELECT /*+ ash_demo */ SUM(SQRT(LEVEL)...      97.8      <sql_id>

ASH answers what AWR can’t: which session, which SQL, which wait — at 2:14:32, not “sometime in the last hour.” For a hands-on look you can also query the view directly:

SELECT sql_id, event, COUNT(*) AS samples
FROM   v$active_session_history
WHERE  sample_time > SYSTIMESTAMP - INTERVAL '5' MINUTE
GROUP  BY sql_id, event
ORDER  BY samples DESC FETCH FIRST 10 ROWS ONLY;

Each row of ASH is roughly one second of one active session, so those samples counts are effectively seconds of DB time broken down by SQL and wait — a histogram of where the database’s attention actually went. (A NULL event means the session was on CPU.)

The 90-second triage

flowchart TD
A([Compute AAS = DB Time / Elapsed]) --> B{AAS meaningful<br/>vs CPU count?}
B -- No, near 0 --> Z[Problem is likely outside the DB<br/>app, network, client]
B -- Yes --> C{Top Timed Event?}
C -- DB CPU --> D[CPU-bound: read SQL ordered by CPU/Gets<br/>find the inefficient SQL, do less work]
C -- db file read --> E[I/O-bound: read SQL ordered by Reads<br/>+ Segments by Physical Reads]
C -- log file sync --> F[Commit/redo: check commit frequency,<br/>redo, LGWR]
C -- enq / lock --> G[Contention: find the blocking SQL/object]
D --> Y([Get the SQL_ID, then tune that statement])
E --> Y
F --> Y
G --> Y
Reading an AWR report: let DB Time and the top event route you. Stop as soon as the story is clear.

What teams get wrong

  • Reading ratios instead of DB Time. Buffer Hit % is not a grade. Start at Top Timed Events.
  • Too-wide a window. A 6-hour AWR averages your 10-minute incident into invisibility. Pick the tightest snapshot pair that brackets the problem.
  • Spanning a restart. An AWR report across an instance bounce is meaningless — stats reset.
  • Tuning the top SQL by the wrong metric. If you’re I/O-bound, the biggest-CPU SQL may be irrelevant.
  • Confusing waits count with impact. Sort by % DB time, always.
  • Using AWR for a spike. Reach for ASH when the problem is short-lived.

Try it yourself: generate a real AWR report

The fastest way to internalize this is to make one. The awr/ lab in github.com/pyaroslav/oracle-labs spins up Oracle Database Free with Docker, runs a known CPU-bound workload between two snapshots, and hands you a real AWR report to read — no Oracle account, no license:

./run.sh up         # start Oracle Database Free
./run.sh all        # setup, then all three drills below
# or run them individually:
./run.sh drill      # CPU-bound  -> awr-report.txt  (DB CPU ~99%, the workload tops SQL by CPU)
./run.sh drill-io   # I/O        -> io-report.txt   (huge physical reads; BIGTAB tops Segments by Reads)
./run.sh drill-ash  # ASH        -> ash-report.txt  (Top User Events / Top SQL for the recent window)

Every excerpt in this post — the CPU-bound report, the I/O signature, the ASH output — was generated by these drills on Oracle Database Free 26ai. Change a workload, regenerate, and watch the report change with it; that feedback loop is what turns “reading AWR” from intimidating into routine.

What about 23ai and 26ai?

The method is release-stable: DB Time, Top Timed Events, and the SQL lists work the same on 19c, 23ai, and the current 26ai. Newer releases sharpen the tooling around AWR — Real-Time SQL Monitoring and a richer Active Session History make drilling into a single statement easier — but the reading order above doesn’t change. (On the Free image, AWR/ASH are available for learning, as the lab shows.)

Frequently asked questions

What is DB Time in an AWR report?

DB Time is the total time the database spent working in user calls — CPU time plus all non-idle wait time — summed across all sessions during the snapshot interval. It is the master metric of Oracle performance: the goal of tuning is to reduce DB Time. Dividing DB Time by elapsed time gives Average Active Sessions, a measure of how busy the database really was.

Which section of an AWR report should I read first?

After checking DB Time and Average Active Sessions in the header, read the Top Timed Events (Top 10 Foreground Events) section. It ranks what consumed DB Time and immediately tells you whether the database is CPU-bound (DB CPU at the top) or waiting on something specific. Do not start with Buffer Hit % or other ratios.

Is a high Buffer Cache Hit Ratio good?

Not necessarily. A very high Buffer Hit Ratio often means a query is performing millions of logical reads against cached blocks, burning CPU while appearing efficient. Hit ratios are not health scores. Diagnose performance from DB Time and wait events, not from cache ratios.

When should I use ASH instead of AWR?

Use AWR for the aggregate picture over a snapshot interval (for example, "the database was slow from 2 to 3pm"). Use ASH (Active Session History) for short-lived spikes that AWR averages away — for example a 90-second stall — and to drill into a specific session, SQL, or wait at a precise moment.

Do AWR and ASH require a license?

Yes. AWR, ASH, ADDM, and the DBA_HIST_* views are part of the Diagnostics Pack, a separately licensed option on Oracle Enterprise Edition. The CONTROL_MANAGEMENT_PACK_ACCESS parameter governs access. If you are not licensed, use Statspack, the free built-in alternative.

How long should the AWR snapshot window be?

Pick the tightest snapshot pair that brackets the problem — typically a single default (hourly) interval or a custom snapshot pair around the incident. Wide windows (several hours) average out short spikes and hide the issue. Never run a report across an instance restart, because statistics reset.

What is Average Active Sessions (AAS) in an AWR report?

Average Active Sessions is DB Time divided by elapsed time over the snapshot interval. It measures how many sessions, on average, were actively working in the database at once. Compared against the server CPU count it tells you the load: AAS well below the CPU count means the database is mostly idle; AAS near the CPU count means it is running hot; AAS well above the CPU count means sessions are queuing for CPU or a wait, which is where real contention shows up.

What does the db file sequential read wait event mean?

db file sequential read is the wait for a single-block read from disk into the buffer cache, most often an index block or a table block reached by rowid during an index lookup. It is normal in small amounts. When it dominates an AWR report, look at the SQL doing those reads and whether the index is selective — sometimes the fix is a better index, and sometimes it is avoiding an index in favor of a full scan. It is distinct from db file scattered read, which is the multi-block read used by full scans.

The one-paragraph version

Compute Average Active Sessions (DB Time ÷ Elapsed) and compare it to your CPU count. Read Top Timed Events to learn whether you’re CPU-bound or wait-bound. Let that pick which SQL ordered by … list to read, and pull the SQL_ID of the dominant statement. Ignore the hit ratios. Use ASH when the problem is a short spike rather than a steady state. Everything else in the report is a supporting detail you’ll usually never need.

Have a question or some feedback?

I write here in a personal capacity and enjoy comparing notes with other Oracle folks. Say hello.

Get in touch