Steps to Analyze AWR Report in Oracle

Steps to Analyze AWR Report in Oracle

AWR - Automatic workload repository is a collection of persistent system performance statistics owned by SYS. It resides in SYSAUX tablespace. Bydefault snapshot are generated once every 60 min and maintained for 7 days. Each snapshot has a unique ID know as "snap_id". Snapshot detail can be found in "dba_hist_snapshot" view. 

If we have Database performance issue and not the Database machine, then AWR Report is the place to look at. AWR is not used for real-time performance monitoring like the v$ tables. It is used for historical analysis of performance. AWR complements, but doesnot replace real-time monitoring.

Once AWR Report is generated in Oracle, the next task is to analyze it. By going through the AWR Report we can easily solve issues like slow database, high wait events, slow query and many more issues. Even though the report is lengthy, Analyzing or Reading relevant part of AWR Report can help to troubleshoot issues in easy and fast manner.


In case if a particular query is not performing well, look at execution plan of the query, stats of underlying table etc. In this case AWR won't help much.

General Tips before getting an AWR Report.

1. Collect Multiple AWR Reports: It's always suggested to have two AWR Reports, one for good time (when database was performing well), second when performance is poor. This way we can easily compare good and bad report to find out the culprit.

2. Stick to Particular Time: Always take the report during the time when the Database is performing slow. We have to have a specific time like Database was slow today at 10Am and continued till 2Pm. Here, we can get a report for these four hours. 

3. Split Large AWR Report into Smaller Reports: Instead of having one report for long time like one report for 4hrs. it's is better to have four reports each for one hour. This will help to isolate the problem.

Now, let us walkthrough the report. Since, AWR report is a huge report and area to look into AWR is also depends on problem to problem. Here, I am listing most common area for a Performance Engineer to look into which will give a clear picture of the issue. 


Steps to Analyze AWR Report:

Note: The Report is obtained using OEM with Oracle Release 12.1.0.1.0
1. Database Details:

After getting an AWR Report check the report generated is for the problematic DB. The details are shown in Top part of the report.
In this part cross check for database and instance and and database version with the Database having performance issue. This report shows RAC = NO if it is not a RAC DB.


2. Host Configuration:

This will give you name, platform CUP, socket and RAM etc. Important thing to notice is that the configuration like CPU and Memory has not changed when the performance is degraded.



3. Snap Shot Detail:

This are the detail about snap shot taken, Snap start time and end time. Difference between them is as "Elapsed".





DB Time= session time spent in database.

DB Time= CPU Time + Non IDLE wait time.


In Systems with multiple concurrent active sessions, DB time can be larger than elapsed time. We can see that DB time is very large as compared to Elapse time, which is not a concern. Check if you have taken a report for the time having performance problem. If yes fine, otherwise 
take a report for performance problem time. 


4. Load Profile:

Here are few important stats to look into. First is "DB CPU(s)" per second. Before that let's understand how DB CPU's work. Suppose you have 8 cores into the system. So, per wall clock second you have 8 seconds to work on CPU. 




 So, if "DB CPU(s)" per second in this report > cores in (Host Configuration (#2)) means env is CPU bound and either need more CPU's or need to further check is this happening all the time or just for a fraction of time.

In this case, machine has 8 cores and DB CPU(s) per second is 0.1. So, this is not a CPU bound case

Next look at Parses and Hard parses. If the ratio of Hard parses to Parses is high, this means Database is performing more Hard parses. In that case we needs to look at parameters like cursor_sharing and application level for bind variables etc.

In the above we have no issue with parses.
Parse issues usually occur as a result of:
           Bad Bind variables usage
           Insufficient memory
           Will also be co-indicated by low percentage of memory for multiple SQL execution.


5. Instance Efficiency Percentages:
Note: DB Tuning must never be driven by the Hit ratios. These only provide additional information for help to understand how the instance is operating.

Meanings of particular Hit Ratios:

a.       Buffer Nowait%: Shows the % of times when data buffers were accessed directly without any wait time.

b.       Buffer Hit Ratio : Measures how many times a  required block was found in memory rather than having to execute an expensive read operation on disk to get the block.

c.        Library Hit%: Shows the % of times when SQL statements and PL/SQL packages were found in the shared pool.

d.       Execute to Parse %: Shows how often parsed SQL statements are reused without re-parsing.

e.        Parse CPU to Parse Elapsed %: Gives the ratio of CPU time spent to parse SQL statements.

f.        Redo NoWait %: Shows whether the redo log buffer has sufficient size.

g.        In-memory Sort %: Shows the percentage of times when sorts are performed in memory instead of using temporary tablespaces.

h.       Soft Parse % : Shows how often sessions issued a SQL statements that is already in the shared pool and how it can use an existing version of the statement.

i.         Latch Hit %: Shows how often latches were acquired without having to wait.

j.         % Non-Parse CPU : Shows the percentage of how much CPU resources were spent on the actual SQL execution.

In these statistics, you have to look at

Soft Parse % ~ 100% indicates that the SQL statements are actively re-used.

"% Non-Parse CPU". ~ 100% means most of the CPU resources are used into operations other than parsing, which is good for database health.

Parse CPU to parse Elapsed % : It is very low, it reveals that oracle waits for some resources during parsing of SQL statements. To be investigated further.

Rule of thumb: Always minimize the number of Hard parses. This reduction yields the benefits of minimizing CPU overhead spent performing costly parse work.

6. Top 10 Foreground Events by Total Wait Time:

This is another most important stats to consider while looking at AWR Report for any database performance related issue.  This report is critical because it shows those database events that might constitute the bottleneck for the system. This has a list of top 10 foreground events arranged by Total Wait Time.



Here, first of all check for wait class if wait class is  User I/O , System I/O,  Others etc this could be fine but if wait class has value "Concurrency" then there could be some serious problem. Next to look at is Total Wait Time (sec) which show how many times DB was waiting in this class and then Wait Avg (ms). If Total Wait Time(sec) are high but Wait Avg(ms) is low then you can ignore this. If both are high or Wait Avg(ms) is high then this has to further investigate.

In the above screen shot, most of the resource are taken by Log file sync = 91.9% DB time and the wait class is “Commit”

In this report the event is "log file sync" which has high Waits, huge % DB time and large values in Total Wait Time (sec)  and Wait  Avg(ms) with wait class as commit. So, here you have to investigate further. 

Note : Use highest Total Wait Time(sec) to guide investigation.

Next are Wait Classes by Total Wait Time, Host CPU, Instance CPU, IO Profile, Memory Statistics, Cache Sizes are self-explanatory.
Point to remember while checking these results :
Ø  Watch for number of CPUs
Ø  Pay attention to changes in Memory size
Ø  An idle CPU can be a bad thing
Ø  Always look at Wait IO verses CPU usage
Ø  If the system is IO bound CPU will be idle!

7. Shared Pool Statistics:


In general, Memory usage % statistics should be ~70% after the DB has been running a long time. If its quite low, memory is being wasted. Here in our report we have Memory Usage % ~50% which is good.

In case if we have this to ~90% or above as shared pool consumed. This could indicate that the system experiences some overhead while aging out old shared memory structures like cursors, PL/SQL programs, and so on. This will place additional overhead on the CPU to perform reparsing aging-out. The size of the shared pool should be increased appropriately to eliminate such overhead.

The % SQL with executions >1 statistics indicate how many SQL statements are executed more than one time. This measures how well the applications are tuned and how well they make use of Bind variables.


8. Time Model Statistics:

This is a detailed explanations of system resource consumptions. Stats are order by Time (s) and % of DB Time.



Some times Sum of all  % of DB time may be > 100%. why is this ?

Because this is cumulative time i.e. SQL execute elapsed time may show more DB time, which includes it sub parts like parse time elapsed, hard parse elapsed time etc. So, if you find Hard parse time elapsed is taking more %. So investigate further so on and so forth.

We have to look for stat which is taking abnormal % of DB time.  

9. Operating System Statistics - Detail:

This is the information related to OS, what is the load status on System shown here.



This report shows, system is 97 to 98% idle at time of report taken, So, there is no resource crunch at system level. But if, you found very high busy, user or sys % and indeed this will led to low idle %. Investigate what is causing this.

Next, very crucial part of AWR report is SQL Statistics which has all sql query details executed during report time interval. 


We will explore few of them, To understand, how to analyzed these reports. Let's start with

10. SQL Ordered by Elapsed Time:

As explained by name itself, this lists SQL queries ordered by Elapsed time into reported time interval.



In this report, look for query which has low executions and high Elapsed time per Exec (s) and this query could be a candidate for troubleshooting or optimizations.

Important point: some times executions may show 0, it doesn't mean query is not executing, this might be the case when query was still executing and you took AWR report. That's why query completion was not covered in Report. 

11. SQL Ordered by CUP Time:

In this report, SQL queries are listed on the basis of CPU taken by the query i.e. queries causing high load on the system. The top few queries could be the candidate query for optimization.

From above stat, look for queries using highest CPU Times,

However, there are so many other stats in AWR Report which needs to be considered; here I have listed the most commonly used stats for any performance related information.


Comments

  1. Nice stuff. Thanks

    ReplyDelete
  2. Thanks for your valuable time for describe AWR.
    Seriously very nice.

    ReplyDelete
  3. Nice and very informative..

    ReplyDelete
  4. Lots of information and its nice.....

    ReplyDelete
  5. nicely presented. Thank you very much. I'm not a DBA but understand the explanation.

    ReplyDelete
  6. very Good explanation of each part, thanks

    ReplyDelete
  7. its very good explanation Thank you so much

    ReplyDelete
  8. Hey Lucky thanks for posting this up.

    ReplyDelete
  9. Thank you so much.

    ReplyDelete
  10. Nice article about understanding on how to analyze the AWR report. Appreciated.

    ReplyDelete
  11. best article I have found on AWR report analysis.

    ReplyDelete
  12. Perfect explanation and need to attach os level and OEM screenshots with steps generation of awr would be helpful deeply to all.,

    ReplyDelete
  13. hi..nice post,
    can you please explain thee same in more detail.
    or share your note about AWR report on sudhir.shilwant@yahoo.com

    Thannks

    ReplyDelete
  14. Nice article that explains more about AWR analysis from performance perspective.

    ReplyDelete
  15. Very informative

    ReplyDelete
  16. One of our application is hanging with large number of threads, where should we start from AWR report to investigate?

    ReplyDelete
  17. The design of your blog made this a chore to read.

    ReplyDelete
  18. Nice article. very goood explanaiton

    ReplyDelete
  19. I'm dazzled, I should say. Only sometimes do I go over a blog that is both educative and intriguing, and doubtlessly, you have nailed it. The issue is an issue that insufficient people are talking cleverly about. Presently i'm upbeat I unearthed this during my chase for something with respect to this.
    liveskye

    ReplyDelete
  20. 28.39-PM.png 1250w, https://www.laweekly.com/wp-content/uploads/2021/06/Screen-Shot-2021-06-01-at-1.28.

    ReplyDelete
  21. I never thought that I will be qualify for a loan that Officer Pedro and his loan company granted me which was very smooth and transparent in every conversation that we make through the loan process, I will once again thank him and his loan organization for a job well done by offering me a loan of 22 million Euro with the low rate of 2% annual return they are genuine and loan register company with simple terms and conditions.
    Contact the loan company through pedroloanss@gmail.com whatsapp: + 1-8632310632.

    ReplyDelete
  22. This comment has been removed by the author.

    ReplyDelete
  23. This comment has been removed by the author.

    ReplyDelete

Post a Comment

Popular posts from this blog

Vmstat Output explained

Verifications and Error Handling in LoadRunner *Web_reg_find and Web_reg_save_param*