Some Analysis of AWR Report

AWR is a new feature introduced in Oracle 10g version. The full name is Automatic Workload Repository-automatic load information repository. AWR generates report data by comparing the statistical information collected by two snapshots and snapshots. The generated report includes multiple part.
write picture description here
DB Time does not include the time consumed by the Oracle background process. If the DB Time is far less than the Elapsed time, the database is relatively idle.
db time= cpu time + wait time (does not include idle waiting) (non-background process)
To put it bluntly, db time is the time that the recorded server spends on database operations (non-background processes) and waiting (non-idle waiting).
DB time = cpu time + all of nonidle wait event time
in 79 minutes (during which 3 snapshot data were collected), the database took 11 minutes, the RDA data shows that the system has 8 logical CPUs (4 physical CPUs), the average Each CPU took 1.4 minutes, and the CPU utilization was only about 2% (1.4/79). Indicates that the system pressure is very low.

List the following two for explanation:
Report A:
Snap Id Snap Time Sessions Curs/Sess


Begin Snap: 4610 24-Jul-08 22:00:54 68 19.1
End Snap: 4612 24-Jul-08 23:00:25 17 1.7
Elapsed: 59.51 (mins)
DB Time: 466.37 (mins)

Report B:
Snap Id Snap Time Sessions Curs/Sess


Begin Snap: 3098 13-Nov-07 21:00:37 39 13.6
End Snap: 3102 13-Nov-07 22:00:15 40 16.4
Elapsed: 59.63 (mins)
DB Time: 19.49 (mins)
The server is an AIX system , 4 dual-core cpus, a total of 8 cores:
/sbin> bindprocessor -q
The available processors are: 0 1 2 3 4 5 6 7
Let’s talk about Report A first, in the snapshot interval, a total of about 60 minutes, the cpu has a total of 60* 8=480 minutes, DB time is 466.37 minutes, then:
cpu spends 466.37 minutes in processing Oracle non-idle waiting and operations (such as logical read)
that means cpu has 466.37/480*100% spent in processing Oracle operations , this does not include the background process
. Looking at Report B, about 60 minutes in total, 19.49/480*100% of the cpu is spent on processing Oracle operations.
Obviously, the average load of the server in 2 is very low.
From the Elapsed time and DB Time of the awr report, you can roughly understand the load of the db.

But for batch systems, the workload of the database is always concentrated over a period of time. If the snapshot period is not within this period, or the snapshot period span is too long and includes a lot of database idle time, the analysis results obtained are meaningless. This also shows that the selection of the analysis time period is critical, and the time period that can represent the performance problem must be selected.
write picture description here
Displays the size of each region in the SGA (after AMM has changed them), which can be used to compare with initial parameter values.
Shared pool mainly includes library cache and dictionary cache. The library cache is used to store the most recently parsed (or compiled) SQL, PL/SQL and Java classes. The dictionary cache is used to store the most recently referenced data dictionary. The cost of cache misses in library cache or dictionary cache is much higher than that in buffer cache. Therefore, the shared pool is set to ensure that the most recently used data can be cached.

write picture description here
It is more meaningful to display the database load profile and compare it with the baseline data. If the load per second or per transaction does not change much, it means that the application is running relatively stable. A single report data only describes the load of the application. Most data do not have a so-called "correct" value. However, Logons greater than 1~2 per second, Hard parses greater than 100 per second, and all parses greater than 300 per second indicate that there may be contention issues.
Redo size: The size of the log generated per second (in bytes), which can indicate the frequency of data changes and whether the database task is heavy or not. Logical reads: The number of logical reads per second/transaction block. Balance the number of logical read blocks generated per second. Logical Reads= Consistent Gets + DB Block Gets
Block changes: Number of blocks modified per second/transaction
Physical reads: Number of blocks physically read per second/transaction
Physical writes: Number of blocks physically written per second/transaction
User calls: Number of user calls per second/transaction
Parses: The number of SQL parsing. The number of parsing per second, including the combination of fast parse, soft parse and hard parse. Soft parsing over 300 per second means your "application" is not efficient, adjust session_cursor_cache. Here, fast parse refers to the case of hitting directly in the PGA (set session_cached_cursors=n); soft parse refers to the case of hitting in the shared pool; hard parse refers to the case of not hitting.
Hard parses: The number of hard parses, too many hard parses, indicates that the SQL reuse rate is not high. If the number of hard resolutions generated per second exceeds 100 times per second, it may indicate that your binding is not good, or the shared pool setting is unreasonable. At this time, the parameter cursor_sharing=similar|force can be enabled, and the default value of this parameter is exact. However, when this parameter is set to similar, there are bugs, which may lead to poor execution plans.
Sorts: the number of sorting
per second/transaction Logons: the number of logins per second/per transaction
Executes: the number of SQL executions per second/per transaction
Transactions: the number of transactions per second. The number of transactions generated per second reflects whether the database task is heavy or not .

Blocks changed per Read: Indicates the proportion of logical reads used to modify data blocks. The percentage of blocks changed in each logical read.
Recursive Call: The ratio of recursive calls to all operations. The percentage of recursive calls. If there are many PL/SQL, this value will be higher.
Rollback per transaction: The rollback rate of each transaction. Check whether the rollback rate is very high, because the rollback consumes resources. If the rollback rate is too high, it may indicate that your database has experienced too many invalid operations. Rollback may also bring Undo Block competition. The formula for calculating this parameter is as follows: Round(User rollbacks / (user commits + user rollbacks) ,4)* 100% .
Rows per Sort: The number of rows sorted each time
Note :
Oracle's hard parsing and soft
  parsing When it comes to soft parsing (soft prase) and hard parsing (hard prase), we have to talk about Oracle's processing of sql. When you issue an sql statement to Oracle, before executing and obtaining the result, Oracle will process the sql in several steps:
  1. Syntax check (syntax check)
  checks whether the spelling of this sql is grammatical.
  2. Semantic check,
  such as checking whether the access object in the sql statement exists and whether the user has the corresponding authority.
  3. Parse the sql statement (prase)
  Use the internal algorithm to parse the sql to generate a parse tree and an execution plan.
  4. Execute sql and return results (execute and return)
  Among them, soft and hard parsing occurs in the third process.
  Oracle uses the internal hash algorithm to obtain the hash value of the sql, and then checks whether the hash value exists in the library cache;
  if it exists, it compares the sql with the one in the cache;
  if it is "same", it will use the existing parse tree and execution plan, and omit the related work of the optimizer. This is the process of soft parsing.
  It is true that if any of the above two assumptions are not true, then the optimizer will create a parse tree and generate an execution plan. This process is called hard parsing.
  Creating a parse tree and generating an execution plan are expensive actions for SQL execution. Therefore, hard parsing should be avoided and soft parsing should be used as much as possible.
write picture description here
This section contains key Oracle metrics for memory hit ratio and the efficiency of other database instance operations. Among them, Buffer Hit Ratio is also called Cache Hit Ratio, and Library Hit ratio is also called Library Cache Hit ratio. Like the Load Profile section, there is no so-called "correct" value in this section, and it can only be judged according to the characteristics of the application. In a DSS environment using direct reads to execute large parallel queries, a Buffer Hit Ratio of 20% is acceptable, and this value is completely unacceptable for an OLTP system. According to Oracle's experience, for OLTP systems, the ideal Buffer Hit Ratio should be above 90%.
Buffer Nowait represents the unwaited proportion of data obtained in memory. Get the Buffer's pending ratio in the buffer. This value of Buffer Nowait generally needs to be greater than 99%. Otherwise there may be contention, which can be further confirmed in a later wait event.
Buffer hits represent the rate at which a process finds blocks of data from memory, and monitoring this value for significant changes is more important than the value itself. For general OLTP systems, if this value is lower than 80%, more memory should be allocated to the database. The hit rate of the data block in the data buffer should usually be above 95%. Otherwise, if it is less than 95%, important parameters need to be adjusted. If it is less than 90%, it may be necessary to add db_cache_size. A high hit rate does not necessarily mean that the performance of the system is optimal. For example, a large number of non-selective indexes are frequently accessed, which will cause a false phase with a high hit rate (a large number of db file sequential reads), but a A relatively low hit rate generally affects the performance of the system and needs to be adjusted. A mutation in hit rate is often a bad message. If the hit rate suddenly increases, you can check the top buffer get SQL to see the statements and indexes that cause a large number of logical reads. If the hit rate suddenly decreases, you can check the top physical reads SQL to check the statements that generate a large number of physical reads, mainly those that do not The index is used or the index is deleted.
Redo NoWait indicates the unwaited proportion of BUFFER obtained in the LOG buffer. If it is too low (refer to the 90% threshold), consider increasing the LOG BUFFER. When the redo buffer reaches 1M, it needs to write to the redo log file, so generally when the redo buffer setting exceeds 1M, it is unlikely to wait for the buffer space allocation. Currently, the redo buffer is generally set to 2M, which should not be too large for the total amount of memory.
The library hit indicates the ratio that Oracle retrieves a parsed SQL or PL/SQL statement from the Library Cache. When an application calls a SQL or stored procedure, Oracle checks the Library Cache to determine whether a parsed version exists. If it exists, Oracle immediately Execute the statement; if it does not exist, Oracle parses the statement and allocates a shared SQL area for it in the Library Cache. A low library hit ratio can cause excessive parsing, increase CPU consumption, and reduce performance. If the library hit ratio is lower than 90%, it may be necessary to increase the shared pool area. The hit rate of STATEMENT in the shared area should usually be kept above 95%. Otherwise, you need to consider: increase the shared pool; use binding variables; modify parameters such as cursor_sharing.
Latch Hit: Latch is a lock that protects the memory structure, which can be considered as the permission of the SERVER process to access the memory data structure. Make sure that the Latch Hit>99%, otherwise it means that the Shared Pool latch is contention, which may be due to unshared SQL, or the Library Cache is too small, which can be solved by using binding changes or increasing the Shared Pool. Make sure >99%, otherwise there are serious performance issues. When there is a problem with this value, we can use the subsequent waiting time and latch analysis to find and solve the problem.
Parse CPU to Parse Elapsd: the actual running time of parsing / (the actual running time of parsing + the waiting time for resources in parsing), the higher the better. The calculation formula is: Parse CPU to Parse Elapsd %= 100*(parse time cpu / parse time elapsed). That is: the actual running time of the parsing / (the actual running time of the parsing + the time of waiting for resources in parsing). If the ratio is 100%, it means that the CPU wait time is 0 and there is no wait.
Non-Parse CPU: SQL actual running time/(SQL actual running time + SQL parsing time), too low means that parsing consumes too much time. The calculation formula is: % Non-Parse CPU =round(100*1-PARSE_CPU/TOT_CPU),2). If this value is small, it means that the parsing consumes too much CPU time. If TOT_CPU is high compared to PARSE_CPU, this ratio will be close to 100%, which is good, indicating that most of the work performed by the computer is the work of executing queries, not analyzing them.
Execute to Parse: It is the ratio of statement execution and analysis. If the SQL reuse rate is to be high, this ratio will be very high. The higher the value, the more times a parsing is repeated. The calculation formula is: Execute to Parse =100 * (1 - Parses/Executions). In this example, a parse is required about every 5 executions. So if the system Parses > Executions, the ratio may be less than 0. This value < 0 usually indicates that there is a problem with the shared pool setting or statement efficiency, resulting in repeated parsing. The reparse may be serious, or it may be related to snapshots, which usually indicates a problem with database performance.
In-memory Sort: The ratio of in-memory sorting. If it is too low, it means that a large number of sorting is performed in the temporary table space. Consider turning up the PGA (10g). If it is lower than 95%, it can be solved by appropriately increasing the initialization parameter PGA_AGGREGATE_TARGET or SORT_AREA_SIZE. Note that the scope of these two parameters is different. SORT_AREA_SIZE is set for each session, and PGA_AGGREGATE_TARGET is for all sessions.
Soft Parse: The percentage of soft parsing (softs/softs+hards), which is approximately regarded as the hit rate of SQL in the shared area. If it is too low, the application needs to be adjusted to use binding variables. If the hit rate of sql in the shared area is less than 95%, you need to consider binding. If it is lower than 80%, then it can be considered that sql is basically not reused
write picture description here
. Memory Usage %: For a database that has been running for a period of time, the shared pool The memory usage should be stable between 75% and 90%. If it is too small, it means that the Shared Pool is wasting. If it is higher than 90, it means that there is contention in the Shared Pool and insufficient memory. This number should stabilize at 75% to 90% over time. If this percentage is too low, it indicates that the shared pool is set too large, which brings additional management burden, which can lead to performance degradation under certain conditions. If this percentage is too high, it will age components outside the shared pool, which will cause the SQL statement to be hard parsed if it is executed again. In a properly sized system, the usage of the shared pool will be in the range of 75% to slightly less than 90%.
SQL with executions>1: The ratio of SQL with execution times greater than 1. If this value is too small, it means that the More use of bind variables in applications to avoid excessive SQL parsing. In a system that tends to run in cycles, this number must be carefully considered. In this looping system, a different set of SQL statements are executed during part of the day relative to another part of the day. In the shared pool, there will be a set of SQL statements that have not been executed during the observation period simply because the statements to execute them did not run during the observation period. This number will only approach 100% if the system runs the same set of SQL statements consecutively.
Memory for SQL w/exec>1: The percentage of memory consumed by SQL whose execution times are greater than 1. This is a measure of how much memory is consumed by frequently used SQL statements compared to infrequently used SQL statements. This number will be very close to % SQL with executions > 1 in general, unless there are certain query tasks that consume memory irregularly. In steady state, you will generally see about 75% to 85% of the shared pool being used over time. If the time window of the Statspack report is large enough to cover all cycles, the percentage of SQL statements executed more than once should be close to 100%. This is a statistic that is affected by the duration between observations. It can be expected to increase with the length of time between observations.
Through ORACLE's instance validity statistics, we can get a general impression, but we cannot determine the performance of the data operation. To determine the current performance problem, we mainly rely on the following waiting events to confirm. We can understand the content of the two parts in this way. Hit statistics help us find and predict the performance problems that some systems will have, so we can plan ahead. The wait event means that the current database has a performance problem that needs to be solved, so it is the nature of making amends.
write picture description here
This is the focus of the report, showing the 5 worst waits in the system, listed in reverse order by proportion of wait time. When we tune, we always want to see the most dramatic effect, so that's where we start to determine what we do next. For example, if 'buffer busy wait' is a serious wait event, we should continue to study the contents of the Buffer Wait and File/Tablespace IO areas in the report to identify which files are causing the problem. If the most severe wait events are I/O events, we should study the SQL statement area sorted by physical reads to identify which statements are performing heavy I/O, and study the Tablespace and I/O area to observe slow response time files. If you have high LATCH waits, you need to look at the detailed LATCH statistics to identify which LATCH is causing the problem. For a system with good performance, the cpu time should be in front of top 5, otherwise your system spends most of its time waiting. Here, log file parallel write is relatively more waiting, taking up 7% of the CPU time. Usually, in a database without problems, CPU time is always listed first.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325995732&siteId=291194637