Vishal desai’s Oracle Blog

August 26, 2016

Parallel Statement Queuing and parallel_force_local

Filed under: Parallel, Troubleshooting — vishaldesai @ 4:42 pm

We had batch system that starts concurrent sessions all with parallel(8). With every release more concurrent sessions are added to that job and at times few sessions are getting serialized which in turn impacts runtime of batch. Downgraded SQL that used to take few minutes now run for couple of hours and does not complete. Lot of concurrent sessions completes within range of 1-5 minutes so it would be ideal to queue the sessions until it gets all parallel slave processes instead of downgrade and missing SLA. We did not want all the features of parallel automatic tuning so I thought of quickly testing only subset of feature called parallel statement queuing.

Operational warehouse system is on 5 node RAC (non Exadata) and batch is tied to service running on single node and parallel_force_local was set to TRUE for known reasons.

Test 1: There are two instances with total 16 slave processes and parallel_force_local set to TRUE. When we submit 4 concurrent sessions with parallel(4) there was downgrade and no queuing. When we submit 4 concurrent sessions with  parallel(8), one session is queued, two are downgraded and one gets all 8 slaves.

image

 

Test 2: Now I turn off parallel_force_local and we don’t get queuing with 4 concurrent sessions and parallel(4) as there are 16 slaves available across node. When we submit 6 concurrent sessions with parallel(4) we do get queuing and eventually all processes can get requested number of slaves and there were no downgrades.

image

Test 3: If I turn on parallel_force_local (same as Test 1) we get downgrades.

image

 

Test 4: Now I left parallel_force_local turned on but set parallel_min_server, parallel_max_servers and parallel_server_Targets to 0 on second instance. So there are only 8 slaves to serve requests across the cluster. Now I get queuing for 4 concurrent threads with parallel(4) and parallel(6).

image

Test 5: Now I tried parameter instead of hint and got same behavior.

image

image

Based on above observations, I think statement queuing logic does not take into consideration parallel_force_local set to TRUE or always assumes its FALSE. When we submit concurrent requests, Oracle thinks that there are slaves available (across the cluster) but when it actually tries to execute SQL due to parallel_force_local set to TRUE it has to get all slaves from same node and there are not enough slaves so its getting downgraded.

I will check with Oracle support to find out if its bug or design limitation and update this blog.

Update from Oracle support:

Bug 21438369 is fixed in 12.2

If the system is under a high load of PFL queries, then you may see that parallel_servers_target is not conformed by Resource Manager, or may result in downgrades even.

Parallel statement queuing did not conform to parallel_servers_target when a lot of parallel_force_local (PFL) queries are run at the same time. This was happening due to the stale published load value: kxfpinaslv.

It looks like there are some ongoing problems with these settings.

Instead of parallel_force_local = true setting, dev suggests:
If they want to restrict certain jobs to certain nodes, they should do the following:
– Create a service for each job type.
– Enable the service on the nodes where it should run.
– Create a consumer group for each service and map the service to the consumer group.

June 1, 2016

Load Balancing maintenance tasks on RAC nodes using dbms_scheduler

Filed under: Tools — vishaldesai @ 2:04 am

In shared environment, with instance caging turned on, I had to perform maintenance during downtime window to reorganize few tables that are populated by AQ. I wanted to load balance sub tasks, use all available resources and complete task quickly. Here is little snippet of code I wrote to load balance tasks across multiple nodes in RAC cluster.

Create job table and populate job table with sub tasks as shown below. I didn’t had huge variation in table sizes so order was  not that important. But if there is huge outlier table in list, put them first on the list otherwise you may end up with one job at the end that starts towards the end and run for long time.

drop table job_table purge;
create table job_table( task_no number primary key, task varchar2(4000),status char(1)) tablespace users;
--status null - work to do, I - in progress, S - Success, F - Failure

insert into job_table values(1   , 'begin execute immediate ''alter table  schema_name.table_name1  move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=1   ; commit; end;','');
insert into job_table values(2   , 'begin execute immediate ''alter table  schema_name.table_name2  move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=2   ; commit; end;','');
insert into job_table values(3   , 'begin execute immediate ''alter table  schema_name.table_name3  move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=3   ; commit; end;','');
insert into job_table values(4   , 'begin execute immediate ''alter table  schema_name.table_name4  move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=4   ; commit; end;','');
insert into job_table values(5   , 'begin execute immediate ''alter table  schema_name.table_name5  move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=5   ; commit; end;','');
insert into job_table values(6   , 'begin execute immediate ''alter table  schema_name.table_name6  move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=6   ; commit; end;','');
insert into job_table values(7   , 'begin execute immediate ''alter table  schema_name.table_name7  move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=7   ; commit; end;','');
insert into job_table values(8   , 'begin execute immediate ''alter table  schema_name.table_name8  move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=8   ; commit; end;','');
insert into job_table values(9   , 'begin execute immediate ''alter table  schema_name.table_name9  move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=9   ; commit; end;','');
insert into job_table values(10  , 'begin execute immediate ''alter table  schema_name.table_name10 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=10  ; commit; end;','');
insert into job_table values(11  , 'begin execute immediate ''alter table  schema_name.table_name11 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=11  ; commit; end;','');
insert into job_table values(12  , 'begin execute immediate ''alter table  schema_name.table_name12 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=12  ; commit; end;','');
insert into job_table values(13  , 'begin execute immediate ''alter table  schema_name.table_name13 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=13  ; commit; end;','');
insert into job_table values(14  , 'begin execute immediate ''alter table  schema_name.table_name14 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=14  ; commit; end;','');
insert into job_table values(15  , 'begin execute immediate ''alter table  schema_name.table_name15 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=15  ; commit; end;','');
insert into job_table values(16  , 'begin execute immediate ''alter table  schema_name.table_name16 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=16  ; commit; end;','');
insert into job_table values(17  , 'begin execute immediate ''alter table  schema_name.table_name17 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=17  ; commit; end;','');
insert into job_table values(18  , 'begin execute immediate ''alter table  schema_name.table_name18 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=18  ; commit; end;','');
insert into job_table values(19  , 'begin execute immediate ''alter table  schema_name.table_name19 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=19  ; commit; end;','');
insert into job_table values(20  , 'begin execute immediate ''alter table  schema_name.table_name20 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=20  ; commit; end;','');
insert into job_table values(21  , 'begin execute immediate ''alter table  schema_name.table_name21 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=21  ; commit; end;','');
insert into job_table values(22  , 'begin execute immediate ''alter table  schema_name.table_name22 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=22  ; commit; end;','');
insert into job_table values(23  , 'begin execute immediate ''alter table  schema_name.table_name23 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=23  ; commit; end;','');
insert into job_table values(24  , 'begin execute immediate ''alter table  schema_name.table_name24 move parallel 8 compress for oltp''; update job_table set status=''S'' where task_no=24  ; commit; end;','');

commit;

 

Change number of instances in your environment, number of sub tasks that you want to run simultaneously on each node and sleep interval and execute pl/sql block.

DECLARE
   --how many instances you want to use
   v_instances            NUMBER := 3;
   --jobs on each instance. Each job is a task from job_Table
   v_jobs                 NUMBER := 2;

   v_available_instance   NUMBER := NULL;
   v_available_jobs       NUMBER := NULL;
   v_job_name             varchar2(100);
   c                      number;

   CURSOR c1
   IS
      SELECT *
        FROM job_table
      ;
BEGIN
   FOR v1 IN c1
   LOOP
      --start find available instance

         DECLARE
            CURSOR c2
            IS
               SELECT running_instance, jobs running_jobs
                 FROM (SELECT a.running_instance, a.jobs - b.jobs jobs
                         FROM (    SELECT ROWNUM running_instance, v_jobs jobs
                                     FROM DUAL
                               CONNECT BY LEVEL <= v_instances) a,
                              (  SELECT running_instance,
                                        COUNT (running_instance) jobs
                                   FROM DBA_SCHEDULER_RUNNING_JOBS
                                  WHERE job_name LIKE '%MYJOB%'
                               GROUP BY running_instance) b
                        WHERE a.running_instance = b.running_instance(+))
                WHERE jobs IS NULL OR jobs > 0;

            v_running_instance   number;
            v_running_jobs       number;
         BEGIN
            OPEN c2;
            LOOP
               FETCH c2 INTO v_running_instance, v_running_jobs;

               
                  v_available_instance := v_running_instance;
                  v_available_jobs := v_running_jobs;
                  
                        v_job_name := DBMS_SCHEDULER.generate_job_name ('MYJOB_');
                          DBMS_SCHEDULER.create_job (
                             job_name              => v_job_name,
                             job_type              => 'PLSQL_BLOCK',
                             job_action            => v1.task,
                             comments              =>    'Instance '
                                                      || v_available_instance
                                                      || ' task no '
                                                      || v1.task_no,
                             number_of_arguments   => 0,
                             start_date            => SYSTIMESTAMP,
                             repeat_interval       => NULL,
                             end_date              => NULL,
                             enabled               => FALSE,
                             auto_drop             => TRUE );
                             
                          DBMS_SCHEDULER.set_attribute (
                            name      => v_job_name,
                            attribute => 'instance_id',
                            value     => v_available_instance);   

                         DBMS_SCHEDULER.enable (name => v_job_name);
                      
                          
                          UPDATE job_table
                             SET status = 'I'
                           WHERE task_no = v1.task_no;

                          COMMIT;
                          dbms_lock.sleep(2);
                
                --EXIT WHEN c2%NOTFOUND;
                EXIT;
            END LOOP;

            CLOSE C2;
         END;

        WHILE 1 > 0 loop
            dbms_lock.sleep(5);
            SELECT count(1) into c
                 FROM (SELECT a.running_instance, a.jobs - b.jobs jobs
                         FROM (    SELECT ROWNUM running_instance, v_jobs jobs
                                     FROM DUAL
                               CONNECT BY LEVEL <= v_instances) a,
                              (  SELECT running_instance,
                                        COUNT (running_instance) jobs
                                   FROM DBA_SCHEDULER_RUNNING_JOBS
                                  WHERE job_name LIKE '%MYJOB%'
                               GROUP BY running_instance) b
                        WHERE a.running_instance = b.running_instance(+))
                WHERE jobs IS NULL OR jobs > 0;
                if c > 0 then
                exit;
                end if;
        end loop;
            
        
   END LOOP;
END;
/

 

Now sit back relax and use your favorite monitoring tool and output from following SQL statements until task is complete.

select * from DBA_SCHEDULER_JOB_RUN_DETAILS where job_name like '%MYJOB%' order by actual_start_date desc;

select * from DBA_SCHEDULER_RUNNING_JOBS where job_name like '%MYJOB%';

select * from dba_scheduler_jobs where job_name like '%MYJOB%';
select * from job_table order by status;
select status,count(1) from job_table group by status;

exec dbms_scheduler.stop_job('MYJOB_11014');
exec dbms_scheduler.drop_job('MYJOB_10997');

 SELECT running_instance, jobs running_jobs
                 FROM (SELECT a.running_instance, a.jobs - b.jobs jobs
                         FROM (    SELECT ROWNUM running_instance, 1 jobs
                                     FROM DUAL
                               CONNECT BY LEVEL <= 3) a,
                              (  SELECT running_instance,
                                        COUNT (running_instance) jobs
                                   FROM DBA_SCHEDULER_RUNNING_JOBS
                                  WHERE job_name LIKE '%MYJOB%'
                               GROUP BY running_instance) b
                        WHERE a.running_instance = b.running_instance(+))
                WHERE jobs IS NULL OR jobs > 0;

I used similar approach for other long running tasks such as index rebuild, gathering statistics on tables and indexes. Other post tasks such as convert table , indexes degree back to 1 etc that does not take long time, were ran from regular sqlplus script.

November 6, 2015

SCHEMATRANDATA, NOVALIDATE Primary Key and Supplemental logging

Filed under: Golden Gate — vishaldesai @ 10:16 pm

We had SCHEMATRANDATA enabled in our golden gate configuration and in recent release developer decided to enable novalidate constraint as we were just converting from regular index to hash partitioned index so their thought was that there is no pointing in validating existing records again. But as constraint was enabled with NOVALIDATE option, due to SCHEMATRANDATA, supplemental logging was enabled on all the columns on some active tables. As a result update on single column captured all columns in golden gate trail and caused lag on replicat side as single column updates got converted to multi column updates.

Golden Gate configuration:

GGSCI > info schematrandata DEAL

2015-11-06 20:01:56  INFO    OGG-06480  Schema level supplemental logging, excluding non-validated keys, is enabled on schema DEAL.

2015-11-06 20:01:56  INFO    OGG-01980  Schema level supplemental logging is enabled on schema DEAL for all scheduling columns.

Test Case:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
SQL>DROP TABLE DEAL.t;

Table dropped.

SQL>
SQL>CREATE TABLE DEAL.t
    AS
       SELECT *
         FROM dba_objects
        WHERE object_id IS NOT NULL;

Table created.

SQL>
SQL>ALTER TABLE DEAL.t MODIFY object_id NOT NULL;

Table altered.

SQL>
SQL>SELECT * FROM TABLE (logmnr$always_suplog_columns ('DEAL', 'T'));

OWNER                          TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ ------------------------
DEAL                           T                              OWNER
DEAL                           T                              OBJECT_TYPE
DEAL                           T                              TEMPORARY
DEAL                           T                              SECONDARY
DEAL                           T                              OBJECT_NAME
DEAL                           T                              NAMESPACE
DEAL                           T                              OBJECT_ID
DEAL                           T                              DATA_OBJECT_ID
DEAL                           T                              LAST_DDL_TIME
DEAL                           T                              SUBOBJECT_NAME
DEAL                           T                              CREATED
DEAL                           T                              TIMESTAMP
DEAL                           T                              STATUS
DEAL                           T                              GENERATED
DEAL                           T                              EDITION_NAME

15 rows selected.

SQL>
SQL>CREATE UNIQUE INDEX DEAL.tpk  ON DEAL.t (object_id);

Index created.

SQL>ALTER TABLE DEAL.t ADD CONSTRAINT tpk PRIMARY KEY(object_id) USING INDEX;

Table altered.

SQL>
SQL>SELECT * FROM TABLE (logmnr$always_suplog_columns ('DEAL', 'T'));

OWNER                          TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ ------------------------
DEAL                           T                              OBJECT_ID

SQL>
SQL>ALTER TABLE DEAL.t DISABLE  CONSTRAINT tpk;

Table altered.

SQL>
SQL>SELECT * FROM TABLE (logmnr$always_suplog_columns ('DEAL', 'T'));

OWNER                          TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ ------------------------
DEAL                           T                              OBJECT_ID

SQL>DROP INDEX DEAL.tpk;

Index dropped.

SQL>
SQL>SELECT * FROM TABLE (logmnr$always_suplog_columns ('DEAL', 'T'));

OWNER                          TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ ------------------------
DEAL                           T                              OWNER
DEAL                           T                              OBJECT_TYPE
DEAL                           T                              TEMPORARY
DEAL                           T                              SECONDARY
DEAL                           T                              OBJECT_NAME
DEAL                           T                              NAMESPACE
DEAL                           T                              OBJECT_ID
DEAL                           T                              DATA_OBJECT_ID
DEAL                           T                              LAST_DDL_TIME
DEAL                           T                              SUBOBJECT_NAME
DEAL                           T                              CREATED
DEAL                           T                              TIMESTAMP
DEAL                           T                              STATUS
DEAL                           T                              GENERATED
DEAL                           T                              EDITION_NAME

15 rows selected.

SQL>CREATE UNIQUE INDEX DEAL.tpk  ON DEAL.t (object_id) global PARTITION BY HASH (object_id)
  2    (PARTITION p1  ,  PARTITION p2);

Index created.

SQL>SELECT * FROM TABLE (logmnr$always_suplog_columns ('DEAL', 'T'));

OWNER                          TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ ------------------------------
DEAL                           T                              OBJECT_ID

SQL>ALTER TABLE DEAL.t ENABLE NOVALIDATE CONSTRAINT tpk USING INDEX;

Table altered.

SQL>SELECT * FROM TABLE (logmnr$always_suplog_columns ('DEAL', 'T'));

OWNER                          TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ -------------------------
DEAL                           T                              OWNER
DEAL                           T                              OBJECT_TYPE
DEAL                           T                              TEMPORARY
DEAL                           T                              SECONDARY
DEAL                           T                              OBJECT_NAME
DEAL                           T                              NAMESPACE
DEAL                           T                              OBJECT_ID
DEAL                           T                              DATA_OBJECT_ID
DEAL                           T                              LAST_DDL_TIME
DEAL                           T                              SUBOBJECT_NAME
DEAL                           T                              CREATED
DEAL                           T                              TIMESTAMP
DEAL                           T                              STATUS
DEAL                           T                              GENERATED
DEAL                           T                              EDITION_NAME

15 rows selected.

GGSCI > info trandata deal.t

2015-11-06 20:12:13  INFO    OGG-06480  Schema level supplemental logging, excluding non-validated keys, is enabled on schema DEAL.

2015-11-06 20:12:13  INFO    OGG-01980  Schema level supplemental logging is enabled on schema DEAL for all scheduling columns.

Logging of supplemental redo log data is enabled for table DEAL.T.

Columns supplementally logged for table DEAL.T: CREATED, DATA_OBJECT_ID, EDITION_NAME, GENERATED, LAST_DDL_TIME, NAMESPACE, OBJECT_ID, OBJECT_NAME, OBJECT_TYPE, OWNER, SECONDARY, STATUS, SUBOBJECT_NAME, TEMPORARY, TIMESTAMP.


SQL>ALTER TABLE DEAL.T  ENABLE  CONSTRAINT tpk USING INDEX;

Table altered.

SQL>SELECT * FROM TABLE (logmnr$always_suplog_columns ('DEAL', 'T'));

OWNER                          TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ -------------------------
DEAL                           T                              OBJECT_ID

GGSCI > info trandata deal.t

2015-11-06 20:12:39  INFO    OGG-06480  Schema level supplemental logging, excluding non-validated keys, is enabled on schema DEAL.

2015-11-06 20:12:39  INFO    OGG-01980  Schema level supplemental logging is enabled on schema DEAL for all scheduling columns.

Logging of supplemental redo log data is enabled for table DEAL.T.

Columns supplementally logged for table DEAL.T: OBJECT_ID.

May 11, 2015

Filter predicates after MIN aggregation

Filed under: Performance Tuning, SQL — vishaldesai @ 8:41 pm

One of OLTP SQL on Exadata was consuming lot of CPU and was running for about 8 seconds per execution. Most of CPU activity was towards bottom half of the SQL as shown in real time SQL monitoring report below.

 

image 

image

To demonstrate this I will created simple test case using following code. PCTFREE, c3 and 90000 rows for multiple iterations are used to create large table:

SQL> drop table tab purge;
drop table tab purge
           *
ERROR at line 1:
ORA-00942: table or view does not exist


SQL> create table tab (c1 number, c2 date, c3 char(2000)) pctfree 50;

Table created.

SQL> insert into tab select rownum, sysdate-rownum, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> insert into tab select rownum, sysdate-rownum-3, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> insert into tab select rownum, sysdate-rownum-2, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> insert into tab select rownum, sysdate-rownum-1, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> insert into tab select rownum, sysdate-rownum+2, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> insert into tab select rownum, sysdate-rownum+2, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> commit;

Commit complete.

SQL> create index i_c1 on tab(c1);

Index created.

SQL> create index i_c2 on tab(c2);

Index created.

SQL> exec dbms_stats.gather_table_stats(USER,'TAB',cascade=>true,estimate_percent=>100);

PL/SQL procedure successfully completed.

Original SQL was calculating MIN of date column grouped by column c1 and then was applying filter against MIN column for narrow date range. So majority of rows were filtered out after full table scan and aggregation. As you can see it was doing 540k buffer gets to return 5 records. As this database is on Exadata, it does 540K physical reads very fast due to smart scan but in original SQL 43% of CPU activity was against full table scan operation.

 

SQL> alter session set statistics_level=all;
SQL>
SQL> select c1, c2 from
  2  (select c1,min(c2) c2
  3  from tab
  4  group by c1)
  5  where c2>=sysdate-15 and c2<=sysdate-10
  6  order by 1,2;

        C1 C2
---------- ---------
         7 01-MAY-15
         8 30-APR-15
         9 29-APR-15
        10 28-APR-15
        11 27-APR-15
SQL>
SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  58qhpx48c18v9, child number 0
-------------------------------------
select c1, c2 from (select c1,min(c2) c2 from tab group by c1) where
c2>=sysdate-15 and c2<=sysdate-10 order by 1,2

Plan hash value: 443431404

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |      1 |        |      5 |00:00:00.69 |     540K|    540K|       |       |          |
|   1 |  SORT ORDER BY               |      |      1 |   4500 |      5 |00:00:00.69 |     540K|    540K|  2048 |  2048 | 2048  (0)|
|*  2 |   FILTER                     |      |      1 |        |      5 |00:00:00.69 |     540K|    540K|       |       |          |
|   3 |    HASH GROUP BY             |      |      1 |   4500 |  90000 |00:00:00.68 |     540K|    540K|    31M|  8462K| 5717K (0)|
|   4 |     TABLE ACCESS STORAGE FULL| TAB  |      1 |    540K|    540K|00:00:00.47 |     540K|    540K|  1025K|  1025K| 7199K (0)|
-----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter((MIN("C2")>=SYSDATE@!-15 AND MIN("C2")<=SYSDATE@!-10))

 

Column C1 was not unique so we cannot just apply filter prior to MIN. As you can see from below that consistent gets have gone down significantly due to index range scan but SQL is not returning same output. If filter predicate is applied first its returning records that we need and records that we don’t (false positives).

SQL> select c1,min(c2)
  2  from tab
  3  where c2>=sysdate-15 and c2<=sysdate-10
  4  group by c1
  5  order by 1,2;

        C1 MIN(C2)
---------- -----------------
         7 05/01/15 18:42:40
         8 04/30/15 18:42:40
         9 04/29/15 18:42:40
        10 04/28/15 18:42:40
        11 04/27/15 18:42:40
        12 04/27/15 18:43:15
        13 04/27/15 18:43:50
        14 04/27/15 18:42:04
        15 04/28/15 18:44:25
        16 04/27/15 18:44:25
SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats l

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------
SQL_ID  7fwwz4v1a9fx4, child number 0
-------------------------------------
select c1,min(c2) from tab where c2>=sysdate-15 and c2<=sysdate-10
group by c1 order by 1,2

Plan hash value: 1451778582

-------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |      1 |        |     10 |00:00:00.01 |      33 |
|   1 |  SORT ORDER BY                 |      |      1 |     32 |     10 |00:00:00.01 |      33 |
|   2 |   HASH GROUP BY                |      |      1 |     32 |     10 |00:00:00.01 |      33 |
|*  3 |    FILTER                      |      |      1 |        |     30 |00:00:00.01 |      33 |
|   4 |     TABLE ACCESS BY INDEX ROWID| TAB  |      1 |     32 |     30 |00:00:00.01 |      33 |
|*  5 |      INDEX RANGE SCAN          | I_C2 |      1 |     32 |     30 |00:00:00.01 |       3 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(SYSDATE@!-10>=SYSDATE@!-15)
   5 - access("C2">=SYSDATE@!-15 AND "C2"<=SYSDATE@!-10)

For such construct of SQL, we can still apply filter predicates first and remove records that should not be returned (false positives) using sub query as shown below. As you can see from below the consistent gets have gone down, indexes are being used, response time has improved, CPU activity has also gone down and we are getting same output.

 

SQL> select  t1.c1, t1.c2 from
  2  (select c1,min(c2) c2
  3  from tab
  4  where c2>=sysdate-15 and c2<=sysdate-10
  5  group by c1) t1
  6  where not exists (select c1 from tab t2 where t1.c1 = t2.c1 and t1.c2 > t2.c2 and t2.c2<=sysdate-10)
  7  order by 1,2;

        C1 C2
---------- -----------------
         7 05/01/15 18:42:40
         8 04/30/15 18:42:40
         9 04/29/15 18:42:40
        10 04/28/15 18:42:40
        11 04/27/15 18:42:40
SQL>
SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------
SQL_ID  g0agfd3mm02qk, child number 0
-------------------------------------
select  t1.c1, t1.c2 from (select c1,min(c2) c2 from tab where
c2>=sysdate-15 and c2<=sysdate-10 group by c1) t1 where not exists
(select c1 from tab t2 where t1.c1 = t2.c1 and t1.c2 > t2.c2 and
t2.c2<=sysdate-10) order by 1,2

Plan hash value: 327191261

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |      |      1 |        |      5 |00:00:00.01 |      84 |    	  |       |
|   1 |  SORT ORDER BY                   |      |      1 |     30 |      5 |00:00:00.01 |      84 |  2048 |  2048 |
|   2 |   NESTED LOOPS ANTI              |      |      1 |     30 |      5 |00:00:00.01 |      84 |       |       |
|   3 |    VIEW                          |      |      1 |     32 |     10 |00:00:00.01 |      33 |       |       |
|   4 |     HASH GROUP BY                |      |      1 |     32 |     10 |00:00:00.01 |      33 |  3331K|  3331K|
|*  5 |      FILTER                      |      |      1 |        |     30 |00:00:00.01 |      33 |       |       |
|   6 |       TABLE ACCESS BY INDEX ROWID| TAB  |      1 |     32 |     30 |00:00:00.01 |      33 |       |       |
|*  7 |        INDEX RANGE SCAN          | I_C2 |      1 |     32 |     30 |00:00:00.01 |       3 |  1025K|  1025K|
|*  8 |    TABLE ACCESS BY INDEX ROWID   | TAB  |     10 |  26996 |      5 |00:00:00.01 |      51 |       |       |
|*  9 |     INDEX RANGE SCAN             | I_C1 |     10 |      6 |     38 |00:00:00.01 |      13 |  1025K|  1025K|
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter(SYSDATE@!-10>=SYSDATE@!-15)
   7 - access("C2">=SYSDATE@!-15 AND "C2"<=SYSDATE@!-10)
   8 - filter(("T2"."C2"<=SYSDATE@!-10 AND "T1"."C2">"T2"."C2"))
   9 - access("T1"."C1"="T2"."C1")

April 24, 2015

Wait Chain analysis using Tanel’s awesome ash_wait_chains script

Filed under: Oracle Database, Troubleshooting — Tags: — vishaldesai @ 7:14 pm

Our database performance engineering team was trying to troubleshoot high wait time on enq:SQ – contention and was working with Oracle support. Snippet from Active Sessions and AWR report is shown below:

 

image

image

image

I thought it could be classic case of Sequence concurrency in RAC environment. I started looking as dba_hist_active_sess_history and found all the sessions were waiting on same sequence object id (p2=362). Object id 362 belongs to sequence AUDSES$ which is used by AUD$ and sequence cache size was already set to 10000. In previous releases there were couple of bugs where sequence cache for AUDSES$ was not high enough and workaround was to fix cache size but this particular database is on 11.2.0.4 and cache size is also 10000 so there was no reason to believe we were hitting those bugs.

image

I manually started mining ASH for sample session that led me all the way from one instance to another instance’s lgwr session.

image

image

image

image

So for given session 1286 and few others wait chain was as follows:

image

Then I realized, instead of doing this manually, I could use Tanel’s ash_wait_chains.sql to see the overall impact. As you can see from top of the output that more than 70% of impact was along the same line.

 

@dash_wait_chains username||':'||event  1=1 1596 1596

%This     SECONDS WAIT_CHAIN
------ ---------- ----------------------------------------------------------------------------------------------------------------------------------------------------
  70%   105023520 -> SYS:enq: SQ - contention -> SYS:row cache lock -> SYS:log file switch (archiving needed) -> SYS:control file sequential read
   9%    13698720 -> SYS:enq: SQ - contention -> SYS:row cache lock -> SYS:log file switch (archiving needed) -> SYS:Disk file Mirror Read
   6%     9803520 -> SYS:enq: SQ - contention -> SYS:log file switch (archiving needed) -> SYS:control file sequential read
   6%     9132480 -> SYS:enq: SQ - contention -> SYS:row cache lock -> SYS:log file switch (archiving needed) -> SYS:
   3%     4566240 -> SYS:enq: SQ - contention -> SYS:row cache lock -> SYS:log file switch (archiving needed) -> SYS:enq: CF - contention
   1%     1887840 -> SYS:buffer busy waits -> SYS:log file switch (archiving needed) -> SYS:control file sequential read
   1%     1278720 -> SYS:enq: SQ - contention -> SYS:log file switch (archiving needed) -> SYS:Disk file Mirror Read
   1%     1159430 -> DBFS_dbname01:buffer busy waits -> DBFS_dbname01:log file switch (archiving needed) -> SYS:control file sequential read
   1%      852480 -> SYS:enq: SQ - contention -> SYS:log file switch (archiving needed) -> SYS:
   0%      695520 -> DBSNMP:buffer busy waits -> ETL_USER:log file switch (archiving needed) -> SYS:control file sequential read
   0%      695520 -> SYS:row cache lock -> SYS:log file switch (archiving needed) -> SYS:control file sequential read
   0%      426240 -> SYS:enq: SQ - contention -> SYS:log file switch (archiving needed) -> SYS:enq: CF - contention
   0%      246240 -> SYS:buffer busy waits -> SYS:log file switch (archiving needed) -> SYS:Disk file Mirror Read
   0%      195500 -> SYS:log file switch (archiving needed) -> SYS:control file sequential read
   0%      182160 -> ETL_USER:log file switch (archiving needed) -> SYS:control file sequential read
   0%      164160 -> SYS:buffer busy waits -> SYS:log file switch (archiving needed) -> SYS:
   0%      151230 -> DBFS_dbname01:buffer busy waits -> DBFS_dbname01:log file switch (archiving needed) -> SYS:Disk file Mirror Read
   0%      100820 -> DBFS_dbname01:buffer busy waits -> DBFS_dbname01:log file switch (archiving needed) -> SYS:
   0%       90720 -> SYS:row cache lock -> SYS:log file switch (archiving needed) -> SYS:Disk file Mirror Read
   0%       90720 -> DBSNMP:buffer busy waits -> ETL_USER:log file switch (archiving needed) -> SYS:Disk file Mirror Read
   0%       82080 -> SYS:buffer busy waits -> SYS:log file switch (archiving needed) -> SYS:enq: CF - contention
   0%       60480 -> SYS:row cache lock -> SYS:log file switch (archiving needed) -> SYS:
   0%       60480 -> DBSNMP:buffer busy waits -> ETL_USER:log file switch (archiving needed) -> SYS:
   0%       50410 -> DBFS_dbname01:buffer busy waits -> DBFS_dbname01:log file switch (archiving needed) -> SYS:enq: CF - contention
   0%       32660 -> DBFS_dbname01:log file switch (archiving needed) -> SYS:control file sequential read
   0%       30240 -> SYS:row cache lock -> SYS:log file switch (archiving needed) -> SYS:enq: CF - contention
   0%       30240 -> DBSNMP:buffer busy waits -> ETL_USER:log file switch (archiving needed) -> SYS:enq: CF - contention
   0%       25500 -> SYS:log file switch (archiving needed) -> SYS:Disk file Mirror Read
   0%       23760 -> ETL_USER:log file switch (archiving needed) -> SYS:Disk file Mirror Read
   0%       17000 -> SYS:log file switch (archiving needed) -> SYS:
   0%       15840 -> ETL_USER:log file switch (archiving needed) -> SYS:
   0%       13590 -> SYS:enq: SQ - contention -> ETL_USER:
   0%        8500 -> SYS:log file switch (archiving needed) -> SYS:enq: CF - contention
   0%        7920 -> ETL_USER:log file switch (archiving needed) -> SYS:enq: CF - contention
   0%        5520 -> DBSNMP:log file switch (archiving needed) -> SYS:control file sequential read
   0%        4260 -> DBFS_dbname01:log file switch (archiving needed) -> SYS:Disk file Mirror Read
   0%        3600 -> ETL_USER:
   0%        3020 -> SYS:enq: SQ - contention -> ETL_USER:cell single block physical read
   0%        2840 -> DBFS_dbname01:log file switch (archiving needed) -> SYS:
   0%        1420 -> DBFS_dbname01:log file switch (archiving needed) -> SYS:enq: CF - contention
   0%         970 -> SYS:
   0%         870 -> GGADMIN:SQL*Net more data from client
   0%         720 -> DBSNMP:log file switch (archiving needed) -> SYS:Disk file Mirror Read
   0%         600 -> ETL_USER:cell single block physical read
   0%         480 -> DBSNMP:log file switch (archiving needed) -> SYS:
   0%         240 -> DBSNMP:log file switch (archiving needed) -> SYS:enq: CF - contention
   0%         230 -> SYS:control file sequential read
   0%         160 -> SYS:gc current request
   0%          90 -> SYS:flashback log file write
   0%          90 -> SYS:db file parallel write
   0%          80 -> ETL_USER:log file sync -> SYS:
   0%          80 -> ETL_USER:log file sync -> SYS:log file parallel write
   0%          80 -> GGADMIN:cell single block physical read
   0%          80 -> GGADMIN:
   0%          70 -> SYS:reliable message
   0%          70 -> SYS:log file sequential read
   0%          70 -> SYS:JS kgl get object wait
   0%          60 -> ETL_USER:resmgr:cpu quantum
   0%          40 -> SYS:log file parallel write
   0%          40 -> ETL_USER:cell multiblock physical read
   0%          30 -> SYS:ges lms sync during dynamic remastering and reconfig -> SYS:
   0%          30 -> GGADMIN:gcs drm freeze in enter server mode -> SYS:ges lms sync during dynamic remastering and reconfig -> SYS:
   0%          30 -> SYS:Disk file Mirror Read
   0%          20 -> SYS:enq: CF - contention
   0%          20 -> GGADMIN:Streams miscellaneous event
   0%          10 -> ETL_USER:direct path write temp
   0%          10 -> ETL_USER:gc cr grant 2-way
   0%          10 -> DBSNMP:log file sync -> SYS:log file parallel write
   0%          10 -> SYS:change tracking file synchronous read
   0%          10 -> U002585:
   0%          10 -> ETL_USER:direct path read temp
   0%          10 -> SYS:library cache lock
   0%          10 -> DBFS_dbname01:
   0%          10 -> ETL_USER:enq: FB - contention
   0%          10 -> ETL_USER:gc current block 3-way
   0%          10 -> ETL_USER:gc current block 2-way
   0%          10 -> SYS:ASM file metadata operation
   0%          10 -> DBFS_dbname01:direct path write
   0%          10 -> DBSNMP:log file sync -> SYS:

79 rows selected.

Further from below output its also clear that the ultimate blocker was session id 1262 on instance 2 (LGWR process). Line id with 3% impact matches my manual analysis.

 

dash_wait_chains instance_number||':'||session_id  1=1 1596 1596

%This     SECONDS WAIT_CHAIN
------ ---------- -----------------------------------------------------
  24%    35955360 -> 1:3164 -> 1:654 -> 2:2540 -> 2:1262
  21%    32447520 -> 1:2533 -> 1:654 -> 2:2540 -> 2:1262
  17%    25431840 -> 1:1914 -> 1:654 -> 2:2540 -> 2:1262
  13%    19293120 -> 1:1913 -> 1:654 -> 2:2540 -> 2:1262
  10%    14908320 -> 1:6 -> 1:654 -> 2:2540 -> 2:1262
   3%     4384800 -> 1:1286 -> 1:654 -> 2:2540 -> 2:1262
   1%     1503360 -> 2:2542 -> 2:2540 -> 2:1262
   1%     1503360 -> 2:1276 -> 2:2540 -> 2:1262
   1%     1503360 -> 2:2520 -> 2:2538 -> 2:1262
   1%     1503360 -> 2:1907 -> 2:2540 -> 2:1262
   1%     1503360 -> 2:3170 -> 2:2540 -> 2:1262
   1%     1503360 -> 2:2533 -> 2:2540 -> 2:1262
   1%     1503360 -> 2:1282 -> 2:2540 -> 2:1262
   1%     1503360 -> 2:25 -> 2:2540 -> 2:1262
   1%     1503360 -> 2:3171 -> 2:2540 -> 2:1262
   1%     1461890 -> 2:1906 -> 2:1909 -> 2:1262
   1%      876960 -> 1:654 -> 2:2540 -> 2:1262
   1%      876960 -> 2:1263 -> 2:2538 -> 2:1262
   0%      709920 -> 2:650 -> 2:3169 -> 2:1262
   0%      229680 -> 2:650 -> 2:2540 -> 2:1262
   0%      167040 -> 2:645 -> 2:3169 -> 2:1262
   0%      104400 -> 2:3163 -> 2:2540 -> 2:1262
   0%       20880 -> 2:653 -> 2:1262
   0%       20880 -> 2:1908 -> 2:1262
   0%       20880 -> 2:20 -> 2:1262
   0%       20880 -> 2:2540 -> 2:1262
   0%       20880 -> 2:1281 -> 2:1262
   0%       20880 -> 2:1269 -> 2:1262
   0%       20880 -> 2:2536 -> 2:1262
   0%       20880 -> 2:649 -> 2:1262
   0%       20880 -> 2:6 -> 2:1262
   0%       20880 -> 2:3169 -> 2:1262
   0%       20880 -> 2:1284 -> 2:1262
   0%       20880 -> 2:1896 -> 2:1262
   0%       20880 -> 2:2538 -> 2:1262
   0%       20880 -> 2:17 -> 2:1262
   0%       20880 -> 2:655 -> 2:1262
   0%       20880 -> 2:1280 -> 2:1262
   0%       20880 -> 2:1895 -> 2:1262
   0%       20880 -> 2:1899 -> 2:1262
   0%       20880 -> 2:3158 -> 2:1262
   0%       20880 -> 2:3167 -> 2:1262
   0%       20880 -> 2:24 -> 2:1262
   0%       20880 -> 2:1288 -> 2:1262
   0%       20590 -> 2:1909 -> 2:1262
   0%       20590 -> 2:2541 -> 2:1262
   0%       16820 -> 2:1913 -> 2:1262
   0%        6960 -> 2:2523 -> 2:1262
   0%        4510 -> 1:3164 -> 1:654
   0%        4070 -> 1:2533 -> 1:654
   0%        3190 -> 1:1914 -> 1:654
   0%        2420 -> 1:1913 -> 1:654
   0%        1870 -> 1:6 -> 1:654
   0%         880 -> 3:3162
   0%         720 -> 2:632

Now the question is what was LGWR process doing and why was it constantly trying to read control file. If you run 10046 on LGWR process you will notice that LGWR will read control file while doing log switches.

WAIT #0: nam='control file sequential read' ela= 59 file#=0 block#=1 blocks=1 obj#=-1 tim=1429886768732384
WAIT #0: nam='control file sequential read' ela= 203 file#=0 block#=40 blocks=1 obj#=-1 tim=142988668732612
WAIT #0: nam='control file sequential read' ela= 197 file#=0 block#=42 blocks=1 obj#=-1 tim=1429886768732845
WAIT #0: nam='control file sequential read' ela= 216 file#=0 block#=1 blocks=1 obj#=-1 tim=1429886768733206
WAIT #0: nam='control file sequential read' ela= 192 file#=1 block#=1 blocks=1 obj#=-1 tim=1429886768733440
WAIT #0: nam='control file sequential read' ela= 200 file#=0 block#=40 blocks=1 obj#=-1 tim=1429886768733674
WAIT #0: nam='control file sequential read' ela= 198 file#=0 block#=42 blocks=1 obj#=-1 tim=1429886768733909

Then I checked alert log on instance 2 and found number of occurrences of following errors:

Tue Apr 21 19:59:21 2015
ORA-19815: WARNING: db_recovery_file_dest_size of 858993459200 bytes is 100.00% used, and has 0 remaining bytes available.
ARC1: Error 19809 Creating archive log file to '+RECO1'
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance dbname2 - Archival Error
ORA-16038: log 21 sequence# 2506 cannot be archived
ORA-19809: limit exceeded for recovery files

Session id 1262 on instance 2 was doing number of small control file sequential read and none of single IO was stuck in large wait.

@event_hist.sql control.*read session_id=1262 1596 1596
                                 Wait time    Num ASH   Estimated    Estimated    % Event  Estimated
Wait Event                      bucket ms+    Samples Total Waits    Total Sec       Time  Time Graph
------------------------------ ----------- ---------- ----------- ------------ ---------- ------------
control file sequential read           < 1         17     66149.3         66.1       88.3 |######### |
                                      < 16          2       171.8          2.7        3.6 |          |
                                      < 32          3       128.2          4.1        5.5 |#         |
                                     < 128          1        15.6          2.0        2.7 |          |

 

image

Majority of sessions were ultimately waiting for session id 1262 on instance 2 (LGWR process). I think wait counts for control file read went up as LGWR was constantly trying to switch log file but it couldn’t as recovery destination was full. So enq: SQ – contention was the symptom of LGWR spinning on log switch due to no space in recovery destination(db_recovery_file_dest_size).

References:

Advanced Oracle Troubleshooting Guide – Part 11: Complex Wait Chain Signature Analysis

Diagnosing buffer busy waits with the ash_wait_chains.sql script (v0.2)

Update: 04/27/2015

I deliberately filled recovery destination area and then tried to switch log files. Switch log file command was hung and LGWR 10046 trace showed repeated loop of trying to read control file blocks. First two lines shows it was trying to read control file header block from multiplexed control files. Below is the excerpt from LGWR trace file.

nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 
nam='control file sequential read' file#=0 block#=40 
nam='control file sequential read' file#=0 block#=42 
nam='control file sequential read' file#=0 block#=43 
nam='control file sequential read' file#=0 block#=815 
nam='control file sequential read' file#=0 block#=46 
nam='control file sequential read' file#=0 block#=1 
nam='control file sequential read' file#=1 block#=1 

March 10, 2015

SQL Plan Tree Visualization

Filed under: Performance Tuning, SQL, Tools, Troubleshooting — vishaldesai @ 8:22 pm

Real Time SQL Monitor Graphical plan from Cloud Control at times is not legible. Using google chart and visualization techniques that I learned from Accenture Enkitec’s and Iggy’s blog, I have create three scripts to quickly display plan in graphical Btree format. The one from real time sql monitoring displays few more metrics such as cardinality, IO and time statistics and I have added some color logic such that if time spend in one plan line id is more than 10% of total elapsed time then display Timeline in red otherwise green.

@getplan_cursor_tree.sql – Display graphical btree plan from cache based on SQL ID and plan_hash_value

@getplan_awr_tree.sql – Display graphical btree plan from AWR metadata based on SQL ID and plan_hash_value

@getplan_realsqlmonitor_tree.sql – Display graphical btree plan from Real Time SQL monitoring based on SQL ID, sql_exec_id etc.

Modify hard coded lines in script:

SPO logs\getplan_realsqlmonitor_tree_&tmenvs..html;

host start chrome C:\Users\U267399\Desktop\Tools\scripts\logs\getplan_realsqlmonitor_tree_&tmenvs..html

Sample output from OEM:

 image

 

Sample html output using getplan_realsqlmonitor_tree.sql: 

image

Full html file https://dl.dropboxusercontent.com/u/48791336/getplan_realsqlmonitor_tree_031615_170343.html

Download

References:

1) http://mauro-pagano.com/2015/03/10/trouble-reading-an-execution-plan-maybe-sqld360-can-help/

2) https://iggyfernandez.wordpress.com/2010/11/26/explaining-the-explain-plan-using-pictures/

November 5, 2014

SQL Performance changing over time

Filed under: Performance Tuning, SQL, Tools — vishaldesai @ 9:43 pm

Here is another visualization to analyze sql performance changing over period of time using Excel.

Input Screen:

image

Enter the details such as username, password, tns, Days, Interval and SQL ID and click on run button.

Output Screen:

image

As you can see from above, IO Wait times on 8/16 and 9/2 were higher compared to 8/19, 8/21 and 8/26. Number of executions were more or less same on those days. On 9/9 and 9/10, IO wait times were at peak but that could be due corresponding increase in number of executions.

image 

Data 2 worksheet will show more details about other SQL statistics. Units for other statistics are different so its difficult to show these statistics in two dimensional chart format.

Formatting for some combination by Day/Interval may not format excel chart properly. You may have to adjust chart manually to make it legible.

Download (Will be updated on 11/08)

October 3, 2014

Compare Oracle SQL Plan in Excel

Filed under: Performance Tuning, Tools — vishaldesai @ 2:36 pm

 

Input Screen:

image

 

 

Output Screen:

image

Download xlsm file, input source and target details, double click on list box to populate plan hash values and select one item and click on Run button to compare plan. The output screen will highlight first line (only first) where it finds difference in operation.

Download

Compare Oracle Database Parameters in Excel

Filed under: Performance Tuning, Tools — vishaldesai @ 2:13 pm

Input Screen:

image

Output Screen:

image

It will highlight differences in red. I have added logic to mark it blue if values are different for parameters instance_number, thread, undo_tablespace and local_listener. You can add more parameters to that logic for further customization.

Download xlsm file, add tns alias to tnsnames.ora file on your windows client and start comparing parameters.

Download

May 15, 2014

IO Waits, Average Wait Time and Histogram Heat maps using SQL Developer

Filed under: Tools, Troubleshooting — vishaldesai @ 11:00 pm

Jeff Smith pointed out David Mann’s blog in recent session on SQL Developer Reporting. Using David’s base code I have developed SQL developer heat map reports to visualize, troubleshoot and compare IO related waits.

 image

 

image

From above we can clearly visualize that on 11-MAY-2013 between 2:00 – 3:00 pm average wait time for log file parallel write event was 27.3 ms. The number of waits during same time slot was 10801 which is equivalent to 3 waits per second and that’s not significant. If intermittent slowness was reported by application team and if it was related to commit wait class, DBAs need to engage SA/SAN team to investigate this further.

Below are IO wait histogram reports for log file parallel write event on same database instance. For OLTP workload you want to see bright red boxes for log file parallel write between 1 and 16ms buckets for wait counts and average wait times. Such heat maps can also be used to do relative performance comparison by Day or Day/Hour.

image

 

image

 

Download (I will upload SQL developer report definition files early next week.)

Another example of vertical stacked bar chart.

IO Wait Histogram by Count – db file sequential read

image

 

IO Wait Histogram by Avg Wait Time(s) – db file sequential read

image

Older Posts »

Create a free website or blog at WordPress.com.