Vishal desai’s Oracle Blog

May 11, 2015

Filter predicates after MIN aggregation

Filed under: Performance Tuning, SQL — vishaldesai @ 8:41 pm

One of OLTP SQL on Exadata was consuming lot of CPU and was running for about 8 seconds per execution. Most of CPU activity was towards bottom half of the SQL as shown in real time SQL monitoring report below.

 

image 

image

To demonstrate this I will created simple test case using following code. PCTFREE, c3 and 90000 rows for multiple iterations are used to create large table:

SQL> drop table tab purge;
drop table tab purge
           *
ERROR at line 1:
ORA-00942: table or view does not exist


SQL> create table tab (c1 number, c2 date, c3 char(2000)) pctfree 50;

Table created.

SQL> insert into tab select rownum, sysdate-rownum, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> insert into tab select rownum, sysdate-rownum-3, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> insert into tab select rownum, sysdate-rownum-2, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> insert into tab select rownum, sysdate-rownum-1, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> insert into tab select rownum, sysdate-rownum+2, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> insert into tab select rownum, sysdate-rownum+2, 'X' from dual connect by level <=90000;

90000 rows created.

SQL> commit;

Commit complete.

SQL> create index i_c1 on tab(c1);

Index created.

SQL> create index i_c2 on tab(c2);

Index created.

SQL> exec dbms_stats.gather_table_stats(USER,'TAB',cascade=>true,estimate_percent=>100);

PL/SQL procedure successfully completed.

Original SQL was calculating MIN of date column grouped by column c1 and then was applying filter against MIN column for narrow date range. So majority of rows were filtered out after full table scan and aggregation. As you can see it was doing 540k buffer gets to return 5 records. As this database is on Exadata, it does 540K physical reads very fast due to smart scan but in original SQL 43% of CPU activity was against full table scan operation.

 

SQL> alter session set statistics_level=all;
SQL>
SQL> select c1, c2 from
  2  (select c1,min(c2) c2
  3  from tab
  4  group by c1)
  5  where c2>=sysdate-15 and c2<=sysdate-10
  6  order by 1,2;

        C1 C2
---------- ---------
         7 01-MAY-15
         8 30-APR-15
         9 29-APR-15
        10 28-APR-15
        11 27-APR-15
SQL>
SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  58qhpx48c18v9, child number 0
-------------------------------------
select c1, c2 from (select c1,min(c2) c2 from tab group by c1) where
c2>=sysdate-15 and c2<=sysdate-10 order by 1,2

Plan hash value: 443431404

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |      1 |        |      5 |00:00:00.69 |     540K|    540K|       |       |          |
|   1 |  SORT ORDER BY               |      |      1 |   4500 |      5 |00:00:00.69 |     540K|    540K|  2048 |  2048 | 2048  (0)|
|*  2 |   FILTER                     |      |      1 |        |      5 |00:00:00.69 |     540K|    540K|       |       |          |
|   3 |    HASH GROUP BY             |      |      1 |   4500 |  90000 |00:00:00.68 |     540K|    540K|    31M|  8462K| 5717K (0)|
|   4 |     TABLE ACCESS STORAGE FULL| TAB  |      1 |    540K|    540K|00:00:00.47 |     540K|    540K|  1025K|  1025K| 7199K (0)|
-----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter((MIN("C2")>=SYSDATE@!-15 AND MIN("C2")<=SYSDATE@!-10))

 

Column C1 was not unique so we cannot just apply filter prior to MIN. As you can see from below that consistent gets have gone down significantly due to index range scan but SQL is not returning same output. If filter predicate is applied first its returning records that we need and records that we don’t (false positives).

SQL> select c1,min(c2)
  2  from tab
  3  where c2>=sysdate-15 and c2<=sysdate-10
  4  group by c1
  5  order by 1,2;

        C1 MIN(C2)
---------- -----------------
         7 05/01/15 18:42:40
         8 04/30/15 18:42:40
         9 04/29/15 18:42:40
        10 04/28/15 18:42:40
        11 04/27/15 18:42:40
        12 04/27/15 18:43:15
        13 04/27/15 18:43:50
        14 04/27/15 18:42:04
        15 04/28/15 18:44:25
        16 04/27/15 18:44:25
SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats l

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------
SQL_ID  7fwwz4v1a9fx4, child number 0
-------------------------------------
select c1,min(c2) from tab where c2>=sysdate-15 and c2<=sysdate-10
group by c1 order by 1,2

Plan hash value: 1451778582

-------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |      1 |        |     10 |00:00:00.01 |      33 |
|   1 |  SORT ORDER BY                 |      |      1 |     32 |     10 |00:00:00.01 |      33 |
|   2 |   HASH GROUP BY                |      |      1 |     32 |     10 |00:00:00.01 |      33 |
|*  3 |    FILTER                      |      |      1 |        |     30 |00:00:00.01 |      33 |
|   4 |     TABLE ACCESS BY INDEX ROWID| TAB  |      1 |     32 |     30 |00:00:00.01 |      33 |
|*  5 |      INDEX RANGE SCAN          | I_C2 |      1 |     32 |     30 |00:00:00.01 |       3 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(SYSDATE@!-10>=SYSDATE@!-15)
   5 - access("C2">=SYSDATE@!-15 AND "C2"<=SYSDATE@!-10)

For such construct of SQL, we can still apply filter predicates first and remove records that should not be returned (false positives) using sub query as shown below. As you can see from below the consistent gets have gone down, indexes are being used, response time has improved, CPU activity has also gone down and we are getting same output.

 

SQL> select  t1.c1, t1.c2 from
  2  (select c1,min(c2) c2
  3  from tab
  4  where c2>=sysdate-15 and c2<=sysdate-10
  5  group by c1) t1
  6  where not exists (select c1 from tab t2 where t1.c1 = t2.c1 and t1.c2 > t2.c2 and t2.c2<=sysdate-10)
  7  order by 1,2;

        C1 C2
---------- -----------------
         7 05/01/15 18:42:40
         8 04/30/15 18:42:40
         9 04/29/15 18:42:40
        10 04/28/15 18:42:40
        11 04/27/15 18:42:40
SQL>
SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------
SQL_ID  g0agfd3mm02qk, child number 0
-------------------------------------
select  t1.c1, t1.c2 from (select c1,min(c2) c2 from tab where
c2>=sysdate-15 and c2<=sysdate-10 group by c1) t1 where not exists
(select c1 from tab t2 where t1.c1 = t2.c1 and t1.c2 > t2.c2 and
t2.c2<=sysdate-10) order by 1,2

Plan hash value: 327191261

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |      |      1 |        |      5 |00:00:00.01 |      84 |    	  |       |
|   1 |  SORT ORDER BY                   |      |      1 |     30 |      5 |00:00:00.01 |      84 |  2048 |  2048 |
|   2 |   NESTED LOOPS ANTI              |      |      1 |     30 |      5 |00:00:00.01 |      84 |       |       |
|   3 |    VIEW                          |      |      1 |     32 |     10 |00:00:00.01 |      33 |       |       |
|   4 |     HASH GROUP BY                |      |      1 |     32 |     10 |00:00:00.01 |      33 |  3331K|  3331K|
|*  5 |      FILTER                      |      |      1 |        |     30 |00:00:00.01 |      33 |       |       |
|   6 |       TABLE ACCESS BY INDEX ROWID| TAB  |      1 |     32 |     30 |00:00:00.01 |      33 |       |       |
|*  7 |        INDEX RANGE SCAN          | I_C2 |      1 |     32 |     30 |00:00:00.01 |       3 |  1025K|  1025K|
|*  8 |    TABLE ACCESS BY INDEX ROWID   | TAB  |     10 |  26996 |      5 |00:00:00.01 |      51 |       |       |
|*  9 |     INDEX RANGE SCAN             | I_C1 |     10 |      6 |     38 |00:00:00.01 |      13 |  1025K|  1025K|
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter(SYSDATE@!-10>=SYSDATE@!-15)
   7 - access("C2">=SYSDATE@!-15 AND "C2"<=SYSDATE@!-10)
   8 - filter(("T2"."C2"<=SYSDATE@!-10 AND "T1"."C2">"T2"."C2"))
   9 - access("T1"."C1"="T2"."C1")
Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: