Vishal desai’s Oracle Blog

January 24, 2014

Build Bind Variables from v$sql_monitor binds_xml column

Filed under: SQL, Troubleshooting — vishaldesai @ 7:41 pm

One of my client had interesting SQL performance issue. Whenever client executes query from ab initio it takes forever and same SQL code from sqlplus runs in less than a minute. Both tools were producing different plan_hash_value. I looked at v$ses_optimizer_env to ensure there were no session level parameters set for ab initio session.

I wanted to run exact SQL from sqlplus, create sql plan baseline so that when SQL is submitted from AI it can use good plan. AI was generating SQL dynamically so I decided get exact SQL text, bind variables and values from v$sql_monitor. I defined bind variables, assigned values and ran SQL from sqlplus and created profile which was eventually used by AI. To automate constructing bind variables and SQL Text from v$sql_monitor, I wrote script called build_bind_sqlmontior.sql.

Script will prompt for Key, SID and SQL_ID. Enter value for at least on prompt and later on enter key value to construct bind variables and SQL Text as shown below:

SQL> @build_bind_sqlmonitor.sql
Please enter the value for Key if known       :
Please enter the value for Sid if known       : 986
Please enter the value for sql_id if known    :

             KEY STATUS              USERNAME   MODULE                      SID SQL_ID        FIRST_REFRESH_TIME   PROGRAM         SQL_TEXT
---------------- ------------------- ---------- -------------------- ---------- ------------- -------------------- --------------- --------------------
   6996503228544 DONE (ALL ROWS)     SYS        sqlplus.exe                 986 8wy5mjmh53vxf 01/24/14 10:41:56    sqlplus.exe     select /*+ monitor f
   8362302828663 DONE (ERROR)        SYS        sqlplus.exe                 986 4bkxr36kasm7s 01/24/14 10:40:51    sqlplus.exe     select /*+ monitor f
  16471201083509 DONE (ERROR)        SYS        sqlplus.exe                 986 azb329svj1ajd 01/24/14 10:39:36    sqlplus.exe     select /*+ monitor f
   7082402574450 DONE (ERROR)        SYS        sqlplus.exe                 986 3036w6mx13rq5 01/24/14 10:38:36    sqlplus.exe     select /*+ monitor f
   5845451993201 DONE (ERROR)        SYS        sqlplus.exe                 986 20fmy44010wvw 01/24/14 10:38:00    sqlplus.exe     select /*+ monitor f
  11643657842205 DONE (ALL ROWS)                                            986 bfjf698jtp6n1 01/24/14 03:10:14
   7421704990227 DONE                                                       986 2vn8ztuzaq3k4 01/24/14 03:10:11
  14856293379585 DONE (ALL ROWS)                                            986 53afdbcv9637w 01/24/14 03:10:10
.....
.....
Please enter Key from above       : 6996503228544

variable SYS_B_0 VARCHAR2(32);
variable SYS_B_1 VARCHAR2(32);
variable SYS_B_2 NUMBER;

exec :SYS_B_0 := '31-OCT-12 00.00.00';
exec :SYS_B_1 := 'DD-MON-YY HH24:MI:SS';
exec :SYS_B_2 := 30205578;

select /*+ monitor full(ism) */ *
from ism.table_name ism
where col1=to_date(:"SYS_B_0",:"SYS_B_1")
and   col2=:"SYS_B_2";

SQL> variable SYS_B_0 VARCHAR2(32);
SQL> variable SYS_B_1 VARCHAR2(32);
SQL> variable SYS_B_2 NUMBER;
SQL>
SQL> exec :SYS_B_0 := '31-OCT-12 00.00.00';

PL/SQL procedure successfully completed.

SQL> exec :SYS_B_1 := 'DD-MON-YY HH24:MI:SS';

PL/SQL procedure successfully completed.

SQL> exec :SYS_B_2 := 30205578;

PL/SQL procedure successfully completed.

SQL> select /*+ monitor full(ism) */ *
  2  from ism.table_name ism
  3  where col1=to_date(:"SYS_B_0",:"SYS_B_1")
  4  and   col2=:"SYS_B_2";

.....

1 row selected.

Download Script

If there is any data type conversion script may not produce reliable output.

December 11, 2013

Bloom Filter and function in join condition

Filed under: Performance Tuning, SQL — vishaldesai @ 4:26 pm

One of my data warehouse user wanted to compare data between two Quarters. Original SQL is slightly complicated so I have created test case to demonstrate the issue.

DROP TABLE t1;

CREATE TABLE t1
  (
    id       NUMBER NOT NULL,
    datecol  DATE NOT NULL,
    comments VARCHAR2(500)
  )
  PARTITION BY RANGE
  (
    datecol
  )
  (
    PARTITION q3 VALUES LESS THAN (TO_DATE('10/10/2013', 'DD/MM/YYYY')) TABLESPACE users,
    PARTITION q4 VALUES LESS THAN (TO_DATE('01/01/2043', 'DD/MM/YYYY')) TABLESPACE users
  );
  
INSERT INTO t1
SELECT
  rownum,
  to_date('31/12/2013','DD/MM/YYYY'),
  dbms_random.string('X', '500')
FROM dual a
  CONNECT BY level <= 5000;
  
INSERT INTO t1
SELECT
  rownum,
  to_date('30/09/2013','DD/MM/YYYY'),
  dbms_random.string('X', '500')
FROM dual a
  CONNECT BY level <= 5000;

For sake of simplicity, I have added only two partitions here but original fact table had 10 years of data. Below is subset of SQL statement that user was running to compare data across quarters.

select /*+ leading(xa,xb) parallel(xa,4) parallel(xb,4) */ sum(xa.id),count( distinct xb.comments)
from t1 xa ,t1 xb
where xa.datecol = add_months(xb.datecol,-3)
and   xa.datecol = '30-SEP-13';
------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |          |     1 |   265 |   367  (74)| 00:00:01 |       |       |        |      |            |
|   1 |  SORT AGGREGATE                  |          |     1 |   265 |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR                 |          |       |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)           | :TQ10001 |     1 |   265 |            |          |       |       |  Q1,01 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE               |          |     1 |   265 |            |          |       |       |  Q1,01 | PCWP |            |
|   5 |      VIEW                        | VW_DAG_0 |    34M|  8774M|   367  (74)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   6 |       HASH GROUP BY              |          |    34M|  9370M|   367  (74)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   7 |        PX RECEIVE                |          |    34M|  9370M|   367  (74)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   8 |         PX SEND HASH             | :TQ10000 |    34M|  9370M|   367  (74)| 00:00:01 |       |       |  Q1,00 | P->P | HASH       |
|   9 |          HASH GROUP BY           |          |    34M|  9370M|   367  (74)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
|* 10 |           HASH JOIN              |          |    34M|  9370M|   122  (20)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
|  11 |            PARTITION RANGE SINGLE|          |  5501 |   118K|    33   (0)| 00:00:01 |   KEY |   KEY |  Q1,00 | PCWC |            |
|* 12 |             TABLE ACCESS FULL    | T1       |  5501 |   118K|    33   (0)| 00:00:01 |   KEY |   KEY |  Q1,00 | PCWP |            |
|  13 |            PX BLOCK ITERATOR     |          |  6311 |  1608K|    65   (0)| 00:00:01 |     1 |     2 |  Q1,00 | PCWC |            |
|* 14 |             TABLE ACCESS FULL    | T1       |  6311 |  1608K|    65   (0)| 00:00:01 |     1 |     2 |  Q1,00 | PCWP |            |
------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

  10 - access("XA"."DATECOL"=ADD_MONTHS(INTERNAL_FUNCTION("XB"."DATECOL"),-3))
  12 - filter("XA"."DATECOL"='30-SEP-13')
  14 - filter(ADD_MONTHS(INTERNAL_FUNCTION("XB"."DATECOL"),-3)='30-SEP-13')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

As you can see from line 14, table T1 (XB) was doing full table scan and was accessing both the partitions (pstart 1 and pstop 2). In original SQL it was accessing all partitions for last 10 years.

I asked user to add additional filter for T1 (XB) so that it will scan required partition as shown below. But user was not satisfied (as it would require calculating dates manually every time they run this SQL and sometimes they run this SQL for multiple Quarters). Other option was to pass filter as xb.datecol = (select add_months(‘30-SEP-13’,3) from dual) but I was curios to find out more about bloom filters.

select /*+ leading(xa,xb) parallel(xa,4) parallel(xb,4) */ sum(xa.id),count( distinct xb.comments)
from t1 xa ,t1 xb
where xa.datecol = add_months(xb.datecol,-3)
and   xa.datecol = '30-SEP-13'
and   xb.datecol = '31-DEC-13';

I was wondering whether Oracle would create bloom filter after scanning T1 (XA) and apply that to T2 (XB) and do partition scan on T2(XB) instead of scanning all partitions but the problem was add_months function in join condition.

Original Plan:

image

Recommended SQL and plan (bloom filter in action):

select /*+ leading(xa,xb) parallel(xa,4) parallel(xb,4) */ sum(xa.id),count( distinct xb.comments)
from t1 xa ,t1 xb
where add_months(xa.datecol,+3) = xb.datecol
and   xa.datecol = '30-SEP-13';
-------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |          |     1 |   265 |   315  (69)| 00:00:01 |       |       |        |      |            |
|   1 |  SORT AGGREGATE                   |          |     1 |   265 |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR                  |          |       |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)            | :TQ10001 |     1 |   265 |            |          |       |       |  Q1,01 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE                |          |     1 |   265 |            |          |       |       |  Q1,01 | PCWP |            |
|   5 |      VIEW                         | VW_DAG_0 |    28M|  7145M|   315  (69)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   6 |       HASH GROUP BY               |          |    28M|  7630M|   315  (69)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   7 |        PX RECEIVE                 |          |    28M|  7630M|   315  (69)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   8 |         PX SEND HASH              | :TQ10000 |    28M|  7630M|   315  (69)| 00:00:01 |       |       |  Q1,00 | P->P | HASH       |
|   9 |          HASH GROUP BY            |          |    28M|  7630M|   315  (69)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
|* 10 |           HASH JOIN               |          |    28M|  7630M|   118  (17)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
|  11 |            PART JOIN FILTER CREATE| :BF0000  |  5501 |   118K|    33   (0)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
|  12 |             PARTITION RANGE SINGLE|          |  5501 |   118K|    33   (0)| 00:00:01 |   KEY |   KEY |  Q1,00 | PCWC |            |
|* 13 |              TABLE ACCESS FULL    | T1       |  5501 |   118K|    33   (0)| 00:00:01 |   KEY |   KEY |  Q1,00 | PCWP |            |
|  14 |            PX BLOCK ITERATOR      |          | 10278 |  2619K|    65   (0)| 00:00:01 |:BF0000|:BF0000|  Q1,00 | PCWC |            |
|  15 |             TABLE ACCESS FULL     | T1       | 10278 |  2619K|    65   (0)| 00:00:01 |:BF0000|:BF0000|  Q1,00 | PCWP |            |
-------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

  10 - access("XB"."DATECOL"=ADD_MONTHS(INTERNAL_FUNCTION("XA"."DATECOL"),3))
  13 - filter("XA"."DATECOL"='30-SEP-13')

image

As you can see from above, just by flipping the add_months function Oracle now creates and uses bloom filter to scan required partition for T1 (XB). So from this behavior it looks like bloom filters cannot be used for join column surrounded by function. It sounds very similar to having index on column x, scanning table with function(x)=y will not use index on column x. May be in future Oracle will add function based bloom filters.

April 5, 2013

Outer Join, Ordered hint and SWAP_JOIN_INPUTS

Filed under: Performance Tuning, SQL — vishaldesai @ 6:33 pm

Outer joins access tables in following order:

from t1, t2 where t1.col1(+)=t2.col1 –> table will be accessed in order t2, t1

from t1, t2 where t1.col1=t2.col1(+) –> table will be accessed in order t1,t2

One of my developer had to do some data cleanup so he quickly staged data in table t1, t2 was our large fact table and ran following query (test case). Query ran for hours on Exadata and did not finish. Smile

select /*+ ordered full(t1) full(t2) parallel(t1,4) parallel(t2,4) */ t1.*, t2.* from t1,t2 where t1.col1(+)=t2.col1;

When I looked at SQL monitoring report driving table was t2 (large fact table). Below is the output from test case:

SQL> create table t1(c1, c2)
  2  as
  3  select level, rpad('x',10)
  4  from dual
  5  connect by level <= 200000
  6  ;

Table created.

SQL>
SQL> create table t2(c1, c2)
  2  as
  3  select level, rpad('x',10)
  4  from dual
  5  connect by level <= 100
  6  ;

Table created.

SQL>
SQL> exec dbms_stats.gather_table_stats(user, 'T1');

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_stats.gather_table_stats(user, 'T2');

PL/SQL procedure successfully completed.


SQL> explain plan for
  2  select /*+ ordered */ t2.*,t1.*
  3  from t2, t1
  4  where t1.c1 = t2.c1(+);

Explained.

SQL>
SQL> @utlxpls.sql

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1823443478

-------------------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |   200K|  5859K|       |   436   (1)| 00:00:06 |
|*  1 |  HASH JOIN OUTER           |      |   200K|  5859K|  5472K|   436   (1)| 00:00:06 |
|   2 |   TABLE ACCESS STORAGE FULL| T1   |   200K|  3125K|       |   165   (1)| 00:00:02 |
|   3 |   TABLE ACCESS STORAGE FULL| T2   |   100 |  1400 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."C1"="T2"."C1"(+))

22 rows selected.

SQL> explain plan for
  2  select t2.*,t1.*
  3  from t2, t1
  4  where t1.c1 = t2.c1(+);

Explained.

SQL>
SQL> @utlxpls.sql

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 312430291

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |   200K|  5859K|   169   (2)| 00:00:03 |
|*  1 |  HASH JOIN RIGHT OUTER     |      |   200K|  5859K|   169   (2)| 00:00:03 |
|   2 |   TABLE ACCESS STORAGE FULL| T2   |   100 |  1400 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS STORAGE FULL| T1   |   200K|  3125K|   165   (1)| 00:00:02 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."C1"="T2"."C1"(+))

When ORDERED hint was added it used t1 (large table as driving table) with hash join outer and when ORDERED hint was NOT used it used t2 (small table as driving table) with hash join right outer.

Output on left is without ORDERED hint and on right is with ORDERED hint.

image

image

image

 

So based on above output looks like that when ORDERED hint was used with OUTER JOIN, Oracle optimizer does not go through join permutations (in this particular case swapping join inputs) and hence choose T1 (larger table) as driving table. When ORDERED hint was removed, Oracle optimizer computes cost of swapping join inputs and choose T2 (smaller table) as driving table. Why Oracle considered join order as t1, t2 with ordered hint? I think because of outer join.

For my production issue, I asked ETL/development team to remove ORDERED hint and query ran in 12 minutes.

December 10, 2012

Calculate Median from Frequency in SQL

Filed under: SQL — vishaldesai @ 11:19 pm

If numbers are stored sequentially then median function can be used to calculate median. But what if you want to calculate median based on frequencies? Either you can convert frequencies into multiple values in rows format and then use median function or calculate median by using frequencies as described by http://www.onlinemathlearning.com/median-frequency-table.html

Here is how I put that into SQL format:

SQL> set feed on
SQL> set echo on
SQL> drop table v_median purge;

Table dropped.

SQL> create table v_median (id number, cnt number);

Table created.

SQL> delete from v_median;

0 rows deleted.

SQL> insert into v_median values (1,2);

1 row created.

SQL> insert into v_median values (2,5);

1 row created.

SQL> insert into v_median values (3,1);

1 row created.

SQL> insert into v_median values (4,3);

1 row created.

SQL> insert into v_median values (5,2);

1 row created.

SQL> commit;

Commit complete.

SQL> WITH qb1 AS
  2    (SELECT id,
  3      cnt,
  4      SUM(cnt) over (order by id rows unbounded preceding) tot
  5    FROM v_median
  6    ORDER BY id
  7    ),
  8    qb2 AS
  9    (SELECT MAX(tot) AS n ,
 10      CASE
 11        WHEN mod(MAX(tot),2) = 0
 12        THEN 'even'
 13        WHEN mod(MAX(tot),2) = 1
 14        THEN 'odd'
 15      END AS oe
 16    FROM
 17      (SELECT id,
 18        cnt,
 19        SUM(cnt) over (order by id rows unbounded preceding) tot
 20      FROM v_median
 21      ORDER BY id
 22      )
 23    )
 24  SELECT median
 25  FROM
 26    (SELECT DISTINCT
 27      CASE
 28        WHEN (qb2.oe    ='odd'
 29        AND (qb2.n+1)/2 = qb1.tot)
 30        THEN qb1.id
 31        WHEN (qb2.oe='even')
 32        THEN
 33          (SELECT median(id)
 34          FROM
 35            ( SELECT DISTINCT id
 36            FROM
 37              (SELECT id
 38              FROM
 39                (SELECT id FROM qb1,qb2 WHERE tot>((qb2.n/2)+1)
 40                )
 41              WHERE rownum=1
 42              UNION
 43              SELECT id
 44              FROM
 45                (SELECT id FROM qb1,qb2 WHERE tot<((qb2.n/2)+1)ORDER BY tot DESC
 46                )
 47              WHERE rownum=1
 48              )
 49            )
 50          )
 51      END AS median
 52    FROM qb1,
 53      qb2
 54    )
 55  WHERE median IS NOT NULL;

    MEDIAN                                                                                                                                                                                              
----------                                                                                                                                                                                              
         2                                                                                                                                                                                              

1 row selected.

SQL> delete from v_median;

5 rows deleted.

SQL> insert into v_median values (0,11);

1 row created.

SQL> insert into v_median values (1,9);

1 row created.

SQL> insert into v_median values (2,5);

1 row created.

SQL> insert into v_median values (3,10);

1 row created.

SQL> insert into v_median values (4,15);

1 row created.

SQL> commit;

Commit complete.

SQL> WITH qb1 AS
  2    (SELECT id,
  3      cnt,
  4      SUM(cnt) over (order by id rows unbounded preceding) tot
  5    FROM v_median
  6    ORDER BY id
  7    ),
  8    qb2 AS
  9    (SELECT MAX(tot) AS n ,
 10      CASE
 11        WHEN mod(MAX(tot),2) = 0
 12        THEN 'even'
 13        WHEN mod(MAX(tot),2) = 1
 14        THEN 'odd'
 15      END AS oe
 16    FROM
 17      (SELECT id,
 18        cnt,
 19        SUM(cnt) over (order by id rows unbounded preceding) tot
 20      FROM v_median
 21      ORDER BY id
 22      )
 23    )
 24  SELECT median
 25  FROM
 26    (SELECT DISTINCT
 27      CASE
 28        WHEN (qb2.oe    ='odd'
 29        AND (qb2.n+1)/2 = qb1.tot)
 30        THEN qb1.id
 31        WHEN (qb2.oe='even')
 32        THEN
 33          (SELECT median(id)
 34          FROM
 35            ( SELECT DISTINCT id
 36            FROM
 37              (SELECT id
 38              FROM
 39                (SELECT id FROM qb1,qb2 WHERE tot>((qb2.n/2)+1)
 40                )
 41              WHERE rownum=1
 42              UNION
 43              SELECT id
 44              FROM
 45                (SELECT id FROM qb1,qb2 WHERE tot<((qb2.n/2)+1)ORDER BY tot DESC
 46                )
 47              WHERE rownum=1
 48              )
 49            )
 50          )
 51      END AS median
 52    FROM qb1,
 53      qb2
 54    )
 55  WHERE median IS NOT NULL;

    MEDIAN                                                                                                                                                                                              
----------                                                                                                                                                                                              
       2.5                                                                                                                                                                                              

1 row selected.

SQL> spool off

March 27, 2012

Oracle CHAR data type comparisons

Filed under: Oracle Database, SQL, Troubleshooting — vishaldesai @ 12:54 am

I received below test case from my pl/sql developer.

dev@oradb > drop table test_3;

Table dropped.

dev@oradb > create table test_3  (col1 char(2));

Table created.
 
dev@oradb > insert into test_3 values('  ');

1 row created.

dev@oradb > insert into test_3 values('~ ');

1 row created.

dev@oradb > insert into test_3 values('~');

1 row created.

dev@oradb > insert into test_3 values(' ~');

1 row created.

dev@oradb > commit;

Commit complete.

dev@oradb > select * from test_3;

CO
--

~
~
 ~


4 rows selected.

Developer wanted to find record with column value = ‘~’ and he used below query:

Test 1: --I would expect no records since it is a char field
dev@oradb > select * from test_3 where col1='~';

CO
--
~
~

2 rows selected.

Developer was expecting that above query will return 0 rows as col1 is defined as char(2) so ‘~’ will be stored as ‘~ ‘ and ‘~’ <> ‘~ ‘ but query returned 2 rows  instead of 0 rows.

As per Oracle documentation http://docs.oracle.com/cd/B28359_01/server.111/b28286/sql_elements002.htm:

With blank-padded semantics, if the two values have different lengths, then Oracle first adds blanks to the end of the shorter one so their lengths are equal. Oracle then compares the values character by character up to the first character that differs. Oracle uses blank-padded comparison semantics only when both values in the comparison are either expressions of datatype CHAR, NCHAR, text literals, or values returned by the USER function.

So now we know how CHAR data types with different lengths are compared how do we know what is the data type of ‘~’ on right side of equality operator? It’s char but lets prove it.

dba@oradb >  create table datatype_test as select '~' as db from dual;

Table created.

dba@oradb > desc  datatype_test
 Name                                            Null?    Type
 ----------------------------------------------- -------- --------------------------------
 DB                                                       CHAR(1)

dba@oradb > select col1,ora_hash(col1),ora_hash('~'),ora_hash('~ ') as "ora_hash('~ ')" from test_3;

CO ORA_HASH(COL1) ORA_HASH('~') ora_hash('~ ')
-- -------------- ------------- --------------
        190699439     216329699     4123278633
~      4123278633     216329699     4123278633
~      4123278633     216329699     4123278633
 ~      999748150     216329699     4123278633
 
4 rows selected.

From above its clear that data type of ‘~’ on right side of equality operator is CHAR and hence select query return 2 records (record 2 and record 3) instead of 0 rows. Visual comparison will look something like following:

Expected Comparison Actual Comparison
‘  ‘ ‘~’ ‘  ‘ ‘~ ‘
‘~ ‘ ‘~’ ‘~ ‘ ‘~ ‘
‘~ ’ ‘~’ ‘~ ‘ ‘~ ‘
‘ ~’ ‘~’ ‘ ~’ ‘~ ‘

Hash value returned by ora_hash function also matches as shown above.

Developer tried to use different work around which produced different results and I used same queries to prove why similar workarounds are producing different results.

Test 2: --above query works then why not this query? all im doing is replacing a ~ with nvl(trim('  '),'~')

dev@oradb > select * from test_3 where col1 =nvl(trim('  '),'~');

no rows selected

dba@oradb > select col1,ora_hash(col1),ora_hash(nvl(trim('  '),'~')) as "ora_hash(nvl(trim('  '),'~'))" from test_3;

CO ORA_HASH(COL1) ora_hash(nvl(trim('  '),'~'))
-- -------------- -----------------------------
        190699439                     216329699
~      4123278633                     216329699
~      4123278633                     216329699
 ~      999748150                     216329699

dba@oradb > create table datatype_test as select nvl(trim('  '),'~') as db from dual;

Table created.

dba@oradb > desc  datatype_test
 Name                                            Null?    Type
 ----------------------------------------------- -------- --------------------------------
 DB                                                       VARCHAR2(1)

Data type of nvl(trim(‘  ‘),’~’) is VARCHAR2 so Oracle will not pad blanks. Hash values returned are also different hence query returned 0 records.

Test 3: --works user defined function

dev@oradb > CREATE OR REPLACE FUNCTION test_fn (v_col1 VARCHAR2) RETURN VARCHAR2
  2  AS
  3   col1 CHAR(2);
  4  BEGIN
  5  col1:=  nvl(trim('  '),'~');
  6    RETURN col1;
  7  END;
  8  /

Function created.

dev@oradb > select * from test_3 where col1 =test_fn('  ');

CO
--
~
~

2 rows selected.

dba@oradb > create table datatype_test as select test_fn('  ') as dt from dual;

Table created.

dba@oradb > desc  datatype_test
 Name                                            Null?    Type
 ----------------------------------------------- -------- --------------------------------
 DT                                                       VARCHAR2(4000)

dba@oradb >  select length(dt) from datatype_test;

LENGTH(DT)
----------
         2

dba@oradb > select dt from  datatype_test;

DT
---------------------------------------------------------------
~

dba@oradb > select col1,ora_hash(col1),ora_hash(test_fn('  ')) as "ora_hash(test_fn('  '))" from test_3;

CO ORA_HASH(COL1) ora_hash(test_fn('  '))
-- -------------- -----------------------
        190699439               216329699
~      4123278633              4123278633
~      4123278633              4123278633
 ~      999748150               216329699

Data type of return variable is VARCHAR but col1 is defined as CHAR in function hence during assignment col1 will store ‘~’ as ‘~ ‘ (return value) which can also be confirmed by length function. Hash values returned for record 2 and record 3 match hence query returns 2 records.

Test 4: case also works

dev@oradb > select * from test_3 where col1 = case when trim('  ') is null then '~'   else '1' end;

CO
--
~
~

2 rows selected.

dba@oradb > select col1,ora_hash(col1),ora_hash(case when trim('  ') is null then '~'   else '1' end) as "ora_hash_on_case" from test_3;

CO ORA_HASH(COL1) ora_hash_on_case
-- -------------- ----------------
        190699439        216329699
~      4123278633        216329699
~      4123278633        216329699
 ~      999748150        216329699

 
dba@oradb > create table datatype_test as select case when trim('  ') is null then '~'   else '1' end as dt from dual;

Table created.


dba@oradb > desc  datatype_test
 Name                                            Null?    Type
 ----------------------------------------------- -------- --------------------------------
 DT                                                       CHAR(1)

Hash value doesn’t match but as the data type of return variable of CASE is CHAR oracle will pad blank and hence return 2 records.

Developer finally asked is there any way to select only record 3 using where condition? Once the data is inserted I don’t think there is way to differentiate between record 2 (‘~ ‘) and record 3 (‘~’).

Below is the data block dump of record 2 and record 3:

tab 0, row 1, @0x3f8c
tl: 6 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [ 2]  7e 20
tab 0, row 2, @0x3f86
tl: 6 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [ 2]  7e 20


.

October 27, 2010

11g pivot unpivot and Database Design

Filed under: Design, SQL — vishaldesai @ 5:59 pm

Oracle 11g pivot and unpivot function can help if database tables are not in second normal form. I will try to unveil this using a simple example:

Table Structures

CREATE TABLE emp1
  (
    empno   NUMBER PRIMARY KEY,
    empname VARCHAR2(1000),
    job     VARCHAR2(100),
    hiredate DATE
  );

CREATE TABLE sal1
  (
    empno   NUMBER,
    sal2000 NUMBER,
    sal2001 NUMBER,
    sal2002 NUMBER,
    CONSTRAINT fk_sal1_emp1 FOREIGN KEY (empno) REFERENCES emp1(empno)
  );

--Inserted few records into both emp1 and sal1 table.

We have seen table structures like sal1 (sal2000,sal2001,sal2002 …….). If we have to write a salary report as shown below we have to join SAL1 table multiple times. I used union but there are different ways to write similar query by joining SAL1 multiple times.

10g SQL

SELECT a.empno,
  a.empname,
  a.job,
  'sal2000' AS SALYEAR ,
  b1.sal2000 sal2000to2002
FROM emp1 a,
  sal1 b1
WHERE a.empno=b1.empno
UNION
SELECT a.empno,
  a.empname,
  a.job,
  'sal2001' AS SALYEAR ,
  b1.sal2001 sal2000to2002
FROM emp1 a,
  sal1 b1
WHERE a.empno=b1.empno
UNION
SELECT a.empno,
  a.empname,
  a.job,
  'sal2002' AS SALYEAR ,
  b1.sal2002 sal2000to2002
FROM emp1 a,
  SAL1 B1
WHERE a.empno=b1.empno;

    EMPNO EMPNAME                        JOB                            SALYEAR SAL2000TO2002
---------- ------------------------------ ------------------------------ ------- -------------
      7369 SMITH                          CLERK                          sal2000            10
      7369 SMITH                          CLERK                          sal2001            11
      7369 SMITH                          CLERK                          sal2002            12
      7499 ALLEN                          SALESMAN                       sal2000            20
      7499 ALLEN                          SALESMAN                       sal2001            21
      7499 ALLEN                          SALESMAN                       sal2002            22
      7521 WARD                           SALESMAN                       sal2000            21
      7521 WARD                           SALESMAN                       sal2001            22
      7521 WARD                           SALESMAN                       sal2002            23
      7654 MARTIN                         SALESMAN                       sal2000            23
      7654 MARTIN                         SALESMAN                       sal2001            24
      7654 MARTIN                         SALESMAN                       sal2002            25
      7698 BLAKE                          MANAGER                        sal2000            31
      7698 BLAKE                          MANAGER                        sal2001            32
      7698 BLAKE                          MANAGER                        sal2002            33
      7782 CLARK                          MANAGER                        sal2000            31
      7782 CLARK                          MANAGER                        sal2001            32
      7782 CLARK                          MANAGER                        sal2002            33
      7788 SCOTT                          ANALYST                        sal2000            25
      7788 SCOTT                          ANALYST                        sal2001            25
      7788 SCOTT                          ANALYST                        sal2002            25
      7839 KING                           PRESIDENT                      sal2000           100
      7839 KING                           PRESIDENT                      sal2001           110
      7839 KING                           PRESIDENT                      sal2002           120
      7844 TURNER                         SALESMAN                       sal2000            10
      7844 TURNER                         SALESMAN                       sal2001            11
      7844 TURNER                         SALESMAN                       sal2002            12
      7876 ADAMS                          CLERK                          sal2000            10
      7876 ADAMS                          CLERK                          sal2001            11
      7876 ADAMS                          CLERK                          sal2002            12
      7900 JAMES                          CLERK                          sal2000            17
      7900 JAMES                          CLERK                          sal2001            18
      7900 JAMES                          CLERK                          sal2002            19
      7902 FORD                           ANALYST                        sal2000            25
      7902 FORD                           ANALYST                        sal2001            26
      7902 FORD                           ANALYST                        sal2002            27
      7934 MILLER                         CLERK                          sal2000            10
      7934 MILLER                         CLERK                          sal2001            11
      7934 MILLER                         CLERK                          sal2002            12

39 rows selected.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=18 Card=39 Bytes=23361)
   1    0   SORT (UNIQUE) (Cost=18 Card=39 Bytes=23361)
   2    1     UNION-ALL
   3    2       HASH JOIN (Cost=7 Card=13 Bytes=7709)
   4    3         TABLE ACCESS (FULL) OF 'SAL1' (TABLE) (Cost=3 Card=13 Bytes=338)
   5    3         TABLE ACCESS (FULL) OF 'EMP1' (TABLE) (Cost=3 Card=14 Bytes=7938)
   6    2       HASH JOIN (Cost=10 Card=26 Bytes=15652)
   7    6         TABLE ACCESS (FULL) OF 'EMP1' (TABLE) (Cost=3 Card=14 Bytes=7938)
   8    6         VIEW OF 'VW_JF_SET$7FC4F042' (VIEW) (Cost=6 Card=26 Bytes=910)
   9    8           SORT (UNIQUE) (Cost=6 Card=26 Bytes=676)
  10    9             UNION-ALL
  11   10               TABLE ACCESS (FULL) OF 'SAL1' (TABLE) (Cost=3  Card=13 Bytes=338)
  12   10               TABLE ACCESS (FULL) OF 'SAL1' (TABLE) (Cost=3  Card=13 Bytes=338)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         35  consistent gets
          0  physical reads
          0  redo size
       1563  bytes sent via SQL*Net to client
        457  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
         39  rows processed

As you can see from query and execution plan we scanned table SAL1 three times. If we have to get report for 10 years we will have to scan SAL1 for ten times and so on.

11g SQL

SELECT a.empno,
  a.empname,
  a.job,
  c.property SALYEAR,
  c.value SAL2000TO2002
FROM emp1 a,
  (SELECT empno,
    property,
    value
  FROM sal1 unpivot exclude nulls (value FOR property IN (sal2000,sal2001,sal2002))
  ) c
WHERE a.empno=c.empno;

      EMPNO EMPNAME                        JOB                            SALYEAR SAL2000TO2002
---------- ------------------------------ ------------------------------ ------- -------------
      7369 SMITH                          CLERK                          SAL2000            10
      7369 SMITH                          CLERK                          SAL2001            11
      7369 SMITH                          CLERK                          SAL2002            12
      7499 ALLEN                          SALESMAN                       SAL2000            20
      7499 ALLEN                          SALESMAN                       SAL2001            21
      7499 ALLEN                          SALESMAN                       SAL2002            22
      7521 WARD                           SALESMAN                       SAL2000            21
      7521 WARD                           SALESMAN                       SAL2001            22
      7521 WARD                           SALESMAN                       SAL2002            23
      7654 MARTIN                         SALESMAN                       SAL2000            23
      7654 MARTIN                         SALESMAN                       SAL2001            24
      7654 MARTIN                         SALESMAN                       SAL2002            25
      7698 BLAKE                          MANAGER                        SAL2000            31
      7698 BLAKE                          MANAGER                        SAL2001            32
      7698 BLAKE                          MANAGER                        SAL2002            33
      7782 CLARK                          MANAGER                        SAL2000            31
      7782 CLARK                          MANAGER                        SAL2001            32
      7782 CLARK                          MANAGER                        SAL2002            33
      7788 SCOTT                          ANALYST                        SAL2000            25
      7788 SCOTT                          ANALYST                        SAL2001            25
      7788 SCOTT                          ANALYST                        SAL2002            25
      7839 KING                           PRESIDENT                      SAL2000           100
      7839 KING                           PRESIDENT                      SAL2001           110
      7839 KING                           PRESIDENT                      SAL2002           120
      7844 TURNER                         SALESMAN                       SAL2000            10
      7844 TURNER                         SALESMAN                       SAL2001            11
      7844 TURNER                         SALESMAN                       SAL2002            12
      7876 ADAMS                          CLERK                          SAL2000            10
      7876 ADAMS                          CLERK                          SAL2001            11
      7876 ADAMS                          CLERK                          SAL2002            12
      7900 JAMES                          CLERK                          SAL2000            17
      7900 JAMES                          CLERK                          SAL2001            18
      7900 JAMES                          CLERK                          SAL2002            19
      7902 FORD                           ANALYST                        SAL2000            25
      7902 FORD                           ANALYST                        SAL2001            26
      7902 FORD                           ANALYST                        SAL2002            27
      7934 MILLER                         CLERK                          SAL2000            10
      7934 MILLER                         CLERK                          SAL2001            11
      7934 MILLER                         CLERK                          SAL2002            12

39 rows selected.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=13 Card=39 Bytes=23478)
   1    0   HASH JOIN (Cost=13 Card=39 Bytes=23478)
   2    1     TABLE ACCESS (FULL) OF 'EMP1' (TABLE) (Cost=3 Card=14 Bytes=7938)
   3    1     VIEW (Cost=9 Card=39 Bytes=1365)
   4    3       UNPIVOT
   5    4         TABLE ACCESS (FULL) OF 'SAL1' (TABLE) (Cost=3 Card=13 Bytes=676)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         17  consistent gets
          0  physical reads
          0  redo size
       1654  bytes sent via SQL*Net to client
        457  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         39  rows processed

With unpivot function we scanned SAL1 only once and that means less resource utilization and better performance. So can we just forget about design? No as we have to maintain code every time we add another column to table. For eg we have to change code if we add sal2003, sal2004 column and so on i.e. [FROM sal1 unpivot exclude nulls (value FOR property IN (sal2000,sal2001,sal2002,sal2003,sal2004…..))]

The Silver is the New Black Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.