top of page
Search

Full Table Scan in Oracle

Updated: Nov 25, 2020






There is a common misbelief amongst us that FTS are evil, In one of my earlier blog i tried to give some insights on FTS and leverage a potential performance gain using a FTS. This post is dedicated on FTS and ways we can gain significant performance using a FTS.


Before we kickoff with use cases, lets first try to understand, why oracle uses full table scan (fts) sometimes instead of using index, where you have index.


A FTS perform sequential reads, many blocks simultaneously from disk while an index scan, which we always advocate for, reads single block random IO’s. Sequential reads is the fastest type of IO since it reads many blocks at a time. Please don’t misapprehend me, I am not trying to say that index scan is not fast but everything depends on the requirement. If we are searching for large data set from a table say for more than 25-30% of the records FTS would be fast.

Imagine a Table has 1000 blocks with 1000 values and we query to read 250 rows(25%) from that table using index. Let us suppose a buffer cache can hold 100 blocks. Index scan in this case might tell, go to block 1 get the row, go to block 2 get the row ….. and so, on go to block n get the row. Imagine buffer cache does not hold 1- 100 block instead it contains 101-200, we have still 50 rows to read. That’s pain it might tell to go back to block again and re-read the block again from the disk that we already read since we don’t have that in buffer cache. Oracle CBO is smart enough to use FTS instead of using index scan since FTS will take less IO more efficiently.


The Selectivity of index depends also depends on ratio of the (DISTINCT_KEYS/NUM_ROWS). The ideal selectivity is 1 in case of unique index on not null column.


Now lets now set up the tables for our use case, I am attaching the script for insert script. While the table DDL below.


-- Create table
create table CUSTOMERS
(
  cust_id                NUMBER,
  cust_first_name        VARCHAR2(20),
  cust_last_name         VARCHAR2(40),
  cust_gender            CHAR(1),
  cust_year_of_birth     NUMBER(4),
  cust_marital_status    VARCHAR2(20),
  cust_street_address    VARCHAR2(40),
  cust_postal_code       VARCHAR2(10),
  cust_city              VARCHAR2(30),
  cust_state_province    VARCHAR2(40),
  country_id             CHAR(2),
  cust_main_phone_number VARCHAR2(25),
  cust_income_level      VARCHAR2(30),
  cust_credit_limit      NUMBER,
  cust_email             VARCHAR2(30)
);
-- Create/Recreate indexes 
create index IDX_CTRY_ID on CUSTOMERS (COUNTRY_ID)
  tablespace SYSAUX
  pctfree 10
  initrans 2
  maxtrans 255
  storage
  (
    initial 64K
    next 1M
    minextents 1
    maxextents unlimited
  );
create index OBE_CUST_CRED_LMT_IND on CUSTOMERS(CUST_CREDIT_LIMIT)
  tablespace SYSAUX
  pctfree 10
  initrans 2
  maxtrans 255
  storage
  (
    initial 64K
    next 1M
    minextents 1
    maxextents unlimited
  );


1
.7z
Download 7Z • 26KB

the above script inserts 600+ records and we have index on country_id colum so when we run below sql we can expect index range scan to happen as a common understanding.


Note: If you don't gather the statistics dynamic sampling may show you a index range scan. It's recommended to gather statistics for the optimizer to run effectively




SQL> SELECT *
     FROM table(DBMS_XPLAN.DISPLAY (FORMAT=>'ALL +OUTLINE'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2008213504

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |   165 | 22935 |     5   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |   165 | 22935 |     5   (0)| 00:00:01 |
-------------------------------------------------------------------------------

You might get surprised when you see a FTS, it was a indexed column yet the optimizer uses a FTS. Optimizer are smart enough to pick the perfect plan for you.


Remember to gather the statistics if you want the optimizer to work efficiently.


BEGIN
dbms_stats.Gather_table_stats('HR', 'CUSTOMERS',
method_opt => 'for all columns size skewonly');
END;

Let's now run the same query on different country id, lets check which has the least no of record .



select count(*), country_id
  from customers
 group by country_id
having count(*) < 10;
---------------------------------------------------------------------------------------------------------

1	2	AR
2	4	ZA
3	7	MY
4	3	SA
5	6	JP
6	3	DK
7	6	BR
8	2	NZ
9	1	TR



Explain plan for
  2  select * from customers where country_id = 'TR';
SQL> 
SQL> SELECT *
  2       FROM table(DBMS_XPLAN.DISPLAY (FORMAT=>'ALL +OUTLINE'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3201264713
--------------------------------------------------------------------------------
| Id  | Operation                           | Name        | Rows  | Bytes | Cost
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |             |     1 |   139 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMERS   |     1 |   139 |
|*  2 |   INDEX RANGE SCAN                  | IDX_CTRY_ID |     1 |       |
--------------------------------------------------------------------------------


Great this time we can see the index being used as per our expectation, Its exactly what we started our blog with- When looking for large data set FTS is better in performance and CBO is smart enough to pick it even when we have a indexed column. In real world example while dealing with projects we have millions of data, FTS can be weary in such cases as an option to reduce the pain of performance while running a FTS on large data set Oracle has come up with a - Parallel hint-


Before we end this post lets see how fast can a parallel hint be, lets run the same query for country_id = 'US'



select /*+ Parallel (c,22) */ *
  from customers c
 where country_id = 'US';
 Plan Hash Value  : 2487033814 

-----------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows | Bytes | Cost | Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |  165 | 22935 |    2 | 00:00:01 |
|   1 |   PX COORDINATOR       |           |      |       |      |          |
|   2 |    PX SEND QC (RANDOM) | :TQ10000  |  165 | 22935 |    2 | 00:00:01 |
|   3 |     PX BLOCK ITERATOR  |           |  165 | 22935 |    2 | 00:00:01 |
| * 4 |      TABLE ACCESS FULL | CUSTOMERS |  165 | 22935 |    2 | 00:00:01 |
-----------------------------------------------------------------------------

You can easily identify significant effect on the cost of the query.




24 views0 comments

Recent Posts

See All
Post: Blog2_Post
bottom of page