Understanding the results of Execute Explain Plan in Oracle (Part -I-)

pfgrf184

The output of EXPLAIN PLAN is a debug output from Oracle’s query optimiser. The COST is the final output of the Cost-based optimiser (CBO), the purpose of which is to select which of the many different possible plans should be used to run the query. The CBO calculates a relative Cost for each plan, then picks the plan with the lowest cost.

(Note: in some cases the CBO does not have enough time to evaluate every possible plan; in these cases it just picks the plan with the lowest cost found so far)

In general, one of the biggest contributors to a slow query is the number of rows read to service the query (blocks, to be more precise), so the cost will be based in part on the number of rows the optimiser estimates will need to be read.

For example, lets say you have the following query:

SELECT emp_id FROM employees WHERE months_of_service = 6;

(The months_of_service column has a NOT NULL constraint on it and an ordinary index on it.)

There are two basic plans the optimiser might choose here: Plan 1. Read all the rows from the “employees” table, for each, check if the predicate is true (months_of_service=6). Plan 2. Read the index where months_of_service=6 (this results in a set of ROWIDs), then access the table based on the ROWIDs returned.

Let’s imagine the “employees” table has 1,000,000 (1 million) rows. Let’s further imagine that the values for months_of_service range from 1 to 12 and are fairly evenly distributed for some reason.

The cost of Plan 1, which involves a FULL SCAN, will be the cost of reading all the rows in the employees table, which is approximately equal to 1,000,000; but since Oracle will often be able to read the blocks using multi-block reads, the actual cost will be lower (depending on how your database is set up) – e.g. let’s imagine the multi-block read count is 10 – the calculated cost of the full scan will be 1,000,000 / 10; Overal cost = 100,000.

The cost of Plan 2, which involves an INDEX RANGE SCAN and a table lookup by ROWID, will be the cost of scanning the index, plus the cost of accessing the table by ROWID. I won’t go into how index range scans are costed but let’s imagine the cost of the index range scan is 1 per row; we expect to find a match in 1 out of 12 cases, so the cost of the index scan is 1,000,000 / 12 = 83,333; plus the cost of accessing the table (assume 1 block read per access, we can’t use multi-block reads here) = 83,333; Overall cost = 166,666.

As you can see, the cost of Plan 1 (full scan) is LESS than the cost of Plan 2 (index scan + access by rowid) – which means the CBO would choose the FULL scan.

If the assumptions made here by the optimiser are true, then in fact Plan 1 will be preferable and much more efficient than Plan 2 – which disproves the myth that FULL scans are “always bad”.

The results would be quite different if the optimiser goal was FIRST_ROWS(n) instead of ALL_ROWS – in which case the optimiser would favour Plan 2 because it will often return the first few rows quicker, at the cost of being less efficient for the entire query.

Hint Description
FIRST_ROWS(n) This hint instructs Oracle to optimize an individual SQL statement with a goal of best response time to return the first n number of rows, where n equals any positive integer. The hint uses a cost-based approach for the SQL statement, regardless of the presence of statistic.
ALL_ROWS This hint explicitly chooses the cost-based approach to optimize a SQL statement with a goal of best throughput.
Advertisements

One of Best Way to Archive Table with “Returning”

The RETURNING clause is used to return specific columns from rows manipulated by DML statements.  When DML statements manipulate multiple rows, the data returned can be loaded into a collection using a bulk operation.  The returning_bulk_collect.sql script provides an example of this functionality.

returning_bulk_collect.sql

SET SERVEROUTPUT ON
DECLARE
TYPE t_object_id_tab IS TABLE OF bulk_collect_test.object_id%TYPE;

  l_tab  t_object_id_tab;
BEGIN
DELETE FROM bulk_collect_test
RETURNING object_id BULK COLLECT INTO l_tab;

  DBMS_OUTPUT.put_line(‘Deleted IDs : ‘ || l_tab.count || ‘ rows’);

  ROLLBACK;
END;
/

The returning_bulk_collect.sql script defines a collection, deletes the data from the test table and returns the OBJECT_IDs of the deleted rows.  To preserve the data, the script performs a rollback, allowing for multiple runs.  The output from this script shows that the collection is populated as expected.

SQL> @returning_bulk_collect.sql
Deleted IDs : 61202 rows

The FORALL command can be used in conjunction with a DML statement with a RETURNING clause.  In this situation, both the DML and the return data are bulk operations.  The forall_returning_bulk_collect.sql script provides an example of this functionality.

forall_returning_bulk_collect.sql

SET SERVEROUTPUT ON
DECLARE
TYPE t_object_id_tab IS TABLE OF bulk_collect_test.object_id%TYPE;

  l_in_tab   t_object_id_tab;
l_out_tab  t_object_id_tab;
BEGIN
— Populate collection use in forall.
SELECT object_id
BULK COLLECT INTO l_in_tab
FROM   bulk_collect_test
WHERE  rownum < 101; 

  FORALL i IN l_in_tab.first .. l_in_tab.last
DELETE FROM bulk_collect_test
WHERE  object_id = l_in_tab(i)
RETURNING object_id BULK COLLECT INTO l_out_tab;

  DBMS_OUTPUT.put_line(‘Starting IDs : ‘ || l_in_tab.count || ‘ rows’);
DBMS_OUTPUT.put_line(‘Deleted IDs  : ‘ || l_out_tab.count || ‘ rows’);

  ROLLBACK;
END;
/

The forall_returning_bulk_collect.sql script defines two collections, one to support the FORALL statement and one to support the RETURNING clause.  The first collection is populated to provide IDs of rows to be deleted. Next the bulk delete is performed and the data returned into the second collection.  Then the size of both collections is reported.  The output from this script is displayed below.

SQL> @forall_returning_bulk_collect.sql
Starting IDs : 100 rows
Deleted IDs  : 100 rows

PL/SQL procedure successfully completed.

XML representation of data in your table via using “dbms_xmlgen” package

If you want get an XML representation of data in your table then take a look at dbms_xmlgen package. For example, you can write a simple function like this one:

SQL> create or replace function GenXML(p_query in varchar2, p_RSetTag in varchar2)
  2  return clob
  3  is
  4    l_xmlcntx dbms_xmlgen.ctxHandle;
  5    l_resxml clob;
  6  begin
  7    l_xmlcntx := dbms_xmlgen.newContext(p_query);
  8    dbms_xmlgen.setRowSetTag(l_xmlcntx,  p_RSetTag);
  9    l_resxml := dbms_xmlgen.getXML(l_xmlcntx);
 10    dbms_xmlgen.closeContext(l_xmlcntx);
 11    return l_resxml;
 12  end;
 13  /

Function created

And then use it as follows by passing your query to a table and a rowset tag as parameters.

SQL> select genxml('select * from employees where rownum = 1','EMPLOYEES') as XmlData
  2    from dual
  3  ;

XMLDATA
--------------------------------------------------------------------------------
<?xml version="1.0"?>
<EMPLOYEES>
 <ROW>
  <EMPLOYEE_ID>100</EMPLOYEE_ID>
  <FIRST_NAME>100</FIRST_NAME>
  <LAST_NAME>King</LAST_NAME>
  <EMAIL>SKING</EMAIL>
  <PHONE_NUMBER>515.123.4567</PHONE_NUMBER>
  <HIRE_DATE>17-JUN-03</HIRE_DATE>
  <JOB_ID>AD_PRES</JOB_ID>
  <SALARY>24000</SALARY>
  <DEPARTMENT_ID>90</DEPARTMENT_ID>
 </ROW>
</EMPLOYEES>

SQL> 

As Further Information

To display a master-detail type of datastructure you can use cursor. For example:

SQL> select genxml('select department_id
  2       , department_name
  3       , cursor(
  4                 select first_name
  5                   from employees t
  6                  where t.department_id = d.department_id
  7           ) employees
  8    from departments d
  9    where rownum = 1','DEPARTMENTS') xmldata
 10    from dual
 11  ;

XMLDATA
--------------------------------------------------------------------------------
<?xml version="1.0"?>
<DEPARTMENTS>
 <ROW>
  <DEPARTMENT_ID>10</DEPARTMENT_ID>
  <DEPARTMENT_NAME>Administration</DEPARTMENT_NAME>
  <EMPLOYEES>
   <EMPLOYEES_ROW>
    <FIRST_NAME>Jennifer</FIRST_NAME>
   </EMPLOYEES_ROW>
  </EMPLOYEES>
 </ROW>
</DEPARTMENTS>

—-2—

To eliminate remaining opening and closing tags in the case when “detail”(the cursor) is null, lets rewrite our query as follows, for example:

SELECT XMLElement("DEPARTMENTS"
                 , XMLAgg( XMLElement( "ROW"
                                     , XMLForest( t.department_id
                                                , t.department_name
                                                )
                                     , (
                                         SELECT XMLAgg(XMLElement("EMPLOYEES"
                                                                 , XMLForest (q.first_name)
                                                                  )
                                                        )
                                           FROM employees q
                                          WHERE q.department_id = t.department_id
                                            --and 100=101
                                        )
                                     )
                          )
                 )

In order to have more control over tags. And store result of the above query in a file as follows: Before running the below code a directory have to be created: Create directory <name> as <path>. Put the name if your directory instead of XMLDIR.

declare
  l_xml clob;

begin
  SELECT XMLElement("DEPARTMENTS"
                 , XMLAgg( XMLElement( "ROW"
                                     , XMLForest( t.department_id
                                                , t.department_name
                                                )
                                     , (
                                         SELECT XMLAgg(XMLElement("EMPLOYEES"
                                                                 , XMLForest (q.first_name)
                                                                  )
                                                        )
                                           FROM employees q
                                          WHERE q.department_id = t.department_id
                                            --and 100=101
                                        )
                                     )
                          )
                 ).getclobval() into l_xml
  FROM departments t
  where rownum < 3;

  dbms_xslprocessor.clob2file(l_xml, 'XMLDIR', 'XmlFile.xml');

end;

Result#1: When sub-query returns data

 <DEPARTMENTS>
  <ROW>
    <DEPARTMENT_ID>10</DEPARTMENT_ID> 
    <DEPARTMENT_NAME>Administration</DEPARTMENT_NAME> 
    <EMPLOYEES>
      <FIRST_NAME>Jennifer</FIRST_NAME> 
    </EMPLOYEES>
  </ROW>
  <ROW>
    <DEPARTMENT_ID>20</DEPARTMENT_ID> 
    <DEPARTMENT_NAME>Marketing</DEPARTMENT_NAME> 
    <EMPLOYEES>
      <FIRST_NAME>Michael</FIRST_NAME> 
    </EMPLOYEES>
    <EMPLOYEES>
      <FIRST_NAME>Pat</FIRST_NAME> 
    </EMPLOYEES>
  </ROW>
 </DEPARTMENTS>

Result#2: When sub-query returns no data

<DEPARTMENTS>
  <ROW>
    <DEPARTMENT_ID>10</DEPARTMENT_ID> 
    <DEPARTMENT_NAME>Administration</DEPARTMENT_NAME> 
  </ROW>
  <ROW>
    <DEPARTMENT_ID>20</DEPARTMENT_ID> 
    <DEPARTMENT_NAME>Marketing</DEPARTMENT_NAME> 
  </ROW>
  </DEPARTMENTS>

RUN TOAD FOR ORACLE ON MAC OS X

Prerequisites

  • CrossOver for Mac
  • GDI+ library (for Toad version 11.5 only)
  • Toad for Oracle MSI installer
  • Oracle Instant Client

As excited as I am that Toad runs on Mac within CrossOver, I have to temper things a bit by admitting that it has not been fully tested. You may also find that this may or may not work with your specific OS/software/version combination. So, until this is fully tested, verified, and officially supported by Quest, please understand that your mileage may vary.

Configuration

My System:

  • 13” MacBook Pro (2.5GHz Intel i5, 8Gb RAM)
    • OS X Lion 10.8.3
  • CrossOver for Mac 11.0.3
    • Toad for Oracle 11.0 (professional, commercial)
    • GDA+ library
    • Oracle Instant Client 11.2.0.3 for WINDOWS  (32-bit)

Toad for Oracle MSI installer 

https://support.quest.com/Search/SoftwareDownloads.aspx?ProductID=268435520&ProductName=Toad%20for%20OracleOracle Instant Client
Download the 32-bit Instant Client – Basic and Instant Client – SQL*Plus from http://www.oracle.com/technetwork/topics/winsoft-085727.html

CrossOver for Mac (www.codeweavers.com)
It’s about $50 for a license, but you can start with a 14-day trial to give you enough time to try..

Step 1: Install CrossOver for Mac
This is real straightforward. If you get stuck here, stop; this article probably isn’t for you.

Step 2: Install Toad for Oracle
Installation via CrossOver is pretty straightforward:

  1. Go to Configure > Install Software… and select “Unsupported Application”
  2. Next, select your Toad for Oracle MSI installer by clicking on the “Select an installer” section. Use the first option: “Choose Installer File”
  3. In the next section, choose “New winxp Bottle…” and give it a name
  4. Click Install and step through the Toad for Oracle installer.
  5. CAUTION: If installed Toad for Oracle v11.5 you will need to instll the GDI+ library first. Otherwise, CrossOver will crash when you launch Toad. You can download & install it from here: http://www.codeweavers.com/compatibility/browse/name/?app_id=8227. Make sure you install it into teh same Bottle as Toad for Oracle 11.5.
  6. Once everything is installed, you should have a shiny new Toad icon in your Launchpad!

Step 3: Connectivity and Environment Variables

Now that we have Toad installed, we need to configure our instant client. Assuming you’ve already downloaded the two packages referenced above, go ahead and unzip them. I recommend placing them all together into a single directory like /instantclient_11_2.

Now create /network/admin folder inside there. This is where we’ll place the tnsnames.ora file (which we have to create). Open your favorite text editor (mine’sTextWrangler) and create your tnsnames.ora file using this structure below. Be sure to replace hostname and SID with the appropriate data for your Oracle instance.

ORCL =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = hostname)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = SID)
)
)

CAUTION: If you’re using TextEdit, stop. TextEdit likes to default to RTF formatting which will wreak havoc with Oracle’s OCI. Save yourself the headache of troubleshooting connection problems and avoid using RTF format…

Once you have your instant client folder arranged and your tnsnames.ora file created, you’ll need to move it over to the C:\ drive of the CrossOver emulator. From CrossOver, click Configure > Manage Bottles… and select your new bottle. Click on the “Advanced” tab and then the “Open C: Drive in Finder” button.

Copy your shiny new /instantclient_11_2 folder to the root of C:\ as you see here:

Next, we need to run regedit to add a few environment variables. Back over to CrossOver, click Configure > Manage Bottles… and select your bottle. Click on the “Control Panel” tab and choose “Task Manager”.

In the Task Manager, click File > New Task (run…) and enter “regedit”

Now let’s add some keys. First, navigate to \HKEY_CURRENT_USER\Environment and add a new String Value as “PATH” with data “c:\instantclient_11_2”.(NOTE: If PATH already exists, just append c:\instantclient_11_2 to the end, separated by a semi-colon.)

Next, navigate to \HKEY_LOCAL_MACHINE\SOFTWARE and add a new key named “Oracle.” Then add a new String Value as “ORACLE_HOME” with data “c:\instantclient_11_2”. Lastly, add one more String Value as “TNS_ADMIN” with data “c:\instantclient_11_2\network\admin”.

Let’s check our configuration and see if we’ve got everything set up properly. From the CrossOver menu click Programs > Run Command… In the new window which opens, use the dropdown to select the appropriate Bottle and then expand Debug Options and click “Open Shell.” This will open a terminal window on the Mac and set the environment variables to CrossOver’s Bottle. To invoke the Windows Command Prompt just type wine cmd. If all goes well you should see a familiar C:\> prompt. Type in “sqlplus” to step into SQL*Plus and connect using your Net Service name defined in the tnsnames.ora file

If you can’t connect at this point, I think it’s a good idea to fiddle with your tnsnames.ora and double check your network settings before moving on.

Now that we’ve installed everything and verified connectivity, let’s fire up Toad for Oracle! You can use that shiny new Toad icon in your Mac OS X Launchpad or run it from CrossOver’s menu by clicking Programs > Quest Software > Toad for Oracle > Toad for Oracle 11

The program seems to run fine for most things, though the painting of windows seems a little slow.

When should I use BULK COLLECT?

When should I use BULK COLLECT?

The primary use case for BULK COLLECT is any kind of loop that results in fetching data one row at a time, which Tom Kyte refers as “slow by slow” processing.

You might think, therefore, that a cursor FOR loop is an obvious opportunity for conversion to BULK COLLECT. If you are running Oracle Database 10g Release 2 or higher, however, that is not necessary – as long as you have turned on Oracle’s automatic PL/SQL optimization to at least level 2.

Oracle Database 10g introduced automatic optimization of PL/SQL code. The default optimization level is 2 – and you should just leave it that way. At this level, Oracle will take many, careful steps to transform your code so that it runs substantially faster.

One of the most interesting and non-trivial optimization is that of cursor loops. Consider this code:

BEGIN
FOR rec IN (  SELECT *
FROM employees
ORDER BY last_name DESC)
LOOP
DBMS_OUTPUT.put_line (l_employees (indx).last_name);
END LOOP;
END;
/

This block seems to fetch data from the employees table one row at a time, and then display the last name. If, however, optimization is set to level 2 or higher, then Oracle will automatically convert this block into C code that fetches data in bulk, just like with BULK COLLECT.

As a result, you do not need to explicitly convert your cursor FOR loops to use BULK COLLECT if:

  1. You are running 10.2 or higher.
  2. Your optimization level is to 2 or higher.
  3. The cursor FOR loop does not contain any DML statements (insert, update, delete).

In other words, you definitely should convert code like the following block either to a cursor FOR loop (to benefit from automatic optimization) or an explicit BULK COLLECT:

DECLARE
CURSOR employees_cur
IS
SELECT *
BULK COLLECT INTO l_employees
FROM employees
ORDER BY last_name DESC;

   l_employee   employees_cur%ROWTYPE;
BEGIN
OPEN employees_cur;

   LOOP
FETCH employees_cur INTO l_employee;

      EXIT WHEN employees_cur%NOTFOUND;
DBMS_OUTPUT.put_line (l_employee.last_name);
END LOOP;
END;
/

Here is the explicit BULK COLLECT version:

DECLARE
TYPE employees_aat IS TABLE OF employees%ROWTYPE;

   l_employees   employees_aat;
BEGIN
— All rows at once…
     SELECT *
BULK COLLECT INTO l_employees
FROM employees
ORDER BY last_name DESC;

   DBMS_OUTPUT.put_line (l_employees.COUNT);

   FOR indx IN 1 .. l_employees.COUNT
LOOP
DBMS_OUTPUT.put_line (l_employees (indx).last_name);
END LOOP;
END;
/

and here is the cursor FOR loop version (same as shown earlier):

BEGIN
FOR rec IN (  SELECT *
FROM employees
ORDER BY last_name DESC)
LOOP
DBMS_OUTPUT.put_line (l_employees (indx).last_name);
END LOOP;
END;
/

Clearly, the cursor FOR loop offers the simplest solution. Use it! You will find at the end of this QuickTip a script you can run to prove to yourself the impact of the optimization level on cursor FOR loop optimization.

I noted earlier that you will need to explicitly convert to BULK COLLECT if your loop (even a cursor FOR loop) contains one or more DML statements. The reason for this is simple: Oracle does not automatically optimize the performance of DML statements inside loops to use FORALL. So even if the cursor FOR loop is automatically “converted” to BULK COLLECT-like levels of performance, the DML statements will still run for each row individually.

Since DML overhead is much higher than querying, you will still likely need to convert those DML statements to use FORALL. And that means you will need to populate collections to “feed” into the FORALL. Which means you will need (or should) use BULK COLLECT to fill up those collections.

Here is an example of a procedure that should be converted explicitly to use both BULK COLLECT and FORALL:

CREATE OR REPLACE PROCEDURE upd_for_dept (
dept_in     IN   employees.department_id%TYPE
, newsal_in   IN   employees.salary%TYPE
)
IS
CURSOR emp_cur
IS
SELECT employee_id, salary, hire_date
FROM employees
WHERE department_id = dept_in;
BEGIN
FOR rec IN emp_cur
LOOP
BEGIN
INSERT INTO employee_history
(employee_id, salary, hire_date
)
VALUES (rec.employee_id, rec.salary, rec.hire_date
);

         rec.salary := newsal_in;

adjust_compensation (rec.employee_id, rec.salary);

         UPDATE employees
SET salary = rec.salary
WHERE employee_id = rec.employee_id;
EXCEPTION
WHEN OTHERS
THEN
log_error;
END;
END LOOP;
END upd_for_dept;

So be on the lookout for any kind of loop in your code that contains DML statements. They will offer the greatest “bang for the buck” on improving application performance with relatively little effort on your part.

BULK COLLECT Good to Knows

Keep the following factoids in mind when using BULK COLLECT:

  • Prior to Oracle9i Database, you could use BULK COLLECT only with static SQL. Now you can use BULK COLLECT with both dynamic and static SQL.
  • You can use BULK COLLECT keywords in any of the following clauses: SELECT INTO, FETCH INTO, and RETURNING INTO.
  • A collection populated by BULK COLLECT is always filled sequentially from index value 1 (or is empty).
  • If you are BULK COLLECTing into a varray or a nested table, you do not need to initialize or extend prior to executing the BULK COLLECT query. Oracle will take care of all of that for you.
  • SELECT…BULK COLLECT will not raise NO_DATA_FOUND if no rows are found. Instead, you must check the contents of the collection to see if there is any data inside it.
  • If the query returns no rows, the collection’s COUNT method will return 0.
  • As with any operation involving collections, use of BULK COLLECT will likely increase the amount of PGA (Process Global Area) memory used by an application session.
  • Use the LIMIT clause with BULK COLLECT to put a cap on the amount of PGA memory used by your query.
  • You can specify the limit (example in next section) as a literal, expression or parameter. I suggest you avoid hard-coding the value.
  • 100 is a good default or starting value for LIMIT. Setting the value higher, to say 500 or 1000, will probably not improve performance significantly, but willincrease PGA consumption. For very large volumes of data, such millions of rows of data, you should probably experiment with higher limit values to see what kind of impact you see.
  • You can fetch into a single collection of records or a series of collections (one for each expression returned in the SELECT list).

Here are some examples:

  1. Select all the rows from the employees table in a single “round trip” to the SQL engine:

DECLARE
TYPE employees_aat IS TABLE OF employees%ROWTYPE;

   l_employees   employees_aat;
BEGIN
— All rows at once…
     SELECT *
BULK COLLECT INTO l_employees
FROM employees
ORDER BY last_name DESC;

   DBMS_OUTPUT.put_line (l_employees.COUNT);

   FOR indx IN 1 .. l_employees.COUNT
LOOP
DBMS_OUTPUT.put_line (l_employees (indx).last_name);
END LOOP;
END;
/

  1. Now limit the fetching to 10 rows at a time.

DECLARE
CURSOR employees_cur
IS
SELECT *
FROM employees
ORDER BY last_name DESC;

   TYPE employees_aat IS TABLE OF employees%ROWTYPE;

   l_employees   employees_aat;
BEGIN
OPEN employees_cur;

   LOOP
FETCH employees_cur
BULK COLLECT INTO l_employees
LIMIT 10;

      EXIT WHEN l_employees.COUNT = 0;

      FOR indx IN 1 .. l_employees.COUNT
LOOP
DBMS_OUTPUT.put_line (l_employee.last_name);
END LOOP;
END LOOP;
END;
/

  1. Use BULK COLLET with a dynamic query.

CREATE OR REPLACE PROCEDURE show_names (query_in IN VARCHAR2)
IS
l_names   DBMS_SQL.varchar2_table;
BEGIN
EXECUTE IMMEDIATE query_in BULK COLLECT INTO l_names;

   FOR indx IN 1 .. l_names.COUNT
LOOP
DBMS_OUTPUT.put_line (l_names (indx));
END LOOP;
END;
/

  1. Use BULK COLLECT with a cursor variable.

CREATE OR REPLACE PROCEDURE show_names (cv_io IN OUT SYS_REFCURSOR)
IS
l_names   DBMS_SQL.varchar2_table;
BEGIN
LOOP
FETCH cv_io
BULK COLLECT INTO l_names
LIMIT 10;

      EXIT WHEN l_names.COUNT = 0;

      FOR indx IN 1 .. l_names.COUNT
LOOP
DBMS_OUTPUT.put_line (l_names (indx));
END LOOP;
END LOOP;

   CLOSE cv_io;
END;
/

DECLARE
CV   SYS_REFCURSOR;
BEGIN
OPEN CV FOR ‘select last_name from employees’;

   show_names (CV);
END;
/

Using LIMIT with BULK COLLECT

Oracle provides a LIMIT clause for BULK COLLECT that allows you to limit the number of rows fetched from the database. The syntax is:

FETCH cursor BULK COLLECT INTO … [LIMIT rows];

where rows can be any literal, variable, or expression that evaluates to an integer (otherwise, the database will raise a VALUE_ERROR exception).

LIMIT is very useful with BULK COLLECT, because it helps you manage how much memory your program will used to process data. Suppose, for example, that you need to query and process 10,000 rows of data. You could use BULK COLLECT to retrieve all those rows and populate a rather large collection. However, this approach will consume lots of memory in the PGA for that session. If this code is run by many separate Oracle schemas, your application performance may degrade because of PGA swapping.

The following block of code uses the LIMIT clause in a FETCH that is inside a simple loop.

    DECLARE
l_limit PLS_INTEGER := 100;
CURSOR allrows_cur IS SELECT * FROM employees;
TYPE employees_aat IS TABLE OF allrows_cur%ROWTYPE
INDEX BY BINARY_INTEGER;
l_employees employees_aat;
BEGIN
OPEN allrows_cur;
LOOP
FETCH allrows_cur BULK COLLECT INTO l_employees LIMIT l_limit;

/* Process the data by scanning through the collection. */
          FOR l_row IN 1 .. l_employees.COUNT
LOOP
upgrade_employee_status (l_employees(l_row).employee_id);
END LOOP;

          EXIT WHEN allrows_cur%NOTFOUND;
END LOOP;

CLOSE allrows_cur;
END;

Notice that I terminate the loop by checking the value of allrows_cur%NOTFOUND at the bottom of the loop. When querying data one row at a time, we usually put this code immediately after the FETCH statement. You should not do that when using BULK COLLECT, because when the fetch retrieves the last set of rows, the cursor will be exhausted (and %NOTFOUND will return TRUE) but you will still have some elements in the collection to process.

So either check the %NOTFOUND attribute at the bottom of your loop, or check the contents of the collection immediately after the fetch:

LOOP
FETCH allrows_cur BULK COLLECT INTO l_employees LIMIT l_limit;
EXIT WHEN l_employees.COUNT = 0;

The disadvantage of this second approach is that you will perform an extra fetch that returns no rows, compared to checking %NOTFOUND at the bottom of the loop body. The performance impact of this extra fetch should, however, be minimal.