Collections, Cursors, Bulk Binds and FORALL

Once in a while, I read something about Oracle that stops me in my tracks and makes me really think about how I approach my job. Recent examples include starting to work with 9iAS, and slowly becoming aware of how much I'm going to have to get my head around the role Java and middleware is going to have in future Oracle applications. Another was when I began studying for my OCP, and began to understand how, for any system to be effective, you need to really have a good understanding of how Oracle works internally. The latest example came about from reading a recent thread started by Daniel Morgan on comp.databases.oracle.server.

The initial posting asked the question:

"At a class I taught this last weekend I brought up the fact that most PL/SQL programmers are still writing v7 code. I gave everyone there a challenge and thought I'd share it with the group for any of you looking for a challenge on which to sharpen your skills.

CREATE TABLE t1 AS
SELECT *
FROM all_objects
WHERE 1=0;

CREATE OR REPLACE PROCEDURE test_proc IS

BEGIN
    FOR x IN (SELECT * FROM all_objects)
    LOOP
        INSERT INTO t1
        (owner, object_name, subobject_name, object_id,
        data_object_id, object_type, created, last_ddl_time,
        timestamp, status, temporary, generated, secondary)
        VALUES
        (x.owner, x.object_name, x.subobject_name, x.object_id,
        x.data_object_id, x.object_type, x.created,
        x.last_ddl_time, x.timestamp, x.status, x.temporary,
        x.generated, x.secondary);
    END LOOP;
COMMIT;
END test_proc;
/

set timing on
exec test_proc;
set timing off

Everyone using 8i+ features should be able to improve the performance of
this by at least 5X.

I'll post a solution in a week or so."

The bit that hit home was the comment about most PL/SQL programmers still writing v7 code. Thinking about it, that's one I'm guilty of.

The sort of work I do involves knowing as much as possible about as many Oracle products as possible. One week I'm tuning up a Discoverer installation, next week I'm building a data model for a first-phase data warehouse. Large parts of my work involve working out which Oracle products are best suited to a potential application, and the nature of the job is that you thoroughly learn something for a particular project, then move on and rely on that knowledge for some time afterwards. On average, I usually know more about a particular Oracle product than most people, but I'm the first to admit that I'm no expert and there's always room to learn.

As Daniel Morgan points out, PL/SQL has come on considerably since version 2.3 that came with Oracle 7. One of the major areas of improvement has been in the area of arrays and PL/SQL Tables, and in fact this area is now referred to in Oracle 8i, 9i and now 10g as 'Collections'. Together with the way cursors are now handled, there's now much more efficient ways of bulk processing large arrays of data, and it's worth taking some time out to look at how things have developed.

Going back to the original thread, and discarding the approach of just using a straight insert (*/ append */)  into ... select ... from all_objects (which of course would be the fastest, as it's just doing the insert using a straight SQL set operation) , the answer as provided by Daniel was as follows;

"I was thinking in terms of some variation on the following:

CREATE OR REPLACE PROCEDURE fast_proc (p_array_size IN PLS_INTEGER DEFAULT 100)
IS

TYPE ARRAY IS TABLE OF all_objects%ROWTYPE;
l_data ARRAY;

CURSOR c IS
SELECT *
FROM all_objects;

BEGIN
    OPEN c;
    LOOP
    FETCH c BULK COLLECT INTO l_data LIMIT p_array_size;

    FORALL i IN 1..l_data.COUNT
    INSERT INTO t2 VALUES l_data(i);

    EXIT WHEN c%NOTFOUND;
    END LOOP;
    CLOSE c;
END fast_proc;
/

of which many possibilities exist. One of which Billy V posted. The point I would hope more junior developers take away from this is that while cursors definitely have their uses ... they should not be the first thing one thinks of any more. Ok ... shouldn't be the first thing after the 'obvious' DML statement."

Another solution proposed by Billy Verreynne was even more compact, and gave a threefold increase in performance.

"My attempt gives me a 3x improvement in performance....

SQL> create or replace procedure fast_proc is
2         type TObjectTable is table of ALL_OBJECTS%ROWTYPE;
3         ObjectTable$ TObjectTable;
4         begin
5         select
6                     * BULK COLLECT INTO ObjectTable$
7         from ALL_OBJECTS;
8
9         forall x in ObjectTable$.First..ObjectTable$.Last
10       insert into t1 values ObjectTable$(x) ;
11       end;
12 /

Procedure created."

So, what are the new features that the two solutions are using, and why do they give such an increase in performance? I decided to take a bit of time out and do some more studying.

The first place to look for details of current PL/SQL syntax is the online manuals available at docs.oracle.com. In this instance, the document we're after is the PL/SQL User's Guide and Reference Release 2 (9.2), and in particular, the sections on PL/SQL Collections and Records, Managing Cursors., Bulk Binds and the FORALL statement. I also took a look at a book we got hold of recently, "Oracle 9i PL/SQL Programming", by Scott Urman.

Over the next few days, I'll be looking at each of these areas in turn, starting tomorrow with PL/SQL Collections and Records.