Friday, January 24, 2025

QTEMP Performance Loses Again

In the past, I’ve written about the inefficiencies caused by referencing QTEMP tables in your query requests. Due to the performance overhead of this approach, our IBM Expert Labs team is often employed by clients to help them reduce the number of queries using QTEMP. 

While working with a customer recently, I was reminded of one reason where developers in the past were forced to use QTEMP queries with stored procedures. While Db2 for i has supported stored procedures which return result sets for a very long time, there were many releases where those result sets were not available to invokers using embedded SQL. If the invoking program wanted to consume a result set returned by a stored procedure, the program had to use an API-based interface like the SQL Call Level Interface (CLI) or JDBC.  So, if you were an RPG or COBOL programmer you could use embedded SQL to call a stored procedure, but you were out of luck when it came to using embedded SQL to consuming the contents of a result set returned by a stored procedure.

 

This limitation was what drove some developers to use QTEMP queries. The basic idea is that if the stored procedure copied the contents of its result set into a QTEMP table, then the invoker of the stored procedure just needed to know the name of the QTEMP table in order to be able to consume the result set with a simple query.

 

Here’s an example of a logic from a stored procedure using the QTEMP technique to return a result set to the invoker.


...
-- Create QTEMP result set table if it doesn't exist in job yet
DECLARE CONTINUE HANDLER FOR SQLSTATE '42704'
  CREATE TABLE qtemp.results
   AS (SELECT ordID, ordCustID FROM orders
           WHERE ordDate = inputDate) WITH NO DATA;

-- Clear our result set rows from prior call
DELETE FROM qtemp.results;

-- Populate with result set rows
INSERT INTO qtemp.results
 (SELECT ordID, ordCustID FROM orders 
           WHERE ordDate = inputDate);
...

 

 

The invoker would then declare a cursor over the QTEMP table to consume the result set with this simple query: SELECT ordID, ordCustID FROM qtemp.results. This query is one reason that the QTEMP solution has slower performance. The application needs to one run query to populate the QTEMP table and a second query to unload the QTEMP table.

 

The good news is that there’s a simpler and faster solution which has been available since the IBM i 7.1 release. This solution is made possible through Db2’s support for result set locator variables. Think of result set locator variables like pointers. These result set locators enable the stored procedure invoker to simply point at the result set returned by the procedure in order to consume it. This approach is more efficient since it doesn’t require a copy of the result set to be made like the QTEMP approach.

 

Using the result set locators support, a stored procedure uses an SQL cursor to make the result set contents available to the invoker. The following SQL demonstrates how that is done.

CREATE PROCEDURE GetOrders()
LANGUAGE SQL
DYNAMIC RESULT SETS 1
BEGIN 

DECLARE ordCursor CURSOR WITH RETURN FOR
 (SELECT ordID, ordCustID FROM orders 
           WHERE ordDate = inputDate);

OPEN ordCursor;

END

 

The consumption of this stored procedure’s result set is enabled by result set locators as shown in the following snippet of SQL code.

DECLARE rs1 DECLARERESULT_SET_LOCATOR VARYING;  

ASSOCIATE LOCATOR(rs1) WITH PROCEDURE GetOrders;

ALLOCATE rs1Cursor CURSOR FOR RESULT SET rs1;
   
FETCH rs1Cursor INTO ordIdVar, custIdVar;
...

 

After declaring variable rs1 as a result set locator, the first step is executing the ASSOCIATE LOCATOR statement to point the rs1 variable at the result set returned by the GetOrders stored procedure.  The ALLOCATE CURSOR statement is the final step that needs to be performed since a cursor is required to retrieve the rows contained in the result set. In this example, the a cursor named rs1Cursor is allocated to fetch the rows from the stored procedure result.

Now that the two different coding methodologies have been compared, let’s talk about performance. In my testing, I measured the performance of repeated calls to the stored procedure in a single job. With distributed applications, it’s very common for the same stored procedure to be called over and over again within a database connection or job. My performance tests showed the QTEMP-based solution to be 2-4.5x slower than the solution using result set locators.  The result set locator performance advantage grew as the number of stored procedure calls increased. When comparing the performance of 25 calls, the result set locator solution was 2x faster. The performance advantage of result set locators grew to 3.8x faster with 150 procedure calls and to 4.5x faster with 300 procedure calls.

 

In my performance tests, the stored procedure result set contained 25 rows. Thus, each call to the QTEMP-based stored procedure results in 25 rows being copied into the QTEMP table. The QTEMP solution performance would slow even further with result sets that are larger in size. In addition, if the stored procedures were being called in multiple jobs, the QTEMP solution would suffer from the Plan Cache inefficiencies that I highlighted in my original QTEMP article.

 

There are some cases where a QTEMP-based solution is the winning choice, but hopefully it’s now clear that using a QTEMP table to return a stored procedure result set is clearly not one of those situations.

2 comments:

  1. Where is this Cursor closed?

    ReplyDelete
    Replies
    1. After the stored procedure invoker is done fetching rows, that program would close the cursor with a CLOSE rs1Cursor statement.

      Delete