Monday, February 16, 2026

Finding Inline Candidates - Part II

 Last month, I reviewed how to identify user-defined functions that are a candidate for inlining from the perspective of finding functions that are small and simple with the usage of a system catalog object. Another method is to identify the user-defined functions that are being used most frequently on your system. With this approach, you maximize the performance impact of any user-defined function (UDF) that can be made inline eligible because the UDF is run so frequently. 

This methodology requires a Plan Cache Snapshot. A Plan Cache Snapshot is a point in time copy of the data from the “live” Plan Cache. When a snapshot is created, the result is a Db2 table. The rows in this table contain information about the query implementation along with query execution statistics. The query execution statistics include a count of the number of UDF executions that were not inlined for each SQL statement. Given that this execution information is stored in a Db2 table, we’ll see that it is relatively simple to write a query to find these candidate UDFs.




 










The first piece of information you’ll need for your query is the name of the snapshot table. If the IBM i Access Client Solution (ACS) SQL Performance Center was used to create a snapshot, then you’ll need to figure out the name it generated for the snapshot object. This task is easily accomplished by bringing up the ACS SQL Performance Center and navigating to the Plan Cache Snapshot tab as shown in the figure below.


Now that we have name of the table for the query, let’s look at the following query it will be plugged into. The QQSMINT8 column is the key column for our inline UDF candidate analysis – this is the column that contains the number of times that an SQL statement invoked a UDF that is currently not being inlined by the Db2 engine. This query uses this column on the WHERE clause to only return queries which used non-inlined UDFs that in addition to returning the count of UDFs that were not inlined in the final result set.

SELECT qqsmint8 AS SQL_UDFs_Not_Inined, qvp15f AS Times_Run,
       CASE
         WHEN LENGTH(RTRIM(qqc104)) = 0 THEN 'No Program'
         ELSE RTRIM(qqc104) CONCAT '/' CONCAT qqc103 END AS Pgm, 
        qq1000 AS sql_stmt
  FROM snaplib.snapshot_table
  WHERE qqrid = 1000 AND qqsmint8 > 0
  ORDER BY qvp15f DESC

 

The other QQRID column referenced on the WHERE clause is also critical because the 1000 record type is the row in the snapshot table which contains the query execution statistics. Most of the other record types contain data regarding the query implementation. The SELECT list also includes the QVP15F column which contains the number of times the query was run while the query plan resided in the Plan Cache. This count helps you understand how many times the non-lined UDFs are being invoked. The more times that these UDFs are run, the bigger the performance impact will be on your system if you’re able to make the UDF inline eligible.

 

Here's an example of the output this query can return using a snapshot on my system.  It only found one query that’s dependent on non-lined UDFs.  However, that query has been run over 500 times to this point. 




The QQ1000 column contains the query text, so that you can review and find the name of the non-inlined UDFs for further analysis. Hopefully, this method used along with the previously discussed method will help you identify user-defined functions that can be made inline eligible in order to improve the performance of your applications and servers.


Tuesday, January 20, 2026

Finding Inline Candidates

 A couple of months ago, I highlighted the performance benefits of Db2’s ability to inline calls to a user-defined function (UDF). While I discussed how to determine whether or not a UDF is inline eligible, I didn’t cover methods to identify UDFs that are candidates to make inline eligible.

Db2’s main requirement for inlining UDFs is that the function logic must be simple. Inlining is not an option for UDFs with complex logic which requires many SQL statements. Therefore, it would make sense to find the UDFs with a small number of SQL statements to determine if they can be made inline eligible. The good news is that the catalogs and services provided by Db2 for i make this candidate UDF search quite easy.

 

The following query uses the SYSFUNCS catalog view which contains a row for every UDF to find the functions that are currently not being inlined. This data set is then joined to the SYSPROGRAMSTAT catalog view which view returns a row for every program object which contains an embedded SQL statement. One key thing to note is that this view uses an underlying service to dynamically generate this list of program objects with calls to a UDF – the SYSPROGRAMSTAT view is not just reading rows from a table like other Db2 catalog views in QSYS2. Thus, it’s critical for performance to limit the scope of the program objects by searching for SQL program objects in a specific library (i.e., system_program_schema='KMEXAMPLE')  instead of Db2 generating a row for every SQL program object on the systems.  A couple years ago, I highlighted the importance of limiting scope for all IBM i services.


SELECT routine_schema, routine_name, number_statements
FROM qsys2.sysfuncs f INNER JOIN
       qsys2.sysprogramstat p
     ON p.system_program_name = specific_name AND
        p.program_schema = routine_schema
  WHERE f.INLINE='NO' AND
        p.system_program_schema = 'KMEXAMPLE' AND 
        p.number_statements <= 5 AND 
        p.routine_type ='FUNCTION' AND 
        p.routine_body  =  'SQL'
  ORDER BY number_statements 

 

Also notice that the SYSPROGRAMSTAT view returns a count of the number of SQL statements embedded in the program object in the NUMBER_STATEMENT column. This column makes it easy to find those UDFs whose logic is composed of a small number of SQL statements. On my system, this query returns the following result set which can now be used to determine if any of these simpler UDFs can be modified to make them inline eligible.







The first step would be reviewing the source of the UDF to determine the feasibility of making the function inline eligible.  The following source code for the CALC1 UDF1 reveals that its logic contains one of the key inhibitors to inline eligibility – an intermediate result variable (i.e., OUT1).

CREATE FUNCTION  CALC1 ( P1 CHAR(4) , P2 CHAR(2) , P3 CHAR(2) ) 
RETURNS CHAR(10)   
LANGUAGE SQL 
DETERMINISTIC 
NOT FENCED
BEGIN 
  DECLARE OUT1 CHAR (10) ; 
  SET OUT1 = CHAR(SUBSTR(P1, 1, 2), 2) CONCAT P2 CONCAT P3; 
  SET OUT1 = REPLACE(OUT1, 'A', 'X') ; 
  RETURN OUT1 ; 
END 

 

An intermediate result variable requires an SQL statement to assign a value to it.  UDFs can be only inlined if they contain a single RETURN statement, so the assignment statement for the intermediate result is a showstopper.  That’s the bad news – the good news is that frequently these intermediate result statements aren’t needed.  Often programmers just utilize these intermediate results to simplify the coding logic without understanding the performance impacts. For this UDF, the intermediate result variable and assignments aren’t needed if the UDF logic is updated to the following:

 

CREATE FUNCTION  CALC1 ( P1 CHAR(4) , P2 CHAR(2) , P3 CHAR(2) ) 
RETURNS CHAR(10)   
LANGUAGE SQL 
DETERMINISTIC 
NOT FENCED
BEGIN 
  RETURN REPLACE( 
           (CHAR(SUBSTR(P1, 1, 2), 2) CONCAT P2 CONCAT P3), 
           'A', 'X') ; 
END 

 

An alternative route to identify which UDFs to target for inlining is to find the UDFs that are being called most frequently on your system that are not being inlined by Db2. Plan Cache Snapshots can be leveraged for this type of analysis, but you’ll have to come back to the blog next month to learn about that approach,  In the interim, you have a method for identifying the UDFs with simple logic on your system that are the easiest candidates for coding changes to make them eligible for inlining by Db2.


Tuesday, December 16, 2025

SQL FOR Statement Delivers

 Christmas is near which means it is gift giving season, so I thought I’d highlight a powerful capability that the SQL procedural language delivers to programmers – the FOR statement.

 The SQL procedural language (SQL PL) itself was a powerful addition to the SQL Standard because it enables developers to code both their data access and application logic entirely in SQL.  No need to learn the syntax for embedding SQL in  a high-level programming language such as RPG or COBOL, all the coding can be done with SQL.

 

For the most part, the SQL PL support mirrors other programming languages with support for variables, assignment statements, conditional statements, and looping constructs. While other languages support a for loop, the SQL FOR statement is unique in the fact that it streamlines the use of a cursor to process the rows returned by a SELECT statement.

 

In a high-level programming language, the use of a cursor requires a specific logic flow that looks like the following:

DECLARE pgm1cur CURSOR FOR
  SELECT colPK, col2 FROM table1 WHERE col3>100;

OPEN pgm1cur;

FETCH pgm1cur INTO :host1, :host2;

 
while SQLCODE = 0
   /* application logic using colPK and col2 values 
      stored in host1 & host2 variables*/
   FETCH pgm1cur INTO :host1, :host2;
end;

CLOSE pgm1cur;

The cursor needs to be declared and opened. Next, the FETCH statement is used with a looping construct to process each row returned by the cursor and copy the values needed for the application logic into local variables.

The coding for cursor processing is nothing complex, but it is somewhat tedious. The SQL FOR statement stands out because it reduces the amount of code that needs to be created and maintained. As the following example shows, the coding load is reduced because the FOR statement eliminates the need for opening, advancing, and closing the cursor along with the need to copy column values into host variables.

FOR myloop AS pgm1cur CURSOR FOR
  SELECT colPK, col2 FROM table1 WHERE col3>100
DO
     /* application logic using 
        colPK and col2 values */ 

END FOR;

The FOR statement makes cursor usage so simple that one can easily overlook that they’re even using a cursor – let’s look at customer example from a recent IBM Expert Labs engagement which reinforces this. 

A client needed help optimizing the performance of their SQL stored procedures and one of their stored procedures contained the following FOR statement:

FOR myloop AS proc1Cur CURSOR FOR
  SELECT colPK, col1, col2 FROM table1 WHERE rowProcessed ='N'
DO
    SET currentPK = colPK;

     /* application logic using col1 and col2 values */

     UPDATE table1 SET rowProcessed='Y' WHERE colPK = currentPK;
END FOR;


The client’s developer coded a separate UPDATE statement to update the row which the loop was currently processing because they lost track of the fact the FOR statement provides a cursor that can be used for a positioned update.  While the client’s FOR statement functionally worked, performance was sub-optimal because the separate UPDATE statement must search and find the row that needs to be updated. 

 

The positioned update used in this optimized version of the FOR statement eliminates the search and find overhead by just updating the row that the cursor is  currently pointing to.

FOR myloop AS proc1Cur CURSOR FOR
  SELECT colPK, col1, col2 FROM table1 WHERE rowProcessed ='N'
   FOR UPDATE OF rowProcessed
DO
    SET currentPK = colPK;

     /* application logic using col1 and col2 values */

    UPDATE table1 SET rowProcessed='Y' WHERE CURRENT OF proc1Cur;
END FOR;


One performance test that I ran showed the positioned update to be two times faster. I’m guessing that a 2x performance increase would be pretty high on the Christmas wish list for most programmers!

 

That's a wrap for 2025 -  Merry Christmas and I look forward to interacting with all of you in the new year!

Tuesday, November 25, 2025

Thankful for UDF Inlining

 This week we celebrate the Thanksgiving holiday in the US, so I decided to write about an overlooked Db2 enhancement that I’m grateful to have as a tool in the toolbox when our IBM Expert Labs team helps clients optimize their SQL performance. 

One of the great things about the SQL user-defined function (UDF) support is that it allows programmers to use modular programming as they develop queries and reports. If there’s a calculation commonly used by your business, then good modular programming would dictate that the calculation is coded in a single program and reused as opposed to embedding that calculation in every application program that needs it.  UDFs enable this same type of code reuse, but at an SQL statement level.

 

Let’s use Costs of Goods Sold (COGS) as a common calculation or derivation that your reports frequently use.  COGS is a simple calculation that adds together beginning inventory and purchases during the period and subtracts the ending inventory. Note: You may be thinking that the conversion of a legacy date value to an SQL Date value would be a better example of a common derivation that queries need - however, remember that I've previously covered a Date lookup table as a better solution for that.


Here’s the SQL UDF definition to support the COGS calculation:


CREATE FUNCTION CostOfGoodsSold
           (beginInv DECIMAL(8,2), 
            endInv DECIMAL(8,2),
            purchasedInv DECIMAL(8,2))
 RETURNS DECIMAL(8,2) 
 LANGUAGE SQL
 NOT FENCED 
 GLOBAL DETERMINISTIC 
  RETURN( beginInv + purchasedInv  endInv)


While using the UDF is easy from a programming perspective, there is a performance cost since each call to the CostsOfGoodsSold UDF involves a call to an external program object. Invoking an external program takes much more time and resources than retrieving a column from a row in a table. Thus, the modular programming approach of using a UDF adds overhead due to the program call. 

 

IBM knows that both performance and modular programming are important, so they delivered SQL UDF inlining to allow SQL developers to get the best of both worlds. Inlining is a technique that is used by program compilers to improve performance of the generated program object. The compiler scans the program source code and notices that the main body of a program calls the same procedure or routine multiple times to perform a specific task. The compiler could decide to replace each procedure call in the program’s main body with the procedure logic itself. This inline replacement is done to eliminate the procedure call overhead to improve the overall performance. 


Db2 for i takes a similar approach with the UDF logic. Instead of taking time to invoke the external program object to perform the UDF logic, Db2 moves the function inline to the query as demonstrated in the following graphic. The SELECT statement coded in the application at the top of the figure is automatically transformed into the bottom SELECT statement when it’s run by the application program.










First, the Db2 inline support enables the application delivers to employ modular programming because their SQL statements reuse the calculation by invoking the UDF. Second, the Db2 inline support improves performance because when the SQL statement is run the UDF calculation is moved inline to the SQL statement which eliminates the overhead of the external program call. 

A simple performance test of the inline and non-inline versions of a UDF against a table with a half million rows showed a 4x performance improvement for the query using the inline version of the UDF.  This performance result reinforces how expensive it is for the database engine to perform external program calls.  

 

When the function is created, Db2 for i reviews the function definition to determine if the function is inline eligible. The main requirement is that the function logic must be simple – the function body must contain a RETURN statement or a RETURN statement wrapped within a BEGIN … END  pairing.  If the calculation requires multiple assignments and SQL statements, then that logic cannot be moved into another SQL statement.  If the UDF is a scalar user-defined function, the function must be defined with the GLOBAL DETERMINISTIC setting.  A user-defined table function must be defined with the NO EXTERNAL ACTION setting.  These are the main requirements, but the complete list of the inline requirements is documented for both scalar UDFs and table UDFs. 

 

At this point, you’re probably wondering how to verify if your existing SQL UDFs are being inlined. The SYSFUNCS view in QSYS2 contains all the details about UDFs on your system, so it’s easy to determine the inline status by running a query like the following:

SELECT routine_schema, routine_name, inline
 FROM qsys2.sysfuncs 
WHERE external_language IS NULL

 

Here’s an example of the output that you’ll receive:







The YES and NO values returned as inline status above are obvious, but what does mean if the INLINE status is Null (i.e., -) like the INT_ARRAY function in this example?  The Null value means that Db2 for i hasn’t had  a chance to determine if your function is inline eligible.  Meaning that the UDF was created before Db2 had the inline feature that was first made available on the IBM i 7.3 release via a Database Group PTF. 

 

One of the strengths of IBM i is that programs like an SQL UDF keep on running when you install a new release.  No program recompiles or recreations are needed. One downside of this run forever approach is that your program can miss out on new performance optimization like UDF inlining.  Because of this, I recommend that clients recreate all their SQL functions, SQL procedures, and SQL triggers after installing a new release of the IBM i operating system.  With this approach, you guarantee that your SQL procedural objects are leveraging all the latest optimizations delivered by the Db2 team in Rochester.

 

You should now have a firm understanding of how UDF inlining delivers the best of both worlds in terms of UDF performance and using UDFs for modular programming.  And hopefully, you're thankful for it like me!



Tuesday, October 21, 2025

Notable Db2 Usability Enhancements in Latest IBM i TRs

 Fall means it is the time of year that we’re treated to a beautiful set of colors outdoors and that another round of IBM i Technology Refreshes have been announced.  As is the norm, the latest round of technology refreshes includes additions that enhance my favorite database, Db2 for i.

 I believe that this most recent round of Db2 for i enhancements really improves usability. One of the more notable enhancements in the Db2 performance tooling is backing away from the use of scientific notation in the Visual Explain tooling. Our IBM Expert Labs team frequently helps clients use Visual Explain more effectively to analyze and tune the performance of slow-running queries. While helping clients become more proficient with Visual Explain, I often heard requests to make the statistical data easier to understand by eliminating the use of scientific notation as shown in the following figure.

 


 



Visual Explain returns a ton of information that can be challenging to digest and some users (myself included) found that also being asked to perform mental math on the scientific notation (i.e., 1.231E7 means the table has 12.7 million rows) was too much to ask. Now, the Visual Explain interface does the math for you as demonstrated below.




 



This enhancement is only available when using Visual Explain with Db2 performance data (e.g., plan cache snapshot) captured after applying the Database Group PTF level associated with the IBM i 7.5 and 7.6  Technology Refreshes.

 

The latest round of Db2 enhancements also improve usability by eliminating work. Cleaning up old index advisor data is a best practice that I continually share with clients, but not everyone chooses to implement this recommendation. With these latest advancements, Db2 automatically will remove  old index advice.  Any index advice that is older than 365 days (according to the LAST_ADVISED column) will automatically be deleted from the SYSIXADV table in QSYS2 on a daily basis. 

 

You can change the definition of old related to index advice on your system with the IBM provided global variable named QIBM_SYSIXADV_BY_DAYS. For example, if you wanted index advice greater than 6 months old  to be automatically removed by Db2, then you would just run the following SQL statement to change the global variable default to 180.


CREATE OR REPLACE VARIABLE SYSIBMADM.QIBM_SYSIXADV_BY_DAYS
   INTEGER DEFAULT 180

 

The same automatic cleanup is also available for the error logging table used by the SQL Error Logging Facility (SELF) via the QIBM_SELF_BY_DAYS global variable.

 

On the performance front, parallel index creation - an overlooked capability provided by the Db2 Symmetric Multiprocessing (SMP) feature – has also been enhanced. The algorithm used to create an index with parallel processing was enhanced to use less temporary storage during the index creation process. Some internal tests showed a 99% reduction in the use of temporary storage! If you’re thinking of using Db2 SMP for parallel index creation, don’t forget to follow SMP best practices.

 

These are my favorite, notable Db2 enhancements in the newly announced IBM i Technology Refreshes, visit this website for a more complete list of the enhancements for IBM i 7.5 & 7.6.

 



Friday, September 19, 2025

Watch Your Epoch

 I have to admit that I didn’t know what an epoch was until a recent customer engagement. This customer enlisted the assistance of our IBM Technology Expert Labs team to help build a solution that would continually stream IBM i system performance data to a Splunk server. The message sent to the Splunk server needed to include the epoch value associated with the time that system performance data was generated.

With this epoch requirement, our Expert Labs team was pleasantly surprised to find that Db2 for i SQL support provides an EXTRACT function which has the ability to return the epoch value associated with a timestamp value. This built-in SQL function made it  simple to return the epoch value of 1,757,182,530 seconds for the input timestamp of 2025-09-06 18:15:30 as demonstrated in the Run SQL Scripts screen captured below. 









 




A solution to generate a message to Splunk with the required epoch value was put together quickly using this EXTRACT function. As testing began, however, we discovered that the EXTRACT function generated an unexpected epoch value. The prior example was run on an IBM i server with the time zone set to US Central Time.  So, all of us working on the project assumed that the epoch value returned by the EXTRACT function would be the number of seconds that have elapsed since January 1, 1970 00:00:00 (US Central).

 

Closer examination of the epoch value returned by the SQL function showed that the time zone setting of your IBM i server does not matter. The EXTRACT function always returns an epoch value which is the number of seconds that have elapsed since January 1, 1970 00:00:00 (UTC).  Depending on your application requirements, this may be good or bad news. For our client, the epoch value needed to be based on the time zone of the server and not UTC.

 

The IBM Expert Labs team relied on another SQL capability - the current timezone special register – to address this requirement. As the following example shows, subtracting the current timezone value from the input timestamp converts the timestamp value from your system to the corresponding UTC  value so that the generated epoch value is based off your system time.  










By sharing our adventures with the epoch values returned by the EXTRACT function, hopefully you can avoid your own missteps with epoch values. 

Thursday, August 14, 2025

ACS Streamlines Index Housekeeping

 One of the most common questions that our Db2 team in IBM Technology Expert Labs hears on SQL performance engagements is, Does my table have too many indexes? Given the efficiency of the Db2 for i index maintenance algorithms, I often respond by asking When was the last time you did housekeeping on the indexes for the table?.

If a table has a large number of indexes (which also includes keyed logical files), then there’s a strong chance there are probably some indexes that are no longer used by the Db2 query optimizer or your applications.  Meaning clients can often make room for new indexes by removing indexes that are no longer providing value.

 

The good news is that IBM i Access Client Solutions (ACS) for many yes has provided the ability to perform index housekeeping.  The bad news is that this ACS function was hard to find.  The even better news is that with the 1.1.9.9 version of ACS released this month, ACS provides an interface that makes it easier to evaluate the impact of your existing indexes.

 

Let’s start with the original ACS interface for index housekeeping, so that you can better appreciate the new streamlined interface delivered by ACS 1.1.9.9.  The biggest issue with the existing support is that the interface could not be accessed from the ACS SQL Performance Center - even though everyone agrees that indexes are a critical success factor for achieving good SQL performance.  The index usage feedback information was buried inside the ACS Schemas tool. 

 

The following figure shows the steps that were required to access the index usage feedback information. After launching the ACS Schemas tool, one needed to navigate to the table whose indexes they wanted to analyze and to select the Work With->Indexes option. 


















After performing all these navigation steps, you were presented with a window like the following which displays the usage metrics needed to perform index housekeeping.






The 1.1.9.9 version of ACS streamlines access to these same index usage metrics. This streamlining was made possible by leveraging the launchpad that was added to the SQL Performance Center at the end of 2024. The following figure shows how the latest version of ACS adds an Index Evaluator option right in the middle of the SQL Performance Center launchpad. 

 




Once you click on the Index Evaluator icon in the launchpad, you are prompted for the name of the schema and table that you want to perform index housekeeping on.  No need to launch ACS Schemas tool, just a couple of clicks and you’re all set to start finding indexes which are just collecting dust and can be removed!