Thursday, March 21, 2024

Not All Db2 Functions Are Created Equal

If you’ve ever perused the list of built-in Db2 for i functions, you’ve probably noticed that there’s some overlap in the capabilities provided by a portion of the functions. In some cases, the functions have different names but result in the exact same processing being performed – the COALESCE and IFNULL functions are good examples. This situation arises because COALESCE is the name assigned to the function by the SQL Standards and IFNULL is a proprietary name used by other DBMS products. In this situation, the overlapping functions are created equal - meaning performance will be exactly the same regardless of which function name that your SQL request references.

There are other overlapping functions, however, where performance can be different depending on the function that you choose. This difference happens most often where one function provides a superset of capabilities of another function. For example, the VARCHAR and VARCHAR_FORMAT built-in functions both support the conversion of a datetime value into a character string. The VARCHAR function supports a much smaller list of string representations of a datetime value than VARCHAR_FORMAT.

The larger and richer set of string representations of datetime values supported by the VARCHAR_FORMAT function comes at a cost. That added performance cost is because Db2 implemented the VARCHAR_FORMAT built-in function as an internal user-defined function. The user-defined function implementation means that each invocation of the VARCHAR_FORMAT function involves a call to an external program to perform the specified processing.  Most Db2 for i built-in functions are implemented as inline functions which means that all of the processing is performed without the cost of an external program call. 

Each of the following two SQL statements will return the same string value, but the VARCHAR_FORMAT function is going to consume more system resources and time in the process. So, it’s helpful to understand there may be performance differences when choosing between built-in Db2 functions. 

VALUES VARCHAR(:HOST_VAR, ISO);

VALUES VARCHAR_FORMAT(:HOST_VAR, 'YYYY-MM-DD');

The next logical question is: How do I know if a Db2 built-in function is implemented as an internal user-defined function? The good news is that it is documented in the Db2 for i SQL Reference. The bad news is that this documentation is hard to find. The function implementation details are not covered anywhere in the Db2 built-in function documentation. The function implementation details are reported indirectly in the CREATE INDEX documentation?!?!?

Why on earth would function implementation details be covered with the CREATE INDEX statement? To answer that question, you need to remember that Db2 for i enables you to create derived key indexes like the following where derivations such as a built-in function can be specified as part of the key definition.

CREATE INDEX index1 ON sometable( DATE(TSTMPCOL) )

Derived key indexes, however, don’t support functions that are implemented as user-defined functions. Thus, the CREATE INDEX documentation needs to highlight those built-in functions that are not allowed in your key definition because they are implemented as internal user-defined functions.

When using this index-based function documentation, you need to be aware that not every name for a function is documented in this list. For example, the VARCHAR_FORMAT function name is in the list. However, the TO_CHAR function does not appear in this list even though its usage results in the VARCHAR_FORMAT function being invoked which is implemented as an internal user-defined function.

Hopefully, you've learned a new criteria that should be applied when selecting a Db2 built-in function to use in your SQL - now that you know all Db2 functions are not created equal.