If you’ve ever perused the list of built-in Db2 for i functions, you’ve probably noticed that there’s some overlap in the
capabilities provided by a portion of the functions. In some cases, the
functions have different names but result in the exact same processing being
performed – the COALESCE and IFNULL functions are good examples. This situation
arises because COALESCE is the name assigned to the function by the SQL Standards
and IFNULL is a proprietary name used by other DBMS products. In this situation,
the overlapping functions are created equal - meaning performance will be
exactly the same regardless of which function name that your SQL request references.
There are other overlapping
functions, however, where performance can be different depending on the
function that you choose. This difference happens most often where one function
provides a superset of capabilities of another function. For example, the VARCHAR and VARCHAR_FORMAT built-in functions both support the conversion of a datetime
value into a character string. The VARCHAR function supports a much smaller
list of string representations of a datetime value than VARCHAR_FORMAT.
The larger and richer set of string
representations of datetime values supported by the VARCHAR_FORMAT function
comes at a cost. That added performance cost is because Db2 implemented the
VARCHAR_FORMAT built-in function as an internal user-defined function. The
user-defined function implementation means that each invocation of the
VARCHAR_FORMAT function involves a call to an external program to perform the
specified processing. Most Db2 for i built-in functions
are implemented as inline functions which means that all of the processing is
performed without the cost of an external program call.
Each of
the following two SQL statements will return the same string value, but the
VARCHAR_FORMAT function is going to consume more system resources and time in
the process. So, it’s helpful to understand there may be performance
differences when choosing between built-in Db2 functions.
VALUES VARCHAR(:HOST_VAR, ISO); VALUES VARCHAR_FORMAT(:HOST_VAR, 'YYYY-MM-DD');
The next
logical question is: How do I know if a Db2 built-in function is implemented
as an internal user-defined function? The good news is that it is
documented in the Db2 for i SQL Reference. The bad news is that this
documentation is hard to find. The
function implementation details are not covered anywhere in the Db2
built-in function documentation. The function implementation details are
reported indirectly in the CREATE INDEX documentation?!?!?
Why on earth would function implementation details be covered with the CREATE INDEX statement? To answer that question, you need to remember that Db2 for i enables you to create derived key indexes like the following where derivations such as a built-in function can be specified as part of the key definition.
CREATE INDEX index1 ON sometable( DATE(TSTMPCOL) )
Derived key indexes, however, don’t support functions that are implemented as user-defined functions. Thus, the CREATE INDEX documentation needs to highlight those built-in functions that are not allowed in your key definition because they are implemented as internal user-defined functions.
When using
this index-based function documentation, you need to be aware that not every
name for a function is documented in this list. For example, the VARCHAR_FORMAT
function name is in the list. However, the TO_CHAR function does not appear in this
list even though its usage results in the VARCHAR_FORMAT function being invoked
which is implemented as an internal user-defined function.
Hopefully,
you've learned a new criteria that should be applied when selecting a Db2
built-in function to use in your SQL - now that you know all Db2 functions are not
created equal.
Wow, this is a real eye opener. In the case of TO_CHAR and similar, how can we tell? Do we need to test it with a CREATE INDEX to be sure? Can we rely on the column SYSFUNCS.ROUTINE_BODY to determine if EXTERNAL or not?
ReplyDeleteIf you look at the documentation for the TO_CHAR function, it points you to the doc for the VARCHAR_FORMAT function. The CREATE INDEX documentation does call out that VARCHAR_FORMAT can't be used. You could always try a CREATE INDEX to be be sure.
ReplyDeleteSYSFUNCS does not contain entries for internal user-defined functions. Another option would be using Visual Explain - a "Complex" icon is used in VE graphs to represent the UDF processing.