Given that I’ve been happily married for over 30 years, I’ll have to admit that I have no experience with speed dating. However, I do have lots of experience seeing the many ways in which clients store date values in legacy formats in their Db2 for i databases. These legacy date formats range from storing the entire date value in a single character or decimal column to storing the components of a date value (Year, Month, Day, Century) in separate columns. This blog topic focuses on the intersection of performance (i.e., speed) and legacy dates because our team has also seen many customer SQL requests that consume a fair amount of system resources while transforming these legacy date values into SQL date values.
IBM certainly contributed to this use of legacy date formats
since Db2 for i has not always supported a Date data type. The Date data type
was not available to use until the V2R1.1 release
of OS/400. While our Lab Services team certainly works with many clients now
using the Date data type, there are a significant number of clients still using
Db2 for i tables that were designed and created prior to that release thanks to
the strong backward compatibility of our great platform.
The large number of legacy date transformations on SQL requests are not a surprise due to the continued rise in SQL adoption by IBM i developers and SQL’s rich support for date processing. This support includes many SQL functions that make it easy to convert numeric and character date values into real SQL date values.
Here’s a sampling of some of the legacy date transformations that have shown up on Lab Services client engagements. Other clients have created their own user-defined functions(UDFs) to perform the legacy date conversions.
==> TIMESTAMP_FORMAT(CHAR(DateDec8) , 'YYYYMMDD') ==> DATE((DTYEAR+2000) || '-' || DTMNTH || '-' || DTDAY) ==> ((YRVAL*10000) + (MTHVAL*100) + (DAYVAL))
While the built-in functions and UDFs make it easy to transform the legacy date formats, these transformations are CPU intensive. If these transformations are performed on thousands of rows in your table, then the CPU costs start to add up. Especially, when you consider that it’s the same legacy date values being converted over and over again. And that’s exactly where speed dating of your legacy date values can help.
Speed dating of legacy date values involves pre-converting
your legacy date values and storing them in a Date dimension or lookup table.
The same SQL built-in and user-defined functions can be used to populate the
table. Once this lookup table has been
populated, your SQL queries can simply join to the table whenever you want to
use the equivalent SQL date value associated with your legacy date value. Joining to a dimension table is a low-cost
alternative to converting your existing Db2 tables to use the Date data type.
The following graphic visually depicts the join to a Dates
lookup table from a legacy Orders table which has date values stored in a
legacy character column.
The ORDERDATE column is storing a date value in an 8-byte
character field. The Dates lookup table
has that legacy date value stored in the DC_MDYY_CHAR column and the equivalent
SQL date value in the DC_DATE column. When an SQL query wants to display or process
the ORDERDATE column as an SQL date value, it can just join to the Dates
dimension table on the ORDERDATE and DC_MDYY_CHAR columns and then reference
the DC_DATE column. The SQL date value is
accessible with a simple lookup instead of a CPU intensive transformation. Assuming that the proper indexes are covering
the join columns, a join can be processed by Db2 very efficiently and quickly.
Even if you have 50 years of dates stored in the lookup table, the table is
only going to have just over 18,000 rows – a small table when it comes to join
performance.
Notice that in this example that the Dates dimension table also is being used to store contextual information such as Day of Week number and Day Name. These contextual date values were also pre-calculated so they are also ready to be used by your SQL request without any dynamic calculation. You could easily add other contextual date values such as business fiscal year or quarter to the lookup table. This Db2 Web Query Redbook has more details on creating and using a Date dimension table. Even better, Db2 Web Query also ships with a utility to create your own dimension table.
So
how much performance savings can be realized with speed dating? As with any
performance question the answer is “It Depends”. One client recently made the switch to using
a Date lookup table because they had an SQL query which performed many legacy date transformations using this pattern: DATE((DTYEAR+2000) ||'-'||DTMNTH||'-'||DTDAY). Obviously, your mileage may vary depending on the query complexity and the number of rows
being processed.
Feel free to reach out if you need assistance applying this
speed dating technique to the legacy dates stored in your Db2 for i databases.
"Given that I’ve been happily married for over 30 years, I’ll have to admit that I have no experience with speed dating."
ReplyDeleteBrilliant opener :-)
Hi, Kent. We are using Date Table y UDF's in our installation and, like you say, using one or other 'it depends'. But, if the perfomance is not a compromise, using a Date Table when your SQL have a lot of date fields is really annoying, what do you think?
ReplyDeleteIf top performance isn't a concern for your queries, then using UDFs is a reasonable tradeoff if you think the coding is simpler.
DeleteUsing a Date Table is a good practise that іт was discovered by your collegues Mike Cain and Hernando Bedoya in a 2017 workshop in our company.
ReplyDeleteI believe Dan Cruikshank was the first member of our Db2 for i team to promote the use of a date table.
DeleteI came across date tables in a seminar presented by Mike Cain years ago and it simplified my life!
ReplyDelete