Wednesday, August 11, 2021

Speed Dating Your Legacy Dates

Given that I’ve been happily married for over 30 years, I’ll have to admit that I have no experience with speed dating. However, I do have lots of experience seeing the many ways in which clients store date values in legacy formats in their Db2 for i databases.  These legacy date formats range from storing the entire date value in a single character or decimal column to storing the components of a date value (Year, Month, Day, Century) in separate columns.  This blog topic focuses on the intersection of performance (i.e., speed) and legacy dates because our team has also seen many customer SQL requests that consume a fair amount of system resources while transforming these legacy date values into SQL date values.

IBM certainly contributed to this use of legacy date formats since Db2 for i has not always supported a Date data type. The Date data type was not available to use until the V2R1.1 release of OS/400. While our Lab Services team certainly works with many clients now using the Date data type, there are a significant number of clients still using Db2 for i tables that were designed and created prior to that release thanks to the strong backward compatibility of our great platform.

The large number of legacy date transformations on SQL requests are not a surprise due to the continued rise in SQL adoption by IBM i developers and SQL’s rich support for date processing. This support includes many SQL functions that make it easy to convert numeric and character date values into real SQL date values.

Here’s a sampling of some of the legacy date transformations that have shown up on Lab Services client engagements. Other clients have created their own user-defined functions(UDFs) to perform the legacy date conversions. 

   ==> TIMESTAMP_FORMAT(CHAR(DateDec8) , 'YYYYMMDD')
   ==> DATE((DTYEAR+2000) || '-' || DTMNTH || '-' || DTDAY)
   ==> ((YRVAL*10000) + (MTHVAL*100) + (DAYVAL))

While the built-in functions and UDFs make it easy to transform the legacy date formats, these transformations are CPU intensive.  If these transformations are performed on thousands of rows in your table, then the CPU costs start to add up. Especially, when you consider that it’s the same legacy date values being converted over and over again. And that’s exactly where speed dating of your legacy date values can help.

Speed dating of legacy date values involves pre-converting your legacy date values and storing them in a Date dimension or lookup table. The same SQL built-in and user-defined functions can be used to populate the table.  Once this lookup table has been populated, your SQL queries can simply join to the table whenever you want to use the equivalent SQL date value associated with your legacy date value.  Joining to a dimension table is a low-cost alternative to converting your existing Db2 tables to use the Date data type.

The following graphic visually depicts the join to a Dates lookup table from a legacy Orders table which has date values stored in a legacy character column.

The ORDERDATE column is storing a date value in an 8-byte character field.  The Dates lookup table has that legacy date value stored in the DC_MDYY_CHAR column and the equivalent SQL date value in the DC_DATE column. When an SQL query wants to display or process the ORDERDATE column as an SQL date value, it can just join to the Dates dimension table on the ORDERDATE and DC_MDYY_CHAR columns and then reference the DC_DATE column.  The SQL date value is accessible with a simple lookup instead of a CPU intensive transformation.  Assuming that the proper indexes are covering the join columns, a join can be processed by Db2 very efficiently and quickly. Even if you have 50 years of dates stored in the lookup table, the table is only going to have just over 18,000 rows – a small table when it comes to join performance.

Notice that in this example that the Dates dimension table also is being used to store contextual information such as Day of Week number and Day Name.  These contextual date values were also pre-calculated so they are also ready to be used by your SQL request without any dynamic calculation.  You could easily add other contextual date values such as business fiscal year or quarter to the lookup table. This Db2 Web Query Redbook has more details on creating and using a Date dimension table. Even better, Db2 Web Query also ships with a utility to create your own dimension table.

So how much performance savings can be realized with speed dating? As with any performance question the answer is “It Depends”.  One client recently made the switch to using a Date lookup table because they had an SQL query which performed many legacy date transformations using this pattern: DATE((DTYEAR+2000) ||'-'||DTMNTH||'-'||DTDAY). Obviously, your mileage may vary depending on the query complexity and the number of rows being processed.  

Feel free to reach out if you need assistance applying this speed dating technique to the legacy dates stored in your Db2 for i databases.