This entry title harkens back to the Got Milk? advertising campaign. Who would have guessed that this campaign goes back almost 30 years?!? As they say, time really flies… now, back to the topic at hand.
If you’re reading
this entry, hopefully you already know that the CTE acronym in the SQL world
stands for Common Table Expression. Even
if you already knew what a CTE is, you may not understand what the term “shared
CTE” is referring to. A shared CTE is
the term used to describe any CTE that is referenced more than once on a query
definition.
Here’s an example
of a query that contains a shared CTE.
In this query, the staff CTE is referenced twice in the main query
definition which qualifies the staff CTE to be categorized as a shared CTE.
WITH staff (deptno, empcount) AS (SELECT deptno, COUNT(*) FROM employee GROUP BY deptno) SELECT deptno, empcount FROM staff WHERE empcount = (SELECT MAX(empcount) FROM staff)
In contrast, the
following query contains multiple CTES, but no shared CTEs. Each of the CTEs (top10_2020 &
top10_2021) are only referenced once in the main query, so they don’t meet the
criteria of a shared CTE.
WITH top10_2020 (customer_name, total_sales (SELECT customer_name, SUM(sales_amt) FROM sales WHERE year=2020 GROUP BY customer_name ORDER BY SUM(sales_amt) DESC FETCH FIRST 10 ROWS ONLY) , top10_2021 (customer_name, total_sales) AS (SELECT customer_name, SUM(sales_amt) FROM sales WHERE YEAR=2021 GROUP BY customer_name ORDER BY SUM(sales_amt) DESC FETCH FIRST 10 ROWS ONLY) SELECT Y1.customer_name, Y1.total_sales AS sales2020,
Y2.total_sales AS sales2021 FROM top10_2020 Y1 INNER JOIN top10_2021 Y2 ON Y1.customer_name = Y2.customer_name
Knowing whether or not a CTE is a shared CTE is significant because an SQL Standards compatibility fix was recently delivered in the IBM i 7.5 release for SQL statements with shared CTEs. This fix had to be made because there’s a possibility that some queries with shared CTEs may return incorrect results if the queries are run while the tables referenced by a shared CTE are being changed. The fix will guarantee as dictated by the SQL standards that each reference to the shared CTE generates the same result set. The fix delivered by IBM may cause some queries with shared CTE references to run slower and some queries with shared CTEs to run faster.
Now, you may be
thinking that it will be a long time before your company installs the IBM i 7.5
release, so why pay attention to this change. The reason that you should pay
attention is that the code fixes for shared CTEs that were made for the IBM i
7.5 release will eventually be delivered as PTFs for the IBM i 7.4 release (NOTE: Level 23 of the 7.4 Database Group PTF includes the shared CTE fix). IBM
recommends analyzing queries with shared CTEs before the PTFs are
delivered so that clients understand the possible impact of these PTFs and have
time to change their SQL statements, if needed.
To help with this
analysis effort, IBM in late 2021 delivered PTFs for IBM i 7.3 and 7.4 that
flag SQL statements with shared CTEs in both Plan Cache Snapshot and SQL
Performance Monitor collections. In addition, these flags classify whether or
not a Shared CTE is estimated to generate a large result set. Shared CTEs that
generate a large result set have a greater chance of having performance issues
after the fix is delivered as compared to CTEs with a smaller result set size.
However, all SQL statements using shared CTEs have the potential to
perform differently once the PTFs are applied.
It is possible to predict the possible performance impact of the future PTFs on SQL requests containing Shared CTEs with a simple coding change. This simple change involves adding the following predicate, AND RAND() IS NOT NULL, to the CTE that is shared (i.e., referenced multiple times) on the SQL request. This predicate forces the Db2 query optimizer to use the same CTE runtime implementation which is used by the fix in the IBM i 7.5 release.
The IBM development team has published a detailed writeup which includes details on the SQL statement flagging that will aid analysis on 7.3 & 7.4 along with possible coding changes that you may want to consider. Our Db2 team in IBM Systems Lab Services can also be engaged to provide additional assistance, so let me know if we can help.
Before I close, I also want to highlight that the licensing change that I highlighted in last month's entry is effective starting June 1, 2022. This change converts Db2 Add-On features such as Db2 SMP and Db2 Multisystem into no additional charge features for all of the IBM i 7.x releases (7.1, 7.2, 7.3, 7.4, 7.5)
No comments:
Post a Comment