Thursday, June 22, 2023

The Amazing & Disappearing Plan Cache

This month I decided to continue with a Plan Cache-related topic as a result of a recent discussion thread on IPL frequency on the IBM i Global TechXchange Community.  System IPLs are relevant to the Db2 for i Plan Cache used by the SQL Query Engine (SQE) because it’s a temporary object that gets recreated each time the system is restarted. Before diving into IPL considerations for the Plan Cache, let’s first focus on how SQE uses the Plan Cache and the benefits it provides. 

As I highlighted in my January entry, the Plan Cache is used by SQE to store the access plans (or query plans) created by the query optimizer. An access plan must be in place before a query can be run since it contains the details on what methods (e.g., table scan, index probe, etc.) and objects (e.g., index) will be used by Db2 during the runtime execution of a query. SQE’s usage of the Plan Cache as a single repository for all access plans on the system benefits IBM i clients in several different ways.

First, a centralized repository reduces the amount of system resources used to build query plans. If you had the same SQL request run by different programs, the Classic Query Engine (CQE) had to create and store a separate access plan for each program. With SQE, one copy of the access plan can be created and shared across multiple programs. 

Second, the Plan Cache enables a tremendous Db2 performance toolset with no overhead. Usage of plans from the Plan Cache is part of the normal SQE query execution process, so there’s no additional tool that must be run to collect data for detailed performance analysis with Visual Explain. The IBM i Access Client Solutions (ACS) SQL Performance Center interface shown below makes it fast and simple to perform analysis on a query plan. Just hit the highlighted “Show Statements” button and find the query that needs performance analysis.









Prior to the Plan Cache, Visual Explain analysis could only be performed if you had first collected database monitor data. This requirement was problematic because of the noticeable overhead caused by the database monitor collection process. This overhead meant that you couldn’t just keep a database monitor always running on your production system. When a query performance problem arose on your production system, Visual Explain could not be used to analyze the query because the required performance data wasn’t available. Then, when you went to collect the requisite database monitor data, you ran the risk of negatively impacting the performance of other workloads on your system. The Plan Cache automatically contains the data required to perform detailed analysis with Visual Explain without adding extra overhead to your system.

Lastly, the Plan Cache’s centralized plan repository provides a foundation for self-learning query optimization techniques. One of those self-learning query optimization techniques is Adaptive Query Processing or AQP for short. The ACS SQL Performance Center displays a Plan Cache property, Number of Plans Rebuilt due to AQP, where you can see how often this self-learning technology automatically has tuned the performance of queries on your system. 

Now that you know about the amazing benefits the Plan Cache provides, hopefully it’s also clearer how these great benefits are somewhat negated by frequent system IPLs.

The biggest issue is slower performance on Monday morning after a weekend featuring a system IPL. An empty Plan Cache has the potential to add to the Monday blues for your end users since all the query plans that we’re being reused by your applications and reports on Friday have to be rebuilt by the Db2 query optimizer on Monday morning when users start running your applications and reports. Building access plans takes time and system resources which can lead to sluggish system performance. 

This performance slowdown can be further compounded if some of your queries had been auto-tuned by SQE with the creation of maintained temporary indexes (MTIs). These temporary indexes obviously also pull a disappearing act when an IPL is performed. A query which previously relied on a temporary index will run slower post IPL until SQE makes the decision to auto-tune the query again. In addition, the recreation of these temporary indexes can cause additional overhead on your production system.

So, what actions can you take to minimize the performance impact of an IPL causing your Plan Cache to disappear?

If your system is frequently IPLed, then start by determining if frequent IPLs are truly required. If  the answer is we’ve always done it that way - that’s a good indicator you should keep investigating. When the AS/400 was first launched, weekly IPLs were common because it was possible for the system to run out of temporary addresses. However, that launch was 35 years ago! Even though IBM has delivered enhancements over the years including the switch to 64-bit RISC processors in 1995 which significantly expanded the temporary address support, our IBM Technology Expert Labs team encounters a significant number of clients still employing weekly IPLs when we’re helping them with SQL performance analysis and tuning. If your company is hesitant to change this old practice, make a small change in IPL frequency and evaluate the impact - for instance, move from weekly IPLs to biweekly or monthly IPLs. 

Regardless of the IPL frequency, proactively replacing the MTIs on your system can minimize the post IPL query performance blues. In 2022, the Db2 for i development team delivered the MTI_INFO service that makes it really easy to identify the temporary indexes currently on your system and  to replace them with permanent indexes. I’d recommend modifying your IPL preparation procedure to include using the MTI_INFO service to replace the temporary indexes with permanent indexes before they disappear at IPL time. The MTI_INFO service is supported back to the IBM i 7.3 release.

The final action you can take is to alter your post IPL procedures to include a warmup of key applications and reports. After an IPL on the weekend, set up a process to run some of your key applications and reports prior to Monday morning. This warmup action will pre-populate some of the Plan Cache to lessen the performance pain when end users ramp up system activity on Monday morning.

Hopefully, one of these actions or a combination of them will help you maximize the performance benefits that the Plan Cache can provide to your application and reports. Please reach out, if you need assistance learning how to leverage Visual Explain and the SQL Performance Center to manage and tune query performance on your systems.