Monday, November 13, 2023

Can SMP stand for System May Plod?!?!

 I'm guessing that you already know that the SMP abbreviation in Db2 for i SMP actually stands for Symmetric Multiprocessing and not System May Plod. However, I'm hoping a title that highlights a different meaning - especially one that implies slow performance - gets people's attention because the usage of Db2 SMP can have negative impacts.

One definition of plod that I found online is: walk doggedly and slowly with heavy steps. I'm not sure about you, but that's not how I'd want the performance of my IBM Power server described! How could slow ever be associated with the Db2 for i SMP feature?  When one thinks of parallel processing, it's easy to envision your queries receiving a performance boost like an F1 race car receives from its turbochargers. While this type of performance improvement can occur for your queries with Db2 SMP, it can also result in plodding system performance when parallel processing is not properly used.

The improper usage of Db2 SMP occurs when it's enabled systemwide instead of selectively applying parallel processing to long-running queries and database operations in specific jobs. A couple of years ago, I summarized parallel processing best practices for Db2 for i in an entry titled: The ABCs of Effective Db2 SMP Usage. The "C" stands for Controlled Usage - just another way of describing how parallel processing should only be applied to a subset of your workloads. Despite this strong recommendation from myself and other IBMers, too many IBM i clients are taking the easy road of activating Db2 SMP systemwide looking for a quick fix for their query performance challenges.

This "easy" approach of enabling database parallel processing systemwide may enhance query performance, but it can be at the expense of other workloads running on the system. Below you'll find an IBM iDoctor for IBM i graph of CPU utilization (green line) from a customer's system. Notice how the CPU utilization is holding steady around a 60% rate. That steady, healthy CPU usage rate is then interrupted by a sharp jump in CPU utilization for several minutes in the 85-90% range.













When our IBM Technology Expert Labs team was engaged by the client to determine the root cause of these system performance spikes, they found that the high CPU usage peaks were a result of multiple queries on the system using parallel processing. The query users were probably happy, but it's easy to see how other users and workloads on the system were negatively impacted. This client's experience was not just a one off - my colleagues in IBM Support and Technology Expert Labs have shared several stories me about clients engaging them to explain their system slowdowns.

This poor practice has been compounded by the fact that since June 2022, IBM i clients can try Db2 SMP without paying for a license. In prior years, IBM i customers had to make a monetary investment before they could take the easy road of enabling Db2 SMP systemwide. In fact, IBM Support's official stance is that clients should only enable parallel processing systemwide on IBM i 7.5.  This latest release of the IBM i contains two new features, PARALLEL_MAX_SYSTEM_CPU and PARALLEL_MIN_TIME, which are very useful in preventing a system from being overrun from systemwide usage of Db2 SMP.    

Some clients that have taken the easy path of enabling Db2 SMP for all queries run on the system have also discovered that some of their high-level language programs are not thread safe. Consider an RPG program which has been registered as an external user-defined function (UDF) with the SQL CREATE FUNCTION statement. If parallel processing is applied to a query that invokes this external UDF, then you need to remember that multiple parallel threads can now be invoking the RPG program.  If this RPG program is not thread safe, then issues can arise.  In this situation, the developer can compile their high-level language programs to be thread-safe or specify the DISALLOW PARALLEL option on the CREATE FUNCTION statement which will prevent the database engine from using parallel processing on any query that references the UDF. It's very easy to overlook program thread safety when turning on parallel processing for all the queries on your system.

Systemwide activation of Db2 SMP is also risky because it can mask the root cause of your query performance problems. With Db2 SMP, you're essentially trading system resources for faster response times. While your queries can return faster with Db2 SMP, sometimes there are more efficient methods to improve query performance - if you've enabled parallel processing systemwide, it can be harder to find the queries that would benefit from these more efficient methods. 

A couple of years ago, our Technology Expert Labs team was hired to help a client experiencing degraded system performance. As we investigated the problem, we discovered that Db2 SMP was enabled systemwide. This systemwide enablement of Db2 SMP was causing a frequently run query to be implemented with a parallel degree of 38. This meant that every time this query was run on the system - there were 38 threads running in parallel consuming CPU, memory, and disk I/O resources. Further analysis of this query and database identified that the creation of a new index would enable the query to run in the same amount of time but with less system resources (i.e., use a single thread). This client had used Db2 SMP systemwide to "fix" their query performance problems with hardware instead of addressing the root cause of a missing index - eventually they ran out of hardware resources.

Hopefully, you now have a better awareness of the risks of taking the easy road of activating Db2 SMP systemwide. Putting in the extra effort to selectively use parallel processing should result in a healthier system performance.  I stumbled upon the following quote from Robert Kiyosaki and think it summarizes things nicely: Remember the easy road often becomes hard, and the hard road often becomes easy.

I hope all my US readers have a great Thanksgiving holiday.