There’s no limit on the number of rows that an SQL SELECT statement can return to an application by default. There are a couple of different techniques, however, that programmers can employ to change this default behavior. One direct option is adding a Fetch or Limit clause to set the maximum numbers of rows the query can return. Usage of these clauses can prevent accidentally sending millions of rows across the network or ensuring a query only returns the number of rows that can be displayed on the end user interface.
Like many syntax options, there’s nothing that prevents these clauses from being used incorrectly. The following examples highlight a poor coding practice with these clauses that has recently emerged in SQL performance reviews performed by our Technology Expert Labs team. Both of these SQL examples would fail if the specified SELECT returns more than one row and they didn't contain a Fetch or Limit clause.
SELECT c1 INTO :hv1 FROM t1 WHERE c2=1 AND c3='A'
FETCH FIRST 1 ROWS ONLY SET :hv1 = (SELECT c1 FROM t1 WHERE c2=1 AND c3='A' LIMIT 1)
While these
limiting clauses prevent the SQL from failing with a More Than One Row
error, there are several issues with this coding approach.
One big issue
is data integrity. If these clauses are
being used just as a safety net – meaning the query shouldn’t return more than row but it’s
added just in case that happens, then these clauses will prevent the
application from ever detecting that the query is returning more than one row.
Multiple rows being returned could be caused by a data integrity problem, but
that potential problem is hidden from the application with this coding practice.
Furthermore, if it’s normal that the query could return more than one row from
time to time, these SQL requests have told Db2 it’s okay to randomly select
which row is returned to the program. That potential randomness exists because
these SELECT statements don’t have an ORDER BY clause. In this case, Db2 is
allowed to return the selected rows in any order that it wants. The chances of
the randomness goes up even higher if these queries are eligible for parallel
processing with Db2 SMP. I think
you’d agree it’s a much cleaner and safer approach to control which row is
assigned to the host variable instead of assigning a value at random.
The other
issue is performance. These clauses require Db2 to track how many rows a query
is returning and that tracking takes time and resources. Some internal tests
show these clauses adding 0.5% overhead to the execution of a query. That
overhead may seem like nothing to worry about, but just multiply that little
overhand times thousands or millions of statement executions.
The Sky’s the Limit when it comes to the scalability of your SQL applications…
assuming that you’re utilizing good SQL coding practices. If you’re company
needs help learning efficient SQL coding practices, then our IBM Technology Expert Labs
team is here to help.
Wait? Are you saying that "LIMIT 1" doesn't cause the query processor to stop after 1 row but rather continues for N rows and then throws away the N-1 rows?
ReplyDeleteNo, that's not what I'm saying. The query engine doesn't normally pay attention to how many rows are returned to by the query. So when you add LIMIT 1/FETCH FIRST 1 ROW ONLY to a query, the query engine has to count the number of rows returned. Counting rows take time & resources.
DeleteVery good point made about "hiding" the real issue. This is akin to fixing the symptom instead of the problem - but without visibility to even the symptom you can fix neither! Cheers :)
ReplyDeleteYou write "Both of these SQL examples will fail if the specified SELECT returns more than one row." but I don't see how the two examples can return more than one row.
ReplyDeleteGood catch - I meant to say they would fail if there was not a Fetch or Limit clause. I've updated the text.
DeleteHow would you rewrite these two statement then?
ReplyDeleteThe Fetch/Limit clause should only be used when the query can return more than one row and there's an Order By clause to control which row gets returned.
DeleteThanks for all your great content, much appreciated.
ReplyDeletedoes it mean that it is better not to use fetch or limit 1 and leave only the select * from table where field = value statement?
ReplyDeleteYes, if the Select statement is only supposed to return one row
Delete