SQL Server is a powerful database engine that is often used to handle large amounts of data. When it comes to optimizing performance, many people focus on tuning the server itself. However, the real key to improving performance lies in tuning the code.
In a recent comparison with Oracle, it was discovered that even the example code provided in the SQL Server documentation for the UPDATE statement was not optimized for performance. This is a common problem, even among experienced developers. One of the main issues is the use of correlated sub-queries instead of joins in the UPDATE statement.
Correlated sub-queries can be a major source of performance issues, as they can lead to row-by-agonizing-row (RBAR) processing. This can result in slow execution times and high CPU usage. To avoid this, it is recommended to use joins instead of correlated sub-queries in the UPDATE statement.
There are two basic types of UPDATE statements in SQL Server: “Direct Data” updates and “Joined” updates. The “Direct Data” update is straightforward and does not require optimization. The “Joined” update, on the other hand, involves updating the target table based on the content of other tables.
The correct syntax for a “Joined” update in SQL Server is as follows:
UPDATE sometable SET somecolumn = st2.somecolumn FROM sometable st1 INNER JOIN sometable2 st2 ON st1.somecolumn = st2.somecolumn WHERE some condition exists
It is important to note that the target table (sometable) must be the object of the UPDATE statement. This syntax ensures that the update is performed efficiently using joins instead of correlated sub-queries.
Another method that can be used for a “Joined” update is the undocumented “Aliased” method. This method uses table aliases instead of table names in the UPDATE statement, making it less confusing for some developers. The syntax for this method is as follows:
UPDATE st1 SET somecolumn = st2.somecolumn FROM sometable st1 INNER JOIN sometable2 st2 ON st1.somecolumn = st2.somecolumn WHERE some condition exists
It is important to avoid the “Death by SQL” update, which is a method that does not explicitly mention the target table in the FROM clause. This can lead to severe performance issues, as it causes a full recompile of the statement and a full table scan for each updated row.
When it comes to performance tuning, it is crucial to remember that the majority of the tuning should be done in the code itself. Tuning the server can only provide limited improvements, while rewriting the code can lead to significant performance gains.
Practical examples of performance tuning in SQL Server can be seen in the UPDATE statement. By analyzing and rewriting the code, it is possible to achieve substantial performance improvements. For example, a code snippet that originally took 313 milliseconds to update 17 rows can be rewritten to take only 15 milliseconds, resulting in a performance gain of more than 19 times.
It is important to recognize tuning opportunities in the code, such as the use of correlated sub-queries and other forms of RBAR. By rewriting the code to use joins and other set-based techniques, it is possible to achieve significant performance improvements.
In conclusion, when it comes to improving performance in SQL Server, the focus should be on tuning the code, not the server. By using joins instead of correlated sub-queries and other set-based techniques, it is possible to achieve substantial performance gains. Remember, “Tune the Code… that’s where the performance is!”
Thank you for reading!
— Your Name