Published on

January 27, 2020

How to Write a Complex SELECT Query in SQL Server

In the world of SQL Server, writing complex SELECT queries can be a daunting task. Many developers often wonder where to start and how to approach such queries. In this article, we will demystify the process of writing complex SELECT statements and provide a step-by-step guide to help you navigate through the complexity.

The Data Model

Before diving into writing complex queries, it is crucial to understand the data model you are working with. Familiarize yourself with the tables and their relationships. If documentation is not available, you can either ask the person who created the model or create the documentation yourself. Having a clear understanding of the data model will save you time in the long run.

Starting with a Complex Query

Let’s start by looking at an example of a complex SELECT query:

SELECT country.country_name_eng, 
       SUM(CASE WHEN call.id IS NOT NULL THEN 1 ELSE 0 END) AS calls, 
       AVG(ISNULL(DATEDIFF(SECOND, call.start_time, call.end_time), 0)) AS avg_difference 
FROM country 
LEFT JOIN city ON city.country_id = country.id 
LEFT JOIN customer ON city.id = customer.city_id 
LEFT JOIN call ON call.customer_id = customer.id 
GROUP BY country.id, country.country_name_eng 
HAVING AVG(ISNULL(DATEDIFF(SECOND, call.start_time, call.end_time), 0)) > (SELECT AVG(DATEDIFF(SECOND, call.start_time, call.end_time)) FROM call) 
ORDER BY calls DESC, country.id ASC;

As you can see, this query is quite complex and may be difficult to understand at first glance. Let’s break it down step by step.

Understanding the Query

The goal of this query is to return all countries along with the number of related calls and their average duration in seconds. We only want to display countries where the average call duration is greater than the average call duration of all calls.

To achieve this, we need to join multiple tables: country, city, customer, and call. By using the appropriate JOIN statements, we can connect these tables using their foreign keys.

Once the tables are joined, we can calculate the desired aggregated values using functions such as SUM, AVG, and CASE. In this example, we are summing the number of calls and calculating the average call duration.

Finally, we apply a HAVING clause to filter out only the countries with an average call duration greater than the overall average call duration.

Writing the Query Step by Step

When dealing with complex SELECT queries, it is best to write the query in parts and test each part along the way. This approach allows you to build the query gradually and ensure that each component is functioning correctly.

Start by writing the basic SELECT statement and JOIN clauses to connect the necessary tables:

SELECT ...
FROM country
LEFT JOIN city ON city.country_id = country.id
LEFT JOIN customer ON city.id = customer.city_id
LEFT JOIN call ON call.customer_id = customer.id
...

Next, test this part of the query to ensure that the tables are joined correctly and the desired data is being retrieved:

SELECT *
FROM country
LEFT JOIN city ON city.country_id = country.id
LEFT JOIN customer ON city.id = customer.city_id
LEFT JOIN call ON call.customer_id = customer.id;

Once you have confirmed that the tables are joined correctly, you can proceed to calculate the average call duration for all calls:

SELECT AVG(DATEDIFF(SECOND, call.start_time, call.end_time))
FROM call;

Now that you have the average call duration, you can incorporate it into the main query and add the necessary aggregation functions:

SELECT country.country_name_eng, 
       SUM(CASE WHEN call.id IS NOT NULL THEN 1 ELSE 0 END) AS calls, 
       AVG(ISNULL(DATEDIFF(SECOND, call.start_time, call.end_time), 0)) AS avg_difference 
FROM country
LEFT JOIN city ON city.country_id = country.id
LEFT JOIN customer ON city.id = customer.city_id
LEFT JOIN call ON call.customer_id = customer.id
GROUP BY country.id, country.country_name_eng
HAVING AVG(ISNULL(DATEDIFF(SECOND, call.start_time, call.end_time), 0)) > (SELECT AVG(DATEDIFF(SECOND, call.start_time, call.end_time)) FROM call)
ORDER BY calls DESC, country.id ASC;

By adding comments to the query, you can make it more readable and understandable for yourself and others:

-- the query returns a call summary for countries having average call duration > average call duration of all calls
SELECT country.country_name_eng, 
       SUM(CASE WHEN call.id IS NOT NULL THEN 1 ELSE 0 END) AS calls, 
       AVG(ISNULL(DATEDIFF(SECOND, call.start_time, call.end_time), 0)) AS avg_difference 
FROM country
LEFT JOIN city ON city.country_id = country.id
LEFT JOIN customer ON city.id = customer.city_id
LEFT JOIN call ON call.customer_id = customer.id
GROUP BY country.id, country.country_name_eng
-- filter out only countries having an average call duration > average call duration of all calls
HAVING AVG(ISNULL(DATEDIFF(SECOND, call.start_time, call.end_time), 0)) > (SELECT AVG(DATEDIFF(SECOND, call.start_time, call.end_time)) FROM call)
ORDER BY calls DESC, country.id ASC;

Conclusion

Writing complex SELECT queries in SQL Server may seem challenging, but by following a systematic approach, you can tackle even the most complex queries. Remember to break down the query into smaller parts, test each part, and add comments to make the code more readable. With practice and patience, you will become proficient in writing complex SELECT queries.

Click to rate this post!
[Total: 0 Average: 0]

Let's work together

Send us a message or book free introductory meeting with us using button below.