SQL Window Functions Problem Solution

Department Top Three Salaries via Leetcode

In extreme detail, in this tutorial I'll walk through a Leetcode Advanced SQL problem to arrive to the solution.

Problem Statement

The Employee table holds all employees. Every employee has an Id, and there is also a column for the department Id.

Id Name Salary DepartmentId
1 Joe 85000 1
2 Henry 80000 2
3 Sam 60000 2
4 Max 90000 1
5 Janet 69000 1
6 Randy 85000 1
7 Will 70000 1

The Department table holds all departments of the company.

Id Name
1 IT
2 Sales

Write a SQL query to find employees who earn the top three salaries in each of the department. For the above tables, your SQL query should return the following rows (order of rows does not matter).

Department Employee Salary
IT Max 90000
IT Randy 85000
IT Joe 85000
IT Will 70000
Sales Henry 80000
Sales Sam 60000

Explanation

In IT department, Max earns the highest salary, both Randy and Joe earn the second highest salary, and Will earns the third highest salary. There are only two employees in the Sales department, Henry earns the highest salary while Sam earns the second highest salary.

Dan's Step by Step Solution

Join Tables to Get Relevant Columns

Explanation of SQL Query Below:

We are provided two tables: Employee and Department. In our final result, we need three fields: the department name, the employee's name and the employee's salary. Since these three columns are in two different tables, we need to join the two tables.

We know this is a relational database in that the two tables "relate" to one another with common identifiers from each.

The term “relational database” refers to the fact that the tables within it “relate” to one another—they contain common identifiers that allow information from multiple tables to be combined easily.

The column Id in the Department table is a primary key. This key uniquely identifies each department in the company. We cannot have two Id values that are the exact same number in this table!

Similarly, the Employee table has a foreign key of DepartmentId that "relates" to the Id field in the Department table. Therefore, we can join the two tables together and retrieve the necessary columns to solve this problem.

We choose a INNER JOIN because we want to return all records from the Employee table in which there is a corresponding match for the department name that the employee works in via the Department table.

To succinctly write out the elements of the SQL query below:

  • Select the Name column from the Department table and alias it as "Department"
  • Select the name column from the Employee table and alias it as "Employee"
  • Select the Salary column from the Salary table and alias it as "Salary"
  • Select the columns from the Employee table
  • Inner join the Department table on records in which the Id field from the Department table is equal to the DepartmentId field from the Employee table

SQL Query:

SELECT Department.Name AS "Department",
       Employee.name AS "Employee",
       Employee.Salary AS "Salary"
FROM Employee
INNER JOIN Department
ON Department.Id = Employee.DepartmentId

Returned Output:

Department Employee Salary
IT Will 70000
IT Randy 85000
IT Janet 69000
IT Max 90000
IT Joe 85000
Sales Sam 60000
Sales Henry 80000

Create a New Column for Salary Rank In Descending Order of Employees by Department

The output above provides all relevant columns needed for the final solution. However, we need to rank employee salaries in descending order by their apartment. To do this, we'll use a window function.

Explanation of SQL Query Below:

Compared to the query above, there is just one new statement that starts with DENSE_RANK() and is a window function. You can learn more about window functions in detail from Mode's great tutorial

In the window function statement, there's 3 major elements.

We use the function DENSE_RANK() to provide a numerical ranking value based on the following conditions. We don't want to skip any numerical rank values so we use DENSE_RANK() instead of RANK().

We want a rank of salaries by department; therefore, the window will consider the group Department.Id. So salaries can be ranked from greatest to least, starting at 1, in the IT department and separately from greatest to least, starting at 1, in the Sales department.

We want the the DENSE_RANK function applied to rank the salaries from greatest to least. ORDER BY Employee.Salary DESC helps sort the salaries of employees from greatest to least - called descending order in SQL.

To succinctly write out the elements of the SQL query below:

  • Select the Name column from the Department table and alias it as "Department"
  • Select the name column from the Employee table and alias it as "Employee"
  • Select the Salary column from the Salary table and alias it as "Salary"
  • Create new column called salary_rank that provides a numerical value of the rank of each person's salary from greatest salary to smallest salary for each department.
  • Select the columns from the Employee table
  • Inner join the Department table on records in which the Id field from the Department table is equal to the DepartmentId field from the Employee table

SQL Query:

SELECT Department.Name AS "Department",
       Employee.name AS "Employee",
       Employee.Salary AS "Salary",
       DENSE_RANK() OVER (PARTITION BY Department.Id ORDER BY Employee.Salary DESC) AS salary_rank 
FROM Employee
INNER JOIN Department
ON Department.Id = Employee.DepartmentId

Returned Output:

Department Employee Salary salary_rank
IT Max 90000 1
IT Randy 85000 2
IT Joe 85000 2
IT Will 70000 3
IT Janet 69000 4
Sales Henry 80000 1
Sales Sam 60000 2

Return 3 Necessary Colums and Top 3 Ranked Salaries by Department

The above query isn't our final solution yet. We have an extra unecessary column to show in the final output, salary_rank, and a salary_rank value of 4 for Janet, who isn't a top three salary in the IT department.

Explanation of SQL Query Below:

Subquery and where clause explanation...

From the returned output above, we need to exclude the salary_rank column and the record for Janet.

To exclude the salary_rank colum, we'll use the subquery functionality in SQL. Learn more about subqueries via Mode's excellent article.

In the execution of the query below, first, the inner query (our query in the previous step) is executed and is the underlying table used for the subquery.

With the subquery, we have the first three new lines at the top of our query to just select the columns we want in the final output. Notice we're selecting these columns from a new table called sub. The underlying table (our query in the previous step) has an alias of sub that's short for subquery.

This would output a table with the same number of rows as the query in the previous step except it would not include the salary_column.

Initially, the question asked to return the people's records who are in the top three salaries of their department. We can use the WHERE clause in the final line to only select records from the underlying table in which the salary_rank value is less than or equal to 3.

SQL Query:

SELECT sub."Department", 
       sub."Employee", 
       sub."Salary" 
FROM (
SELECT Department.Name AS "Department",
       Employee.name AS "Employee",
       Employee.Salary AS "Salary",
       DENSE_RANK() OVER (PARTITION BY Department.Id ORDER BY Employee.Salary DESC) AS salary_rank 
FROM Employee
INNER JOIN Department
ON Department.Id = Employee.DepartmentId
) sub
WHERE sub.salary_rank <= 3

Returned Output:

Department Employee Salary salary_rank
IT Max 90000 1
IT Randy 85000 2
IT Joe 85000 2
IT Will 70000 3
Sales Henry 80000 1
Sales Sam 60000 2

We are done!!