SQL Window Functions Problem Solution

# Department Top Three Salaries via Leetcode

In extreme detail, in this tutorial I'll walk through a Leetcode Advanced SQL problem to arrive to the solution.

### Problem Statement

The Employee table holds all employees. Every employee has an Id, and there is also a column for the department Id.

Id Name Salary DepartmentId
1 Joe 85000 1
2 Henry 80000 2
3 Sam 60000 2
4 Max 90000 1
5 Janet 69000 1
6 Randy 85000 1
7 Will 70000 1

The Department table holds all departments of the company.

Id Name
1 IT
2 Sales

Write a SQL query to find employees who earn the top three salaries in each of the department. For the above tables, your SQL query should return the following rows (order of rows does not matter).

Department Employee Salary
IT Max 90000
IT Randy 85000
IT Joe 85000
IT Will 70000
Sales Henry 80000
Sales Sam 60000

### Explanation

In IT department, Max earns the highest salary, both Randy and Joe earn the second highest salary, and Will earns the third highest salary. There are only two employees in the Sales department, Henry earns the highest salary while Sam earns the second highest salary.

### Dan's Step by Step Solution

#### Join Tables to Get Relevant Columns

Explanation of SQL Query Below:

We are provided two tables: `Employee` and `Department`. In our final result, we need three fields: the department name, the employee's name and the employee's salary. Since these three columns are in two different tables, we need to join the two tables.

We know this is a relational database in that the two tables "relate" to one another with common identifiers from each.

The term “relational database” refers to the fact that the tables within it “relate” to one another—they contain common identifiers that allow information from multiple tables to be combined easily.

The column `Id` in the `Department` table is a primary key. This key uniquely identifies each department in the company. We cannot have two `Id` values that are the exact same number in this table!

Similarly, the `Employee` table has a foreign key of `DepartmentId` that "relates" to the `Id` field in the `Department` table. Therefore, we can join the two tables together and retrieve the necessary columns to solve this problem.

We choose a `INNER JOIN` because we want to return all records from the `Employee` table in which there is a corresponding match for the department name that the employee works in via the `Department` table.

To succinctly write out the elements of the SQL query below:

• Select the `Name` column from the `Department` table and alias it as `"Department"`
• Select the `name` column from the `Employee` table and alias it as `"Employee"`
• Select the `Salary` column from the `Salary` table and alias it as `"Salary"`
• Select the columns from the `Employee` table
• Inner join the `Department` table on records in which the `Id` field from the `Department` table is equal to the `DepartmentId` field from the `Employee` table

SQL Query:

```SELECT Department.Name AS "Department",
Employee.name AS "Employee",
Employee.Salary AS "Salary"
FROM Employee
INNER JOIN Department
ON Department.Id = Employee.DepartmentId
```

Returned Output:

Department Employee Salary
IT Will 70000
IT Randy 85000
IT Janet 69000
IT Max 90000
IT Joe 85000
Sales Sam 60000
Sales Henry 80000

#### Create a New Column for Salary Rank In Descending Order of Employees by Department

The output above provides all relevant columns needed for the final solution. However, we need to rank employee salaries in descending order by their apartment. To do this, we'll use a window function.

Explanation of SQL Query Below:

Compared to the query above, there is just one new statement that starts with `DENSE_RANK()` and is a window function. You can learn more about window functions in detail from Mode's great tutorial

In the window function statement, there's 3 major elements.

We use the function `DENSE_RANK()` to provide a numerical ranking value based on the following conditions. We don't want to skip any numerical rank values so we use `DENSE_RANK()` instead of `RANK()`.

We want a rank of salaries by department; therefore, the window will consider the group `Department.Id`. So salaries can be ranked from greatest to least, starting at `1`, in the `IT` department and separately from greatest to least, starting at `1`, in the `Sales` department.

We want the the `DENSE_RANK` function applied to rank the salaries from greatest to least. `ORDER BY Employee.Salary DESC` helps sort the salaries of employees from greatest to least - called descending order in SQL.

To succinctly write out the elements of the SQL query below:

• Select the `Name` column from the `Department` table and alias it as `"Department"`
• Select the `name` column from the `Employee` table and alias it as `"Employee"`
• Select the `Salary` column from the `Salary` table and alias it as `"Salary"`
• Create new column called `salary_rank` that provides a numerical value of the rank of each person's salary from greatest salary to smallest salary for each department.
• Select the columns from the `Employee` table
• Inner join the `Department` table on records in which the `Id` field from the `Department` table is equal to the `DepartmentId` field from the `Employee` table

SQL Query:

```SELECT Department.Name AS "Department",
Employee.name AS "Employee",
Employee.Salary AS "Salary",
DENSE_RANK() OVER (PARTITION BY Department.Id ORDER BY Employee.Salary DESC) AS salary_rank
FROM Employee
INNER JOIN Department
ON Department.Id = Employee.DepartmentId
```

Returned Output:

Department Employee Salary salary_rank
IT Max 90000 1
IT Randy 85000 2
IT Joe 85000 2
IT Will 70000 3
IT Janet 69000 4
Sales Henry 80000 1
Sales Sam 60000 2

#### Return 3 Necessary Colums and Top 3 Ranked Salaries by Department

The above query isn't our final solution yet. We have an extra unecessary column to show in the final output, `salary_rank`, and a `salary_rank` value of `4` for `Janet`, who isn't a top three salary in the `IT` department.

Explanation of SQL Query Below:

Subquery and where clause explanation...

From the returned output above, we need to exclude the `salary_rank` column and the record for `Janet`.

To exclude the `salary_rank` colum, we'll use the subquery functionality in SQL. Learn more about subqueries via Mode's excellent article.

In the execution of the query below, first, the inner query (our query in the previous step) is executed and is the underlying table used for the subquery.

With the subquery, we have the first three new lines at the top of our query to just select the columns we want in the final output. Notice we're selecting these columns from a new table called `sub`. The underlying table (our query in the previous step) has an alias of `sub` that's short for subquery.

This would output a table with the same number of rows as the query in the previous step except it would not include the `salary_column`.

Initially, the question asked to return the people's records who are in the top three salaries of their department. We can use the `WHERE` clause in the final line to only select records from the underlying table in which the `salary_rank` value is less than or equal to `3`.

SQL Query:

```SELECT sub."Department",
sub."Employee",
sub."Salary"
FROM (
SELECT Department.Name AS "Department",
Employee.name AS "Employee",
Employee.Salary AS "Salary",
DENSE_RANK() OVER (PARTITION BY Department.Id ORDER BY Employee.Salary DESC) AS salary_rank
FROM Employee
INNER JOIN Department
ON Department.Id = Employee.DepartmentId
) sub
WHERE sub.salary_rank <= 3
```

Returned Output:

Department Employee Salary salary_rank
IT Max 90000 1
IT Randy 85000 2
IT Joe 85000 2
IT Will 70000 3
Sales Henry 80000 1
Sales Sam 60000 2

We are done!!