nov
2
2015

Day 1 @ AdventureWorks, the Solutions

It’s been a while since I published Day 1 questions and I guess it’s time to post the answers I came up with.
That doesn’t mean my solutions are the only possibles, and in some cases I’ll provide more than one answer taking different approaches.

The Answers

The first question does not have much to talk about, it’s just a matter of locate the table we want to retrieve the data from and select the columns we’re asked to.

--01=================================================================================
-- Retrieve a list of the different types of contacts that can exist in the database 
-- Output: 
--			- ContactTypeID
--			- Name

SELECT ContactTypeID, Name 
	FROM Person.ContactType

The Second is based on the first and will use the WHERE clause to filter the results returned by the query.

--02=================================================================================
-- Retrieve (ContactTypeID, Name) for the contact type 'Marketing Manager'
-- Output: 
--			- ContactTypeID
--			- Name
SELECT ContactTypeID, Name 
	FROM Person.ContactType
	WHERE Name = 'Marketing Manager'

Third we’ll need to use the WHERE clause combined with the logical operator LIKE to filter the different names that contain the word

--03=================================================================================
-- Retrieve a list of the different types of contacts which are managers
-- Output: 
--			- ContactTypeID
--			- Name
SELECT ContactTypeID, Name 
	FROM Person.ContactType
	WHERE Name LIKE '%Manager%'

Next question adds another clause, this time is ORDER BY, which sorts the returned results.
It’s important to remind

The order in which rows are returned in a result set are not guaranteed unless an ORDER BY clause is specified.

Sounds pretty obvious but it’s awesome how many people will fail to know it, even experienced professionals.

--04=================================================================================
-- Retrieve a list of the different types of contacts which are managers
-- Output: 
--			- ContactTypeID
--			- Name
-- Sorted by
--			- Alphabetically Descending
SELECT ContactTypeID, Name 
	FROM Person.ContactType
	WHERE Name LIKE '%Manager%'
	ORDER BY Name DESC

Retrieving data from more than one table requires a more elaborated query, where we specify the tables involved and the equality columns for each of the JOIN operators.
In this SELECT, we need to retrieve only columns from one of the tables, [Person].[Person], but since we need to filter only those which are ‘Purchasing Manager’ we need somehow to involve [Person].[ContactType] in the query.

The table which we can use both is [Person].[BusinessEntityContact] which contains Foreign Keys referencing both tables.

--05=================================================================================
-- Retrieve a list of all contacts which are 'Purchasing Manager' 
-- Output: 
--			- BusinessEntityID
--			- LastName
--			- FirstName
--			- MiddleName
--
-- Sorted by:
--			- LastName		ascending
--			- FirstName		ascending
--			- MiddleName	ascending

SELECT p.BusinessEntityID, LastName, FirstName, MiddleName
	FROM Person.BusinessEntityContact AS bec
		INNER JOIN Person.ContactType AS c
			ON c.ContactTypeID = bec.ContactTypeID
		INNER JOIN Person.Person AS p
			ON p.BusinessEntityID = bec.PersonID
	WHERE c.Name = 'Purchasing Manager'
	ORDER BY LastName, FirstName, MiddleName

Next one is built on top of the previous again. You have seen that there’s a number of people which do not have MiddleName and displays as NULL.
To avoid this, we can use the TSQL function ISNULL(), to replace NULL values for the value we want, in this case an empty string ”.
Also we need to display the name of the column as MiddleName, hence we need to use an alias for it.

--06=================================================================================
-- Retrieve a list of all contacts which are 'Purchasing Manager' 
-- Output: 
--			- BusinessEntityID
--			- LastName
--			- FirstName
--			- MiddleName, if there is no MiddleName, to display '' (empty string) instead of NULL
--
-- Sorted by:
--			- LastName		ascending
--			- FirstName		ascending
--			- MiddleName	ascending

SELECT p.BusinessEntityID, LastName, FirstName, ISNULL(MiddleName,'') AS MiddleName
	FROM Person.BusinessEntityContact AS bec
		INNER JOIN Person.ContactType AS c
			ON c.ContactTypeID = bec.ContactTypeID
		INNER JOIN Person.Person AS p
			ON p.BusinessEntityID = bec.PersonID
	WHERE c.Name = 'Purchasing Manager'
	ORDER BY LastName, FirstName, MiddleName

We need now to count the number of contacts for each type and display it along the ContactTypeID and ContactTypeName
To do so, we need to use COUNT(), which is an aggregate function and in this case requires a GROUP BY clause to return one single row for each group.
Again we need to use an alias to name the column.

--07=================================================================================
-- Retrieve a list of the different types of contacts and how many of them exist in the database 
-- Output: 
--			- ContactTypeID
--			- ContactTypeName
--			- N_contacts
--

SELECT c.ContactTypeID, c.Name AS ContactTypeName, COUNT(*) AS N_contacts
	FROM Person.BusinessEntityContact AS bec
		INNER JOIN Person.ContactType AS c
			ON c.ContactTypeID = bec.ContactTypeID
	GROUP BY c.ContactTypeID, c.Name
	

Just add the ORDER BY clause to complete the next question

--08=================================================================================
-- Retrieve a list of the different types of contacts and how many of them exist in the database 
-- Output: 
--			- ContactTypeID
--			- ContactTypeName
--			- N_contacts
--
-- Sorted by:
--			- N_contacts descending

SELECT c.ContactTypeID, c.Name AS ContactTypeName, COUNT(*) AS N_contacts
	FROM Person.BusinessEntityContact AS bec
		INNER JOIN Person.ContactType AS c
			ON c.ContactTypeID = bec.ContactTypeID
	GROUP BY c.ContactTypeID, c.Name
	ORDER BY COUNT(*) DESC

We have seen before that to filter the returned rows we can use the WHERE clause, but to filter rows based on the aggregate function we must use the HAVING clause.

--09=================================================================================
-- Retrieve a list of the different types of contacts and how many of them exist in the database 
-- Output: 
--			- ContactTypeID
--			- ContactTypeName
--			- N_contacts
--
-- Sorted by:
--			- N_contacts descending
-- Filter:
--			- Only interested in ContactTypes that have 100 contacts or more.

SELECT c.ContactTypeID, c.Name AS ContactTypeName, COUNT(*) AS N_contacts
	FROM Person.BusinessEntityContact AS bec
		INNER JOIN Person.ContactType AS c
			ON c.ContactTypeID = bec.ContactTypeID
	GROUP BY c.ContactTypeID, c.Name
	HAVING COUNT(*) >= 100
	ORDER BY COUNT(*) DESC

More functions into play, in order to comply with the requirements, we will need to use a function to generate the column full name and another to format the date to the required, also some calculation must be done to calculate the weekly salary as in the database we only store the hourly rate.

  • For the name, I used CONCAT
  • To format the date, I used CONVERT, specifying the format, you can check all the possibilities by yourself
  • --10=================================================================================
    --
    -- The table [HumanResources].[EmployeePayHistory] holds employees current salary along with historical data
    --
    --
    -- List of employees and their Historical weekly salary (based on 40h a week) 
    -- Output: 
    --			- RateChangeDate from historical table aliased to DateFrom, in the format 'dd/mm/yyyy'
    --			- 1 column which includes 'LastName, FirstName MiddleName' called FullNamne 
    --			- WeeklySalary which must be calculated too
    --
    -- Sorted by 
    --			- FullName 
    
    SELECT CONVERT(VARCHAR, h.RateChangeDate, 103) AS DateFrom
    		, CONCAT(LastName, ', ', FirstName, ' ', MiddleName) AS FullName
    		, (40 * h.Rate) AS WeeklySalary
    	FROM Person.Person AS p
    		INNER JOIN HumanResources.EmployeePayHistory AS h
    			ON h.BusinessEntityID = p.BusinessEntityID		
    	ORDER BY FullName
    

    The Bonus question gets a bit more complicated as you’d expect. More concepts apply here depending on the approach we want to take.

    In words, we need to find the most recent date (MAX) each person has had his/her salary changed.

    The first solution I came up with is to use a correlated subquery which filters by the maximum date for each person.

    --===================================================================================
    -- BONUS
    --===================================================================================
    
    
    --===================================================================================
    -- List of employees and their current weekly salary (based on 40h a week) 
    -- Output: 
    --			- RateChangeDate from historical table aliased to DateFrom, in the format 'dd/mm/yyyy'
    --			- 1 column which includes 'LastName, FirstName MiddleName' called FullNamne 
    --			- WeeklySalary which must be calculated too
    --
    -- Sorted by 
    --			- FullName 
    
    --
    -- ONLY 1 line per Employee, the current one to be precise.
    
    -- Using subquery in the WHERE clause
    
    USE AdventureWorks2014
    
    SELECT CONVERT(VARCHAR,(h.RateChangeDate),103) AS DateFrom
    		, CONCAT(LastName, ', ', FirstName, ' ', MiddleName) AS FullName 
    		, (40 * h.Rate) AS WeeklySalary
    	FROM Person.Person AS p
    		INNER JOIN HumanResources.EmployeePayHistory AS h
    			ON h.BusinessEntityID = p.BusinessEntityID
    	WHERE h.RateChangeDate = (SELECT MAX(RateChangeDate) 
    								FROM HumanResources.EmployeePayHistory 
    								WHERE BusinessEntityID = h.BusinessEntityID)
    	ORDER BY FullName
    

    We can use a subquery and do INNER JOIN to get the desired rows. Here we also need to use the GROUP BY clause to get the MAX date for each employee, not just the maximum date in general.

    -- Using subquery in the INNER JOIN
    SELECT CONVERT(VARCHAR,(h.RateChangeDate),103) AS DateFrom
    		, CONCAT(LastName, ', ', FirstName, ' ', MiddleName) AS FullName 
    		, (40 * h.Rate) AS WeeklySalary
    	FROM Person.Person AS p
    		INNER JOIN HumanResources.EmployeePayHistory AS h
    			ON h.BusinessEntityID = p.BusinessEntityID
    		INNER JOIN (SELECT BusinessEntityID, MAX(RateChangeDate) AS RateChangeDate 
    						FROM HumanResources.EmployeePayHistory
    						GROUP BY BusinessEntityID) AS Cur
    			ON cur.BusinessEntityID = h.BusinessEntityID 
    				AND cur.RateChangeDate = h.RateChangeDate
    	ORDER BY FullName
    
    

    The inner join can be replaced with CROSS APPLY, but since it doesn’t have ON clause, we need to filter the subquery using values from the outer query

    -- Using subquery and CROSS APPLY
    SELECT CONVERT(VARCHAR,(h.RateChangeDate),103) AS DateFrom
    		, CONCAT(LastName, ', ', FirstName, ' ', MiddleName) AS FullName 
    		, (40 * h.Rate) AS WeeklySalary
    	FROM Person.Person AS p
    		INNER JOIN HumanResources.EmployeePayHistory AS h
    			ON h.BusinessEntityID = p.BusinessEntityID
    		CROSS APPLY	(SELECT BusinessEntityID, MAX(RateChangeDate) AS RateChangeDate 
    						FROM HumanResources.EmployeePayHistory
    						WHERE BusinessEntityID = h.BusinessEntityID							
    						GROUP BY BusinessEntityID
    						HAVING MAX(RateChangeDate) = h.RateChangeDate) AS Cur
    	ORDER BY FullName
    
    

    When queries get too big, sometimes for readability is better to split the logic in small parts to we can understand them better.
    Using Common Table Expressions is a good way of doing it. In this case we calculate the MAX date for each employee beforehand and then use the CTE like a regular table to JOIN it to the other tables.

    -- Using a Common table expression (to make the query below smaller and more readable)
    -- Then can be combined as INNER JOIN or CROSS APPLY 
    ;WITH cte AS (
    SELECT BusinessEntityID, MAX(RateChangeDate) AS RateChangeDate 
    	FROM HumanResources.EmployeePayHistory
    	GROUP BY BusinessEntityID 
    )
    
    SELECT CONVERT(VARCHAR,(h.RateChangeDate),103) AS DateFrom
    		, CONCAT(LastName, ', ', FirstName, ' ', MiddleName) AS FullName 
    		, (40 * h.Rate) AS WeeklySalary
    	FROM Person.Person AS p
    		INNER JOIN HumanResources.EmployeePayHistory AS h
    			ON h.BusinessEntityID = p.BusinessEntityID
    		INNER JOIN cte AS Cur
    			ON cur.BusinessEntityID = h.BusinessEntityID 
    				AND cur.RateChangeDate = h.RateChangeDate
    	ORDER BY FullName
    

     
    And that was all for this first day of training. There are a lot of concepts involved here, so further reading is well recommended.

     
    Please feel free to throw me your questions and other solutions you came up with and I’ll try to get back to you soon.

    See you for Day 2 @ AdventureWorks!

One comment

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *

Este sitio usa Akismet para reducir el spam. Aprende cómo se procesan los datos de tus comentarios.