While I am currently focusing on learning programming for
Data Analysis, I didn’t want to lose the grip on what I have learned so far in
SQL, so to avoid fading of SQL knowledge, I decided to do this challenge and
share my approach with others. I’ll be using MySQL to answer the case study
questions.
Problem Statement
Danny wants to use the data to answer a few simple questions
about his customers, especially about their visiting patterns, how much money
they’ve spent, and which menu items are their favorite. Having this deeper
connection with his customers will help him deliver a better and more
personalized experience for his loyal customers.
He plans on using these insights to help him decide whether
he should expand the existing customer loyalty program — additionally he needs
help to generate some basic datasets so his team can easily inspect the data
without needing to use SQL.
Danny has provided you with a sample of his overall customer
data due to privacy issues — but he hopes that these examples are enough for
you to write fully functioning SQL queries to help him answer his questions!
Danny has shared with you 3 key datasets for this case
study:
sales
menu
members
I copied & pasted the SQL script for tables into MySQL
workbench from the source mentioned at the end of the article.
Case Study Questions
Each of the following case study questions can be answered
using a single SQL statement:
1. What is the total amount each customer spent at the
restaurant?
select s.customer_id
,sum(m.price) from menu m
join sales s using(product_id)
group by s.customer_id;
2. How many days has each customer visited the restaurant?
select customer_id
,count(distinct order_date) num_days_visited
from sales
group by customer_id;
3. What was the first item from the menu purchased by each customer?
with cte as
(
select
s.customer_id
,m.product_name
,row_number() over(partition by s.customer_id order by order_date) rnk
from sales s
join menu m using(product_id)
)
select customer_id
,product_name
from cte where rnk =1;
4. What is the most purchased item on the menu and how many
times was it purchased by all customers?
select
product_name
,count(*) times_purchased
from menu m
join sales s using(product_id)
group by product_name
order by times_purchased desc;
5. Which item was the most popular for each customer?
with cte as
(
select
customer_id
,product_name
,row_number() over(partition by customer_id order by count(*) desc) rnk
from menu m
join sales s using(product_id)
group by customer_id, product_name
)
select *
/*customer_id
,product_name*/
from cte
where rnk = 1;
6. Which item was purchased first by the customer after they
became a member?
with cte as
(
select
s.customer_id
,m.product_name
, rank() over(partition by s.customer_id order by order_date) rnk
from sales s
join members mem using(customer_id)
join menu m using(product_id)
where order_date >= mem.join_date
)
select customer_id
,product_name
from cte where rnk =1;
7. Which item was purchased just before the customer became
a member?
with cte as
(
select
s.customer_id
,m.product_name
,row_number() over(partition by s.customer_id order by order_date desc) rnk
from sales s
join members mem using(customer_id)
join menu m using(product_id)
where order_date < mem.join_date
)
select customer_id
,product_name
from cte
where rnk=1;
8. What is the total items and amount spent for each member
before they became a member?
select
s.customer_id
,count(product_id) total_item
,sum(price)
from sales s
join members mem
on s.customer_id = mem.customer_id
and s.order_date < mem.join_date
join menu m using(product_id)
group by s.customer_id
order by s.customer_id;
9. If each $1 spent equates to 10 points and sushi has a 2x
points multiplier — how many points would each customer have?
with cte as(
select
product_id
,case
when product_id = 1 then price * 20
else price * 10
end as points
from menu
)
select
customer_id
,sum(points) points
from cte
join sales using(product_id)
group by customer_id;
10. In the first week after a customer joins the program
(including their join date) they earn 2x points on all items, not just sushi —
how many points do customers A and B have at the end of January?
In this case, if sushi was ordered before and
after a week of joining it’s point will be 2 times 10 and the rest will
be 10.
select
s.customer_id
,sum(case
when s.order_date between mem.join_date and
date_add(mem.join_date, interval 6 day) then price * 2 * 10
when m.product_id = 1 then price * 2 * 10
else price * 10
end) as points
from sales s
join members mem
on mem.customer_id = s.customer_id
join menu m using(product_id)
where order_date <= '2021-01-31'
group by s.customer_id;
Bonus Questions
1. Join All The Things
Recreate the following table output using the available
data:
select
s.customer_id
,order_date
,product_name
,price
,case
when s.order_date >= join_date
then 'Y'
else 'N'
end as member
from sales s
join menu using (product_id)
left join members mem using(customer_id);
2. Rank All The Things
Danny also requires further information about the ranking of
customer products, but he purposely does not need the ranking for non-member
purchases so he expects null ranking values for the records when
customers are not yet part of the loyalty program.
e.g table
with cte as (
select
s.customer_id
,order_date
,product_name
,price
,case
when s.order_date >= join_date
then 'Y'
else 'N'
end as member
from sales s
join menu using (product_id)
left join members mem using(customer_id)
)
select *
, case
when member = 'N'
then null
else rank() over(partition by customer_id, member order by order_date)
end as 'rank'
from cte;
This completes the first case study. I’ll be publishing the
rest soon!✌
Thank you for reading!🙏 Any thoughts or suggestions are
welcome in the comment or you can directly message and connect with me on Linkedin.
Reference: For source you can click here(ERD
and table image were sourced from the same link).
Comments
Post a Comment