Customer & Sales Analysis: MySQL and Power BI
Analyzing customer purchase behavior, sales, and revenue.
Final Dashboard after analysis in Power BI
Introduction
In this project, I’ll Identify the gender breakdown
of Superstore company’s customer base, the most
popular product categories for different age groups and genders, trends
in sales and transactions over time, products that are in high
demand, and product categories that are low in stock, and most
profitable product categories. The analysis can be used to target marketing
and sales efforts more effectively, improve product assortment and marketing
campaigns, forecast future sales and make informed decisions about marketing
and sales strategies, improve inventory management and prevent stockouts, and
focus marketing and sales efforts on the most profitable areas.
1. About Dataset
The source of the dataset is a subset of Sample
superstore. It has 4 tables; Customer, Inventory, Product, and Sales.
- A
glimpse of customer dataset 4 columns:
2. Glimpse of product dataset 4 columns:
3. Glimpse of sales dataset 6 columns:
4. Glimpse of inventory dataset:
2. Data Exploration and Cleaning
I’ve added the SQL script of this on GitHub. I’ll start with Number of Rows of each table:
Checking the unique values in each id column for each table:
A mismatch in rows for the product table, means there are
Duplicates in product table, then I identify the duplicates.
The same Products have different price values. Now in this
case I’ll keep the products having maximum price value. To delete the
duplicates, I’ll need a column that is unique so I’ll create a row index column
with the help of auto-increment.
With the help of Common Table Expression and ROW NUMBER
function, I deleted the duplicates and 8332 row(s) affected, only unique 1862
rows left. I verified if there are any duplicates with:
For the product id ‘FUR-BO-1000046’ there were different
prices 104,145 etc. Now only the maximum-price product is left.
Identifying Null or blank values: I found 15
records in stock with values zero, it is possible the stocks are zero, so no
need to remove them. Some product id and quantity are missing in this case I’ll
replace the quantity with mean and remove the product_id as there are only 3
missing records.
Missing values in the sales table
I checked the quantity column for any inconsistencies such
as alphabets or special characters, as it is a character data type. So, I used
the query “SELECT quantity FROM sales WHERE quantity NOT REGEXP
‘[^0-9]’;” and 15 rows blank rows returned, which means there are no
special characters or alphabets. Then I replaced the blank quantities with
average (4) and changed the data type to Integer.
3. Analyze and Share
Connecting MySQL with Power BI: To visualize the
data, I utilized Power BI by connecting it to MySQL. I created a VIEW to
represent the query results that were intended for visual representation.
Views and Tables in the Database
I first analyzed the data related to a particular table,
such as the number of customers by gender, age range, and time period. I then
analyzed the data based on the project objective.
3.1 Number of customers by gender: Female
673 & Male 131.
3.2 Age range: The customer’s Age
ranges from 18 to 80.
3.3 Order dates: From 2019–01–03 to
2022–12–30.
3.4 Transaction & Sales by Year:
Sales have increased from 2.6 million in 2019 to 4.3 million
in 2022, with the number of transactions increasing as well. 2022 had the
highest number of transactions, while 2019 had the lowest. The trend of
increasing sales and transactions is evident from year to year
3.5 Number of Transactions & Sales by Month:
December achieved 1.9 million in sales, while February
recorded approximately 0.35 million. November saw 755 transactions, whereas
February had 162. Notably, the last quarter of the year experienced higher
sales and transaction volumes.
3.6 Sales & Number of Transactions by Day of Week:
Sunday garnered 2.5 million in sales, whereas Tuesday
reached approximately 0.7 million. Additionally, Sunday recorded 956
transactions, and Tuesday had 274. Evidently, customers tend to engage in more
shopping activities during the latter part of the week.
3.7 Transaction and Sales by Age: Customers who
are 58 years old contribute the highest to sales and transactions, while those
who are 67 years old contribute the least. To better understand the
distribution of sales in respect of age, I created age bins of size 10. It
tells us that customers aged between 20 to 29 have made the most transactions.
3.8 Sales by Gender: As there are more female
customers, it was anticipated that the highest sales would come from females
compared to male customers.
3.9 Percentage of products with zero stocks
3.10 Total revenue generated by the company for each
product category
We can see that the most revenue is coming from Furniture
57% of the total, then office supplies then the least from Technology
contributing only 14% to total revenue.
3.11 Top 5 customers who have made the highest total
purchases, considering the customer’s age and gender.
The top 5 customers predominantly consist of females,
outnumbering males. There is a significant age variation among these customers,
ranging from 20 to 61, with similar patterns observed for both genders.
3.12 Most profitable product category by calculating the
average revenue per unit sold.
Furniture is the most profitable product category based on
the average revenue per unit sold, followed by Office Supplies and Technology.
Furniture accounts for 57% of the contribution, with revenue of 1.9M, while
Technology represents the least profitable category with a contribution of 15%.
3.13 The average age of customers for each product
category.
The Furniture category has an average age of 44, while both
Technology and Office Supplies categories have an average age of 43. This
suggests that there is a slight variation in the age of customers across these
product categories, with Furniture attracting slightly older customers on
average.
3.14 Top product category that has the highest average
transaction amount.
Technology has the highest average transaction amount
(2573). Office Supplies closely follows with a slightly lower average
transaction amount (2552). Furniture has a slightly lower average transaction
amount (2513).
This suggests that customers, on average, spend the most
per transaction in the Technology category, followed by Office Supplies and
then Furniture. It indicates that Technology products may have a higher price
point compared to the other categories.
3.15 Popular product category by Gender
Among both gender office Supplies is the popular product
category. However, the contribution of female customers is more than males in
every product category.
3.16 Popular product category by Age Group
Here also Office supplies are the most popular category
among all the age groups, followed by furniture then technology. And the
popularity of the category decreases as we move toward the younger age groups.
3.17 Product Category having Zero Stock in the inventory.
I checked the inventory for products with zero stock. Office
supplies have the highest number of zero stock, followed by furniture and
technology, which have the same percentage.
3.18 YoY% Revenue Growth
4. Conclusion
4.1 Target Audience: The company’s
customer base is predominantly female, so it makes sense to target more
female customers in its marketing and advertising campaigns. And in terms of
Age, target middle age customers as this age group contributes
more to sales.
4.2 Continue to focus on Furniture as
it is the company’s most profitable product category. This could be done by
expanding the company’s selection of office supplies, or by offering discounts
on office supplies. Market office supplies to older adults. Office supplies are
also popular among older adults, so the company could market its office
supplies to this group.
4.3 Consider expanding the Technology product
category it is currently the company’s least profitable product
category, but it has the potential to be more profitable. The company could
consider expanding this category by adding new furniture products, or by
offering discounts on furniture products.
4.4 Product Stock: It is important to
understand why there is zero stock for product categories. Also, monitor
inventory levels on a regular basis to ensure that the company has enough
products to meet customer demand, company can set up alerts for low inventory
levels so that it is notified when a product category is running low on stock.
4.5 Increase marketing and advertising
efforts during the last quarter of the year. This is when sales and
transactions are highest, so it makes sense to focus marketing efforts during
this time. Target more customers on Sunday. Offer discounts
or promotions during the off-peak seasons. This could help to attract
customers during times when sales are typically lower.
Any thoughts or suggestions are welcome in the comment or
you can directly message and connect with me on Linkedin.
Comments
Post a Comment