“ETL” stands for Extract, Transform, and Load. The ETL process extracts, transforms, and loads data from multiple sources to a data warehouse (single data source) where that data can be used in decision-making. ETL is the foundation for data analytics.
KAT Concession Supply is a fictitious company that supplies food, paper products, and other items to fairs, amusement parks, and other organizations. Information in the KAT Concession Supply data sets includes information about its sales over a three-year period. The data from KAT Concession Supply has errors and irregularities in it and it must be cleaned.
A Google Sheets interactive dashboard will be built after the data cleaning steps are completed. This dashboard contains two visualizations. The data needs to be cleaned before preparing the dashboard; otherwise, the dashboard information will be incorrect.
After completing Parts 1 and 2 for the project, answer the questions that follow the requirements.
Need to use Google Sheets for this project
Google Sheets must be used for this project. The project is not designed for use in Excel or any other spreadsheet program.
Complete step-by-step instructions are provided for this project. You have the choice of viewing the tutorial videos or downloading a pdf file of the annotated slides. The tutorials for this project have been split into two parts:
The data set for this project can be accessed at this link. Please make a copy of the dataset; you will not be able to edit the file if you do not make a copy.
Requirements for Part 1: Data cleaning
To follow are the requirements for cleaning the data records in the data set. The complete instructions for these requirements are contained in the tutorial videos linked in page 1 of this document.
- Open the Google Sheets data file provided.
- Widen all columns to fit the data.
- Find and remove duplicate records.
- Remove spaces and nonprinting characters from the Transaction year (original) column, placing the results in the Transaction year (edited) column.
- Replace all instances of “coffee” in Columns G and H with “cola.”
- Change the case of the contents of Warehouse state (original) to uppercase with a formula in the Warehouse state (edited) column.
- Format Columns AE through AH as currency with zero decimal places. Format Column AI as currency with two decimal places.
- Save the Google Sheets file with your last name in the beginning of the file name.
Requirements for Part 2: Dashboard
To follow are the requirements for constructing a two-visualization interactive dashboard with the cleaned data set that you saved upon completion of Part 1. The complete instructions for these requirements are contained in the tutorial videos linked in page 1 of this document.
Create a two-visualization dashboard that contains a slicer that is based on the field “Transaction year (edited).” Include the following two charts that are connected to the slicer:
- Total revenue for edible and nonedible products (column chart)
- Total revenue for the cotton candy inventory type (horizontal bar chart)
Using the dashboard that you just created, answer the following questions:
- What is the total revenue for nonedible products for one year prior?
- What is the total revenue for “Cotton Candy Cones, 80 pcs” for two years prior?
- What is the total revenue for edible products for the current year?
- What is the total revenue for “Cotton Candy Floss Sugar, 60 servings, pumpkin spice” for one year prior?
- What was the top revenue-generating Cotton Candy product for two years prior?
- Did edible products or nonedible products generate the most revenue in one year prior?
This project was developed by Dr. Wendy Tietz, CPA, CMA, CGMA, CSCA. All rights reserved.