Capstone Project Using Real Dataset
Join our community on Telegram!
Join the biggest community of Pharma students and professionals.
A capstone project is a comprehensive, real-world assignment that allows learners to apply all the concepts and techniques covered throughout the course. It serves as a final project where students demonstrate their ability to work with real datasets, perform analysis, and present meaningful insights.
In this capstone project, a real dataset is used to perform the complete data analysis workflow. This includes data loading, cleaning, exploration, visualization, statistical analysis, and report generation.
The first step is to load the dataset into R. For example, a sales dataset stored in a CSV file can be imported as follows.
# Load dataset
data <- read.csv("sales_data.csv")
# View structure
str(data)
# Summary statistics
summary(data)
After loading the data, the next step is data cleaning and preparation. This may include handling missing values, converting data types, and creating new features.
library(dplyr)
# Remove missing values
clean_data <- na.omit(data)
# Create new feature
clean_data <- clean_data %>%
mutate(total_price = quantity * price)
Exploratory data analysis is then performed to understand patterns and relationships in the dataset.
library(ggplot2)
# Distribution of total price
ggplot(clean_data, aes(x = total_price)) +
geom_histogram() +
theme_minimal()
# Relationship between quantity and total price
ggplot(clean_data, aes(x = quantity, y = total_price)) +
geom_point() +
theme_minimal()
Statistical summaries can be generated to understand key metrics.
clean_data %>%
summarise(
average_price = mean(price),
total_revenue = sum(total_price)
)
Finally, the results can be compiled into a report using R Markdown.
rmarkdown::render("capstone_report.Rmd")
The capstone project helps learners gain practical experience with real data. It strengthens problem-solving skills and demonstrates readiness for real-world data analysis tasks, which is valuable for interviews and job roles.
