portfolio

A set of industry oriented examples with all the tools that I have dominated.

My full resume can be found here.

The topic of my portfolio is part of the final capstone project of my Google Data Analytics Professional Certificate. While the project had specific goals, I went beyond them, and I used them to demonstrate my analytical skills and the tools that I dominate.

The case study

Bellabeat is a high-tech manufacturer of health-focused products for women, and they can become a more prominent player in the global smart device market. The Chief Creative Officer of Bellabeat believes that analyzing smart device fitness data could help unlock new growth opportunities for the company. The goal is to focus on one of Bellabeat’s products and analyze smart device data to understand how consumers use their smart devices. These insights will then help guide the marketing strategy for the company.

Data Analytics

The data analytics process that I followed, as suggested by Google, is described step-by-step in this set of log files: Ask, Prepare, Process, Analyze, Share, Act.

In short, I am using three different datasets that I called: Fitabase (Fitbit information from 30 individuals)[1], AppleWatchFitbit (information from two smart devices from 23 men and 26 women)[2], and FitbitGrades (Fitbit information from 400 college students, including grades)[3]. The following table can guide you through the different topics, the notebooks, and the collection of tools used.

Tools used Goals Code/Notebooks/Links
Bigquery, SQL, Python, Pandas, Matplotlib, Seaborn Study the overall behavior of Fitbit consumers using the Fitabase dataset.
R, Tidyverse, ggplot2 Study the behavior of Apple watch consumers using the AppleWatchFitbit dataset. Study differences between women and men consumers.
Spreadsheets, Pivot tables Study behavior of women/men Fitbit costumers related with their intellectual skills.
Tableau, dashboards Summarize and emphasize the previous findings using BI tools.

Finally, the entire project is stored in this GitHub repository.

Results and recommendations

The outcomes of this study are:

A set of slides highlighting the results and recommendations can be found here.

Data Science

After completing the capstone project and having a deeper look at the datasets, I got some ideas that I want to explore, showing other tools that I dominate. In this part of my portfolio, I am showing machine learning techniques applied to answer some questions that I got from the datasets:

Tools used Goals Code/Notebooks/Links
statsmodels regressions, scikit-learn, XGBoost, Feature importance, ML Optimizations Can I infer the number of calories burned from other variables collected by the apple watch?. By answering this goal, I can show different regression techniques. Link to the notebook
sklearn, classification, LogisticRegression, RandomForest, XGBoost, Exploratory Data Analysis, matplotlib, seaborn The famous Titanic competition. Here I test many different ML algorithms. Link to the notebook
pyspark, binary classification, SQL, Feature engineering I use pyspark in one the Kaggle Monthly Challenges. Link to the notebook
TensorFlow, Keras, sklearn, multilabel classification, deep neural network Classification problem using one the CERN LHC datasets. Link to the notebook

More soon

References

[1] Furberg, R., Brinton, J., Keating, M., & Ortiz, A. (2016). Crowd-sourced Fitbit datasets 03.12.2016-05.12.2016 [Data set]. Zenodo. https://doi.org/10.5281/zenodo.53894

[2] Fuller, Daniel, 2020, “Replication Data for: Using machine learning methods to predict physical activity types with Apple Watch and Fitbit data using indirect calorimetry as the criterion.”, https://doi.org/10.7910/DVN/ZS2Z2J, Harvard Dataverse, V1

[3] Broaddus, Allie; Jaquis, Brandon; Jones, Colt; Jost, Scarlet; Lang, Andrew; Li, Ailin; et al. (2018): Dataset: Fitbits, field-tests, and grades. The effects of a healthy and physically active lifestyle on the academic performance of first year college students.. figshare. Dataset. https://doi.org/10.6084/m9.figshare.7218497.v1