Skip to content

Month: September 2023

Today’s Exploration: Deep Dives into Random Forest and Geographic Heat Maps

Today was a day of deep dives. I got my hands dirty with implementing the Random Forest regression model on our U.S. counties’ health metrics data. The objective was clear: to make sense of the complex relationships between various health indicators like diabetes rates, obesity levels, and physical inactivity. But…

The Hidden Patterns of Health: A Cluster Analysis of CDC Data

The analysis begins with data cleaning and standardization, followed by k-means clustering to group counties based on their health metrics. The optimal number of clusters was determined to be four. Various visualizations were created to explore these clusters. Key Findings 1. Cluster-wise Box Plots Box plots were used to visualize…

Clusters Unveiled: Grouping States by Health Metrics Reveals Surprising Patterns

After exploring the states with the highest and lowest rates of diabetes, obesity, and physical inactivity, I delved deeper to understand how states could be grouped based on these health metrics. Through clustering analysis, I found some intriguing patterns that might offer valuable insights for public health interventions. The Power…

The Geographic Divide: States with the Highest and Lowest Rates of Diabetes

I analyzed data from the Centers for Disease Control and Prevention (CDC) for the year 2018 to identify states that are most affected by these conditions. My Findings on Geographic Variations: Top 5 States with the Highest Rates of Diabetes: Through my analysis, I found that the states most affected…

Understanding the Landscape of Diabetes in U.S. Counties: A Statistical Overview

This post aims to provide a statistical overview of that dataset, offering insights into the average rate of diabetes, its variability, and the range of percentages across counties. Key Statistical Findings: Count: A total of 3,142 counties were surveyed in 2018. Mean: The average rate of diabetes was approximately 8.72%…

Exploratory Data Analysis: Scatter Plots and Correlation Matrix for Diabetes, Obesity, and Inactivity Metrics

In today’s analysis, I begins with an introduction to the analysis and focuses on exploring relationships between diabetes, obesity, and inactivity rates. The data is loaded from an Excel file, and necessary libraries are imported. Step 1: Uploading the Excel File The analysis starts by prompting the user to upload…