A Data Science Story: Data Analysis in Education and the Digital Divide
Here is our analysis:
Step 1: Data Wrangling
Using the US Census data (estimates for 2019) we wrangled the data and created a Utility Matrix that we will use for the calculations. Since we are focused on the Digital Divide and how it affects education we used the following data fields:
- Total House Holds
- Persons Per House Holds
- Percent of House Holds with Computers
- Percent of House Holds with Broadband Internet Access
Step 2: EDA & Hypothesis
Lets look at the bar chart as percentages
So we compute the number of Households and the Number of Persons affected by the digital divide in each of these villages. We took the inverse of the percentages for computers and internet access and use them in our computation.
- Total house holds without computers = Total Households X (1- PCT With Computers)
- Total households without internet = Total Households x (1- PCT with Internet)
- Total persons without computers = (Total households without computers) x (Person Per Household)
- Total Persons without internet = (Total Household without internet) x (Persons per Household)
The results are shown by the table below:
Step 3: Conclusion & Impact
By taking the percentages and converting them to numbers show the real impact the “Digital Divide” has on communities in our area. Sometimes showing impact a as percentage does not bring to light the seriousness of the problem. In the case of Hempstead we can see that the access to computers and internet affects 5,010 + 10,891. A total of over 15,000 persons are impacted by the digital divide.
Now let’s take a look at the visualization and not percentages but as actual persons affected.