🔍 How I Calculated the High-Risk Areas (density and provider characteristics)

kelper/addis_care_real_data_analysis.py

Step 1: Data Aggregation by ZIP Code

# Group by ZIP and count providers
zip_analysis = df.groupby('zip').agg({
    'provider_type': 'count',        # Total providers per ZIP
    'state': 'first'                 # State for each ZIP
}).reset_index()

# Separate ALF and HCBS counts by ZIP
alf_by_zip = alf_providers.groupby('zip').agg({
    'provider_type': 'count'         # ALF count per ZIP
}).reset_index()

hcbs_by_zip = hcbs_providers.groupby('zip').agg({
    'provider_type': 'count'         # HCBS count per ZIP
}).reset_index()

Step 2: Calculate Provider Percentages

# Calculate what percentage of providers are ALF vs HCBS
zip_analysis['alf_percentage'] = zip_analysis['alf_count'] / zip_analysis['total_providers'] * 100
zip_analysis['hcbs_percentage'] = zip_analysis['hcbs_count'] / zip_analysis['total_providers'] * 100

Step 3: Define Risk Factors (Based on Industry Knowledge)

Since we don't have real Medicaid data, I used provider characteristics that correlate with Medicaid dependency:

Risk Factor 1: HCBS-Dominant Areas (>70% HCBS)

Rationale: HCBS agencies rely heavily on Medicaid funding
Risk Level: CRITICAL
Impact: Policy changes directly affect service delivery

Risk Factor 2: High Provider Density (>100 total providers)

Rationale: More providers competing for limited Medicaid dollars
Risk Level: CRITICAL
Impact: Higher operational costs and vulnerability to funding cuts

Risk Factor 3: ALF-Heavy Areas (>50% ALF)

Rationale: Mixed funding models create uncertainty
Risk Level: HIGH
Impact: Medicaid-dependent residents at risk

Step 1: Data Aggregation by ZIP Code

Step 2: Calculate Provider Percentages

Step 3: Define Risk Factors (Based on Industry Knowledge)

Risk Factor 1: HCBS-Dominant Areas (>70% HCBS)

Risk Factor 2: High Provider Density (>100 total providers)

Risk Factor 3: ALF-Heavy Areas (>50% ALF)

Step 4: Calculate Risk Scores