🎉 JOGL is soon launching a new version. All the users of the v1 will be migrated to the new version. In the time being, we do not allow the creation of new users on this platform.
Community Vulnerability Index banner
Project
3
Members

Status:
Active/Ongoing
Project maturity:
Proof of concept
Linked to group(s)/challenge(s):

Community Vulnerability Index

About reviewed project
The Community Vulnerability Index is an open-source, science-backed needs assessment tool. Using data-driven metrics and AI, CVI evaluates a community’s well-being and delivers actionable information to those in a position to effect change.

Links



1.0 Introduction 


1.1 Problem and Background


Community-focused organizations and nonprofit groups often have the seemingly impossible task of doing more with less. They are expected to transform limited resources into a notable impact for their communities. Unfortunately, in addition to illness, the COVID-19 pandemic also brought with it financial burdens that many were unprepared to face. The economic harm caused by the pandemic forced many nonprofits and other community organizations to make budget cuts. Consequently, the communities these groups are serving are going without, sometimes, much-needed resources.


As community needs vastly increase due to the pandemic, efficient and effective resource planning and distribution become crucial. However, this process is not straightforward as the effects of COVID can interact in complicated ways. For example, two communities with similar infection rates may experience different hospitalization rates due to a higher prevalence of other complicating factors such as obesity or diabetes.


Untangling how the effects of COVID are interacting in different communities has become a priority as we begin the process of recovering from the pandemic. There is no avoiding the economic harm resulting from budget cuts and their increased burden on nonprofits. However, we can help mitigate the consequences by making smarter, data-driven decisions about resource distribution.


1.2 Solution summary in simple terms


We propose a solution that leverages data science and machine learning to help nonprofits and other organizations obtain insight into their communities.


The Community Vulnerability Index (CVI) is a science-backed, open-source needs assessment tool targeted mainly for use by nonprofits. By using data-driven metrics and machine learning, CVI evaluates a community’s well-being and needs. It then delivers actionable information that nonprofits and community-focused organizations can use to effectively target their resources. Through a web application, users can access several indicators representing health, socioeconomics, policy, and more. They can also view the potential vulnerabilities within their communities that were predicted with our sophisticated AI algorithms.


1.3 Solution summary in technical terms


In technical terms, CVI can be described by its front-end and back-end components. The front-end for CVI is a dashboard that is being built on Python and the Dash Plotly framework. We are developing the app using Flask, a lightweight web application framework that is an industry-standard in Python. The dashboard is stylized with HTML, CSS, Bootstrap, and Javascript, making the presentation aesthetically pleasing.


CVI’s back-end is where data is processed, and machine learning models are built to identify and project vulnerabilities within communities. To quantify community vulnerability, we derived a series of indices that measure community needs along different axes. Predictive modeling is then used to determine where resources are best used.


1.4 State of advancement of the project


CVI has undergone user testing with various nonprofit organizations. Early feedback has indicated they find the product useful and wish to collaborate in developing customized vulnerability indices, including an Access to Sexual Health Education Score for Planned Parenthood.


1.5 Project Timeline


The Community Vulnerability Index was started in 2020 and is an ongoing project. Short term milestones are listed below:


February 2021

  • Definition of new Economic Harm vulnerability metric
  • Evaluation of clustering schemes

March 2021

  • Data cleaning and merging for Economic Harm metric
  • Assessment of predictive power of current COVID severity metric

April 2021

  • Refinement of COVID severity metric based on results
  • Implementation and analyses of clustering schemes

May 2021

  • Analysis of model accuracies based on a retrospective longitudinal study
  • Finalize Economic Harm metric

June 2021

  • Refinement of Machine Learning models based on results
  • Incorporate Economic Harm Vulnerability metic to Dashboard

July 2021

  • Beginning of user testing outside of partner organizations
  • Intensification of public outreach efforts through non-academic publications and increased promotion


2.0 Project Implementation


2.1 Solution


This project is taking place in the United States and will be made freely accessible to any community-focused organization. We estimate the stakeholders for this project to be healthcare professionals, policymakers, researchers, nonprofit groups, philanthropists, and anyone in the market for social good.


The main deliverable of this project is an interactive web application. Using the dashboard, stakeholders can visualize data on health, infrastructure, policy, demographics and more, for each county in the United States. The platform also provides a variety of derived metrics that quantify community vulnerabilities across different axes. To date, users can access eight vulnerability indices:


  • risk of severe case complications
  • risk of economic harm
  • need for mobile health resources
  • need for food resources
  • need for mental health resources
  • risk of overwhelming health care infrastructure
  • community connectedness
  • information deserts


Figure 1: Community Vulnerability Index Dashboard


A mock-up of the application is shown in Figure 1. Upon accessing the web application, users are presented with a full map of the United States as well as national statistics regarding COVID-19. To obtain statistics relevant to their county, they can interact with the map by zooming in on the area, by clicking the Magnifying Glass icon and searching for the location, or by selecting their state from the dropdown menu.


Users can choose what data is presented to them by adding layers to the map using the Layers icon in the top right corner of the map. Each layer represents a variable (e.g.,% adults 65 or older, % smokers, etc.) or a calculated vulnerability index. In the module on the right of the map, users can select a state to view county vulnerability scores and their rankings within the state. The arrows at the bottom toggle between the scores for different vulnerability indices.


Although this project was conceived to help our stakeholders address the impacts of COVID-19 in their communities, we expect its utility to outlive the pandemic. Our vulnerability indices do not pertain exclusively to COVID-19, and we are planning to add more metrics, translating stakeholders’ insights into community needs.


2.2 Methodology


Data


The Community Vulnerability Index leverages data from the following open-access sources:


  • The New York Times COVID-19 Data Repository
  • The New York City Health Department COVID-19 Data Repository
  • United States Census Bureau County Population by Characteristics: 2010-2019, American Community Survey, and Small Area Health Insurance Estimates
  • United States Diabetes Surveillance System (2017)
  • Centers for Disease Control and Prevention Interactive Atlas of Heart Disease and Stroke and Wonder Multiple Cause of Death database
  • Institute for Health Metrics and Evaluation Chronic Respiratory Disease Mortality Rates by County 1980-2014
  • County Health Rankings Model Adult smoking
  • Centers for Medicare & Medicaid Services COVID-19 Nursing Home Dataset
  • United States Department of Agriculture Economic Research Service
  • The Robert Wood Johnson Foundation County Health Rankings and Roadmaps Program
  • Homeland Infrastructure Foundation-Level Data Open Data
  • National Center for Education Statistics Number and percentage of households with computer and internet access, by state: 2015


Adding new variables to the CVI dataset involves a carefully crafted pipeline. The process starts with the identification of a new vulnerability index through an extensive literature review. Once identified, we locate open access data sources for each of the variables that comprise the index. To be included, the data must be race and age-adjusted, cover the entire adult population, and present the information at the county-level. We then clean the data and address any missing values. Finally, the new variables are merged into the full data set.


Construction of the dataset is an ongoing activity as vulnerability indices are continuously reviewed and added. The full dataset, along with definitions for all included variables, is available for download on our GitHub.


Metrics and Modeling


Vulnerability indices are calculated by quantile normalizing each comprising variable to a Gaussian distribution based on the full set of U.S. counties. The scaled variables are added using a weighted linear combination and the result is divided by the number of factors multiplied by weights to get a value between 0 and 100. Weights for each variable are determined through a review of published CDC information and other peer-reviewed literature.


We are currently exploring several strategies to improve our vulnerability scoring. To better understand the similarities between communities, we are examining the use of unsupervised learning methods such as clustering. We expect the resulting clusters to provide insight into additional axes on which we can evaluate communities.


In addition, we will also use time series forecasting through LSTM or similar models to improve the accuracy of scores that rely on COVID-19 case counts. Our expectation is that this will lead to improved results compared to simply using the current case count.


Finally, we are also currently researching using a neural network or other supervised learning methods for predicting severe COVID-19 cases. Interpretability methods applied to the trained model will allow us to refine the weights associated with the current metric, thus improving accuracy.


2.3 Results/Expected results


At the end of this project, we expect to deliver a tool that can assist nonprofits and other organizations with resource planning and distribution. Early feedback has indicated that stakeholders find CVI useful, and some have expressed interest in collaborating on developing customized vulnerability indices. Another expectation is that users of CVI will be able to leverage the insights obtained from the product to bolster their grant applications. As such, we believe CVI could also help secure funding.


3.0 Safety, quality assurance and regulation


3.1 What steps have you taken to ensure your solution’s safety? How advanced are you in this process (if applicable)? Please check the Biosafety and Biosecurity guideline of OpenCovid19


N/A - We do not collect any clinical samples. The data we use has been aggregated at a county level, and as such contains no identifying information for individuals.


3.2 Have you planned the conduct of your manufacturing process that ensures quality, what are the steps you have taken? How advanced are you in this (if applicable)?


N/A


3.3 Will you need assistance with the regulation system? If not, which regulatory system do you plan on using to distribute the product? Please elaborate (please see: Regulatory-Strategies(if applicable)


N/A


3.4 Have you talked to medical staff about the feasibility of your project? What did they say? 


N/A


3.5 Have you planned the testing, verification and validation of your solution? How advanced are you? (if applicable)


The accuracy of the vulnerability indices will be validated through a retrospective longitudinal study. We will start by identifying a list of past events that may have impacted a community’s vulnerability (e.g., policy changes). Subsequently, we will plot the vulnerability scores over time and verify if there are any significant deviations related to these events.


4.0 Impact, issues and risks


4.1 What impact do you feel your project could have?


There is growing evidence showing that Black and Latinx communities have been disproportionately affected by the pandemic, adding COVID-19 to a long list of health disparities suffered more acutely by minorities. In the United States, President Joe Biden and his administration have made equitable pandemic recovery a top priority. CVI can help governments and nonprofits with this task by ensuring that aid is going where it would be most effective. We can help navigate the long road to recovery by giving a clear and detailed picture of a community’s needs, allowing stakeholders to make smarter, more effective data-driven decisions.


4.2 What do you think would make your project a success?


We measure CVI’s success mainly by two criteria:

  1. The accuracy of the models
  2. User adoption


Having only one is not sufficient. For this project to be considered successful, both need to be attained. After all, basing decisions on the results of an inaccurate model would be no better (and potentially worse) than using nothing at all. By the same logic, even the most accurate model will be ineffective if no one is willing to use it.


4.3 Please list the known issues, potential risks, grey-areas, etc in your project


Reluctance to Adopt

We recognize that for some industries incorporating data science and machine learning in their decision-making represents a major paradigm shift. As such, there is a risk that users will be hesitant to use CVI. We can mitigate this risk by providing users with convincing demos showcasing what CVI can help them achieve, as well as delivering a clear, easy-to-use UX.


Hidden Variables, Insufficient Data

Model accuracy is highly dependent on having enough data to learn patterns. Similarly, variables not included in the dataset will be missing information. If these variables are important to the process being modeled, then performance will be impaired. We rely mostly on open access data sources to build our models. In addition, we enforce strict guidelines for including variables. Data must be race and age-adjusted, cover the entire adult population, and present the information at the county-level. As such, access to the data necessary to build accurate and reliable models represents another potential risk.


5.0 Originality


5.1 What other projects on JOGL are like yours? Search for them and Link them!


As of March 10, 2021, there are no projects for performing needs assessments. EPI-CENTER is similar in that it is focused on building epidemiological forecasting models.


5.2 Is this an innovative project? What makes this project different if it’s unique on JOGL?


To our knowledge, there are no other projects listed on JOGL for performing community needs assessments. What makes our project unique compared to others is that CVI is meant to help communities recover from the consequences of the pandemic by quantifying their needs across multiple axes and identifying areas of vulnerability.


5.3 Is there already an open-source version of this project?


We have not come across another open-source project like CVI.


6.0 Team experience


6.1 Please cite your team members and their roles in the project. 


Savannah Thais - Project Founder and Machine Learning Team Lead

Shaine Leibowitz - Machine Learning Co-Lead

Stephanie Santo - Data Lead

Diep Hoang - Dashboard Lead

Alexandra Passarelli - Literature Review

Alex Rios - Literature Review

Lindsey Fiedler - Funding Lead

Annina Christensen - Design Strategist

Sahil Saxena - Project Manager


7.0 Funding and Costs


7.1 How is your project being funded so far?


We are currently applying for other grants and setting up crowdfunding.


7.2 How much funding do you need and how do you plan to use that funding?


We have included an itemized budget in the Documents section. Although we recognize that this award would not cover all of our expenses, it would be of immeasurable value in helping us grow the Community Vulnerability Index.


Edit: It seems the budget document we uploaded is not showing to other users. We include an image of the budget below.


Additional information
  • Short Name: #CVI
  • Created on: March 3, 2021
  • Last update: July 12, 2021
  • Looking for collaborators: ✅
  • Grant information: Received $1,600.00€ from the OpenCOVID19 Grant Round 5 on Invalid Date
Keywords
Data science
Public health
machine learning
Artificial intelligence
Front-end development
3Good Health and Well-being
10Reduced Inequalities