Grantee Research Project Results
2018 Progress Report: Untapping the Crowd: Consumer Detection and Control of Lead in Drinking Water
EPA Grant Number: CR839375Title: Untapping the Crowd: Consumer Detection and Control of Lead in Drinking Water
Investigators: Edwards, Marc , Katner, Adrienne , Cooper, Caren , Berglund, Emily , Pieper, Kelsey
Current Investigators: Edwards, Marc , Berglund, Emily , Pieper, Kelsey , Katner, Adrienne , Cooper, Caren , Roy, Siddhartha , Kriss, Rebecca , Scherer, Michelle
Institution: Virginia Tech , Texas A & M University , North Carolina State University , University of Iowa , Louisiana State University
Current Institution: Virginia Tech , University of Iowa , Louisiana State University , North Carolina State University , Texas A & M University
EPA Project Officer: Hahn, Intaek
Project Period: April 1, 2018 through March 31, 2021 (Extended to March 31, 2023)
Project Period Covered by this Report: April 1, 2018 through March 31,2019
Project Amount: $1,981,500
RFA: National Priorities: Transdisciplinary Research into Detecting and Controlling Lead in Drinking Water (2017) RFA Text | Recipients Lists
Research Category: Drinking Water , Water
Objective:
We are developing a consumer-centric framework to detect and control water lead risks by
achieving the following objectives: 1) inventory infrastructure and analytical data, 2) predict risks
with quantitative models, 3) evaluate models through citizen science, 4) intervene with sitetailored
strategies to avoid water lead exposure, and 5) scale deliverables to a national level.
OBJ. 1. INVENTORY.
We are collecting and reviewing existing data and literature (Obj. O1.1), collecting data from
communities (Obj. O1.2), and crowdsourcing knowledge of lead in water levels, lead bearing
plumbing, patterns of water use and treatment systems, and perceptions of water quality (Obj.
O1.3).
OBJ. 2. PREDICT.
We are developing a Bayesian Belief Network (BBN) model of household-level citizen risk
considering the three key variables controlling lead in drinking water (Obj. 2.1); will expand the
household-level BBN to a GIS-enabled BBN model to simulate risk for neighborhoods and
communities (Obj. 2.2), will conduct scenario analysis to explore how a range of other variables
such as chlorine residuals, treatment options, source water composition, and demographics affect
community-level risk (Obj. 2.3); and are integrating the models and inventory data into a scalable
information technology (IT) platform (Obj. 2.4).
OBJ. 3. EVALUATE.
We are evaluating the accuracy of existing low-cost citizen science lead in water testing
technologies to quantify the range of particulate lead forms that can be encountered in practice,
and compare results to EPA standard method sample preparation and analysis (Obj. 3.1); and will
utilize appropriate testing technologies based on the prevalence and type of lead particulates
encountered in a community to evaluate BBN models (Obj. 3.2).
OBJ. 4. INTERVENE.
We are evaluating environmental (Obj. 4.1) and educational (Obj. 4.2) interventions within the
consumer-centric framework.
OBJ. 5. SCALE.
We will beta test work products (frameworks and models) in other states concerned with lead in
water (Obj. 5.1) and are engaging with stakeholders in order to enhance the capabilities of the
consumer-centric framework (Obj. 5.2).
Progress Summary:
Progress Summary/Accomplishments
OBJ. 1. INVENTORY.
O1.2: Collect data from communities
Well Water Testing in North Carolina
In February 2019, Virginia Tech and the UNC Institute for the Environment partnered with the
Iredell Health Department (North Carolina) to provide free well water testing to private well users.
A total of 786 well water sampling kits were returned for analysis (out of 931 that were distributed;
84% return rate). The goal was to measure lead in drinking water from private wells where well
users are solely responsible for detecting and controlling water lead risks, and to examine well
water quality and recovery after Hurricanes Florence and Michael. Of the 786 homes sampled,
7.5% of first draw samples exceeded the US EPA lead action level of 15 μg/L, which applies to
regulated public water supplies. Elevated levels of copper above the US EPA copper action level
of 1.3 mg/L were also observed in 10.8% of first draw samples. Although the US EPA Lead and
Copper Rule does not apply to private wells, some of the higher levels of both lead and copper are
concerning. Information about contaminants from household plumbing corrosion was assessed by
letting the water sit stagnant for at least six hours, and then collecting the ‘first draw’ sample from
the kitchen tap. After flushing water from the pipes for 5 minutes, less than 1% of homes had lead
and copper concentrations above the action levels—this further confirms these contaminants were
from plumbing. Consumers were provided information about use of low cost faucet and lead filters
at a community meeting attended by 250 participants in our survey.
In March and April 2019, Virginia Tech and the UNC Institute for the Environment
provided free well water testing to private well users in the Chatham County. This testing is
currently underway, but 235 well users have agreed to participate in this program.
Well Water Testing in New York
Increasing chloride concentrations in community water supplies across the United States have
damaged premise plumbing, triggered sudden water lead contamination events, and even exceeded
aesthetic standards for chloride. In the Northeast and Midwest, this rise in chloride levels has been
attributed to an increased use of road salt, but rising chloride might also come from other sources
including saltwater intrusion, hydrofracking, and septic effluent. To identify appropriate
mitigation and management strategies, a method to identify the source of chloride in drinking water
is needed. Studies have applied fingerprinting techniques to trace chloride to road salt in surface
waters and groundwaters, but these techniques have not yet been applied to drinking water systems.
The objectives of this study are to: 1) summarize fingerprinting techniques to identify chloride
sourced from road salt, 2) apply these techniques to determine whether they work in drinking water
supplied by private wells, 3) compare these results to an analysis of spatial and temporal chloride
trends, and 4) examine the relationship between chloride and the metals in drinking water
infrastructure that are vulnerable to corrosion.
Data for this study were collected during a citizen science sampling campaign. Between
January to July 2018, residents in Orleans, New York collected a weekly first draw water sample
from their kitchen tap. A total of 240 first draw samples were collected weekly and analyzed via
ICP-MS. More than a third of the samples exceeded the 250 mg/L aesthetic standard for chloride
throughout the study. Seasonal variations in chloride levels were evaluated and a median chloride
level of 180 mg/L (maximum of 808 mg/L) was observed during the winter months while a median
of 191 mg/L (maximum of 832 mg/L) was observed during the summer months. Although not
statistically significant (one-tailed t-test, p=0.23), chloride levels decreased by an average of 5.4
mg/L from winter to summer months, a trend that is also consistent with road salt as a source. To
further distinguish whether road salt is the source of chloride, 12 different fingerprinting
techniques are being evaluated. Temporal relationships between lead and copper levels and
chloride will be explored. Developing a technique to identify road salt contamination in drinking
water is critical, because road salt levels are increasing nationally and becoming an emerging
problem. We are also documenting a case where rising chloride triggered a sudden increase in
water lead levels, without any other change in source water treatment.
Municipal System Testing
In June 2018, our team conducted sequential sampling in Berwyn (3 single-family homes) and
Cicero homes (3 single-family homes and 1 church) alongside the environmental justice
organization Ixchel. We collected several liters of water per house after a 6+ hour stagnation period.
Specifically, we collected 10 1-L samples at low flow rate (~2L/min) and 10 1-L samples at full
flow rate. When possible, we also ascertained the service line material type with permission from
the homeowner. The sampling showed all but one Berwyn home with persistently high levels of
lead above 15 ppb over several minutes, and most of the lead was particulate. The highest lead
detected was 141 ppb.
In August 2018, we conducted a citywide sampling using the Flint 3-bottle protocol (first
draw, “45-second flush” second draw and “2-min flush” third draw). We sampled 83 homes and 2
churches, with 25 Berwyn homes and 51 Cicero homes. The 90th percentile first draw lead for
Berwyn and Cicero were 10.2 ppb and 12.4 ppb respectively. The 90th percentile second draw lead
was relatively the same in Berwyn (11.1 ppb) but increased slightly in Cicero (15.5 ppb).
According to Illinois EPA, 85% of Berwyn homes and 98% of Cicero homes have lead service
lines.
On the basis of our analysis, and documentation of EPA LCR sampling deficiencies,
Illinois EPA issued violations to the City of Cicero. In an early 2019 meeting with Ixchel, U.S.
EPA Region V and Illinois EPA, plans by the City of Chicago to optimize corrosion control were
discussed. This could potentially reduce lead levels throughout the City of Chicago and for its
100+ suburban wholesale customers.
O1.3: Crowdsource knowledge of lead in water levels, lead bearing plumbing, patterns of water
use and treatment systems, and perceptions of water quality
Website Development
A website (http://crowdthetap.org/ or https://scistarter.org/form/crowd-the-tap) was developed
where citizen scientists can input data about their tap water pipes, connect with resources about
finding their pipes, and communicate in a forum.
Postcard Development
Crowd the Tap pipe identification kits were created, which contain postcards that have descriptions
of where to find tap water pipes along with materials used to test tap water pipes (i.e. penny and
magnet). These kits will be used by citizen scientists to find and test their tap water pipes, and then
report the results to an online website and database. An estimated 200 pipe identification kits
distributed to Citizen Science Festival visitors.
OBJ. 2. PREDICT.
O2.1: Develop a BBN model of household-level citizen risk considering
Methodology: Bayesian Belief Network
The BBN is at the core of our machine learning analytics. It is a probabilistic directed acyclic
graphical model, which represents the dependencies among the subset of attributes via directed
arcs. A joint probability table is associated with each arc to explain the probabilistic relationship
of the connected attributes. A BBN model is constructed from two components: 1) Directed
Acyclic Graph (DAG), which is the structure of BBN and shows the topology of network, and 2)
Conditional Probability Table (CPT), which is the parameter set of a BBN and is learned from a
specific DAG. In this research, we seek a BBN model that fits water quality data to classify positive
lead samples with the highest accuracy. BBN is designed to work with discrete variables, and the
raw data, which included some continuous variables, were converted into discrete variables using
discretization approaches. We compared “bins” for variables elucidated through expert knowledge
and data-driven approaches.
Finding DAG Structures: Different classifiers were explored to construct the structure of the BBN,
as follows.
Naive-Bayes is the simplest classifier. It is constructed from one parent node which is the
class attribute and all other attributes are considered as the children of the class node. It
assumes that the children are mutually independent, and no other connections between
nodes are allowed.
Tree Augmented Naive-Bayes is similar to the Naive-Bayes but it relaxes the independence
assumption. Attributes may be dependent, and more complex tree structure are formed, in
which children nodes are connected via arcs.
The General Bayesian Network does not force the classification attribute (e.g., presence of
lead) to be the parent of all other attributes. Instead, arcs are constructed among all potential
pairs of nodes to identify relationships among attributes.
An Expert-Knowledge Approach builds a classifier based on insight that experts in the
domain provide about relationships among variables.
Methodology: Ensemble of Decision Trees
A decision tree is an expanding structure of nodes with the application of binary splits. Each node
represents a predictor variable, and splits are formed by using an inequality condition. The
performance of a split is evaluated through the Gini index, which measures the diversity of the
data until a terminal node is reached. An Ensemble of Decision Trees (EDT) can be built using
boosting or bootstrap aggregation (bagging) techniques. In this research, the EDT was built using
a boosting algorithm called Random Undersampling Boosting (RUS). RUS is an algorithm
designed to cope with the problem of imbalanced data, where one class size outnumbers the rest
of classes.
The number of trees, the depth of each tree, and the learning rate are among the required
settings to define an EDT. The number of trees expands the ensemble horizontally and the higher
this number is, the more computational time is required. Similarly, the depth of each tree is
controlled by the maximum number of splits and high values add complexity to the model. Finally,
the learning rate refers to the step size of each iteration during the learning phase. Due to the
infinite number of settings combinations, a Bayesian optimization process was performed to find
the most suitable combination of settings and maximize the accuracy of the model.
Application of Methodologies
Dataset: Approximately 1800 water samples at drinking tap water were collected from private
water systems (e.g., wells, springs) by VAHWQP between 2012 and 2014 in Virginia. Aesthetic
properties and household characteristics were collected through a survey. We worked with domain
experts on the research team to narrow the attributes that should be used in the models to a set of
42 potential features. Data types are listed as follows.
Observations, including water taste, color, odor, and staining on fixtures
Well type and depth
Water treatment in the home and plumbing type
Location and physiographic province
Water quality data, collected in a laboratory, including lead concentration data
Data Cleaning: Some samples reported incomplete data. For incomplete data points, we removed
attributes from the dataset if more than 5% of the samples did not report a value for that attribute.
Otherwise, we removed incomplete samples from the dataset.
Data Division: 80% of the dataset were used to train models and 20% of the dataset were used to
test models. A 10-fold cross-validation approach was used in application of the BBN and EDT
approaches.
The performance of the best models found using the BBN and EDT approaches are
reported in Table 1. The precisions, recall, and accuracy of the two approaches are reported. The
models classify lead as present (value = TRUE) at a concentration at or above 15 ppb, and as absent
(value = FALSE) otherwise. Precision is the ratio of true positives to the total number of samples
predicted as true, which is the sum of true positives and false positives. Recall is the ratio of true
positives to the total number of true samples. Accuracy is the ratio of accurately predicted samples
to the total number of samples.
In this application, we focus on recall, because the models should accurately predict the
presence of lead. The datasets are imbalanced, with only about 20% of the data reporting true
(concentration greater than or equal to 15 ppb) and focusing on recall is an appropriate approach
to address imbalanced datasets. The EDT dominates the BBN across the three metrics.
Table 1. Performance of BBN and EDT Approaches
Modeling Approach Precision Recall Accuracy
Bayesian Belief Networks 29% 59% 61%
Ensemble of Decision Trees 72% 77% 84%
O2.4: Develop a scalable information technology (IT) platform
Planning and development of our team’s main website, including functionality needs and security
requirements, has been completed. Documents presenting website layout and preliminary content,
metadata templates for documents and data, and surveys have been completed. A preliminary
website has been developed and IT contractors are working to train the Virginia Tech and LSU
Health teams on website content editing.
OBJ. 3. EVALUATE.
O3.1: Evaluate low-cost citizen science lead in water testing technologies
The utility and accuracy of off-the-shelf lead in water test kits as screening tools for lead in water
are being evaluated. Specifically, we are determining if these test kits can (1) detect high soluble
lead in water, (2) accurately measure 15 ppb soluble lead in water, (3) detect particulate lead in
water, and (4) subject to other artifacts from co-contaminants such as iron.
Test kits were selected based on an Amazon search for “water lead test kit” performed on
May 9, 2018. This search yielded 347 results, 122 of which were products related to lead testing.
Within these 122 results, there were 34 unique products with costs from $0.10 to $39.95 per test,
including: 1 colorimetric vial test, 1 spectrophotometric test, 8 mail away tests, 8 positive/negative
test strip type tests, and 16 color change test strip type tests. This effort focuses primarily on athome
kits and, therefore, excluded the spectrophotometric test and mail away tests from testing.
Four types of at-home test kits were considered for this experiment including a colorimetric vial
test, color change test strip type tests, a positive/negative color change test, and positive/negative
line tests analogous to at-home pregnancy tests.
A 4-tiered sampling approach is being used to determine test kit effectiveness. Tiers 1 and
2 are aimed at testing the overall effectiveness of tests using low and high drinking water
concentrations of lead. Tiers 3 and 4 investigate the potential for false negative and false positive
results using various types of particulate lead and potential test interferences.
Four test kits accurately detected the 150 ppb dissolved lead in Tier 1 including three
positive/negative line tests and one color change vial test. All other test kits failed to measure lead
in the drinking water, either by detecting no lead or measuring more lead than was present.
Therefore, the three positive/negative line tests and the color change vial test will proceed to Tier
2a (low dissolved), while other kits will proceed to Tier 2b (extreme particulate).
OBJ. 4. INTERVENE.
O4.1: Evaluate environmental interventions
Environmental interventions are being evaluate through reviews of the literature, technical
documents, and case studies. Interventions are assessed in terms of WLL reduction; short- and
long-term costs and sustainability; barriers and impediments to adoption and use; conditions under
which interventions are expected to be effective and ineffective; limitations and strengths; and
short- and long-term expected environmental impacts and consequences. Interventions evaluated
includes: 1) low-cost point of use systems (POU); 2) corrosion control treatment (including those
in homes such as limestone contactors); 3) flushing; 4) full and partial service line replacements;
and 5) provision of other water sources like bottled water and water buffalos. Prior research on the
effectiveness of flushing has been published through a grant funded by the Louisiana State Board
of Regents grant. Data collected to assess water lead levels prior to and after partial lead service
line replacements is being prepared for publication. NSF-certified filters are being evaluated at
Virginia Tech currently and are the most cost-effective filters will be field-tested this summer in
different towns and under varying water quality conditions. A journal paper has been peerreviewed
and accepted for publication (see Publications section).
O4.2: Evaluate educational interventions
A project-based educational curriculum for elementary and grade schools, which was developed
through a grant from the US EPA Environmental Education Program, is currently being evaluated
and updated with funds from this US EPA grant. To date this curriculum has been tested in two
inner-city high schools, an inner-city summer program, and is currently being evaluated by an
inner-city elementary school. Curriculum feedback has been obtained from teachers and students
from each school and results are enabling lesson plan refinement. These lesson plans, which
include project-based and multi-disciplinary exercises focused on addressing lead in drinking
water issues, are expected to be available to the public at the end of Summer 2019.
OBJ. 5. SCALE.
O5.2: Engage with stakeholders to enhance the capabilities of the consumer-centric framework
Opportunities to participate in educational programs and seek technical assistance from experts
are necessary to ensure well stewardship, which legally remains the sole responsibility of well
users. In states with robust Cooperative Extension Service support, Extension has proven to be one
of the most successful platforms for engaging, empowering, and educating private well users.
Cooperative Extension Service personnel are trusted sources of science-based information in their
communities, including educating identification of resources for protecting family health and
groundwater quality. However, there is no established network for these Extension platforms to
engage and share knowledge and resources.
In October 2018, Extension programs from 12 states participated in a 2-day meeting at NC
State. The states included: Arizona, Florida, Georgia, Illinois, Maryland, Mississippi, Missouri,
Montana, North Carolina, Pennsylvania, Texas, and Virginia. Participants from Maine, Rhode
Island, and Vermont were not able to attend. The purpose of this meeting was to (1) understand
function and capacity of existing/successful programs; (2) identify barriers to and conditions for
success/best practices for new programs; (3) develop resources and guidelines to address common
well water problems; and (4) design metrics to evaluate Extension programming. Results of this
meeting are currently being developed into a white paper.
Future Activities:
Future Activities Plans
Objective 1.1: Collect and review existing data and literature
We have learned the that LSLR Collaborative is currently conducting a review of all service line
data sources. We will work with them to join our team’s database to have a more comprehensive
understanding. Other data sources such as USGS groundwater data are being evaluated for
integration in the BBN as well as better characterizing corrosion in small systems with limit
treatment.
Objective 1.3
In Year 2, we will collect, analyze, and publish mental models study data. We will also examine
links between demographics and pipe materials and accessibility of mitigation options using data
from citizen scientists collected via Crowd the Tap’s page. To launch Crowd the Tap across all
counties in North Carolina, we will build community and support network.
Objectives 2.1-2.3
Future activities will explore the BBN and EDT approaches with application for new datasets,
including additional data from private wells collected in Virginia and North Carolina and data from
municipal systems, including New Orleans, LA and Flint, MI. Model capabilities will be explored
relative to private and municipal systems. We will also explore alternative sets of predictor
variables, or attributes, and how they affect the classification of risk of lead. We will explore how
models perform for survey data alone and for increasing levels of information provided through
pH, hardness, and commonly available water quality parameters. Models applied for municipal
systems will be further extended to predict community levels of lead exposure.
Objective 2.4: Develop a scalable information technology (IT) platform
Website content is expected to be complete by the Fall of 2019, which will include BBN
integration. Work to develop a database to house water measurement data, and construction of the
back-end and front-end of model interface are in progress and ongoing. In Spring 2020, the BBN
testing and website evaluation will begin. Website editing to respond to public comments is
expected to be completed by the end of Fall 2020. The public version of the website with full BBN
model functionality, citizen-centric content and functionality, is expected to roll out for public
Beta-testing by the Fall of 2020. The final website is scheduled to be completed by Summer of
2021.
Objective 3.1: Evaluate low-cost citizen science lead in water testing technologies
Particulates from communities in the US struggling with lead in water problems are being
harvested to complete Tier 3. After, Tier 4 will evaluate kit test interferences such as high iron in
water. Once laboratory testing has been completed in Spring 2019, field testing of the top at-home
test kit will begin. During Summer 219, water samples from Robeson and New Hanover (North
Carolina) will be measured via at-home test kit, dissolved lead via ICP-MS, and total lead via ICPMS.
Objective 3.2: Utilize testing technologies to evaluate BBN models
Validation of the BBN model with field data will begin once the internal testing of BBN model
with the partners’ existing database is completed. After the household-level citizen risk model has
been constructed, we will recruit participants to test the model by entering their relevant data and
information, running the model, and documenting their risk output. Volunteers will also be asked
to provide feedback on the user interface, resources provided on the website, risk messaging, ease
of data input, and model runs, and to identify how the system can be made into a more valuable
resource for impacted communities. This information will be used to improve the user interface
and content.
Objective 4.1-4.2: Evaluate environmental and educational interventions
Documents presenting population-specific educational needs, intervention strategies, and
consumer guidance materials for addressing knowledge gaps, motivating exposure reduction
behaviors, and enabling informed-decision making is being developed. These documents are
anticipated to be one of the final outputs of the project. Educational interventions to be evaluated
include those that are being conducted in response to LCR requirements in St. Joseph and Flint as
well as voluntary measures in New Orleans and in Virginia. Researchers will evaluate educational
campaign strategies being used by the utility, state, or community groups based on their
availability, literacy level, appropriateness of risk reduction messages, and impact. Information
will also be sought from communities using surveys, focus groups, and semi-structured interviews
to enable characterization of factors such as educational needs and information gaps (e.g.,
knowledge of information request procedures, POU filter selection, understanding of
responsibilities of both consumers and utilities).
Objective 5.1: Beta test work products (frameworks and models) in other states
Community partners in Iowa and Texas will apply our framework and work products, recruiting
consumers to participate in crowdsourced inventories, water testing, and use of the model,
providing a feedback loop to those efforts. It will also identify barriers to community outreach,
motivation, and product dissemination when the methods are applied beyond those involved on
the core research team. After further model and website optimization, we will launch a national
citizen science and community crowdsourcing campaign. This will occur in Year 3.
Objective 5.2: Engage with stakeholders in order to enhance the capabilities of the consumercentric
framework.
LSUHSC will conduct key stakeholder interviews with public advocates involved in the water
struggle of St. Joseph, LA and Enterprise, LA to document the challenges, barriers, and needs of
the community during their struggle to get their voices heard and their water problems addressed.
The results from the Extension well water workshop will be developed into a white paper
and a follow-up workshop will be held in Year 3 to evaluate progress toward individual and group
goals.
Journal Articles on this Report : 1 Displayed | Download in RIS Format
Other project views: | All 56 publications | 13 publications in selected types | All 13 journal articles |
---|
Type | Citation | ||
---|---|---|---|
|
Roy S, Mosteller K, Mosteller M, Webber K, Webber V, Webber S, Reid L, Walter L, Edwards MA. Citizen science chlorine surveillance during the Flint, Michigan federal water emergency. Water Research 2021:117304 |
CR839375 (2018) CR839375 (2020) CR839375 (2021) |
Exit Exit |
Supplemental Keywords:
Lead, drinking water, corrosion, citizen science, environmental justiceRelevant Websites:
Our website is still in development and will be ready for previewing in Fall 2019.
Progress and Final Reports:
Original AbstractThe perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Conclusions drawn by the principal investigators have not been reviewed by the Agency.