ArcGIS REST Services Directory |
Home > services > PredECHydroline (FeatureServer) | | API Reference |
Natural Background Stream Conductivity Model and Estimates
The Natural Background Stream Conductivity (NBSC) Model predicts background SC for stream segments in the contiguous United States to enable comparison with measured in-stream conductivity. This random forest model was developed using geology, soil, vegetation, climate and other empirically measured predictors. It was developed for streams with natural background SC < 2000 µS/cm. Above this level, inland water is considered brackish and the estimates from the NBSC Model may be less reliable. Data for some parameters that affect background SC were not readily available were not be included in the model. These include freshwater and marine interfaces, natural mineral springs, salt deposits which may affect groundwater and streams, and other natural sources of salts. In such areas the model is likely to underestimate SC. Local knowledge is necessary when assessing differences between predicted and measured background SC.
The Validation Model view on the Freshwater Explorer shows the predictive performance of the empirical NBSC Model at reference sites that were used to develop the model. In the wetter and more forested areas, reference sites were more abundant and predictions more precise. In the central U.S, reference sites were less available and predictions more uncertain. In the grass and shrublands east of the continental divide, measured SC was over-predicted by the model by more than 100 µS/cm at 5% of sites (yellow dots) and under-predicted by more than 100 µS/cm at 3% of sites (red dots). Calculated differences between measured and predicted SC are reported as residuals in pop-up boxes on this view. There are many potential causes for these differences, including data reporting errors and reference site reliability.
Overall, the model explained most of the variation in SC and produced reasonably accurate predictions for both training data (assessed with out-of-bag predictions, MAE = 22 µS/cm, NSE = 0.92, and R2 = 0.92) and external validation data (MAE = 29 µS/cm, NSE = 0.87, and R2 = 0.87). For more detail, see Olson and Cormier (2019). Values reported as background only apply to streams and have not been validated for lakes or wetlands.
Natural Background Steam Conductivity Model Data Sets
The StreamCat data set and process was used to develop the NBSC Model and make stream-specific predictions across the contiguous U. S. (Hill et al. 2016; https://github.com/USEPA/StreamCat). The StreamCat data set is based on a network of stream segments from NHD+2 (McKay et al. 2012). These stream segments drain an average area of 3.1 km2 characterizing the spatial grain size of this data set. Natural background SC was not estimated for streams shown as gray lines.
The data set consists of minimally disturbed sites. More than 2.4 million SC observations were obtained from WQP (USEPA 2016b), state natural resource agencies, the U.S. Geological Survey (USGS) National Water Information System (NWIS) system (USGS 2016), and data used in Olson and Hawkins (2012). Data obtained from WQP were downloaded from the WQP website using the following query criteria: Country: United States; Sample Media: Water; Characteristics: Conductivity, Specific Conductivity, Specific Conductance, Specific Conductance, Calculated/Measured Ratio; Date range: up to 31 December 2015. Final data used in the model include observations made between 1 January 2001 and 31 December 2015, thus coincident with Moderate Resolution Imaging Spectroradiometer (MODIS) satellite data (https://modis.gsfc.nasa.gov/data/). During development, 56 potential explanatory parameters were evaluated. The original source data and final data sets are available at https://doi.org/10.23719/1500945. Predicted Background Conductivity metadata and data are also available on the Geoplatform.
Each observation was related to the nearest stream segment in the NHD+. Data were limited to one observation per stream segment per month. SC observations with ambiguous locations and repeat measurements along a stream segment in the same month were discarded. Using estimates of anthropogenic stress derived from the StreamCatdatabase (Hill et al. 2016), segments were selected with minimal amounts of human activity (Stoddard et al. 2006) using criteria developed for each Level II Ecoregion (Omernik and Griffith 2014). Segments were assessed as potentially minimally stressed where watersheds had 0 – 0.5% impervious surface, 0 – 5% urban, 0 – 10% agriculture, and population densities of 0.8 – 30 people/km2. Watersheds with observations with large residuals in initial models were identified and inspected for evidence of other human activities not represented in StreamCat (e.g., mining, logging, grazing, or oil/gas extraction). Observations were removed from disturbed watersheds, with a tidal influence or unusual geologic conditions such as hot springs. Some sites with high levels remain in the data set, e.g., mining influenced but no easily accessible evidence. Some sites with high levels remain in the data set, e.g., mining influenced but no national record. About 5% of SC observations in each National Rivers and Stream Assessment (NRSA) region were then randomly selected as independent validation data. The remaining observations became the large training data set for model calibration.