Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
1295e91
start adding functions
ehinman Aug 8, 2025
c4b0b9a
start adding documentation and going through functions
ehinman Aug 27, 2025
c32ded5
adjust date function
ehinman Aug 28, 2025
99e949c
fix dates function
ehinman Aug 29, 2025
1641e85
keep working out issues with api calls
ehinman Aug 29, 2025
7bc6c6f
add documentation
ehinman Aug 29, 2025
1b29d6a
adjust how response is handled and edit walk pages, fix API limit print
ehinman Sep 19, 2025
3289982
add documentation
ehinman Sep 19, 2025
867d728
add more documentation, correct waterdata module
ehinman Sep 19, 2025
44213b5
allow post and get calls in recursive walk pages, fix typo where firs…
ehinman Sep 19, 2025
4affa2f
add in all possible arguments
ehinman Sep 19, 2025
21691d0
trying to get cql2 query correct, will keep at it
ehinman Sep 19, 2025
4c2a3ee
correct cql2 queries
ehinman Sep 22, 2025
14f2830
simplify syntax, remove unneeded dependencies
ehinman Sep 22, 2025
d25f854
start adding function documentation
ehinman Sep 24, 2025
7fe486a
add link urls
ehinman Sep 25, 2025
fad9ce0
fix date formatting function
ehinman Sep 25, 2025
a33d201
make waterdata outputs geopandas if geometry included
ehinman Sep 25, 2025
bd82c49
make gpd an optional dependency and change returns accordingly
ehinman Sep 25, 2025
06b0e69
incorporate geopandas boolean into function arguments and ensure user…
ehinman Sep 25, 2025
253da79
clean up some documentation and comments
ehinman Sep 25, 2025
f5cca07
add optional dependency to pyproject.toml
ehinman Sep 25, 2025
5c546e7
set convertType to default or user specification
ehinman Sep 25, 2025
e9221ac
start unit tests on new functions
ehinman Sep 25, 2025
b1436db
update README and add a NEWS markdown in which to place past updates
ehinman Sep 26, 2025
dc24658
make a few small changes to names and documentation
ehinman Sep 26, 2025
89b960c
define max_results when it is an input
ehinman Sep 26, 2025
1237777
comment out code that wasn't doing the correct thing with max_results
ehinman Sep 26, 2025
e84984a
Revert waterdata to requrests
thodson-usgs Sep 29, 2025
4c84fc0
Review waterdata module
Oct 2, 2025
f4693b6
Update README.md
Oct 2, 2025
0d06672
Add deprecation warning for nwis
Oct 22, 2025
96a4356
Update dataretrieval/waterdata/api.py
ehinman Nov 21, 2025
7f7f184
Update dataretrieval/waterdata/api.py
ehinman Nov 21, 2025
c14e00b
Update dataretrieval/waterdata/api.py
ehinman Nov 21, 2025
dcc7a1a
Apply suggestions from code review
ehinman Nov 21, 2025
370f9a5
Merge pull request #5 from nodohs/waterdata
ehinman Nov 21, 2025
4482751
add back in documentation and make formatting changes
ehinman Nov 21, 2025
37063b9
add metadata to api.py and testing
ehinman Nov 21, 2025
8bb2de8
small changes to remove unnecessary imports and add more documentation
ehinman Nov 21, 2025
2f6af7d
remove some redundant testing, make next url be an info log, not debug
ehinman Nov 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,5 @@ jobs:
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest and report coverage
run: |
cd tests
coverage run -m pytest
coverage run -m pytest tests/
coverage report -m
cd ..
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
**10/01/2025:** `dataretrieval` is pleased to offer a new module, `waterdata`, which gives users access USGS's modernized [Water Data APIs](https://api.waterdata.usgs.gov/). The Water Data API endpoints include daily values, instantaneous values, field measurements (modernized groundwater levels service), time series metadata, and discrete water quality data from the Samples database. Though there will be a period of overlap, the functions within `waterdata` will eventually replace the `nwis` module, which currently provides access to the legacy [NWIS Water Services](https://waterservices.usgs.gov/). More example workflows and functions coming soon. Check `help(waterdata)` for more information.

**09/03/2024:** The groundwater levels service has switched endpoints, and `dataretrieval` was updated accordingly in [`v1.0.10`](https://github.com/DOI-USGS/dataretrieval-python/releases/tag/v1.0.10). Older versions using the discontinued endpoint will return 503 errors for `nwis.get_gwlevels` or the `service='gwlevels'` argument. Visit [Water Data For the Nation](https://waterdata.usgs.gov/blog/wdfn-waterservices-2024/) for more information.

**03/01/2024:** USGS data availability and format have changed on Water Quality Portal (WQP). Since March 2024, data obtained from WQP legacy profiles will not include new USGS data or recent updates to existing data. All USGS data (up to and beyond March 2024) are available using the new WQP beta services. You can access the beta services by setting `legacy=False` in the functions in the `wqp` module.

To view the status of changes in data availability and code functionality, visit: https://doi-usgs.github.io/dataRetrieval/articles/Status.html
254 changes: 176 additions & 78 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,123 +4,221 @@
![Conda Version](https://img.shields.io/conda/v/conda-forge/dataretrieval)
![Downloads](https://static.pepy.tech/badge/dataretrieval)

:warning: USGS data availability and format have changed on Water Quality Portal (WQP). Since March 2024, data obtained from WQP legacy profiles will not include new USGS data or recent updates to existing data. All USGS data (up to and beyond March 2024) are available using the new WQP beta services. You can access the beta services by setting `legacy=False` in the functions in the `wqp` module.
## Latest Announcements

To view the status of changes in data availability and code functionality, visit: https://doi-usgs.github.io/dataRetrieval/articles/Status.html
:mega: **10/01/2025:** `dataretrieval` now features the new `waterdata` module,
which provides access to USGS's modernized [Water Data
APIs](https://api.waterdata.usgs.gov/). The Water Data API endpoints include
daily values, instantaneous values, field measurements, time series metadata,
and discrete water quality data from the Samples database. This new module will
eventually replace the `nwis` module, which provides access to the legacy [NWIS
Water Services](https://waterservices.usgs.gov/).

:mega: **09/03/2024:** The groundwater levels service has switched endpoints, and `dataretrieval` was updated accordingly in [`v1.0.10`](https://github.com/DOI-USGS/dataretrieval-python/releases/tag/v1.0.10). Older versions using the discontinued endpoint will return 503 errors for `nwis.get_gwlevels` or the `service='gwlevels'` argument. Visit [Water Data For the Nation](https://waterdata.usgs.gov/blog/wdfn-waterservices-2024/) for more information.
**Important:** Users of the Water Data APIs are strongly encouraged to obtain an
API key for higher rate limits and greater access to USGS data. [Register for
an API key](https://api.waterdata.usgs.gov/signup/) and set it as an
environment variable:

## What is dataretrieval?
`dataretrieval` was created to simplify the process of loading hydrologic data into the Python environment.
Like the original R version [`dataRetrieval`](https://github.com/DOI-USGS/dataRetrieval),
it is designed to retrieve the major data types of U.S. Geological Survey (USGS) hydrology
data that are available on the Web, as well as data from the Water
Quality Portal (WQP), which currently houses water quality data from the
Environmental Protection Agency (EPA), U.S. Department of Agriculture
(USDA), and USGS. Direct USGS data is obtained from a service called the
National Water Information System (NWIS).
```python
import os
os.environ["API_USGS_PAT"] = "your_api_key_here"
```

Note that the python version is not a direct port of the original: it attempts to reproduce the functionality of the R package,
though its organization and interface often differ.
Check out the [NEWS](NEWS.md) file for all updates and announcements.

If there's a hydrologic or environmental data portal that you'd like dataretrieval to
work with, raise it as an [issue](https://github.com/USGS-python/dataretrieval/issues).
## What is dataretrieval?

Here's an example using `dataretrieval` to retrieve data from the National Water Information System (NWIS).
`dataretrieval` simplifies the process of loading hydrologic data into Python.
Like the original R version
[`dataRetrieval`](https://github.com/DOI-USGS/dataRetrieval), it retrieves major
U.S. Geological Survey (USGS) hydrology data types available on the Web, as well
as data from the Water Quality Portal (WQP) and Network Linked Data Index
(NLDI).

```python
# first import the functions for downloading data from NWIS
import dataretrieval.nwis as nwis
## Usage Examples

# specify the USGS site code for which we want data.
site = '03339000'
### Water Data API (Recommended - Modern USGS Data)

# get instantaneous values (iv)
df = nwis.get_record(sites=site, service='iv', start='2017-12-31', end='2018-01-01')
The `waterdata` module provides access to modern USGS Water Data APIs:

# get basic info about the site
df2 = nwis.get_record(sites=site, service='site')
```python
import dataretrieval.waterdata as waterdata

# Get daily streamflow data (returns DataFrame and metadata)
df, metadata = waterdata.get_daily(
monitoring_location_id='USGS-01646500',
parameter_code='00060', # Discharge
time='2024-10-01/2024-10-02'
)

print(f"Retrieved {len(df)} records")
print(f"Site: {df['monitoring_location_id'].iloc[0]}")
print(f"Mean discharge: {df['value'].mean():.2f} {df['unit_of_measure'].iloc[0]}")
```
Services available from NWIS include:
- instantaneous values (iv)
- daily values (dv)
- statistics (stat)
- site info (site)
- discharge peaks (peaks)
- discharge measurements (measurements)

Water quality data are available from:
- [Samples](https://waterdata.usgs.gov/download-samples/#dataProfile=site) - Discrete USGS water quality data only
- [Water Quality Portal](https://www.waterqualitydata.us/) - Discrete water quality data from USGS and EPA. Older data are available in the legacy WQX version 2 format; all data are available in the beta WQX3.0 format.

To access the full functionality available from NWIS web services, nwis.get record appends any additional kwargs into the REST request. For example, this function call:

```python
nwis.get_record(sites='03339000', service='dv', start='2017-12-31', parameterCd='00060')
# Get monitoring location information
locations, metadata = waterdata.get_monitoring_locations(
state_name='Maryland',
site_type_code='ST' # Stream sites
)

print(f"Found {len(locations)} stream monitoring locations in Maryland")
```
...will download daily data with the parameter code 00060 (discharge).

## Accessing the "Internal" NWIS
If you're connected to the USGS network, dataretrieval call pull from the internal (non-public) NWIS interface.
Most dataretrieval functions pass kwargs directly to NWIS's REST API, which provides simple access to internal data; simply specify "access='3'".
For example
### NWIS Legacy Services (Deprecated but still functional)

The `nwis` module accesses legacy NWIS Water Services:

```python
nwis.get_record(sites='05404147',service='iv', start='2021-01-01', end='2021-3-01', access='3')
import dataretrieval.nwis as nwis

# Get site information
info, metadata = nwis.get_info(sites='01646500')

print(f"Site name: {info['station_nm'].iloc[0]}")

# Get daily values
dv, metadata = nwis.get_dv(
sites='01646500',
start='2024-10-01',
end='2024-10-02',
parameterCd='00060',
)

print(f"Retrieved {len(dv)} daily values")
```

More services and documentation to come!
### Water Quality Portal (WQP)

## Quick start
Access water quality data from multiple agencies:

dataretrieval can be installed using pip:

$ python3 -m pip install -U dataretrieval
```python
import dataretrieval.wqp as wqp

or conda:
# Find water quality monitoring sites
sites = wqp.what_sites(
statecode='US:55', # Wisconsin
siteType='Stream'
)

$ conda install -c conda-forge dataretrieval
print(f"Found {len(sites)} stream monitoring sites in Wisconsin")

More examples of use are include in [`demos`](https://github.com/USGS-python/dataretrieval/tree/main/demos).
# Get water quality results
results = wqp.get_results(
siteid='USGS-05427718',
characteristicName='Temperature, water'
)

## Issue tracker
print(f"Retrieved {len(results)} temperature measurements")
```

Please report any bugs and enhancement ideas using the dataretrieval issue
tracker:
### Network Linked Data Index (NLDI)

https://github.com/USGS-python/dataretrieval/issues
Discover and navigate hydrologic networks:

Feel free to also ask questions on the tracker.
```python
import dataretrieval.nldi as nldi

# Get watershed basin for a stream reach
basin = nldi.get_basin(
feature_source='comid',
feature_id='13293474' # NHD reach identifier
)

## Contributing
print(f"Basin contains {len(basin)} feature(s)")

Any help in testing, development, documentation and other tasks is welcome.
For more details, see the file [CONTRIBUTING.md](CONTRIBUTING.md).
# Find upstream flowlines
flowlines = nldi.get_flowlines(
feature_source='comid',
feature_id='13293474',
navigation_mode='UT', # Upstream tributaries
distance=50 # km
)

print(f"Found {len(flowlines)} upstream tributaries within 50km")
```

## Need help?
## Available Data Services

### Modern USGS Water Data APIs (Recommended)
- **Daily values**: Daily statistical summaries (mean, min, max)
- **Instantaneous values**: High-frequency continuous data
- **Field measurements**: Discrete measurements from field visits
- **Monitoring locations**: Site information and metadata
- **Time series metadata**: Information about available data parameters

### Legacy NWIS Services (Deprecated)
- **Daily values (dv)**: Legacy daily statistical data
- **Instantaneous values (iv)**: Legacy continuous data
- **Site info (site)**: Basic site information
- **Statistics (stat)**: Statistical summaries
- **Discharge peaks (peaks)**: Annual peak discharge events
- **Discharge measurements (measurements)**: Direct flow measurements

### Water Quality Portal
- **Results**: Water quality analytical results from USGS, EPA, and other agencies
- **Sites**: Monitoring location information
- **Organizations**: Data provider information
- **Projects**: Sampling project details

### Network Linked Data Index (NLDI)
- **Basin delineation**: Watershed boundaries for any point
- **Flow navigation**: Upstream/downstream network traversal
- **Feature discovery**: Find monitoring sites, dams, and other features
- **Hydrologic connectivity**: Link data across the stream network

## Installation

Install dataretrieval using pip:

```bash
pip install dataretrieval
```

Or using conda:

The Water Mission Area of the USGS supports the development and maintenance of `dataretrieval`. Any questions can be directed to the Computational Tools team at
[email protected].
```bash
conda install -c conda-forge dataretrieval
```

Resources are available primarily for maintenance and responding to user questions.
Priorities on the development of new features are determined by the `dataretrieval` development team.
## More Examples

Explore additional examples in the
[`demos`](https://github.com/USGS-python/dataretrieval/tree/main/demos)
directory, including Jupyter notebooks demonstrating advanced usage patterns.

## Getting Help

- **Issue tracker**: Report bugs and request features at https://github.com/USGS-python/dataretrieval/issues
- **Documentation**: Full API documentation available in the source code docstrings

## Contributing

Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for
development guidelines.

## Acknowledgments
This material is partially based upon work supported by the National Science Foundation (NSF) under award 1931297.
Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.

This material is partially based upon work supported by the National Science
Foundation (NSF) under award 1931297. Any opinions, findings, conclusions, or
recommendations expressed in this material are those of the authors and do not
necessarily reflect the views of the NSF.

## Disclaimer

This software is preliminary or provisional and is subject to revision.
It is being provided to meet the need for timely best science.
The software has not received final approval by the U.S. Geological Survey (USGS).
No warranty, expressed or implied, is made by the USGS or the U.S. Government as to the functionality of the software and related material nor shall the fact of release constitute any such warranty.
The software is provided on the condition that neither the USGS nor the U.S. Government shall be held liable for any damages resulting from the authorized or unauthorized use of the software.
This software is preliminary or provisional and is subject to revision. It is
being provided to meet the need for timely best science. The software has not
received final approval by the U.S. Geological Survey (USGS). No warranty,
expressed or implied, is made by the USGS or the U.S. Government as to the
functionality of the software and related material nor shall the fact of release
constitute any such warranty. The software is provided on the condition that
neither the USGS nor the U.S. Government shall be held liable for any damages
resulting from the authorized or unauthorized use of the software.

## Citation

Hodson, T.O., Hariharan, J.A., Black, S., and Horsburgh, J.S., 2023, dataretrieval (Python): a Python package for discovering
and retrieving water data available from U.S. federal hydrologic web services:
U.S. Geological Survey software release,
https://doi.org/10.5066/P94I5TX3.
Hodson, T.O., Hariharan, J.A., Black, S., and Horsburgh, J.S., 2023,
dataretrieval (Python): a Python package for discovering and retrieving water
data available from U.S. federal hydrologic web services: U.S. Geological Survey
software release, https://doi.org/10.5066/P94I5TX3.
17 changes: 9 additions & 8 deletions dataretrieval/nwis.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,6 @@

.. _National Water Information System (NWIS): https://waterdata.usgs.gov/nwis


.. todo::

* Create a test to check whether functions pull multiple sites
* Work on multi-index capabilities.
* Check that all timezones are handled properly for each service.

"""

import re
Expand All @@ -19,7 +12,7 @@
import pandas as pd
import requests

from dataretrieval.utils import BaseMetadata, format_datetime, to_str
from dataretrieval.utils import BaseMetadata, format_datetime

from .utils import query

Expand All @@ -28,6 +21,14 @@
except ImportError:
gpd = None

# Issue deprecation warning upon import
warnings.warn(
"The 'nwis' services are deprecated and being decommissioned. "
"Please use the 'waterdata' module to access the new services.",
DeprecationWarning,
stacklevel=2
)

WATERDATA_BASE_URL = "https://nwis.waterdata.usgs.gov/"
WATERDATA_URL = WATERDATA_BASE_URL + "nwis/"
WATERSERVICE_URL = "https://waterservices.usgs.gov/nwis/"
Expand Down
12 changes: 6 additions & 6 deletions dataretrieval/samples.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,18 +11,17 @@
import pandas as pd
import warnings

from dataretrieval.utils import BaseMetadata, to_str
from dataretrieval.waterdata import get_samples
from dataretrieval.utils import BaseMetadata

if TYPE_CHECKING:
from typing import Optional, Tuple, Union
from dataretrieval.waterdata import _SERVICES, _PROFILES
from dataretrieval.waterdata import SERVICES, PROFILES
from pandas import DataFrame

def get_usgs_samples(
ssl_check: bool = True,
service: _SERVICES = "results",
profile: _PROFILES = "fullphyschem",
service: SERVICES = "results",
profile: PROFILES = "fullphyschem",
activityMediaName: Optional[Union[str, list[str]]] = None,
activityStartDateLower: Optional[str] = None,
activityStartDateUpper: Optional[str] = None,
Expand Down Expand Up @@ -212,7 +211,8 @@ def get_usgs_samples(
DeprecationWarning,
stacklevel=2,
)


from dataretrieval.waterdata import get_samples
result = get_samples(
ssl_check=ssl_check,
service=service,
Expand Down
Loading