Add waterdata infrastructure #183

ehinman · 2025-09-19T22:01:58Z

This PR will add access to the new water data APIs via the waterdata module.

9/26/25: Added some updates to the README.md about the new module and API keys. Ready for testing and review.

EOD 9/25/25: qualifier is a tricky argument and product owner suggests against using it as an argument unless you're really confident and restrictive about what you want: it can be a list of multiple qualifiers and if you just pick one qualifier value, it will only match rows with JUST that one. Default is to return a geopandas dataframe when geometry are returned, but because geopandas is an optional dependency, functions will return pandas dataframes if geopandas is not available. Unit tests have been created, with opportunities for more. I'd say the functions are ready for testing. I need to add in some info on the new functions in the README, etc.

9/25/25: POST calls using the CQL2 query language appear to be working, and documentation for the functions has been added. I'm noticing some inconsistencies in some of the input parameters like qualifier that still need to be addressed/parsed. I also need to create unit tests and I'd like to have the functions return a geopandas dataframe when skipGeometry=False.

9/19/25: It is currently a work in progress that appears to work for GET calls in which the user requests one parameter (e.g. one site, one pcode, etc.) at a time. Still working out the POST calls in which a user may request multiple parameters (e.g. data from multiple sites, with multiple pcodes), which requires the use of the CQL2 query language. Stay tuned.

…t page not downloading, start to add more function outlines

agilmore2 · 2025-09-25T05:13:43Z

dataretrieval/waterdata.py

+        bbox: Optional[List[float]] = None,
+        limit: Optional[int] = None,
+        max_results: Optional[int] = None,
+        convertType: bool = True


how do we pass the api_key API parameter here?

I will add documentation about this (still learning the details myself), but your API key should be passed as a header if it exists as an environment variable. This is the line used to grab the api key in one of the helper functions:
token = os.getenv("API_USGS_PAT")

So you'll want to get your API key, and then set it using:

os.environ["API_USGS_PAT"] = "<your key>"

You may need to restart your session to get it to "register".

And to be clear: all you need to do is have your key in your environment, you don't need to "set it" in the functions anywhere.

agilmore2 · 2025-09-25T14:32:45Z

Understood, thanks! I was pulling it out of an environment variable myself and expecting to set it in the retrieval functions, but the functions pulling it themselves also works. I appreciate your hard work on these!

…

On Thu, Sep 25, 2025, 08:25 Elise Hinman ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In dataretrieval/waterdata.py <#183 (comment)> : > + parameter_code: Optional[Union[str, List[str]]] = None, + statistic_id: Optional[Union[str, List[str]]] = None, + properties: Optional[List[str]] = None, + time_series_id: Optional[Union[str, List[str]]] = None, + daily_id: Optional[Union[str, List[str]]] = None, + approval_status: Optional[Union[str, List[str]]] = None, + unit_of_measure: Optional[Union[str, List[str]]] = None, + qualifier: Optional[Union[str, List[str]]] = None, + value: Optional[Union[str, List[str]]] = None, + last_modified: Optional[str] = None, + skipGeometry: Optional[bool] = None, + time: Optional[Union[str, List[str]]] = None, + bbox: Optional[List[float]] = None, + limit: Optional[int] = None, + max_results: Optional[int] = None, + convertType: bool = True I will add documentation about this (still learning the details myself), but your API key should be passed as a header if it exists as an environment variable. This is the line used to grab the api key in one of the helper functions: token = os.getenv("API_USGS_PAT") So you'll want to get your API key, and then set it using: os.environ["API_USGS_PAT"] = "<your key>" You may need to restart your session to get it to "register". — Reply to this email directly, view it on GitHub <#183 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABUI5SVU5QEID6FDYDBVEHL3UP3M7AVCNFSM6AAAAACHAHDID6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTENRYGA2DAOBUHA> . You are receiving this because you commented.Message ID: ***@***.***>

agilmore2 · 2025-09-25T16:18:33Z

I see that in _default_headers() Thanks!

…

On Thu, Sep 25, 2025 at 8:32 AM Elise Hinman ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In dataretrieval/waterdata.py <#183 (comment)> : > + parameter_code: Optional[Union[str, List[str]]] = None, + statistic_id: Optional[Union[str, List[str]]] = None, + properties: Optional[List[str]] = None, + time_series_id: Optional[Union[str, List[str]]] = None, + daily_id: Optional[Union[str, List[str]]] = None, + approval_status: Optional[Union[str, List[str]]] = None, + unit_of_measure: Optional[Union[str, List[str]]] = None, + qualifier: Optional[Union[str, List[str]]] = None, + value: Optional[Union[str, List[str]]] = None, + last_modified: Optional[str] = None, + skipGeometry: Optional[bool] = None, + time: Optional[Union[str, List[str]]] = None, + bbox: Optional[List[float]] = None, + limit: Optional[int] = None, + max_results: Optional[int] = None, + convertType: bool = True And to be clear: all you need to do is have your key in your environment, you don't need to "set it" in the functions anywhere. — Reply to this email directly, view it on GitHub <#183 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABUI5SSOIKUSVNDVU3D7QVD3UP4IXAVCNFSM6AAAAACHAHDID6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTENRYGA4DINBSGY> . You are receiving this because you commented.Message ID: ***@***.***>

thodson-usgs · 2025-09-25T18:12:38Z

dataretrieval/waterdata_helpers.py

 from datetime import datetime
 import pandas as pd
 import json
+import geopandas as gpd


We want to keep geopandas as an optional dependency, I think. See the nldi module for example.

… knows when they will receive a pandas df

jzemmels

I've tested out all of the read_waterdata_ functions in various ways and didn't run into errors. Just one comment on the usage of limit and max_results that confuses me. I've also read through the helper functions and everything looks good. I know there are other PRs proposed for the ehinman:add-waterdata-infrastructure branch, so let me know if you'd like me to look again after those changes are resolved.

jzemmels · 2025-11-21T17:37:28Z

dataretrieval/waterdata.py

+    output_id = "daily_id"
+
+    # Build argument dictionary, omitting None values
+    args = { 


jzemmels · 2025-11-21T17:41:06Z

dataretrieval/waterdata.py

+    return waterdata_helpers.get_ogc_data(args, output_id, service)
+
+def get_monitoring_locations(
+        monitoring_location_id: Optional[List[str]] = None,


Do we want to add some argument checking statements, perhaps somewhere in the helpers? Making sure monitoring_location_id is a string, etc. Not sure how difficult this would be to implement, but I'm able to pass seemingly whatever I want to these arguments and a request is still made.

jzemmels · 2025-11-21T19:33:52Z

dataretrieval/waterdata.py

+        time: Optional[Union[str, List[str]]] = None,
+        bbox: Optional[List[float]] = None,
+        limit: Optional[int] = None,
+        max_results: Optional[int] = None,


Similar to a question I brought up in the dataRetrieval PR, I'm either not understanding how limit and max_results are supposed to work or they're not working as intended. The max_results argument doesn't seem to impact the number of rows returned in the output. Examples:

# wd.get_monitoring_locations(site_type_code="GW") # fetches all GW ML ids
wd.get_monitoring_locations(site_type_code="GW", max_results = 10) # fetches 10,000 GW ML ids
wd.get_monitoring_locations(site_type_code="GW", max_results = 5, limit = 10) # fetches 10 GW ML ids

jzemmels · 2025-11-21T19:58:38Z

tests/waterdata_test.py

Switching up the monitoring location IDs and parameter codes fetched across these tests would be good. For a future PR: consider making lists of potential argument values and randomly selecting one for each of the test runs.

Waterdata revisions

ehinman added 17 commits August 8, 2025 17:19

start adding functions

1295e91

start adding documentation and going through functions

c4b0b9a

adjust date function

c32ded5

fix dates function

99e949c

keep working out issues with api calls

1641e85

add documentation

7bc6c6f

adjust how response is handled and edit walk pages, fix API limit print

1b29d6a

add documentation

3289982

add more documentation, correct waterdata module

867d728

allow post and get calls in recursive walk pages, fix typo where firs…

44213b5

…t page not downloading, start to add more function outlines

add in all possible arguments

4affa2f

trying to get cql2 query correct, will keep at it

21691d0

correct cql2 queries

4c2a3ee

simplify syntax, remove unneeded dependencies

14f2830

start adding function documentation

d25f854

add link urls

7fe486a

fix date formatting function

fad9ce0

agilmore2 reviewed Sep 25, 2025

View reviewed changes

make waterdata outputs geopandas if geometry included

a33d201

thodson-usgs reviewed Sep 25, 2025

View reviewed changes

ehinman added 7 commits September 25, 2025 13:25

make gpd an optional dependency and change returns accordingly

bd82c49

incorporate geopandas boolean into function arguments and ensure user…

06b0e69

… knows when they will receive a pandas df

clean up some documentation and comments

253da79

add optional dependency to pyproject.toml

f5cca07

set convertType to default or user specification

5c546e7

start unit tests on new functions

e9221ac

update README and add a NEWS markdown in which to place past updates

b1436db

ehinman requested a review from thodson-usgs September 26, 2025 15:22

ehinman requested review from jzemmels and ldecicco-USGS September 26, 2025 15:22

ehinman added 3 commits September 26, 2025 10:57

make a few small changes to names and documentation

dc24658

define max_results when it is an input

89b960c

comment out code that wasn't doing the correct thing with max_results

1237777

ehinman requested a review from mikemahoney218-usgs September 26, 2025 18:36

thodson-usgs and others added 4 commits September 29, 2025 09:11

Revert waterdata to requrests

e84984a

Review waterdata module

4c84fc0

Update README.md

f4693b6

Add deprecation warning for nwis

0d06672

jzemmels reviewed Nov 21, 2025

View reviewed changes

ehinman and others added 9 commits November 21, 2025 15:14

Update dataretrieval/waterdata/api.py

96a4356

Update dataretrieval/waterdata/api.py

7f7f184

Update dataretrieval/waterdata/api.py

c14e00b

Apply suggestions from code review

dcc7a1a

Merge pull request #5 from nodohs/waterdata

370f9a5

Waterdata revisions

add back in documentation and make formatting changes

4482751

add metadata to api.py and testing

37063b9

small changes to remove unnecessary imports and add more documentation

8bb2de8

remove some redundant testing, make next url be an info log, not debug

2f6af7d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add waterdata infrastructure #183

Add waterdata infrastructure #183

Uh oh!

ehinman commented Sep 19, 2025 •

edited

Loading

Uh oh!

agilmore2 Sep 25, 2025

Uh oh!

ehinman Sep 25, 2025

Uh oh!

ehinman Sep 25, 2025

Uh oh!

agilmore2 commented Sep 25, 2025 via email

Uh oh!

agilmore2 commented Sep 25, 2025 via email

Uh oh!

thodson-usgs Sep 25, 2025

Uh oh!

jzemmels left a comment

Uh oh!

jzemmels Nov 21, 2025

Uh oh!

jzemmels Nov 21, 2025

Uh oh!

jzemmels Nov 21, 2025

Uh oh!

jzemmels Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add waterdata infrastructure #183

Are you sure you want to change the base?

Add waterdata infrastructure #183

Uh oh!

Conversation

ehinman commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agilmore2 commented Sep 25, 2025 via email

Uh oh!

agilmore2 commented Sep 25, 2025 via email

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jzemmels left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ehinman commented Sep 19, 2025 •

edited

Loading