Rate Limiting + Performance for CWMS Data API

The national CDA endpoint has become unresponsive at times, we should discuss adding broader rate limiting. This is both a **security** concern and a **performance/reliability** concern.

From the security side, rate limiting helps reduce the blast radius of abusive clients, credential stuffing, accidental request storms, scraping, and basic DoS behavior. From the performance side, it protects shared Tomcat threads, Oracle connections, expensive CWMS package calls, and large response serialization from being consumed by a small number of clients or malformed query patterns.

## Current State

Based on the current source, CDA has a very small amount of application-level rate limiting. (As of 05/07/2026)

Currently rate limited:

- `POST /ratings/rate-ts/{office}/{rating-id}`
- `POST /ratings/reverse-rate-ts/{office}/{rating-id}`
- `POST /ratings/reverse-rate-values/{office}/{rating-id}`
- `POST /ratings/rate-values/{office}/{rating-id}`

Current behavior (of `/ratings` above):

- Uses Javalin `NaiveRateLimit`.
- Defaults to `100` requests per minute.
- Configured by JVM system property: `cwms.dataapi.request.limit`. (Confusing if this is not globa)
- Uses in-memory counters, so limits are per JVM/pod.
- Uses client IP/method/path as the effective key unless Javalin’s key function is changed.
- Authorized users with the required route roles can bypass the current limiter.

Important gap: most high-traffic public `GET` endpoints are not currently rate limited, including `/timeseries`, `/catalog`, location/catalog endpoints, and other unauthenticated read paths.

## Assumptions 

- We should rate limit **anonymous users**, **API key users**, and **OIDC/CAC users** differently.
- Limits should be configurable without code changes.
- Limits should support per-endpoint or endpoint-class policies, not just one global number.
- In production, limits probably need to be **distributed** across nodes/pods.
- The limiter key should use a trusted client IP from the edge/proxy, not unsanitized `X-Forwarded-For`.
- Expensive endpoints should have stricter limits than cheap metadata endpoints.
- Rate limiting should return `429 Too Many Requests`, ideally with `Retry-After` and/or standard rate limit headers.
- Rate limiting should be paired with max `page-size`, max time-window, max request body, max response size, and metrics/alerting.

## Candidate Approaches / Libraries

| Option | Fits Current Javalin/Tomcat Structure | Per Route | Per IP | Per User/API Key | Distributed | 429 Handling | Headers | Metrics Friendly | Notes |
|---|---:|---:|---:|---:|---:|---:|---:|---:|---|
| Javalin `NaiveRateLimit` / `RateLimitPlugin` | [x] | [x] | [x] | [x] with custom key | [ ] | [x] | [ ] manual | [ ] manual | Easiest incremental option, but explicitly basic and in-memory. Better for local/simple protection than national production. |
| Apache Tomcat `RateLimitFilter` | [x] | [~] via filter mapping | [x] | [ ] | [ ] | [x] | [x] | [ ] manual | Very low app-code impact. Good baseline IP protection at servlet layer. Needs trusted proxy/IP handling. |
| Bucket4j | [x] | [x] | [x] | [x] | [x] with Redis/JCache/JDBC/etc. | [x] manual | [x] manual | [x] manual/Micrometer possible | Strong Java option for token-bucket policies. Probably the best app-level fit if we want distributed limits and custom keys. |
| Resilience4j RateLimiter | [x] | [x] | [ ] needs wrapping/key map | [ ] needs wrapping/key map | [ ] | [x] manual | [ ] manual | [x] Micrometer support | Good resilience library, but its limiter is more naturally per protected operation than per arbitrary HTTP client key. |
| Guava `RateLimiter` | [x] | [x] | [ ] needs custom map | [ ] needs custom map | [ ] | [x] manual | [ ] manual | [ ] manual | Simple local throttling primitive. Useful internally, less suitable as the main public API limiter. |
| Redisson `RRateLimiter` | [x] | [x] | [x] | [x] | [x] Redis-backed | [x] manual | [x] manual | [ ] manual | Good if Redis is acceptable infrastructure. More direct Redis dependency than Bucket4j. |
| Traefik RateLimit middleware | [x] at edge/proxy | [x] | [x] | [~] header-based | [~] depends deployment/features | [x] | [~] | [x] via proxy metrics | Good first line of defense before traffic reaches Tomcat. Should not be the only application-aware limiter. |

References:
- Javalin rate limiting docs: https://javalin.io/documentation
- Bucket4j: https://github.com/bucket4j/bucket4j
- Resilience4j RateLimiter: https://resilience4j.readme.io/docs/ratelimiter
- Tomcat Rate Limit Filter: https://tomcat.apache.org/tomcat-9.0-doc/config/filter.html
- Guava RateLimiter: https://guava.dev/releases/snapshot-jre/api/docs/com/google/common/util/concurrent/RateLimiter.html
- Redisson `RRateLimiter`: https://javadoc.io/doc/org.redisson/redisson/latest/org/redisson/api/RRateLimiter.html
- Traefik RateLimit middleware: https://doc.traefik.io/traefik/middlewares/http/ratelimit/

## Possible Policy Model

A starting policy could look something like this:

| Client Type | Suggested Starting Limit | Notes |
|---|---:|---|
| Anonymous public users | 60 requests/minute/IP, burst 120 | Good baseline for browsers, scripts, and casual users. Forces high-volume users toward API keys. |
| API key users | 300 requests/minute/key, burst 600 | Supports legitimate automated use while making ownership and abuse tracing easier. |
| Authenticated OIDC/CAC users | 600 requests/minute/user, burst 1,200 | Higher trust, but still not unlimited because DB/API capacity is shared. |
| Internal/trusted service accounts | 1,500 requests/minute/account, burst 3,000 | Only for explicitly approved service identities with monitoring and contact owner. OPTIONALLY: setup via CWMS roles/auth? |

*Burst*:  Helps avoid punishing normal behavior like a web page loading several resources at once, while still limiting sustained high-volume traffic. Let a client make up to 120 requests quickly if they have been idle, but over time they still average around 60 requests per minute.For example.

Endpoint classes could also have different weights, some examples:

| Endpoint Class | Example | Suggested Starting Limit |
|---|---|---:|
| Cheap metadata | `/offices`, `/parameters`, static lookup data | 300 requests/minute/IP anonymous; 1,000 requests/minute authenticated |
| Catalog queries | `/catalog`, location catalog, TS catalog | 60 requests/minute anonymous; 300 requests/minute API key/authenticated |
| Time series reads | `/timeseries` | 30 requests/minute anonymous; 180 requests/minute API key; 300 requests/minute authenticated |
| Rating calculations | `/ratings/rate-*`, `/ratings/reverse-rate-*` | Keep current 100 requests/minute anonymous; 300 requests/minute authenticated/API key |
| Writes | `POST`, `PATCH`, `DELETE` | 60 requests/minute authenticated user/API key |
| Auth/key endpoints | `/auth/*`, API key creation/deletion | 10 requests/minute/IP and 30 requests/hour/user |


## Other Mitigations To Discuss

Rate limiting alone probably will not be enough. Related work items:

- Add global max `page-size` validation.
- Reject or cap `page-size=0` where it means unlimited.
- Add maximum time-window limits for time series requests.
  - What about cached (eventualy) TS and POR or Composite TS? 
  - Should 5min data allow for unlimited time window?
- Add maximum number of time series names per request.
- Add maximum request body sizes by endpoint, not only Tomcat `maxPostSize`.
- Ensure response size or row-count caps for expensive formats.
- [Optional] Add request cost metrics by endpoint, office, authenticated principal/API key, response status, and duration.
- Add `429` response body and headers consistently.
- [Optional] Add dashboards for top clients, top endpoints, 429 counts, DB pool wait time, and slow queries.
- [Optional] Edge/WAF limits as the first layer and app-level limits as the second layer.
- Document rate limits publicly so legitimate users know how to behave.
  - Should we also have 429 respond with a URL to the CDA read the docs on ratelimiting + POC below
- Provide a contact/escalation path for users who need higher limits

## Open Questions

- Should rate limits be enforced at the edge, in CDA, or both?
- What should be the initial limits for anonymous, API key, and authenticated users? Examples above
- Should authenticated users ever fully bypass limits?
- Which endpoints are most expensive today based on production metrics?
- Do we have or want Redis/JCache/JDBC infrastructure for distributed counters?
- Should limits be configured in properties, feature flags, database tables, or deployment config?
- What headers should CDA return for rate-limited responses?
- How do we avoid breaking legitimate automated users of the national API?
- Should we introduce user-facing API tiers or require API keys for high-volume use?

## TODO

- Initial endpoint scope.
- Selected library/architecture.
- Rate-limit key strategy.
- Default anonymous/authenticated/API-key limits.
- Required metrics and dashboards ( In Grafana?)
- Rollout plan with logging-only/dry-run mode before enforcement.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rate Limiting + Performance for CWMS Data API #1722

Current State

Assumptions

Candidate Approaches / Libraries

Possible Policy Model

Other Mitigations To Discuss

Open Questions

TODO

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Option	Fits Current Javalin/Tomcat Structure	Per Route	Per IP	Per User/API Key	Distributed	429 Handling	Headers	Metrics Friendly	Notes
Javalin `NaiveRateLimit` / `RateLimitPlugin`	[x]	[x]	[x]	[x] with custom key	[ ]	[x]	[ ] manual	[ ] manual	Easiest incremental option, but explicitly basic and in-memory. Better for local/simple protection than national production.
Apache Tomcat `RateLimitFilter`	[x]	[~] via filter mapping	[x]	[ ]	[ ]	[x]	[x]	[ ] manual	Very low app-code impact. Good baseline IP protection at servlet layer. Needs trusted proxy/IP handling.
Bucket4j	[x]	[x]	[x]	[x]	[x] with Redis/JCache/JDBC/etc.	[x] manual	[x] manual	[x] manual/Micrometer possible	Strong Java option for token-bucket policies. Probably the best app-level fit if we want distributed limits and custom keys.
Resilience4j RateLimiter	[x]	[x]	[ ] needs wrapping/key map	[ ] needs wrapping/key map	[ ]	[x] manual	[ ] manual	[x] Micrometer support	Good resilience library, but its limiter is more naturally per protected operation than per arbitrary HTTP client key.
Guava `RateLimiter`	[x]	[x]	[ ] needs custom map	[ ] needs custom map	[ ]	[x] manual	[ ] manual	[ ] manual	Simple local throttling primitive. Useful internally, less suitable as the main public API limiter.
Redisson `RRateLimiter`	[x]	[x]	[x]	[x]	[x] Redis-backed	[x] manual	[x] manual	[ ] manual	Good if Redis is acceptable infrastructure. More direct Redis dependency than Bucket4j.
Traefik RateLimit middleware	[x] at edge/proxy	[x]	[x]	[~] header-based	[~] depends deployment/features	[x]	[~]	[x] via proxy metrics	Good first line of defense before traffic reaches Tomcat. Should not be the only application-aware limiter.

Client Type	Suggested Starting Limit	Notes
Anonymous public users	60 requests/minute/IP, burst 120	Good baseline for browsers, scripts, and casual users. Forces high-volume users toward API keys.
API key users	300 requests/minute/key, burst 600	Supports legitimate automated use while making ownership and abuse tracing easier.
Authenticated OIDC/CAC users	600 requests/minute/user, burst 1,200	Higher trust, but still not unlimited because DB/API capacity is shared.
Internal/trusted service accounts	1,500 requests/minute/account, burst 3,000	Only for explicitly approved service identities with monitoring and contact owner. OPTIONALLY: setup via CWMS roles/auth?

Endpoint Class	Example	Suggested Starting Limit
Cheap metadata	`/offices`, `/parameters`, static lookup data	300 requests/minute/IP anonymous; 1,000 requests/minute authenticated
Catalog queries	`/catalog`, location catalog, TS catalog	60 requests/minute anonymous; 300 requests/minute API key/authenticated
Time series reads	`/timeseries`	30 requests/minute anonymous; 180 requests/minute API key; 300 requests/minute authenticated
Rating calculations	`/ratings/rate-`, `/ratings/reverse-rate-`	Keep current 100 requests/minute anonymous; 300 requests/minute authenticated/API key
Writes	`POST`, `PATCH`, `DELETE`	60 requests/minute authenticated user/API key
Auth/key endpoints	`/auth/*`, API key creation/deletion	10 requests/minute/IP and 30 requests/hour/user

Rate Limiting + Performance for CWMS Data API #1722

Description

Current State

Assumptions

Candidate Approaches / Libraries

Possible Policy Model

Other Mitigations To Discuss

Open Questions

TODO

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions