caching data with LRU Cache and refresh it within specified time.
Data cache that include how to fetch/replace/clear items in one place to make it simple to use.
APIs have been kept to be minimal, unless there are useful use cases.
lru-cache's fetchMethod + allowStale is pull-based: an entry is only refreshed when a request happens to hit it after it goes stale. The first requester after expiry still pays the backend latency (stale-while-revalidate serves them stale data, not fresh). Refresh work is coupled to, and triggered by, request traffic.
refreshed-cache is push-based: asyncRefresh() re-fetches the entire working set on a timer (refreshAge) or at a wall-clock time (refreshAt), independent of request traffic. Consequences that raw lru-cache cannot replicate without you building it:
- Zero cache-miss penalty on hot data. The working set is already fresh before requests arrive — reads stay
<0.1 mswith no per-key revalidation stall. - Bounded, predictable backend load. With
passRecentKeysOnRefresh+ "active-only" refresh, the entire hot set is refreshed in a handful of queries per interval (e.g. ~601 queries vs. ~51,000 for lazy), regardless of read QPS. Backend load is a function of cache size and interval, not traffic. - Time-aligned freshness.
refreshAt: {days, at}supports "rebuild the cache at 02:00 daily" — a first-class need for reference data, pricing tables, feature flags, config — thatlru-cachehas no concept of. - Encapsulated data provider. Fetch logic (
fetch,fetchByKey,fetchByKeys) lives with the cache config, not scattered across call sites.
raw lru-cache (fetchMethod + allowStale) |
refreshed-cache |
|
|---|---|---|
| Refresh model | Pull — on access, after stale | Push — on timer / wall-clock, ahead of access |
| Who pays refresh latency | First requester after expiry | Nobody (background loop does it) |
| Backend load scales with | Read traffic (QPS) | Cache size × refresh interval |
| Refresh whole hot set in one query | ✗ (per-key) | ✓ (fetch / fetchByKeys) |
| "Rebuild at 02:00 daily" | ✗ | ✓ (refreshAt: {days, at}) |
| Single-flight coalescing | ✓ (built in) | ✓ (getOrFetch) |
| Negative caching of misses | ✗ (you build it) | ✓ (maxMiss / maxAgeMiss) |
refreshed-cacheis for read-heavy workloads over a bounded, slowly-changing dataset (reference/config/catalog data) where you want the hot set kept fresh proactively on a schedule — so no request ever pays refresh latency — rather than lazily revalidated on access like rawlru-cache.
- Your dataset is unbounded or high-cardinality and you only ever want lazy, per-key population → use
lru-cache.fetch(). - You're happy serving stale-while-revalidate and don't need scheduled/time-aligned freshness.
- You need only coalescing or batching —
lru-cache's ownfetch()already coalesces in-flight requests, anddataloaderalready batches.refreshed-cache's edge is that coalescing, batching, negative caching, and scheduled push refresh share one store and one config — not that any single one of those is novel.
In short: pick refreshed-cache when "keep the hot set warm on a schedule" is the requirement; pick raw lru-cache when "populate lazily on demand" is enough.
npm install refreshed-cache --saveconst Cache = require("refreshed-cache");
const options = { max: 500
, maxAge: 1200
, refreshAge : 600 };
const fetch = ()=>Object.entries({ a: 1, b: 2, c: 3 });
const cache = new Cache(fetch,options);
await cache.init()
cache.get("a") // 1
await cache.close() //clear cache and stop refreshdata-cache constructor requie a fetch function, and an optional options
a function/async function that return a iterator/asyncIterator object that contains the [key, value] pairs for each item e.g. Map.entries(), Object.entries(object), Async Generator Function
if Options.passRecentKeysOnRefresh is true, then recently keys (Array) will be passed to this function when refresh data
Note: the order of items in entries is important, if the max < size of entries then only 1 to max items are loaded to cached. Therefore items must be sorted by its prority, which the most important one is the first.
-
maxThe maximum size of the cache. Setting it to 0 then no data will be cached. Default is 10000. -
maxAgeMaximum age in second. Expired items will be removed every refreshAge. Setting it to 0 disables TTL expiry (items live until evicted by LRU pressure). Default is 600 seconds. -
refreshAgerefresh time in second. New data will be fetch on each refresh and expired items will be removed every refreshAge. Also the expired data will be prune in every refresh only the first item of each refresh successfully retrieved. Default is maxAge. note if refreshAt is specified too, then the refreshAt will be use, and ignore refreshAge. -
refreshAtrefresh at specific time every x days. Specific as object in format {days,at} e.g. {days:2,at: "10:00:00"}, time of the day to refresh the data days:x -- refresh every x days (x must be 1-14) at:"HH:mm:ss" -- refresh at -
passRecentKeysOnRefreshpass recent keys (Array) - that not expired - to fetch function when refresh default = false. This is useful when do you want to refresh the recently keys. -
resetOnRefreshtrue then reset cache on every refresh only the first item of each refresh successfully retrieved, so only the new fetch data is cached. Default is true -
fetchByKey- function/async function use to fetch value by key and and keep it to cache. fetchByKey must return value (null is count as a value), and return undefined when no data found. -
maxMiss- iffetchByKeyorfetchByKeysis set, this is the maximum size of the miss-cache (bounded sidecar LRU for non-existent keys). Setting it to0disables the miss-cache entirely — repeated lookups for non-existent keys will always call the fetch function. Default is 2000. -
maxAgeMiss- iffetchByKeyorfetchByKeysis set, this is the maximum age of a miss-cache entry in seconds. Setting it to0means miss entries never expire by age (they are only evicted by LRU pressure when the miss-cache reachesmaxMiss). Default isrefreshAge.
-
async init()Call this function to init cache with the data from fetch function and start the refresh cycle.This function will throw exception if fetch throw exception
-
get(key) => valueget the cached data using key. if no key then it will return undefined.
This will update the "recently used"-ness of the key.
The key and val can be any type. But using object as key have to same object.
-
set(key, value)set the cached data using key.
This will update the "recently used"-ness of the key.
The key and val can be any type. But using object as key have to same object.
-
delete(key)delete the cached data using key.
-
clear()clear all cached data.
-
entries()Return a generator yielding [key, value] pairs.
-
asyncRefresh()async refresh data using fetch function and reset cache if and only if resetOnRefresh option is true, otherwise unexpired values will be kept.
-
async getOrFetch(key) => valueget cache value by key, if it's not found try to get item using fetchByKey, return undefined if not found.If fetchByKey throw exception this will throw exception as well.
-
has(key) => booleancheck the key is in cached. if the key is cached then return true
This will not update the "recently used"-ness of the key, and not remove the expired key.
-
async close()Clear the cache entirely, throwing away all values, and stop refresh.
-
sizeReturn total number of items currently in cache. Note, that expired items are included as part of this item count.
const fs = require('fs');
const parse = require('csv-parse');
async function* readCSVByLine() {
const readFileStream = fs.createReadStream(__dirname + "/keyword.csv");
const csvParser = parse({});
readFileStream.pipe(csvParser)
for await (const record of csvParser) {
yield record;
}
await readFileStream.destroy();//detroy unused readstream
}
const cache = new (require("refreshed-cache"))(readCSVByLine);
await cache.init();
cache.get("aa");//
await cache.close();The code above will read content from CSV to cache, the first column will be keys and the second column will be values. The cache will be refresh with update content of CSV file every 600 second (default)
This example is read only first 4 lines from large csv since max cache is only 4
const fs = require('fs');
const parse = require('csv-parse');
async function* readCSV4Lines() {
const readFileStream = fs.createReadStream(__dirname + "/large.csv");
const csvParser = parse({});
readFileStream.pipe(csvParser)
let i = 0;
for await (const record of csvParser) {
yield record;
i++;
if (i >= 4) break;
}
await readFileStream.destroy();
}
const cache = new (require("refreshed-cache"))(readCSV4Lines,{max:4});
await cache.init();
cache.get("aa");//
await cache.close();var got = require('got');
const parse = require('csv-parse');
const max = 10;
async function* readCSVMaxLinesOnWeb() {
const csvWebStream = got.stream("https://raw.githubusercontent.com/songpr/refreshed-cache/main/test/1000000.csv");
const csvParser = parse({});
csvWebStream.pipe(csvParser)
let i = 0;
for await (const record of csvParser) {
yield record;
i++;
if (i == max) break
}
await csvWebStream.destroy();
}
const cache = new (require("refreshed-cache"))(readCSVMaxLinesOnWeb,{max});
await cache.init();
console.log(cache.get("cpPG"))//"MnelEaBbPP"
console.log(cache.get("HClmlnlM"))//"I"
console.log(cache.get("IFOBOfEOpLcJKnH"))//'PNaj'
await cache.close();Latest coverage from npm test -- --coverage:
- Statements: 99.18%
- Branches: 100.00%
- Functions: 95.34%
- Lines: 100.00%
Core file coverage (index.js) matches the overall values above.
Latest full test run (npm test) results:
- Test suites: 16 passed, 1 skipped, 17 total
- Tests: 79 passed, 1 skipped, 80 total
Roadmap tests are intentionally skipped by default and can be enabled via environment variable.
For detailed setups, scripts, and cost analyses, please refer to the Benchmark README.
Comparing Direct Prepared Statements (No Cache) against the Cache across different sizes:
| Scenario | Cache Size | Avg DB Ops/sec | Avg Cache Ops/sec | DB Queries Cache | Speedup | Correctness |
|---|---|---|---|---|---|---|
| Small Cache (1% coverage) | 10,000 | ~17,600 | ~48,600 | 20,800 | 2.76x | ✅ PASSED |
| Medium Cache (10% coverage) | 100,000 | ~15,600 | ~42,800 | 16,200 | 2.74x | ✅ PASSED |
| Large Cache (50% coverage) | 500,000 | ~15,100 | ~37,600 | 15,300 | 2.49x | ✅ PASSED |
Under shifting workloads (sliding window pool of 120,000 keys) with a strict limit of max: 100000 keys:
- Active-Only Refresh Cache (Strategy C): Achieved the same 95% hit rate as standard caching strategies, but reduced database query traffic by over 90x (from 50,000+ lookups to under 601), keeping heap growth minimal (~4 MB).
Under high concurrent single-key miss storms, standard cache-miss strategies run into database connection pool bottlenecks:
- Connection Pool Saturation: Firing individual single-key fetches (
cache.getOrFetch(key)) for cache misses under high concurrency saturates the Postgres client connection pool. - Latency Alignment: Due to socket queueing delays, standard caching latencies align with the direct prepared statement baseline (~240 ms p99).
By implementing Single-flight Promise Coalescing and Bulk Batch Loading, the queueing bottleneck is completely resolved:
- Throughput Boost: Scales throughput by over 4x (from ~5,500 rps with old caching architecture to ~24,500 rps).
- Latency ROI: Drops tail latency (p99) from ~285 ms to ~31 ms under high stress.
- DB Query Reduction: Cuts total database queries triggered in half (e.g., from 103,837 queries to 51,514), protecting the database from thundering herd storms.
- Peak Memory Optimization: Reduces peak heap memory usage by over 4x (from ~330 MB to ~75 MB) by eliminating microtask delay wrappers (replacing deferred fetches with direct async IIFEs) and switching from async
for awaititeration to standard synchronousforloops for synchronous database result arrays.
- Why C aligns with DB baseline: In load test C, the active sliding window (120,000 keys) is wider than the cache capacity (100,000 keys). This forces constant evictions and triggers over 56,000 - 65,000 individual DB queries. Because these are executed key-by-key, they saturate the Postgres client pool, causing queueing delays that affect both cache misses and direct prepared statements.
- How D resolves the bottleneck: Single-flight Promise Coalescing coalesces concurrent duplicate reads targeting the same hot keys into a single database query. Meanwhile, Bulk Batch Loading groups batch requests into a single SQL statement (
WHERE uuid IN (...)). By eliminating redundant database roundtrips, it prevents connection pool saturation, dropping p99 tail latency by 90% (to ~31 ms) and scaling throughput to ~24,500 rps.
Run the standard suite:
npm testRun with coverage report:
npm test -- --coverageRun open-handle diagnostics (serial mode):
npm test -- --detectOpenHandles --runInBandRun roadmap/future-feature tests explicitly:
RUN_ROADMAP_TESTS=true npm test -- test/tdd_roadmap.test.js- Low Memory Footprint: Evaluated at ~305.5 bytes per cache item (storing realistic string values), keeping RAM usage highly predictable.
- High-Load Stability: Successfully soak tested for over 2.5 million operations in a 5-minute high load sequence (concurrent reads, writes, manual evictions, and background refresh intervals) with 0% error rate and stable heap growth.
To clean up deprecated, duplicate, and sub-optimal methods in the cache API, version 1.8.0 removes the following methods:
-
del(key)(Alias Removed): Use the standarddelete(key)method instead. -
find(findFunction)(Linear Lookup Removed): Linear$O(N)$ searches over in-memory caches bypass the speed advantages of LRU maps and introduce performance overhead. If you need to search cached items, iterate via the native generatorcache.entries()instead.
While caching strategies (like Promise Coalescing and Batching) are not unique to refreshed-cache and exist in other tools (e.g. lru-cache's native .fetch() API, or dataloader), version 1.8.0 wraps them natively within its scheduled refresh and miss-cache structures to make them easy to use.
These patterns demonstrate how to configure and utilize these features effectively:
If your app experiences spikes of duplicate requests targeting the same hot keys (e.g., flash sales, breaking news), configuring fetchByKey automatically coalesces concurrent misses into a single database query.
const Cache = require("refreshed-cache");
const cache = new Cache(
async () => [], // Base loader (optional for purely lazy setups)
{
max: 100000,
maxAge: 300,
fetchByKey: async (id) => {
// Multiple concurrent calls for the same ID will coalesce here.
// Only ONE database query is executed; others share the same returned Promise.
return await db.query("SELECT * FROM products WHERE id = $1", [id]);
}
}
);
// Usage in express router
app.get("/product/:id", async (req, res) => {
const product = await cache.getOrFetch(req.params.id);
res.json(product);
});When loading dashboard widgets, lists, or feeds that query multiple related entities, use fetchByKeys and cache.getOrFetchMany(keys). This groups all missing keys and fetches them in a single batch statement (e.g. WHERE id IN (...)) rather than iterating key-by-key.
const cache = new Cache(
async () => [],
{
max: 100000,
maxAge: 300,
// Batch fetcher for missing keys
fetchByKeys: async (ids) => {
// Query database once for all missing keys
const rows = await db.query("SELECT id, name FROM users WHERE id = ANY($1)", [ids]);
return rows.map(r => [r.id, r]); // Return iterable [key, value] pairs
}
}
);
// Usage in express router
app.get("/users/bulk", async (req, res) => {
const userIds = req.query.ids.split(","); // e.g. [1, 5, 8, 12]
const users = await cache.getOrFetchMany(userIds);
res.json(users);
});When your database contains millions of records (e.g., 10M or 100M rows), caching the entire dataset in-process is impossible. Use the Active-Only Refresh strategy. It regularly refreshes only the keys that have been read since the last refresh interval, keeping the hot set warm while bounding memory usage.
const cache = new Cache(
async (recentKeys) => {
// recentKeys lists only keys accessed since the last refresh cycle
if (!recentKeys || recentKeys.length === 0) return [];
const rows = await db.query("SELECT id, data FROM profiles WHERE id = ANY($1)", [recentKeys]);
return rows.map(r => [r.id, r.data]);
},
{
max: 100000,
maxAge: 600,
refreshAge: 300,
resetOnRefresh: false, // Keep existing unexpired items
passRecentKeysOnRefresh: true // Pass active keys list to the loader function
}
);When clients query non-existent keys (e.g. product-non-existent-999), a cache miss normally forces a database query. A flood of non-existent queries can take down your database (Cache Penetration Attack).
Configure maxMiss and maxAgeMiss to track non-existent keys in a separate bounded miss-cache, preventing database lookup spam.
const cache = new Cache(
async () => [],
{
max: 100000,
fetchByKey: async (sku) => {
const item = await db.query("SELECT * FROM items WHERE sku = $1", [sku]);
return item || undefined; // Returning undefined puts the key into the miss cache
},
maxMiss: 10000, // Bounded tracking for non-existent SKUs
maxAgeMiss: 60 // Lock out non-existent keys for 60 seconds
}
);For detailed performance comparison benchmarks, database prepared statements analysis, and the future development plan, please refer to DEVELOPMENT_PLAN.md.