Hi @ltwlf ,
while learning using this library, a request came to my mind, which could be useful for large documents in order to save performance.
Summary
If possible please consider adding a keysToInclude option to diffAtom that acts as the whitelist complement of the existing keysToSkip blacklist. When keysToInclude is provided, only paths that match (or are descendants of) an entry in the list would be diffed; everything else would be silently ignored.
Motivation / user story - performance and flexibility
We are diffing large, deeply nested order documents. A typical order object contains dozens of fields per line item — material descriptions, material numbers, pricing structures - sometimes internal processing flags — the vast majority of which we do not care about for
change-detection purposes. Also in our business logic we are often able to tell , that only specific pathes inside the big document changed.
Today we could work around this with keysToSkip, but maintaining an ever-growing exclusion list is error-prone: every time a new field is added to the document schema we have to remember to add it to the skip list or it leaks unwanted noise into the diff output.
What we really want is to say "only these five paths and its childs matter — skip the rest". That is the classic whitelist / include-only pattern, and it is the natural inverse of what keysToSkip already does.
For large documents this could also have a performance benefit: the diff engine can stop recursing as soon as it knows a subtree can never produce an included path, rather than walking the entire object tree just to discard results at the end.
Proposed API
Same dot-notation as keysToSkip. The option would sit alongside it in Options / AtomOptions:
interface Options {
arrayIdentityKeys?: EmbeddedObjKeysType | EmbeddedObjKeysMapType;
keysToSkip?: readonly string[];
keysToInclude?: readonly string[]; // ← new
treatTypeChangeAsReplace?: boolean;
}
Path notation (identical to keysToSkip)
Dot-separated property names from the root of the document. Array element keys are not part of the path — a single entry automatically covers every element in the array at that level, just like keysToSkip already does.
Example usage
Assume positions is an array field and deliveries a nested array property for each position.
1 — Include only a handful of fields in a large order document
{
// ... many other properties
"positions": [
{
"positionNumber": { "value": "10" },
"quantity": 100,
"netPrice": 50.0,
"deliveryDateRequested": "2023-12-01",
// ... many other properties
"deliveries": [
{
"deliveryPositionNumber": "10-1" ,
"deliveryDateStart": "2023-12-01",
"deliveryDateEnd": "2023-12-05",
"quantity": 50
},
{
"deliveryPositionNumber": "10-2",
"deliveryDateStart": "2023-12-06",
"deliveryDateEnd": "2023-12-10",
"quantity": 50
}
]
}
]
}
Only the six listed leaf paths appear in delta.operations — all other fields are silently ignored, which could save performance.
const delta = diffAtom(oldOrder, newOrder, {
keysToInclude: [
// item-level fields we care about
"positions.quantity",
"positions.netPrice",
"positions.deliveryDateRequested",
// delivery-level fields we care about
"positions.deliveries.deliveryDateStart",
"positions.deliveries.deliveryDateEnd",
"positions.deliveries.quantity",
],
arrayIdentityKeys: {
positions: (obj, isKeyName) => isKeyName ? "positionNumber" : obj.positionNumber,
"positions.deliveries": (obj, isKeyName) =>
isKeyName ? "deliveryPositionNumber" : obj.deliveryPositionNumber,
}
});
2 — Include entire subtrees
Assume positions is still an array of objects and goodsRecipientAddress is an object with name, city, zipcode etc. on each position level.
2a
In the following code snippet changes expected to appear only for position elements, where something in the goodsRecipientAddress has changed. Everything outside "positions.goodsRecipientAddress" is ignored.
const delta = diffAtom(oldOrder, newOrder, {
arrayIdentityKeys: { positions: "positionNumber" },
keysToInclude: [
// include the whole goods recipient address subtree for every position
"positions.goodsRecipientAddress",
],
});
2b
Or also on nested array elements.
{
positions: [
{
positionNumber: "001",
deliveries: [
{deliveryPositionNumber: "d001", quantity: 1, deliveryDateStart: "2026-03-05" }
]
}
]
}
Then the following would only track changes for each deliveries inside a position
const delta = diffAtom(oldOrder, newOrder, {
arrayIdentityKeys: { positions: "positionNumber", "positions.deliveries": "deliveryPositionNumber" },
keysToInclude: [
// include the whole deliveries subtree for every position
"positions.deliveries",
]
});
2c
Or the following would only track changes for each deliveryDateStart inside deliveries of each position
const delta = diffAtom(oldOrder, newOrder, {
arrayIdentityKeys: { positions: "positionNumber", "positions.deliveries": "deliveryPositionNumber" },
keysToInclude: [
// include the whole deliveries subtree for every position
"positions.deliveries.deliveryDateStart",
],
});
3 — Mix keysToInclude with keysToSkip
Useful when you want to include a broad subtree but still exclude a noisy child within it:
const delta = diffAtom(oldOrder, newOrder, {
arrayIdentityKeys: { positions: "positionNumber" },
keysToInclude: ["positions"],
keysToSkip: ["positions.noisyChildProperty"],
});
Relationship to existing keysToSkip logic
The matching semantics I would expect are the exact mirror of keysToSkip:
- A path in
keysToInclude includes that exact path and all its descendants.
- If both
keysToInclude and keysToSkip are provided, keysToSkip takes precedence
for any path that matches both (consistent with "skip wins" semantics).
Given that keysToSkip is already implemented with prefix-matching on the internal keyPath, would it be feasible to add keysToInclude as the symmetric counterpart using the same infrastructure?
Environment
- Package:
json-diff-ts
- Version tested:
5.0.0-alpha.8
Hi @ltwlf ,
while learning using this library, a request came to my mind, which could be useful for large documents in order to save performance.
Summary
If possible please consider adding a
keysToIncludeoption todiffAtomthat acts as the whitelist complement of the existingkeysToSkipblacklist. WhenkeysToIncludeis provided, only paths that match (or are descendants of) an entry in the list would be diffed; everything else would be silently ignored.Motivation / user story - performance and flexibility
We are diffing large, deeply nested order documents. A typical order object contains dozens of fields per line item — material descriptions, material numbers, pricing structures - sometimes internal processing flags — the vast majority of which we do not care about for
change-detection purposes. Also in our business logic we are often able to tell , that only specific pathes inside the big document changed.
Today we could work around this with
keysToSkip, but maintaining an ever-growing exclusion list is error-prone: every time a new field is added to the document schema we have to remember to add it to the skip list or it leaks unwanted noise into the diff output.What we really want is to say "only these five paths and its childs matter — skip the rest". That is the classic whitelist / include-only pattern, and it is the natural inverse of what
keysToSkipalready does.For large documents this could also have a performance benefit: the diff engine can stop recursing as soon as it knows a subtree can never produce an included path, rather than walking the entire object tree just to discard results at the end.
Proposed API
Same dot-notation as
keysToSkip. The option would sit alongside it inOptions/AtomOptions:Path notation (identical to
keysToSkip)Dot-separated property names from the root of the document. Array element keys are not part of the path — a single entry automatically covers every element in the array at that level, just like
keysToSkipalready does.Example usage
Assume
positionsis an array field anddeliveriesa nested array property for each position.1 — Include only a handful of fields in a large order document
Only the six listed leaf paths appear in delta.operations — all other fields are silently ignored, which could save performance.
2 — Include entire subtrees
Assume
positionsis still an array of objects andgoodsRecipientAddressis an object withname,city,zipcodeetc. on each position level.2a
In the following code snippet changes expected to appear only for position elements, where something in the goodsRecipientAddress has changed. Everything outside "positions.goodsRecipientAddress" is ignored.
2b
Or also on nested array elements.
Then the following would only track changes for each deliveries inside a position
2c
Or the following would only track changes for each
deliveryDateStartinside deliveries of each position3 — Mix
keysToIncludewithkeysToSkipUseful when you want to include a broad subtree but still exclude a noisy child within it:
Relationship to existing
keysToSkiplogicThe matching semantics I would expect are the exact mirror of
keysToSkip:keysToIncludeincludes that exact path and all its descendants.keysToIncludeandkeysToSkipare provided,keysToSkiptakes precedencefor any path that matches both (consistent with "skip wins" semantics).
Given that
keysToSkipis already implemented with prefix-matching on the internalkeyPath, would it be feasible to addkeysToIncludeas the symmetric counterpart using the same infrastructure?Environment
json-diff-ts5.0.0-alpha.8