Summary
Add metadata tags to pattern definitions and support filtering by tag at scan time. This enables jurisdiction-specific and domain-specific pattern sets.
Proposed tags
- Country/region:
us, eu, uk, de, fr, au, ca, etc.
- Jurisdiction:
gdpr, hipaa, ccpa, pci-dss, sox
- Domain:
healthcare, financial, legal, government, education
- Data category:
pii, phi, financial, credentials, biometric
Design
- Add
tags: Vec<String> to pattern JSON definitions
- Add
tags: Vec<String> filter to PatternRecognition workflow config
PatternEngineBuilder::with_tags(&[&str]) restricts to patterns matching any of the given tags
- Tags are additive: a pattern tagged
["us", "pii"] matches filter ["us"] or ["pii"]
Examples
{
"name": "ssn",
"tags": ["us", "pii", "government_id"],
"category": "personal_identity",
"entity_type": "government_id",
"pattern": { ... }
}
{
"name": "nhs_number",
"tags": ["uk", "healthcare", "phi"],
"category": "personal_identity",
"entity_type": "government_id",
"pattern": { ... }
}
Config: { "tags": ["eu", "gdpr"] } would match patterns tagged with either eu or gdpr.
Future work
- Built-in pattern packs per jurisdiction (eu-gdpr, us-hipaa, etc.)
- Mutually exclusive tag groups (e.g. a pattern can't be both
us and uk)
- Tag-based confidence adjustments
Summary
Add metadata tags to pattern definitions and support filtering by tag at scan time. This enables jurisdiction-specific and domain-specific pattern sets.
Proposed tags
us,eu,uk,de,fr,au,ca, etc.gdpr,hipaa,ccpa,pci-dss,soxhealthcare,financial,legal,government,educationpii,phi,financial,credentials,biometricDesign
tags: Vec<String>to pattern JSON definitionstags: Vec<String>filter toPatternRecognitionworkflow configPatternEngineBuilder::with_tags(&[&str])restricts to patterns matching any of the given tags["us", "pii"]matches filter["us"]or["pii"]Examples
{ "name": "ssn", "tags": ["us", "pii", "government_id"], "category": "personal_identity", "entity_type": "government_id", "pattern": { ... } }{ "name": "nhs_number", "tags": ["uk", "healthcare", "phi"], "category": "personal_identity", "entity_type": "government_id", "pattern": { ... } }Config:
{ "tags": ["eu", "gdpr"] }would match patterns tagged with eithereuorgdpr.Future work
usanduk)