Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
4ffaf67
Remove all validation
honnibal Mar 22, 2026
093dd87
Fix broken imports and undefined variables in confection refactor
honnibal Mar 22, 2026
b88e49d
Tmp
honnibal Mar 22, 2026
64b9a26
Add pyrightconfig.json pointing to monorepo venv
honnibal Mar 22, 2026
7a999ac
Use pythonPath in pyrightconfig for monorepo venv
honnibal Mar 22, 2026
0e7ba09
Add catalogue to pyright extraPaths
honnibal Mar 22, 2026
96937d4
Fix pyright errors: use .name not ._name, suppress configparser inher…
honnibal Mar 22, 2026
392e40d
Remove tests for now (will replace)
honnibal Mar 22, 2026
886dc8e
Implement hypothesis strategy for (config_str, dict) pairs
honnibal Mar 22, 2026
837d79a
Simplify strategies to just generate dicts, move serialization to tests
honnibal Mar 22, 2026
a3dac09
Fix interpret_config bugs and add basic roundtrip test
honnibal Mar 22, 2026
a24e7f2
Move interpret_config to _parser as a pure function
honnibal Mar 22, 2026
dced2a4
Encapsulate ConfigParser in _parser: add parse_config_string, simplif…
honnibal Mar 22, 2026
8132ac5
Move to_str logic into _parser.config_to_str
honnibal Mar 22, 2026
dba7689
Update _config.py to use renamed parse_config and serialize_config
honnibal Mar 22, 2026
7737797
Add type checker
honnibal Mar 22, 2026
7a21ff7
Add JSON leaf strategy and test for inline JSON values in configs
honnibal Mar 22, 2026
d40b3f1
Add variable interpolation tests: happy path and circular reference d…
honnibal Mar 22, 2026
c494d01
Fix slow strategy generation: replace permutations with sampling, red…
honnibal Mar 22, 2026
3e70b78
Add error case tests, fix missing continue in _validate_overrides
honnibal Mar 22, 2026
e51ff4b
Replace is_promise check in serialize_config with inline_paths parameter
honnibal Mar 22, 2026
4f19665
Allow * as valid intermediate section in validation, fix * skeleton n…
honnibal Mar 22, 2026
ef258a6
Add tests for * (positional) sections: parse, roundtrip, nesting, int…
honnibal Mar 22, 2026
f145ac7
Add coverage pragmas for unreachable defensive checks in _parser
honnibal Mar 22, 2026
730f4e7
Remove dead InterpolationMissingOptionError handling from validation
honnibal Mar 22, 2026
bdcc8a7
Add interpolation edge case tests: single quotes, 74688, bare $, same…
honnibal Mar 22, 2026
de4fd12
Reorder
honnibal Mar 22, 2026
522638e
Fold _interpolation.py into _parser.py
honnibal Mar 22, 2026
214b7c2
Add Config serialization tests: str/bytes/disk roundtrip, copy, merge…
honnibal Mar 22, 2026
d24f905
Add validation.py: Schema, FieldInfo, type validation, pydantic compat
honnibal Mar 22, 2026
b3a46d3
Add schema validation and default filling to Config
honnibal Mar 22, 2026
959ee76
Fix validate to recurse into nested schemas, bring in validation/pyda…
honnibal Mar 22, 2026
3b0606a
Fix pydantic shim tests: install pydantic, remove registry deps, add …
honnibal Mar 22, 2026
8a259aa
Add pydantic v2 ensure_schema tests to cover model_fields extraction …
honnibal Mar 22, 2026
ae907d8
Fix alias bug in Schema.__init_subclass__, add validation edge case t…
honnibal Mar 22, 2026
f82f15f
Pragma out pydantic import fallbacks in validation.py
honnibal Mar 22, 2026
65dd3c5
Add typechecker edge case tests, achieve 100% coverage
honnibal Mar 22, 2026
3677683
Remove deepcopy try/except, add merge edge case tests, achieve 100% o…
honnibal Mar 22, 2026
203058b
Implement registry.fill() default-filling from function signatures
honnibal Mar 22, 2026
ce50f66
Add registry resolve/promise tests, achieve 100% on _registry.py
honnibal Mar 22, 2026
768a97a
Add util tests: frozen collections, partial, try_dump_json error path
honnibal Mar 22, 2026
c6e317e
Add _errors tests: from_error and parent formatting — 100% across all…
honnibal Mar 22, 2026
c072a48
Fix typechecker for complex generics: guard isinstance/issubclass aga…
honnibal Mar 22, 2026
670795d
Handle ForwardRef in typechecker: accept unresolved forward references
honnibal Mar 22, 2026
7bc343a
Resolve ForwardRefs in dataclass field types via get_type_hints
honnibal Mar 22, 2026
ef03c5d
Add schema/validate params to registry.resolve() and fill() for spaCy…
honnibal Mar 22, 2026
4ba0992
Add promise arg validation in fill: check missing required args and u…
honnibal Mar 22, 2026
5778753
Fix overrides: apply to nested dict after interpretation, not at conf…
honnibal Mar 22, 2026
1d91680
Fix fill: skip unknown functions, strip extras with schema
honnibal Mar 22, 2026
676875c
Fix flake8 errors: remove unused imports and fix formatting
honnibal Mar 22, 2026
bf82bbd
Run black, isort, flake8: format and fix lint errors
honnibal Mar 22, 2026
6ee8e24
Fix pyright errors: suppress NewType attribute access and pydantic .e…
honnibal Mar 22, 2026
1c6ff77
Switch to ruff for formatting/linting
honnibal Mar 22, 2026
6ab029f
Merge origin/main, preferring redo-validation branch
honnibal Mar 22, 2026
5b7b23f
Add catalogue as explicit dependency
honnibal Mar 22, 2026
2f7afbc
Remove catalogue as hard dependency — only needed by consumers who us…
honnibal Mar 22, 2026
0c526cd
Fix import sorting in typechecker.py
honnibal Mar 22, 2026
7393fb3
Fix CI: remove old confection/tests/, export Schema, fix pyrightconfi…
honnibal Mar 22, 2026
eec05ae
Port fixes from main for issues #52, #58, #59, #61
honnibal Mar 22, 2026
a56e5e1
Fix Python 3.10 compat: use TYPE_CHECKING guard for Self import
honnibal Mar 22, 2026
83335c6
Fix Python 3.10 compat: conditional Self import, typing_extensions de…
honnibal Mar 22, 2026
ce08360
Fix CI: run pytest tests/ instead of --pyargs confection
honnibal Mar 22, 2026
87fb173
Skip pydantic v1 tests on Python 3.14+
honnibal Mar 22, 2026
480a88a
Add type annotations and docstrings to typechecker module
honnibal Mar 22, 2026
c11c45d
Add TypeVar ForwardRef bound test, pragma defensive catches in typech…
honnibal Mar 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,4 +74,4 @@ jobs:

- name: Run tests
run: |
python -m pytest --pyargs confection
python -m pytest tests/
1,216 changes: 16 additions & 1,200 deletions confection/__init__.py

Large diffs are not rendered by default.

268 changes: 268 additions & 0 deletions confection/_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
from __future__ import annotations

import copy
import sys
from pathlib import Path
from typing import Any, Dict, List, Optional, Union

if sys.version_info >= (3, 11):
from typing import Self
else:
from typing_extensions import Self

from ._errors import ConfectionError, ConfigValidationError
from ._parser import parse_config, serialize_config
from .validation import ValidationError, ensure_schema


class Config(dict):
# TODO: Improve doc string
"""Dict subclass to save TOML-style configuration format from/to string, file
or bytes.
"""

is_interpolated: bool

def __init__(
self,
data: Optional[Union[Dict[str, Any], "Config"]] = None,
*,
is_interpolated: Optional[bool] = None,
section_order: Optional[List[str]] = None,
) -> None:
"""Initialize a new Config object with optional data."""
dict.__init__(self)
if data is None:
data = {}
if not isinstance(data, (dict, Config)):
raise ConfectionError(
f"Can't initialize Config with data. Expected dict or "
f"Config but got: {type(data)}"
)
# Whether the config has been interpolated. We can use this to check
# whether we need to interpolate again when it's resolved. We assume
# that a config is interpolated by default.
if is_interpolated is not None:
self.is_interpolated = is_interpolated
elif isinstance(data, Config):
self.is_interpolated = data.is_interpolated
else:
self.is_interpolated = True
if section_order is not None:
self.section_order = section_order
elif isinstance(data, Config):
self.section_order = data.section_order
else:
self.section_order = []
# Update with data
self.update(data)

def interpolate(self) -> Self:
"""Interpolate (resolve var references) a config.

Returns a copy of the object.
"""
# This is currently the most effective way because we need our custom
# to_str logic to run in order to re-serialize the values so we can
# interpolate them again. ConfigParser.read_dict will just call str()
# on all values, which isn't enough.
return type(self)().from_str(self.to_str())

def copy(self) -> Self:
"""Deepcopy the config."""
config = copy.deepcopy(self)
return type(self)(
config,
is_interpolated=self.is_interpolated,
section_order=self.section_order,
)

def merge(
self, updates: Union[Dict[str, Any], "Config"], remove_extra: bool = False
) -> Self:
"""Deep merge the config with updates, using current as defaults."""
defaults = self.copy()
updates = Config(updates).copy()
merged = deep_merge_configs(updates, defaults, remove_extra=remove_extra)
return type(self)(
merged,
is_interpolated=defaults.is_interpolated and updates.is_interpolated,
section_order=defaults.section_order,
)

def validate(self, schema) -> Self:
"""Validate the config against a schema. Raises ConfigValidationError
if validation fails.
"""
schema = ensure_schema(schema)
_validate_recursive(dict(self), schema, self)
return self

def fill_defaults(self, schema) -> Self:
"""Fill in missing values from schema defaults and remove extra
fields if the schema forbids them. Modifies in place and returns self.
"""
schema = ensure_schema(schema)
extra = schema.model_config.get("extra", "allow")
# Fill defaults
for name, field in schema.model_fields.items():
if name not in self and not field.is_required():
self[name] = field.default
elif name in self and isinstance(self[name], dict):
field_schema = field.annotation
if isinstance(field_schema, type) and hasattr(
field_schema, "model_fields"
):
sub_schema = ensure_schema(field_schema)
_fill_defaults_recursive(self[name], sub_schema)
# Strip extras
if extra == "forbid":
known = set(schema.model_fields.keys())
for key in list(self.keys()):
if key not in known:
del self[key]
return self

def from_str(
self,
text: str,
*,
interpolate: bool = True,
overrides: Dict[str, Any] = {},
schema=None,
) -> Self:
"""Load the config from a string."""
self.clear()
self.update(parse_config(text, interpolate=interpolate, overrides=overrides))
if overrides and interpolate:
# Re-interpolate now that overrides are applied. The recursive
# from_str call will have no overrides, so this doesn't loop.
self = self.interpolate()
self.is_interpolated = interpolate
if schema is not None:
self.fill_defaults(schema)
self.validate(schema)
return self

def to_str(self, *, interpolate: bool = True) -> str:
"""Write the config to a string."""
return serialize_config(self, interpolate=interpolate)

def to_bytes(self, *, interpolate: bool = True) -> bytes:
"""Serialize the config to a byte string."""
return self.to_str(interpolate=interpolate).encode("utf8")

def from_bytes(
self,
bytes_data: bytes,
*,
interpolate: bool = True,
overrides: Dict[str, Any] = {},
) -> Self:
"""Load the config from a byte string."""
return self.from_str(
bytes_data.decode("utf8"), interpolate=interpolate, overrides=overrides
)

def to_disk(self, path: Union[str, Path], *, interpolate: bool = True) -> None:
"""Serialize the config to a file."""
path = Path(path) if isinstance(path, str) else path
with path.open("w", encoding="utf8") as file_:
file_.write(self.to_str(interpolate=interpolate))

def from_disk(
self,
path: Union[str, Path],
*,
interpolate: bool = True,
overrides: Dict[str, Any] = {},
) -> Self:
"""Load config from a file."""
path = Path(path) if isinstance(path, str) else path
with path.open("r", encoding="utf8") as file_:
text = file_.read()
return self.from_str(text, interpolate=interpolate, overrides=overrides)


def _fill_defaults_recursive(data, schema):
"""Fill defaults and strip extras recursively for nested schemas."""
extra = schema.model_config.get("extra", "allow")
for name, field in schema.model_fields.items():
if name not in data and not field.is_required():
data[name] = field.default
elif name in data and isinstance(data[name], dict):
field_schema = field.annotation
if isinstance(field_schema, type) and hasattr(field_schema, "model_fields"):
_fill_defaults_recursive(data[name], ensure_schema(field_schema))
if extra == "forbid":
known = set(schema.model_fields.keys())
for key in list(data.keys()):
if key not in known:
del data[key]


def _validate_recursive(data, schema, config, parent=""):
"""Validate data against a schema, recursing into nested schemas."""
try:
schema.model_validate(data)
except ValidationError as e:
section = f" in [{parent}]" if parent else ""
raise ConfigValidationError(
config=config,
errors=e.errors(),
title=f"Config validation error{section}",
) from None
# Recurse into fields that are themselves schemas
for name, field in schema.model_fields.items():
annotation = field.annotation
if (
isinstance(annotation, type)
and hasattr(annotation, "model_validate")
and name in data
and isinstance(data[name], dict)
):
child_parent = f"{parent}.{name}" if parent else name
_validate_recursive(data[name], annotation, config, parent=child_parent)


def deep_merge_configs(
config: Union[Dict[str, Any], Config],
defaults: Union[Dict[str, Any], Config],
*,
remove_extra: bool = False,
) -> Union[Dict[str, Any], Config]:
"""Deep merge two configs."""
if remove_extra:
# Filter out values in the original config that are not in defaults
keys = list(config.keys())
for key in keys:
if key not in defaults:
del config[key]
for key, value in defaults.items():
if isinstance(value, dict):
node = config.setdefault(key, {})
if not isinstance(node, dict):
continue
value_promises = [k for k in value if k.startswith("@")]
value_promise = value_promises[0] if value_promises else None
node_promises = [k for k in node if k.startswith("@")] if node else []
node_promise = node_promises[0] if node_promises else None
# We only update the block from defaults if it refers to the same
# registered function
if (
value_promise
and node_promise
and (
value_promise in node
and node[value_promise] != value[value_promise]
)
):
continue
if node_promise and (
node_promise not in value or node[node_promise] != value[node_promise]
):
continue
defaults = deep_merge_configs(node, value, remove_extra=remove_extra)
elif key not in config:
config[key] = value
return config
22 changes: 22 additions & 0 deletions confection/_constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import re

# Regex to detect whether a value contains a variable
VARIABLE_RE = re.compile(r"\$\{[\w\.:]+\}")

# Internal prefix used to mark section references for custom interpolation
SECTION_PREFIX = "__SECTION__:"

# Field used for positional arguments, e.g. [section.*.xyz]. The alias is
# required for the schema (shouldn't clash with user-defined arg names)
ARGS_FIELD = "*"
ARGS_FIELD_ALIAS = "VARIABLE_POSITIONAL_ARGS"

# Aliases for fields that would otherwise shadow pydantic attributes. Can be any
# string, so we're using name + space so it looks the same in error messages etc.
RESERVED_FIELDS = {
"validate": "validate\u0020",
"model_config": "model_config\u0020",
"model_validate": "model_validate\u2020",
"model_fields": "model_fields\u2020",
}
RESERVED_FIELDS_REVERSE = {v: k for k, v in RESERVED_FIELDS.items()}
Loading