-
Notifications
You must be signed in to change notification settings - Fork 18
Feature/design renaissance #469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Major improvements for agent accessibility and developer experience: ## Agentic Interface (MCP Server) - Add MCP server with tools: search_datasets, get_dataset_schema, load_dataset, list_all_datasets - Add structured error classes with codes and recovery hints - Add as_json parameter to search() and list() for programmatic access - Add include_schema parameter to get_as_dict() for schema-aware loading - Add get_schema() method to FoundryDataset ## CLI Rebuild - Complete rewrite using typer with rich output - Commands: search, get, list, schema, status, catalog, version - MCP commands: mcp start, mcp install - HuggingFace: push-to-hf command ## HuggingFace Export Bridge - Add push_to_hub() function for exporting to HF Hub - Auto-generate Dataset Cards from DataCite metadata ## Code Quality - Remove debug print statement from foundry_dataset.py - Fix silent download failures (now raises DownloadError) - Standardize error handling (print → logger) - Add beginner-friendly example notebook ## Tests - Add test_errors.py for structured error classes - Add test_new_features.py for new API features - Update test_https_download.py for DownloadError 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
## New Documentation Structure - docs/SUMMARY.md: Complete navigation - docs/installation.md: Installation guide - docs/quickstart.md: 5-minute quick start - docs/guide/: User guides (searching, loading, ML frameworks, schemas) - docs/features/: Feature docs (CLI, MCP server, HuggingFace, errors) - docs/support/faq.md: Frequently asked questions ## Updated Documentation - README.md: Modern API examples, new features highlighted - docs/README.md: Rewritten introduction - docs/concepts/overview.md: Architecture diagram, concepts explained ## Example Notebooks - Simplified examples using HTTPS default - Removed fallback code (new version installed) - Added 3 progressive tutorials (quickstart, working with data, advanced) ## Code Changes - foundry/foundry.py: Changed use_globus default to False 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Contributor
Author
|
Ran tests locally. This seems to be a GH testing issue |
blaiszik
added a commit
that referenced
this pull request
Jan 14, 2026
This file was part of PR #469 but was not included in the merge, causing ModuleNotFoundError when importing foundry. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
blaiszik
added a commit
that referenced
this pull request
Jan 14, 2026
* Restore missing mdf_client.py from design-renaissance branch This file was part of PR #469 but was not included in the merge, causing ModuleNotFoundError when importing foundry. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Fix DOI search to return correct dataset The forge DOI search can return multiple results where only one actually has the matching DOI. Previously, get_metadata_by_doi() blindly returned the first result, which often didn't have the requested DOI. Now it iterates through results to find the one with the exact DOI match, fixing test_dataframe_search_by_doi and test_dataframe_download_by_doi tests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
blaiszik
added a commit
that referenced
this pull request
Jan 14, 2026
* Restore missing mdf_client.py from design-renaissance branch This file was part of PR #469 but was not included in the merge, causing ModuleNotFoundError when importing foundry. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Fix DOI search to return correct dataset The forge DOI search can return multiple results where only one actually has the matching DOI. Previously, get_metadata_by_doi() blindly returned the first result, which often didn't have the requested DOI. Now it iterates through results to find the one with the exact DOI match, fixing test_dataframe_search_by_doi and test_dataframe_download_by_doi tests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Move torch/tensorflow to optional extras to fix CI disk space The combined size of torch, tensorflow, and NVIDIA CUDA dependencies exceeded GitHub Actions runner disk space (~4GB+). These ML frameworks are now available as optional extras via pip install .[torch] or pip install .[tensorflow]. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Fix flake8 linting errors - Remove unused imports (sys, rprint, Optional, pandas, numpy) - Fix unused exception variable - Remove f-string without placeholders - Split long line in MCP server description - Add noqa comment for intentional re-export Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Replace mdf_forge with internal MDFClient in tests Update test imports to use foundry.mdf_client.MDFClient instead of mdf_forge.Forge, which is no longer a required dependency. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
blaiszik
added a commit
that referenced
this pull request
Jan 14, 2026
* Restore missing mdf_client.py from design-renaissance branch This file was part of PR #469 but was not included in the merge, causing ModuleNotFoundError when importing foundry. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Fix DOI search to return correct dataset The forge DOI search can return multiple results where only one actually has the matching DOI. Previously, get_metadata_by_doi() blindly returned the first result, which often didn't have the requested DOI. Now it iterates through results to find the one with the exact DOI match, fixing test_dataframe_search_by_doi and test_dataframe_download_by_doi tests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Move torch/tensorflow to optional extras to fix CI disk space The combined size of torch, tensorflow, and NVIDIA CUDA dependencies exceeded GitHub Actions runner disk space (~4GB+). These ML frameworks are now available as optional extras via pip install .[torch] or pip install .[tensorflow]. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Fix flake8 linting errors - Remove unused imports (sys, rprint, Optional, pandas, numpy) - Fix unused exception variable - Remove f-string without placeholders - Split long line in MCP server description - Add noqa comment for intentional re-export Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Replace mdf_forge with internal MDFClient in tests Update test imports to use foundry.mdf_client.MDFClient instead of mdf_forge.Forge, which is no longer a required dependency. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add optional extras and document installation Move heavy ML dependencies to optional extras to reduce default install size: - pip install foundry-ml[torch] - pip install foundry-ml[tensorflow] - pip install foundry-ml[huggingface] - pip install foundry-ml[excel] - pip install foundry-ml[examples] - pip install foundry-ml[dev] Update README with extras install instructions and NumPy 2.0 compatibility note. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.