Skip to content

NikitaZemtsov/menu_parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Menu Recognizer

Extract dish data from a restaurant menu PDF and output it as structured, normalized JSON. Uses google-adk + Gemini for extraction.

Getting Started

Assuming you have Homebrew installed, install the local tooling:

brew bundle

Install the asdf python/pdm plugins:

asdf plugin add python
asdf plugin add pdm

Install the pinned Python / PDM versions (.tool-versions):

asdf install

Create and activate the venv, then install dependencies:

pdm venv create
source .venv/bin/activate
pdm install -G dev

Environment Variables

Copy the example file and add your Gemini API key:

cp .env.example .env

Then set GOOGLE_API_KEY Google AI Studio

Usage

Put one or more menu files (.pdf, .png, .jpg, .jpeg, .webp) into a directory — each file is treated as one menu — then run:

pdm run extract            # reads ./menus
pdm run extract <dir>      # or a directory you choose

The menus/ folder already ships with a sample menu (espn_bet (1).pdf), so pdm run extract works out of the box with no extra setup.

Every menu is processed concurrently; the extracted dishes are written to output/<filename>.json.

Data model

Each menu is a flat JSON array of Dish objects (src/schema.py):

  • dish_id — stable id; other dishes reference it from their options.
  • category — menu section (e.g. BURGERS).
  • dish_name / description — multi-line descriptions are merged; description is null if absent.
  • price — a float, or null for $X placeholders and price-less items (e.g. sauces).
  • currency — a separate field, ISO 4217 code (e.g. USD); null when there is no price.
  • options — list of OptionGroup ({category, dishes}); each choice is a light ChoiceDish ({dish_id, surcharge}).

Key idea: a single physical item (a side, a sauce) is stored once as a standalone Dish; combos reference it by dish_id from their options. So "choice of side", wings + sauces, and flights are modeled without duplicating data. The graph is non-recursive (Dish → OptionGroup → ChoiceDish), which keeps it clean and usable as an LLM output schema. Extra-cost choices carry a surcharge (e.g. "+$2").

AI Usage

  • Tool: Built end-to-end with Claude Code (Claude) — scaffolding (pdm/asdf/ruff), the Pydantic schema, the google-adk agent + tool, the runner, the CLI entry point, and the agent instructions.
  • Approach: google-adk + Gemini (gemini-flash-latest). The agent reads the PDF (multimodal) and calls one tool, save_dishes, which validates each dish and writes it to session state; the CLI reads that state and dumps JSON, one file per menu.
  • Adapted / why: the tool takes a list[str] of per-dish JSON (ADK reliably passes only simple types); per-dish validation + referential-integrity checks (option dish_ids must already exist) so one bad item doesn't sink the batch; currency is normalized/validated to ISO codes; the Dish JSON schema is injected into the tool description dynamically.
  • Assumptions / edge cases: $X and price-less items → price=null; sides/sauces/rubs are stored as standalone dishes so combos can reference them by id; implicit combos (e.g. a name like "8 wings & 8 sauces") are modeled as option groups after an instruction fix for the FLIGHTS case the model first missed.
  • Known gaps: no automated tests; validated against the single provided menu; output quality depends on the model; most drink prices are null because the menu prints $X.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors