Skip to content

Fix/semantic search image build time#1081

Open
vaclisinc wants to merge 35 commits intomainfrom
fix/semantic-search-infra
Open

Fix/semantic search image build time#1081
vaclisinc wants to merge 35 commits intomainfrom
fix/semantic-search-infra

Conversation

@vaclisinc
Copy link
Contributor

@vaclisinc vaclisinc commented Feb 26, 2026

Overview and Problem statement

In our old version, we pre-downloaded the bge-base-en-v1.5 model at build time. This causes long build time and is not friendly for quick bug fix.

This PR basically just (1) remove model pre-download from dockerfile to speed up deploys on k8s or local (2) download the cpu version of torch to save disk space (3) fix semantic search bar.

Implementation

  1. Dockerfile — Removed the RUN step that pre-downloaded the BAAI/bge-base-en-v1.5 model (~400MB) at build time. This was the main bottleneck slowing down every image build.
  2. docker-compose.yml / K8s semantic-search.yaml — Added a volume mount persisting the HuggingFace model cache (/root/.cache/huggingface) to the host.
  3. Before the sentence-transformer package defaults downloads the dependency of gpu version of torch (4.35GB), I let it download the torch cpu version (~1GB).

The model is downloaded on the very first time I build it up and cached to the host volume. All subsequent deploys, image rebuilds, and pod restarts load it from the volume — same pattern as the existing FAISS index persistence via hostPath.

Same in your local dev, only the first time you docker compose up --build -d it will download the model. After the first time, it will directly read from ./data/semantic-search/model-cache/

Detail solution refers to this commit: 192ad51

Result

semantic-search image is even faster than other images. Total build time in github-action reduces ~86.5% (from 20 minutes to 2m 42s).
image

search bar:
image

vaclisinc and others added 30 commits January 15, 2026 17:20
…y context

When using git URL context with subdirectory (:apps/semantic-search),
the file path must be relative to that subdirectory, not repo root.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The label should be app.kubernetes.io/name=semantic-search, not the full deployment name

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Pre-download BAAI/bge-base-en-v1.5 model during Docker build
  so container doesn't need to download 420MB on every startup
- Increase startupProbe to 10 minutes (from 5) for safety

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…erms and (2) make index save in disk -> not deleted by every deployment
- Restore deleted semantic-search module files (client.ts, controller.ts, requirements.txt)
- Re-add semantic search routes to express loader
- Restore ClassBrowser AI search UI components
- Update fuzzy-find imports to use @repo/common
- Add semantic-search to typedef validation exclusions
- Restore semantic search config in packages/common

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change import from @repo/common to @repo/common/models
- Add explicit type annotation for termsWithClasses.map

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Re-queue failed index builds with exponential backoff (up to 10 rounds)
- Retry entire startup cycle when backend isn't ready yet
- Enable PVC for dev environments so indexes persist across pod restarts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Datapuller needs this to call /refresh on the semantic search service
after updating class data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
No longer needed since we use hostPath instead of PVC.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vaclisinc vaclisinc changed the title Fix/semantic search infra Fix/semantic search image build long time Feb 26, 2026
@vaclisinc vaclisinc changed the title Fix/semantic search image build long time Fix/semantic search image build time Feb 26, 2026
@vaclisinc vaclisinc requested a review from ARtheboss February 26, 2026 01:10
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e5bae94653

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +49 to +53
const result = (await response.json()) as { size?: number };
log.info(
`[Semantic Search] Refreshed ${term.name}: ${result.size ?? "unknown"} courses indexed`
);
return true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle async refresh status before marking term refreshed

refreshSemanticSearchForTerm treats every 2xx response as a successful rebuild, but /refresh is asynchronous and can return {"status":"already_building"} while another term is still in progress (see SemanticSearchEngine.refresh_async). In a multi-term run, later requests can be acknowledged but never queued/built, yet this code still counts them as refreshed, leaving semantic indexes stale for those terms.

Useful? React with 👍 / 👎.

);
const filteredClasses = useMemo(() => {
// If AI search is active and we have semantic results, filter by those
if (aiSearchActive && semanticResults.length > 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep AI search mode from falling back on empty semantic hits

When AI search is enabled, filteredClasses only uses semantic results if semanticResults.length > 0; an empty semantic response therefore falls through to normal fuzzy search. That means users can run “AI Search” and still see keyword matches even though semantic search found nothing, which produces misleading results for no-hit queries.

Useful? React with 👍 / 👎.

@github-actions
Copy link

github-actions bot commented Mar 5, 2026

Linting Failed

Note: The status check will always pass. Run npm run lint -- --continue to see the full output locally.

Click to expand lint output

> lint
> turbo run lint --continue --output-logs=errors-only


Attention:
Turborepo now collects completely anonymous telemetry regarding usage.
This information is used to shape the Turborepo roadmap and prioritize features.
You can learn more, including how to opt-out if you'd not like to participate in this anonymous program, by visiting the following URL:
https://turborepo.com/docs/telemetry

• Packages in scope: @repo/common, @repo/eslint-config, @repo/gql-typedefs, @repo/shared, @repo/sis-api, @repo/storybook, @repo/theme, @repo/typescript-config, ag-frontend, api-sandbox, backend, datapuller, frontend, staff-frontend
• Running lint in 14 packages
• Remote caching disabled
�[;31mfrontend:lint�[;0m
cache miss, executing 3dc199213b2eaa24

> lint
> eslint src/


/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/BubbleCard/index.tsx
  106:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Capacity/index.tsx
  9:14  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Chart/ChartContext.tsx
  7:17  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Chart/index.tsx
   5:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
   8:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
   9:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  10:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  11:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/ClassBrowser/List/index.tsx
  323:3  error  Parsing error: '}' expected

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/ScheduleSummary/index.tsx
  11:14  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

✖ 10 problems (1 error, 9 warnings)

npm error Lifecycle script `lint` failed with error:
npm error code 1
npm error path /home/runner/work/berkeleytime/berkeleytime/apps/frontend
npm error workspace frontend
npm error location /home/runner/work/berkeleytime/berkeleytime/apps/frontend
npm error command failed
npm error command sh -c eslint src/
[WARN] command finished with error, but continuing...
::error::frontend#lint: command (/home/runner/work/berkeleytime/berkeleytime/apps/frontend) /opt/hostedtoolcache/node/22.12.0/x64/bin/npm run lint exited (1)

 Tasks:    6 successful, 7 total
Cached:    0 cached, 7 total
  Time:    11.475s 
Failed:    frontend#lint

 ERROR  run failed: command  exited (1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants