A reproducible pipeline for EEG-based biomarker extraction using the Linear Observer Control Framework (LOCF), with a dashboard for cohort- and subject-level visualization.
Project Supervisor: Zheng Wang
This platform:
- Ingests raw EEG data (starting with the LEMON dataset)
- Preprocesses and quality-checks recordings
- Extracts standard EEG features and LOCF-derived biomarkers (K_norm, L_norm, etc.)
- Stores outputs in a structured biomarker database
- Displays results in an interactive dashboard
Longer-term targets: sleep, mood, depression, meditation, PBM, digital-twin applications.
src/
ingestion/ # data loading, metadata parsing
preprocessing/ # artifact rejection, filtering, epoching
features/ # standard EEG feature extraction
biomarkers/ # LOCF biomarker computation
database/ # schema, write/read utilities
dashboard/ # Dash/Streamlit app
scripts/ # CLI entry points
notebooks/ # exploratory and onboarding notebooks
configs/ # pipeline configs and path templates
docs/ # biomarker dictionary, design docs
tests/ # unit and smoke tests
outputs/ # generated outputs (not committed)
| Name | Owns |
|---|---|
| Pawan | preprocessing, feature extraction, biomarker extraction, LEMON validation |
| Vedansh | database schema, backend/API, dashboard UI |
| Aditya | data ingestion, metadata parsing, orchestration, config, reproducibility |
git clone <repo-url>
cd eeg-biomarker-platformconda env create -f environment.yml
conda activate eeg-biomarker-platformor
pip install -r requirements.txtcp configs/paths.example.yaml configs/paths.local.yamlEdit configs/paths.local.yaml to point to your local LEMON data (synced from Google Drive).
jupyter notebook notebooks/00_environment_check.ipynbpython scripts/run_preprocessing.py --config configs/preprocessing.yaml --subject sub-0001
python scripts/run_biomarkers.py --config configs/biomarkers.yaml --subject sub-0001python scripts/build_database.pypython scripts/launch_dashboard.py- Do not commit raw dataset files to GitHub.
- Raw LEMON data and shared example notebooks live in Google Drive.
- Reference data paths through
configs/paths.local.yaml— never hard-code paths. configs/paths.example.yamlis the committed template;paths.local.yamlis gitignored.
| Type | Location |
|---|---|
| QC summaries | outputs/qc/ |
| Subject biomarker tables | outputs/biomarkers/ |
| Cohort summary tables | outputs/cohort/ |
Example:
outputs/qc/sub-0001_ses-rest_run-01_qc.csv
outputs/biomarkers/sub-0001_ses-rest_run-01_biomarkers.parquet
- Every pull request must be linked to an issue.
- Notebook logic must be converted to scripts/modules before merging.
- Use standard file and variable naming across modules.
- Every major feature needs at least one test or smoke check.
| # | Milestone |
|---|---|
| 1 | Onboarding and replication |
| 2 | LEMON ingestion and preprocessing |
| 3 | Biomarker database |
| 4 | Dashboard prototype |
| 5 | Prototype release |
See docs/milestones.md for detailed checklists.