Master Thesis Code

Introduction

This repository contains all code scripts developed for my master thesis research: A cohort and molecular epidemiological study of the association between platelet count and colorectal cancer survival.

The project aims to explore the relationship between platelet count and colorectal cancer survival using data from the UK Biobank and West China cohorts. The repository is designed to ensure reproducibility, transparency, and robustness in the computational workflows associated with the thesis.

File Descriptions

File	Description
`00a_ukb_survival_data.py`	Processes survival data for the UK Biobank cohort.
`00b_ukb_extra_data.py`	Processes additional data for the UK Biobank cohort.
`01a_hx_survival_data.py`	Processes survival data for the West China cohort.
`01b_hx_extra_data.py`	Processes additional data for the West China cohort.
`02_baseline_stats.R`	Generates baseline statistics for both cohorts.
`03_survival_curve.R`	Creates survival curves for the UK Biobank and West China cohorts.
`04_coxph_forest.R`	Performs Cox proportional hazards models and generates forest plots.
`05_coxph_rcs.R`	Visualizes restricted cubic splines for Cox proportional hazards models.
`06_coxph_rolling.R`	Implements rolling Cox proportional hazards models.
`07_genetic_instruments.R`	Identifies genetic instruments for Mendelian randomization.
`08_gwas_info.R`	Processes GWAS-related information for the West China cohort.
`09_plink_data.sh`	Processes SNP data using PLINK, including sex discrepancy checks, PCA, and SNP extraction.
`10_gwas_coxph.R`	Conducts Cox proportional hazards regression for SNP data.
`11_2smr.R`	Performs two-sample Mendelian randomization analysis.
`12_prs.R`	Calculates polygenic risk scores.
`13_nlmr.R`	Conducts nonlinear Mendelian randomization analysis.
`14_eqtl_mr.R`	Performs eQTL Mendelian randomization analysis.
`15_gene_survival.R`	Analyzes the association between gene expression and survival.
`16_gene_enrichment.R`	Conducts gene enrichment analysis.
`run_all.sh`	A bash script to sequentially execute all numbered analysis scripts and log the results.

Usage

This project is developed using Python, R, and PLINK. Ensure these tools are correctly configured.
The Python environment is managed by uv. Use uv sync in the terminal to reproduce the Python environment.
The R environment is managed by the renv package. Use renv::restore() in R to rebuild the R environment.
PLINK version 1.9 is required. Ensure it is available in $PATH.
Run the scripts sequentially by their numerical prefixes or execute run_all.sh to automate the process.

Author

Changtao Li

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Master Thesis Code

Introduction

File Descriptions

Usage

Author

License

About

Uh oh!

Releases 1

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
functions		functions
.gitignore		.gitignore
.python-version		.python-version
00a_ukb_survival_data.py		00a_ukb_survival_data.py
00b_ukb_extra_data.py		00b_ukb_extra_data.py
01a_hx_survival_data.py		01a_hx_survival_data.py
01b_hx_extra_data.py		01b_hx_extra_data.py
02_baseline_stats.R		02_baseline_stats.R
03_survival_curve.R		03_survival_curve.R
04_coxph_forest.R		04_coxph_forest.R
05_coxph_rcs.R		05_coxph_rcs.R
06_coxph_rolling.R		06_coxph_rolling.R
07_genetic_instruments.R		07_genetic_instruments.R
08_gwas_info.R		08_gwas_info.R
09_plink_data.sh		09_plink_data.sh
10_gwas_coxph.R		10_gwas_coxph.R
11_2smr.R		11_2smr.R
12_prs.R		12_prs.R
13_nlmr.R		13_nlmr.R
14_eqtl_mr.R		14_eqtl_mr.R
15_gene_survival.R		15_gene_survival.R
16_gene_enrichment.R		16_gene_enrichment.R
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
renv.lock		renv.lock
run_all.sh		run_all.sh
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Master Thesis Code

Introduction

File Descriptions

Usage

Author

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Uh oh!

Contributors

Uh oh!

Languages