ML Proteomics

Machine Learning models and pipelines to predict proteomics values from mRNA expression data.

Official Codebase for the Paper:

Ochoteco-Asensio, J., et al. (2022). "Predicting missing proteomics values from mRNA expression data using machine learning". Computational and Structural Biotechnology Journal.
DOI: 10.1016/j.csbj.2022.04.017

Project Overview

This repository contains the research and implementation of the machine learning strategies described in the paper. The study demonstrates how transcriptomics data can be used to accurately predict missing protein abundance levels using Recursive Feature Elimination (RFE) and various regression models.

Directory Structure

scripts/: Core logic and analysis scripts.
- modelling/: Model training and evaluation logic.
- recursive_feature_elimination/: RFE pipelines.
- data_cleaning/: Pre-processing and normalization.
- go_terms_analysis/: Functional enrichment analysis of prioritized features.
- utils/: Shared utility functions (functions_JOA.R).
data/: Input datasets (RDS format).
output/: Generated results, including plots and model metrics.

Key Features

RFE Pipeline: Integrated Recursive Feature Elimination using caret.
Parallel Processing: Support for multi-core execution via doParallel.
Comprehensive Visualization: Automated plotting of model performance and feature importance.

Usage

Most scripts are designed to be run from the project root using the ml_proteomics.Rproj file.

Developed by Juan Ochoteco Asensio

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
data		data
output		output
scripts		scripts
.gitignore		.gitignore
CITING.md		CITING.md
README.md		README.md
ml_proteomics.Rproj		ml_proteomics.Rproj
reproduce_check.R		reproduce_check.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Proteomics

Project Overview

Directory Structure

Key Features

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML Proteomics

Project Overview

Directory Structure

Key Features

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages