Fact Checking Auto-ML

This project is a small AutoML-style NLP pipeline for binary fact checking. It trains classifiers that decide whether a claim is supported by a piece of evidence.

The current dataset is intentionally small and local: data.csv contains 30 rows with balanced labels:

1: supported / true claim-evidence pair
0: unsupported / false claim-evidence pair

Project Structure

.
├── data.csv              # Input dataset with claim, evidence, and label columns
├── features.py           # Text feature extractors
├── models.py             # Model factory for supported classifiers
├── run.py                # Training, Optuna search, evaluation, and experiment saving
├── requirements.txt      # Python dependencies
├── Makefile              # Setup and run shortcuts
└── experiments/          # Saved trial configs and F1 scores

Pipeline

run.py performs the full experiment workflow:

Load data.csv.
Split the data into train and test sets with an 80/20 split.
Use Optuna to run 10 trials.
For each trial, choose one feature extractor:
- v1: TfidfVectorizer
- v2: CountVectorizer with unigram and bigram features
For each trial, choose one model:
- logreg: LogisticRegression
- rf: RandomForestClassifier
Train the selected model and evaluate it with F1 score.
Save each trial under experiments/exp_<trial_id>/.

Each experiment folder contains:

config.json: feature choice, model choice, and hyperparameters
score.json: F1 score for that trial

The recorded experiments currently report an F1 score of 0.8 for all 10 trials.

Setup

Create a virtual environment and install dependencies:

make setup

This runs:

python3 -m venv .venv
.venv/bin/pip install --upgrade pip
.venv/bin/pip install -r requirements.txt

Run

Run the AutoML search:

make run

Or run the script directly:

.venv/bin/python run.py

The script prints the label distribution, runs the Optuna study, saves trial outputs to experiments/, and prints the best parameter set and best F1 score.

Data Format

data.csv must contain these columns:

claim,evidence,label

Example:

"Paris is capital of France","Paris is the capital city of France",1
"Apple is a fruit","Apple Inc makes phones",0

The script expects at least two labels and at least two samples per class.

Dependencies

Main libraries:

pandas
scikit-learn
optuna
numpy

mlflow and joblib are listed in requirements.txt, but the current scripts do not use them yet.

Notes

The dataset is very small, so the recorded scores should be treated as a demonstration result rather than a reliable benchmark.
train_test_split uses random_state=42, but it does not currently stratify by label.
New feature extractors can be added in features.py and registered in the FEATURES dictionary in run.py.
New models can be added in models.py and included in the Optuna objective in run.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fact Checking Auto-ML

Project Structure

Pipeline

Setup

Run

Data Format

Dependencies

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
experiments		experiments
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
data.csv		data.csv
features.py		features.py
models.py		models.py
requirements.txt		requirements.txt
run.py		run.py

Folders and files

Latest commit

History

Repository files navigation

Fact Checking Auto-ML

Project Structure

Pipeline

Setup

Run

Data Format

Dependencies

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages