Skip to content

shvadoodi/simple-model

Repository files navigation

🗞️ News Category Predictor

This project trains a BERT-based model to classify news headlines into categories such as Politics, Sports, Crime, etc., using a labeled dataset (news_dataset.csv).


🧰 Setup

1. Install Dependencies

Please note there is a possibility that some of the requirements need to be modify in order to fit on your system.

pip install -r requirements.txt

2. Prepare or Train BERT Model

The project expects a trained model and tokenizer in the bert_news_model/ folder.

To train the model locally:

python train_bert.py

This will save:

  • bert_news_model/ → trained BERT model
  • label_encoder.pkl → label encoder

3. Predict a Headline

To test the model on a sample headline:

python predict_bert.py

📁 Files

  • train_bert.py: Fine-tunes BERT using your news dataset.
  • predict_bert.py: Loads the model and classifies a test headline.
  • news_dataset.csv: Your labeled training data (must have text, label columns).
  • bert_news_model/: Output folder for the trained model.
  • label_encoder.pkl: Stores label-to-index mapping.
  • suppress_warnings.py: Silences TF/PyTorch startup logs.

📌 Example

Summary: Google released a new tool for developers.
Predicted Category: Technology
Summary: A man was arrested in Toronto after a stabbing incident.
Predicted Category: Crime

✅ Notes

  • Requires a GPU for faster training (torch.cuda should be available).
  • Compatible with Windows, Linux, or WSL environments.

About

Process of creating a model using BERT and Keras (for learning purpose)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages