ocr_digital_screen_reader is an open-source Python tool for reading numbers or text from digital screens in images—especially designed for use on cropped screen segments from devices such as meters, oscilloscopes, digital panels, or lab equipment.
- Deep learning-based OCR: Uses a CRNN model for accurate character recognition.
- Ready for digit and character screens: Trained on images of real device displays.
- Command-line interface: Predict text from a single image or batch easily.
- Extensible: Ready for integration into larger automated pipelines or real-time workflows.
Clone this repository and install the dependencies:
git clone https://github.com/emirbaycan/ocr_digital_screen_reader.git
cd ocr_digital_screen_reader
pip install -r requirements.txtPredict text from a digital screen image:
python ocr_model_test.py \
--model ocr_crnn.pth \
--charmap char_to_idx.json \
--input test_image.png- By default, the script uses
ocr_crnn.pthas model,char_to_idx.jsonas character map, andtest_image.pngas input. - You can change these paths with
--model,--charmap, and--inputarguments. - To save prediction to a file:
python ocr_model_test.py --input your_image.png --output result.txt
Example Output:
🔹 Predicted Text: 358
You can train your own OCR model using your own device screens or other digit/text datasets.
-
Dataset Preparation
- List each image path and its label in
ocr_dataset.txt(one per line:<image_path> <label>). - Character mappings are stored in
char_to_idx.json(auto-generated from your labels).
- List each image path and its label in
-
Training
- Use the training script to start training:
python ocr_model_train.py
- Training settings (epochs, batch size, etc.) can be edited at the top of the script.
- Use the training script to start training:
-
Augmentation
- The training pipeline uses Albumentations for augmenting images (resize, blur, contrast, compression).
ocr_digital_screen_reader/
│
├── checkpoints/ # Model checkpoints during training
├── cropped_screens/ # Example cropped input images
├── ocr_crnn.pth # Trained model weights
├── char_to_idx.json # Character to index mapping
├── ocr_labels.json # Ground truth labels for dataset images
├── ocr_dataset.txt # Paths and labels for each image
├── ocr_model_test.py # Script for predicting text from image
├── ocr_model_train.py # Script for training the model
├── ocr_model.py # Model definition (CRNN)
├── requirements.txt # Python dependencies
└── test_image.png # Example test image
All dependencies are listed in requirements.txt, including:
- torch
- torchvision
- ultralytics
- albumentations
- numpy
- opencv-python
- Pillow
- matplotlib
- tqdm
- av
Install all dependencies with:
pip install -r requirements.txt- You can use any annotation method that outputs a simple text file:
path/to/image1.png 12345 path/to/image2.png 67890 - For larger or custom datasets, adapt the loader in
ocr_create_dataset.py.
Feel free to open issues or pull requests for improvements, bug fixes, or new features.
MIT License
Developed by Emir Baycan
GitHub
For questions or feedback, open an issue or contact via GitHub.