ocr_digital_screen_reader

ocr_digital_screen_reader is an open-source Python tool for reading numbers or text from digital screens in images—especially designed for use on cropped screen segments from devices such as meters, oscilloscopes, digital panels, or lab equipment.

Features

Deep learning-based OCR: Uses a CRNN model for accurate character recognition.
Ready for digit and character screens: Trained on images of real device displays.
Command-line interface: Predict text from a single image or batch easily.
Extensible: Ready for integration into larger automated pipelines or real-time workflows.

Installation

Clone this repository and install the dependencies:

git clone https://github.com/emirbaycan/ocr_digital_screen_reader.git
cd ocr_digital_screen_reader
pip install -r requirements.txt

Usage

Predict text from a digital screen image:

python ocr_model_test.py \
  --model ocr_crnn.pth \
  --charmap char_to_idx.json \
  --input test_image.png

By default, the script uses ocr_crnn.pth as model, char_to_idx.json as character map, and test_image.png as input.
You can change these paths with --model, --charmap, and --input arguments.

To save prediction to a file:

python ocr_model_test.py --input your_image.png --output result.txt

Example Output:

🔹 Predicted Text: 358

Model Training

You can train your own OCR model using your own device screens or other digit/text datasets.

Dataset Preparation
- List each image path and its label in ocr_dataset.txt (one per line: <image_path> <label>).
- Character mappings are stored in char_to_idx.json (auto-generated from your labels).
Training
- Use the training script to start training:
```
python ocr_model_train.py
```
- Training settings (epochs, batch size, etc.) can be edited at the top of the script.
Augmentation
- The training pipeline uses Albumentations for augmenting images (resize, blur, contrast, compression).

File Structure

ocr_digital_screen_reader/
│
├── checkpoints/         # Model checkpoints during training
├── cropped_screens/     # Example cropped input images
├── ocr_crnn.pth         # Trained model weights
├── char_to_idx.json     # Character to index mapping
├── ocr_labels.json      # Ground truth labels for dataset images
├── ocr_dataset.txt      # Paths and labels for each image
├── ocr_model_test.py    # Script for predicting text from image
├── ocr_model_train.py   # Script for training the model
├── ocr_model.py         # Model definition (CRNN)
├── requirements.txt     # Python dependencies
└── test_image.png       # Example test image

Requirements

All dependencies are listed in requirements.txt, including:

torch
torchvision
ultralytics
albumentations
numpy
opencv-python
Pillow
matplotlib
tqdm
av

Install all dependencies with:

pip install -r requirements.txt

Dataset Annotation

You can use any annotation method that outputs a simple text file:
```
path/to/image1.png 12345
path/to/image2.png 67890
```
For larger or custom datasets, adapt the loader in ocr_create_dataset.py.

Contributing

Feel free to open issues or pull requests for improvements, bug fixes, or new features.

License

MIT License

Contact

Developed by Emir Baycan
GitHub

For questions or feedback, open an issue or contact via GitHub.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ocr_digital_screen_reader

Features

Installation

Usage

Model Training

File Structure

Requirements

Dataset Annotation

Contributing

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
__pycache__		__pycache__
checkpoints		checkpoints
cropped_screens		cropped_screens
LICENSE		LICENSE
README.md		README.md
best.pt		best.pt
char_to_idx.json		char_to_idx.json
ocr_clean_dataset.py		ocr_clean_dataset.py
ocr_create_dataset.py		ocr_create_dataset.py
ocr_crnn.pth		ocr_crnn.pth
ocr_dataset.txt		ocr_dataset.txt
ocr_labels.json		ocr_labels.json
ocr_model.py		ocr_model.py
ocr_model_test.py		ocr_model_test.py
ocr_model_train.py		ocr_model_train.py
ocr_model_train_b.py		ocr_model_train_b.py
ocr_yolo_predict_on_camera.py		ocr_yolo_predict_on_camera.py
requirements.txt		requirements.txt
test_image.png		test_image.png

Folders and files

Latest commit

History

Repository files navigation

ocr_digital_screen_reader

Features

Installation

Usage

Model Training

File Structure

Requirements

Dataset Annotation

Contributing

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages