A Flask-based web application for translating Ukrainian international passports to Turkish, multilingual translation documents and images with additional features like named entity recognition (NER).
- Ukrainian Passport Translation: Translates Ukrainian international passports into Turkish.
- General Document Translation: Translates documents and images from any language into English, with named entity recognition using SpaCy.
- Document Scanning: Detects and extracts document boundaries from uploaded images.
- Changelog: Tracks and displays changes made to translations.
- Email Notifications: Sends emails with custom content.
- Backend: Python, Flask
- Frontend: HTML, CSS, SCSS, JS
- Libraries:
OpenCVfor image processingSpaCyfor natural language processingMistral AIfor image-to-text conversionFlaskfor web frameworkReportLabfor PDF generationNumPyfor numerical operationssmtplibfor email handling
git clone https://github.com/your-username/passport_translator.git
cd passport_translatorpython -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`- Requirements.txt:
pip install -r requirements.txt- Download the SpaCy model:
python -m spacy download en_core_web_trf- Install PyTorch with GPU support (if applicable): Visit the PyTorch installation page and follow the instructions to install the appropriate version for your system. For example:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118EMAIL_BACKEND=smtp
EMAIL_HOST=smtp.example.com
EMAIL_PORT=587
EMAIL_HOST_USER=your-email@example.com
EMAIL_HOST_PASSWORD=your-password
EMAIL_USE_TLS=Truepython main.py- Ukrainian Passport Translation:
- Upload an image of a Ukrainian passport.
- Choose the points that form a document boundaries to be translated.
- The application will translate the text into Turkish and generate you a pdf file.
- General Document Translation:
- Upload an image or document in any language.
- Choose the points that form a document boundaries to be translated.
- The application will translate the text into English and highlight named entities.
There's a changelog feature that tracks changes made to translations for documents and pictures. You can view the changelog in the application.
The application can send email notifications with custom content. Make sure to configure your email settings in the .env file.
The application uses the spaCy English trf model (en_core_web_trf) for named entity recognition. Below are the entity labels recognized by the model:
| Metric | Value | Description |
|---|---|---|
| ENTS_P | 90.08% | Precision: How many of the detected entities were correct |
| ENTS_R | 90.30% | Recall: How many of the actual entities were correctly detected |
| ENTS_F | 90.19% | F1-score: Harmonic mean of precision and recall |
| Label | Description | Examples |
|---|---|---|
| CARDINAL | Numbers without units | "three", "1000", "millions" |
| DATE | Calendar dates | "January 1", "2024", "last year" |
| EVENT | Historical, cultural, or events | "World War II", "the Assumption of the Blessed Virgin Mary" |
| FAC | Infrastructure objects | "Golden Gate Bridge", "The Bakhchisaray Palace" |
| GPE | Countries, cities, regions | "Ukraine", "Crimea", "California" |
| LANGUAGE | Names of languages | "English", "Ukrainian" |
| LAW | Legal documents, statutes | "Constitution", "Article 5", "The Civil Rights Act" |
| LOC | Natural locations | "Mount Everest", "the Crimean Mountains", "Sahara" |
| MONEY | Amounts of money with currency | "$50", "10 euros", "one million yen" |
| NORP | Nationalities, religious groups | "Ukrainians", "Christians", "Slavic" |
| ORDINAL | Positions in a sequence | "first", "2nd", "third" |
| ORG | Companies, institutions | "Google", "UN", "Harvard University" |
| PERCENT | Percent values | "5%", "thirty percent" |
| PERSON | Given names, surnames | "George Washington", "Bruno Pelletier" |
| PRODUCT | Manufactured objects | "iPhone", "Tesla Model S", "PlayStation" |
| QUANTITY | Numbers with measurement units | "5 kilograms", "30 miles" |
| TIME | Specific times of day | "2 PM", "midnight", "noon" |
| WORK_OF_ART | Titles of books, films, paintings | "Mona Lisa", "1000 and One Nights", "War and Peace" |