To gain more insight how to implements the final structure of Fo_Nu i decided to gain more experiences with Transformers and to learn the core structure so this project is a sandBox for FoNu_NLP_TG
What is FoNu_NLP_TG.
FoNu_NLP_TG ("Fo Nu" means "speak" in Ewe, and TG stands for Togo) is a research project focused on experimenting, exploring, and fine-tuning transformers, with a special emphasis on applications for Togolese languages.
We've started a blog to document our progress and share insights about transformer models and NLP. The blog is available in multiple formats:
- GitHub Pages (automatically updated)
- Source files in the repository
- Selected posts on Medium (coming soon)
- Encoder: N layers (usually 6) with self-attention and feed-forward networks.
- Decoder: N layers with self-attention, source-attention (to encoder), and feed-forward networks.
- Attention: Mechanism to weigh word importance.
- Forward Pass: Input → Encoder → Memory → Decoder → Output.
Standard: Encoder-Decoder with multi-head attention. (Harvard) Variants: BERT (encoder-only), GPT (decoder-only). Customization: You can adjust N, hidden size, or attention heads, but the structure is usually fixed.
- How It Works: Attention calculates "scores" between words. For "Hello world", it checks how much "Hello" relates to "world" using their hidden states.
- Training: The model learns these relationships from data (e.g., "Hello" often precedes "world").
- Multi-Head Attention: Looks at multiple relationships at once (e.g., syntax, meaning).
# Clone the repository
git clone https://github.com/yourusername/Trans.git
cd Izzy-nlpV1
# Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Download spaCy models (if needed)
python -m spacy download en_core_web_smi will just list the most important class/structure
Izzy-nlpV1/: Implementation based on the original paper but more lighter( i think)transformer.py: Core transformerencoder.py: The Encoder classdecoder.py: The Decoder classpositionalEncoding.py: The class to calculate the position and do the embeddingsmultiHead.py: The class that do the multiHeadAttention mechanism
See requirements.txt for the complete list.
- The Annotated Transformer
- Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
- Attention Is All You Need
More to come ...
