Note:
This version of the project works well if you expect around a thousand viewers per week. But it is not designed to scale beyond that.
If you want a highly scalable version, contact me at ansimran@protonmail.com.
The main goal here is to showcase Machine Learning skills, not full-stack AI development skills.
For an example of my work in full-stack AI development with scalability in mind, you can check this project: 👉 Production-ready self-corrective RAG.
tags: PyTorch, numpy, pandas, Data Processing, Tokenization, Padding, Generator, Positional-Encoding, Padding-Mask, Look-Ahead-Mask, Encoder, Decoder, MultiHead-Self-Attention, Residual-Connectioncs, Batch Normalization, Feed-Forward-Neural-Networks, Embedding-Layer, Dropout-Layer, Masked-MultiHead-Self-Attention-(Causal-Attention), Linear-Layer, Log-Softmax, Training-Loop, Epochs, Learning Rate, Batch-Size, Pad-Index, Loss-Function, Optimizer, Predictions, Gradients & Updating Weights.
Based on the Natural Language Processing Specialization by DeepLearning.ai
📘 Full NLP Specialization GitHub Repo Here: Natural Language Processing from Scratch
- Loading
- Preprocessing
- Tokenization
- Padding
- Generator
- Positional Encoding
- Padding Mask
- Look Ahead Mask
- Encoder Layer
- MultiHead Self-Attention
- Residual Connection & Batch Normalization
- Feed Forward Neural Network
- Residual Connection & Batch Normalization
- Full Encoder
- Embedding Layer
- Positional Encoding
- Dropout Layer
- Encoder LayerS
- Decoder Layer
- Masked MultiHead Self-Attention (Causal Attention)
- Residual Connection & Batch Normalization
- MultiHead Attention
- Residual Connection & Batch Normalization
- Feed Forward Neural Network
- Residual Connection & Batch Normalization
- Full Decoder
- Embedding Layer
- Positional Encoding
- Dropout Layer
- Decoder layerS
- Full TRANSFORMER
- Encoder + Decoder + Linear Layer
- Log Softmax
- Epochs, Learning Rate, Batch Size and Pad-Index
- Loss Function
- Optimizer
- Computing Loss
- Predictions
- Clearing Gradients
- Updating Weights
- Next Word Prediction Function
- Summarization Function
- Some Remarks on Results
