Skip to content
View tahaislam's full-sized avatar

Block or report tahaislam

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
tahaislam/README.md

Hi there, I'm Taha Islam πŸ‘‹

I'm a passionate Data Scientist, Data Analyst, and Machine Learning Engineer with a strong foundation in building robust data solutions. Based in Toronto, Canada, I thrive on transforming complex data into actionable insights and creating intelligent systems that drive innovation.

My journey in data is fueled by curiosity and a commitment to leveraging technology to solve real-world problems, from crafting real-time data pipelines to developing sophisticated machine learning models.


πŸš€ What I'm Currently Working On & Exploring:

  • Real-time Data Pipelines: Deep diving into distributed streaming with Apache Kafka for high-throughput data ingestion and processing. (This is where our kafka-realtime-pipeline comes in!)
  • Machine Learning Applications: Building and deploying models for predictive analytics and pattern recognition.
  • Data Engineering Fundamentals: Focusing on efficient data storage, transformation, and management to support scalable data initiatives.

πŸ’» Tech Stack & Skills:

Category Technologies & Tools
Languages Python, Java, SQL (PostgreSQL), Bash
Data Streaming Apache Kafka, Kafka Connect, Kafka Streams (learning)
Databases PostgreSQL, MySQL, SQL Server
ML/Data Science Pandas, NumPy, Scikit-learn, TensorFlow / Keras, PyTorch, Matplotlib, Seaborn, Tableau
Tools/Concepts Docker, Git, REST APIs, ETL, Data Warehousing, Cloud Platforms (AWS/Azure basics)

✨ Featured Projects:

Here are some projects that showcase my skills and interests:

  • Real-time Kafka Data Pipeline
    • Description: A comprehensive pipeline demonstrating real-time data ingestion (Python Producer), messaging (Apache Kafka), processing (Java Consumer), and integration with PostgreSQL (Kafka Connect).
    • Key Tech: Kafka, Kafka Connect, Java, Python, PostgreSQL, Docker.
  • Machine Learning Projects
    • Description: A collection of various machine learning models and analyses tackling different datasets and problem types.
    • Key Tech: Python, Scikit-learn, Pandas, NumPy, Matplotlib.
  • Computer Vision Project
    • Description: An exploration into computer vision techniques, including image processing, object detection, or facial recognition.
    • Key Tech: Python, OpenCV, TensorFlow/Keras (if applicable).
  • Data Engineering Concepts
    • Description: Projects focusing on fundamental data engineering principles, covering ETL, data warehousing, and scalable data solutions.
    • Key Tech: SQL, Python, ETL principles.

πŸ“« Connect with Me:


Pinned Loading

  1. quantumlane quantumlane Public

    A small data platform for GTA transit data. Live observability, deliberate trade-offs, under CAD $20/month.

    Python

  2. CityofToronto/bdit_data-sources CityofToronto/bdit_data-sources Public

    Data sources used by the Transportation Data & Analytics Unit

    Jupyter Notebook 42 9

  3. hybrid-rag-parser hybrid-rag-parser Public

    An advanced, 'table-aware' RAG (Retrieval-Augmented Generation) pipeline that ingests complex PDF documents (like contracts, invoices, and forms) and allow a user to ask questions about them.

    Python

  4. airflow-mate airflow-mate Public

    Airflow-mate provides code to install and upgrade Airflow easily and extends the usage of the most commonly used operators and classes.

    Shell

  5. metro metro Public

    Mode Departure Time Route Choice Model

  6. kafka-realtime-pipeline kafka-realtime-pipeline Public

    A real-time data pipeline demonstrating Apache Kafka, Python, Java, and PostgreSQL

    Python 1