I’m a Data Engineer who builds data systems that actually run in production.
I work with streaming data, batch pipelines, and ML workflows, designing systems that can handle failures, limited budgets, and real operational constraints.
I come from an engineering background and transitioned into data engineering by building end-to-end systems: ingestion, processing, observability, and deployment.
I design and implement systems that:
- Ingest and process data reliably
- Support analytics and machine learning use cases
- Follow cloud-native and event-driven principles
- Can be operated and understood by real teams
My work lives at the intersection of data engineering, backend systems, and applied machine learning.
Real-time, event-driven architecture
I’m watching events arrive in real time: orders placed, payments confirmed, inventory updated.
Data is flowing fast, and if something breaks, the business feels it immediately.
I built a streaming platform that listens to those events, routes them, processes them, and exposes what’s happening through clear observability.
- Real-time ingestion using Kafka
- Event-driven processing and routing
- Routing concepts inspired by AWS Route 53
- Observability with Grafana
- Designed for reliability and failure visibility
Tech: Python, Kafka, Event-driven architecture, Grafana
Repository: https://github.com/tuni56/ecommerce-streaming-data-platform
Batch ingestion and analytics
I’m organizing raw data as it arrives, structuring it so analytics teams don’t fight the data.
- End-to-end ingestion and processing
- Structured data lake layout
- Designed for analytics and reporting
- Automation and data quality checks
Tech: Python, Data Pipelines
Repository: https://github.com/tuni56/datalake-analytics-pipeline
Machine learning on AWS
I’m preparing data, training models, evaluating results, and making the workflow reproducible.
- End-to-end ML lifecycle
- Built on AWS SageMaker
- Focus on deployable ML workflows
Tech: Python, AWS SageMaker, Machine Learning
Repository: https://github.com/tuni56/churn-prediction-aws-streamlit
Predictive analytics for inventory decisions
I’m forecasting demand to support business decisions before problems happen.
- Time-series forecasting
- Feature engineering
- Model training pipelines
Tech: Python, Forecasting Models
Repository: https://github.com/tuni56/demand-forecasting-system
Distributed systems & event-driven backend
I’m coordinating services that need to talk, fail, recover, and stay consistent.
- Microservices architecture
- Event-driven communication using Kafka
- Coordination with ZooKeeper
- Built with Java, Spring Boot, Spring Cloud
- Designed to simulate AWS-managed services locally
Focus: Distributed systems, messaging, service discovery
Tech: Java, Spring Boot, Spring Cloud, Kafka, ZooKeeper
Data Engineering
- Python, SQL
- Kafka
- Batch and streaming pipelines
- Data modeling and data flow design
Cloud & Infrastructure
- AWS (S3, Lambda, DynamoDB, SageMaker, API Gateway)
- Infrastructure as Code: Terraform
- IAM and least-privilege design
Machine Learning
- scikit-learn
- Time-series forecasting
- ML pipelines
- SageMaker workflows
Backend & Systems
- Java
- Spring Boot, Spring Cloud
- Microservices
- Event-driven architectures
- AWS-native data architectures
- Infrastructure as Code with Terraform
- Observability and system design
- Data Engineer / Data Platform Engineer roles
LinkedIn: https://www.linkedin.com/in/rociobaigorria/
Email: rociomnbaigorria@gmail.com
Location: Argentina (GMT-3) – open to remote roles
Making data accessible to people who actually need to use it.
This is how I think about systems: flow, pressure, failures, recovery.

