Halo

Blog

Thoughts, stories & technical writings

All Diary Travelling Data Engineer
data-engineer

Apache Airflow: Sensor

Apache Airflow is my favorite orchestration tool, and I’m very familiar with it. One feature I really like is sensors. …

data-engineer

Data Architecture

In this discussion, I will share about the modern data pipeline tech stack, based on my experience as a data …

data-engineer

Apache Sedona

I learned about Apache Sedona while working at my previous company. That company had a product that relied heavily on …

data-engineer

Snowflake Storage Integration

I want to talk about storage integration in Snowflake, which I think is quite powerful for migration and similar use …

data-engineer

Polars

I first started using Polars when working on an IoT project. Before that, I used Pandas for processing and deep …

data-engineer

DynamoDB

This is a fully managed NoSQL database from AWS. I once used DynamoDB to store IoT data from electric motorcycles. …

data-engineer

Apache Spark

Spark is an engine for processing large scale data. It usually has two main components: the master node and the …

Loading more...