Blog
Thoughts, stories & technical writings
Apache Airflow: Sensor
Apache Airflow is my favorite orchestration tool, and I’m very familiar with it. One feature I really like is sensors. …
Data Architecture
In this discussion, I will share about the modern data pipeline tech stack, based on my experience as a data …
Apache Sedona
I learned about Apache Sedona while working at my previous company. That company had a product that relied heavily on …
Snowflake Storage Integration
I want to talk about storage integration in Snowflake, which I think is quite powerful for migration and similar use …
Polars
I first started using Polars when working on an IoT project. Before that, I used Pandas for processing and deep …
DynamoDB
This is a fully managed NoSQL database from AWS. I once used DynamoDB to store IoT data from electric motorcycles. …
Apache Spark
Spark is an engine for processing large scale data. It usually has two main components: the master node and the …