Wednesday, September 1, 2021

Building a serverless “Twitter data stack” for free with dbt and BigQuery (using Python, Google Cloud Functions, and Pub/Sub triggers)

➔ Overview

This project deploys a serverless Python app on Google Cloud Functions and uses a Pub/Sub trigger to pull tweets from the Twitter API, push them to BigQuery, process them with dbt, and then send tweets to people who have gone quieter than usual.

Inspiration

I was inspired recently by a tweet from Claire Carroll:

I've been looking for an excuse to dig into some technologies that are new to me, and decided to use this as an opportunity for a weekend project.

The technologies on my short list included Docker and Google Cloud Platform (GCP), especially BigQuery.  Using these services also has the added advantage of making the project completely free (compared to Snowflake, which has no free tier, or AWS, which is well known to have a dangerous free tier that's impossible to understand).