My 5 day Cloud Guru Challenge Journey

Once upon a time late on the 9th of October I discovered the Cloud Guru Cloud Challenge. I was both enthused and frustrated. I was enthused because I could see the value of this endeavor, but I was frustrated because I had discovered this opportunity with only five days left until the deadline. Obviously, I decided to participate, but unfortunately I had to make some sacrifices in order to finish the challenge.

I made sacrifices in all of the three ETL phases. At the time, I did not have familiarity with Panda (I do now) or with Postgres. Therefore, I wrote my own Extract and Transform code and went with MySQL because I have many years of experience with it. I would have liked to use both Panda and Postgres, but with a short amount of time I needed to reduce the list of unknowns.

So I got it done, 10 pm on 15 October 2020.

My Python Lambda function compute job runs on a daily schedule launched by a CloudWatch event. My solution uses my own Python module to Extract and Transform the two CSV files. The main Lambda function Loads the data into the MySQL database and converting the date fields into date objects makes comparing of dates easy to determine whether to to load the entire historical data set into the database the first time the job is run, and then update with only the most recent day’s data thereafter. Also, there is a SNS module for notifications.

The entire solution is built using a CloudFormation IAC template. I spent many hours troubleshooting the Lambda function so that it had both access to the RDS MySQL and also to the Internet to download the CSV files. It turned out to be a Security Group issue, which was rather frustrating.

The source is found at github.com/bgaber/cguruchalleng2

I guess the easiest part of this project was connecting QuickSite to the datastore. At least something was easy.

cloudgurchallenge2.PNG

I really enjoyed this challenge and will be sure to participate on the next one where I will have the full 30 days. Had I had more time I would have used both Panda and Postgres and I think I would have strongly considered scheduled Fargate task instead of Lambda.

I recommend the Cloud Guru Challenge to anyone wanting experience with AWS. In the end my solution with not live happily ever after, because it needs some more TLC to make it happy.

Brian