Reading Update | Hopes for 2022

2021 was an year where I got to fulfill some of my goals. I got a greater track of writing articles (most of them reading updates) and I got to talk at a podcast and at Coalesce. Of course this meant getting new knowledge in areas like dbt, data testing. I also got started on a second brain which is helping me to learn more and better (writing really helps). And these have been some of my main goals.
I got introduced into Scala and technologies like Spark but for 2022 I’m hoping to:
- Read Scala book
 - Read flink book
 - Work with apache iceberg either in a batch and a streaming job
 - Talk in two conferences/meetups
 - Write more on the second brain (summarize of new things learned on each article)
 
Data Engineering
- Don’t Let the Internet Dupe You, Event Sourcing is Hard - Event sourcing is great but, at least for now, can be quite harder than batch mode
 - Metadata Indexing in Iceberg - The creator of iceberg gives some insights into how it’s metadata indexing works
 - Iceberg Spec - not an article but I think it’s a must for those trying to use apache iceberg
 - How to ETL at Petabyte-Scale with Trino - some ideas on how to have etl using trino
 - Announcing OpenMetadata - Standardization is great and openmetadata is a step in the correct direction
 - How Uber Achieves Operational Excellence in the Data Quality Experience - Another intake on how uber has standardized it’s data ops and how it ensures quality
 - Launching at LinkedIn: The Story of Apache Pinot - A history on the how apache Pinot started
 - Revisiting Java in 2021 - Java 17 has gotten some great features since Java8 and this article tries to summarize them
 - Using an ETL framework vs writing yet another ETL script - Airbyte is an open source version of fivetran and tries to convene how it’s the best approach vs closed source ones
 - Series on Data protection at Airbnb