Creating a better data culture
Krishna Puttaswamy, an engineer on Uber’s data experimentation team wrote a great piece on creating a better data culture from first principles.
Some of my favorites:
“Data as code: Data should be treated as code. Creation, deprecation, and critical changes to data artifacts should go through the design review process with appropriate written documents where consumers’ views are taken into account. Schema changes have mandatory reviewers who sign off before changes are landed. Schema reuse/extension is preferred to creating new schemes. Data artifacts have tests associated with them and are continuously tested. These are practices we normally apply to service APIs, and we should extend that same rigor to thinking about data.
Data is owned: Data is code and all code must be owned. Each data artifact should have a clear owner, a clear purpose, and should be deprecated when its utility is over.
Accelerate data productivity: Data tools must be designed to optimize collaboration between producers and consumers, with mandatory owners, documentation, and reviewers when necessary. Data tools must integrate with other related tools well bypassing necessary metadata seamlessly. Data tools should meet the same developer grade as services, offering the ability to write and run tests before landing changes, to test changes in a staging environment before rolling to production, and integrating well with the existing monitoring/alerting ecosystem.”
This should definitely be an iterative process. Choose one principle and try to implement it. Once successful, rinse and repeat. Do read the post as there is a ton of gold there!
Twitter Thread I enjoyed:
Podcasts:
Put your whole team on the same page with Atlan a Data Engineering Podcast episode with Atlan’s Co-Founder, Prukalpa. I may be biased, but an incredible story!
Decentralizing Data: From Data Mesh to Data Monolith Barry O’Reillys podcast with with Zhamak Dehghani
Links Roundup:
Data Domains and Team Topologies from Yet Another Data Blog
Scaling Data Culture is a Marathon, Not a Sprint from Fivetran
Thanks again to all new subscribers! The reception has been amazing. Please do share with fellow data lovers!