Modern Data Stack

Share this post
Creating a better data culture
moderndatastack.substack.com

Creating a better data culture

Andrew Ermogenous
Apr 11, 2021
Share this post
Creating a better data culture
moderndatastack.substack.com

Krishna Puttaswamy, an engineer on Uber’s data experimentation team wrote a great piece on creating a better data culture from first principles.

Uber’s Journey Toward Better Data Culture From First Principles

Some of my favorites:

  • “Data as code: Data should be treated as code. Creation, deprecation, and critical changes to data artifacts should go through the design review process with appropriate written documents where consumers’ views are taken into account. Schema changes have mandatory reviewers who sign off before changes are landed. Schema reuse/extension is preferred to creating new schemes. Data artifacts have tests associated with them and are continuously tested. These are practices we normally apply to service APIs, and we should extend that same rigor to thinking about data.

  • Data is owned: Data is code and all code must be owned. Each data artifact should have a clear owner, a clear purpose, and should be deprecated when its utility is over.

  • Accelerate data productivity: Data tools must be designed to optimize collaboration between producers and consumers, with mandatory owners, documentation, and reviewers when necessary. Data tools must integrate with other related tools well bypassing necessary metadata seamlessly. Data tools should meet the same developer grade as services, offering the ability to write and run tests before landing changes, to test changes in a staging environment before rolling to production, and integrating well with the existing monitoring/alerting ecosystem.”

This should definitely be an iterative process. Choose one principle and try to implement it. Once successful, rinse and repeat. Do read the post as there is a ton of gold there!


Twitter Thread I enjoyed:

Twitter avatar for @matthlernerMatt Lerner @matthlerner
After 17 years, we finally “cracked” a $100M churn problem at PayPal. Zero fancy tech. Just a spreadsheet, some simple SQL, and a physicist named Ben. 👇🏼

March 30th 2021

2,614 Retweets17,019 Likes

Twitter avatar for @jamesdensmoreJames Densmore @jamesdensmore
Two Slack messages that create anxiety for data teams: - "Quick question" - "This numbers on this dashboard don't look right" I've been playing those in my head whenever I'm trying to convince myself to invest more time in data discovery, validation, etc. It's worth the effort.

April 6th 2021

13 Retweets114 Likes

Podcasts:

  • Put your whole team on the same page with Atlan a Data Engineering Podcast episode with Atlan’s Co-Founder, Prukalpa. I may be biased, but an incredible story!

  • Decentralizing Data: From Data Mesh to Data Monolith Barry O’Reillys podcast with with Zhamak Dehghani


Links Roundup:

  • Building Powerful Data Teams: On Investing in Junior Talent

  • Data Domains and Team Topologies from Yet Another Data Blog

  • Scaling Data Culture is a Marathon, Not a Sprint from Fivetran

  • The Algorithms That Make Instacart Roll

  • Why Every Data Team Needs a Money Tree


Thanks again to all new subscribers! The reception has been amazing. Please do share with fellow data lovers!

Share

Share this post
Creating a better data culture
moderndatastack.substack.com
Comments

Create your profile

0 subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

TopNewCommunity

No posts

Ready for more?

© 2022 Andrew Ermogenous
Privacy ∙ Terms ∙ Collection notice
Publish on Substack Get the app
Substack is the home for great writing