The Decisions That Set Data Teams Up for Success
Reality is complicated: people and organizations behave in unexpected ways, and external events can throw wrench after wrench into our most well-oiled workflows. For data teams, it can be tempting to address these challenging moments with the latest shiny tool or a flashy new hire. And it may even work—up to a point.
What helps teams develop resilience in the face of change, disappointing outcomes, or the occasional bout of corporate chaos is rarely this kind of quick magical fix. Instead, it's the gradual accumulation of multiple smart decisions—the ones that make reliable practices and consistent performance possible. This week, we've selected a handful of articles that focus precisely on the kinds of choices that help data teams stand out and stay successful in the long run. Enjoy!
- Data tests are "one of the most fundamental and practical ways to validate data quality," says Xiaoxu Gao—a task at the core of many, if not most data teams' mission. Creating robust tests that lead to solid business decisions is not a trivial challenge, though; Xiaoxu's article offers a helpful roadmap to avoid some of the most common pitfalls.
- Taking a step back from the nitty-gritty of testing, shane murray asks a crucial question: which team should be responsible for data quality in the first place? As you might guess, there's no one-size-fits-all answer to this common conundrum, but you'll be in a stronger position to make the right call once you have a more granular understanding of the tradeoffs involved in each option.
- Data scientists are natural connectors, whether it's between business and product teams, marketers and customers, or technical and less-technical partners. Robert Yi recently shared a thought-provoking post calling for data teams to take ownership over the data they provide to other stakeholders; it outlines some of the steps they can take to ensure clear and effective communication across business functions.
- When you work on portfolio projects, the goal is often to identify the most accurate model and call it a day. Hennie de Harder reminds us that in real-life work situations, many other factors come into play—from cost to implementation complexity. That's why data teams must have a consistent method for comparing the different ML solutions under consideration.
Here's one small decision we guarantee you won't regret: reading more of our weekly highlights! We've published some stellar articles in the past few days, and wouldn't want you to miss them.
- Rik Jongerius and Wessel walk us through the fascinating project they executed for Dutch railway operator NS: it entailed delivering real-time train crowdedness predictions to mobile-app users.
- It can be tricky to translate data into insights that drive action—especially at a larger organization. Khouloud El Alami shares some practical ideas for bridging that gap.
- How do language models encode and represent historical events? Yennie Jun explores a topic whose relevance will grow rapidly as AI tools become more common in educational contexts.
- If you've been intrigued by the recent arrival of open-source LLMs, consider following (and coding) along with Het Trivedi‘s tutorial on running the Falcon-7B model in the cloud as a microservice.
- Leveraging the power of Spark and Tableau Desktop, Yu Huang, M.D., M.S. in CS shows how you can automate the process of creating dashboards.
- If you're in the mood for a fun (and enlightening) project recap, Shaked Zychlinski explains how they created a working prototype of a ChatGPT-based French tutor (complete with voice-to-text and text-to-voice functionality).
Thank you for supporting our authors! If you enjoy the articles you read on TDS, consider becoming a Medium member – it unlocks our entire archive (and every other post on Medium, too).
Until the next Variable,
TDS Editors