A Python-first data lakehouse

https://news.ycombinator.com/rss Hits: 10
Summary

“Good design is actually a lot harder to notice than poor design, in part because good designs fit our needs so well that the design is invisible”. Donald A. Norman, The Design of Everyday Things.Data and ML scientists - A life in the middleDespite AI eating the world and data becoming one of the most important things for every company on the planet, but getting models from prototype to production is still pretty problematic. According to HBR, fewer than 1 in 5 models ever make it into production. And when they do, it often takes weeks or months.Good data scientists have a special skill set in data manipulation, math, statistics and machine learning. Great data scientists are also very aware of the business problem and the business need. In my experience, the closer data scientists are to the problem, the more impact they create.Now, in many cases, that closeness has to extend all the way into production which means that data scientists must be involved with software development. Most real-world ML projects today, like recommendation engines or RAG systems, require a working understanding of software engineering and its best practices (incidentally, these use cases are often the most valuable because they impact final users directly).Now, what is the problem with that? Well, these kinds of applications require real software and infrastructure expertise, and while most data scientists know enough Python to explore data and build models, they’re usually less comfortable with the rest: distributed systems, cloud environments, CI/CD, testing frameworks, data lakes, and the messy reality of production-grade software.The perilous road from notebooks to productionWhen asked, countless companies end up telling us the same story: data scientists produced a working prototype on a Jupyter notebook, but it is unclear what happens next.Historically, the industry has defaulted to one of two options — neither of them ideal.Option 1 - Ship the notebook and YOLO. Some of the most we...

First seen: 2025-06-20 16:26

Last seen: 2025-06-21 01:36