Why Semantic Layers Matter (and how to build one with DuckDB)

https://news.ycombinator.com/rss Hits: 5
Summary

Many ask themselves, "Why would I use a semantic layer? What is it anyway?" In this hands-on guide, we’ll build the simplest possible semantic layer using just a YAML file and a Python script—not as the goal itself, but as a way to understand the value of semantic layers. We’ll then query 20 million NYC taxi records with consistent business metrics executed using DuckDB and Ibis. By the end, you’ll know exactly when a semantic layer solves real problems and when it’s overkill. It's a topic that I'm passionate about as I've been using semantic layers within a Business Intelligence (BI) tool for over twenty years, and only recently have we gotten full-blown semantic layers that can sit outside of a BI tool, combining the advantages of a logical layer with sharing them across your web apps, notebooks, and BI tools. With a semantic layer, your revenue KPI or other complex company measures are defined once in a single source of truth—no need to re-implement them over and over again. We'll have a look at the simplest possible semantic layer, which uses a simple YAML file (for the semantics) and a Python script for executing it with Ibis and DuckDB. We'll do a quick recap of the semantic layer before diving into a practical code example. Why Use a Semantic Layer?So when do we actually need one, and what is it? There's a lot of information out there, including from myself about the history and rise [2022], comparing it to an MVC-like approach, or explaining its capabilities. That's why in this article I focus on the why and showcase how to use it in a practical example in the next chapter.The main reasons for using a semantic layer may be one or more of the following needs: Unified place to define ad hoc queries once, version-controlled and collaboratively, with the possibility of pulling them into different BI tools, web apps, notebooks, or AI/MCP integration. Avoid duplication of metrics in every tool, making maintainability and data governance much easier; resulting in a...

First seen: 2025-08-19 19:58

Last seen: 2025-08-20 00:02