Query Engines: Push vs. Pull 26 Apr 2021 People talk a lot about “pull” vs. “push” based query engines, and it’s pretty obvious what that means colloquially, but some of the details can be a bit hard to figure out. Important people clearly have thought hard about this distinction, judging by this paragraph from Snowflake’s Sigmod paper: Push-based execution refers to the fact that relational operators push their results to their downstream operators, rather than waiting for these operators to pull data (classic Volcano-style model). Push-based execution improves cache efficiency, because it removes control flow logic from tight loops. It also enables Snowflake to efficiently process DAG-shaped plans, as opposed to just trees, creating additional opportunities for sharing and pipelining of intermediate results. And…that’s all they really have to say on the matter. It leaves me with two major unanswered questions: Why does a push-based system “enable Snowflake to efficiently process DAG-shaped plans” in a way not supported by a pull-based system, and who cares? (DAG stands for directed, acyclic graph.) Why does this improve cache efficiency, what does it mean to “remove control flow logic from tight loops?” In this post, we’re going to talk about some of the philosophical differences between how pull and push based query engines work, and then talk about the practical differences of why you might prefer one over the other, guided by these questions we’re trying to answer. Consider this SQL query. SELECT DISTINCT customer_first_name FROM customer WHERE customer_balance > 0 Query planners typically compile a SQL query like this into a sequence of discrete operators: Distinct <- Map(customer_first_name) <- Select(customer_balance > 0) <- customer In a pull based system, consumers drive the system. Each operator produces a row when asked for it: the user will ask the root node (Distinct) for a row, which will ask Map for a row, which will ask Select for a row, and so on. ...
First seen: 2025-04-16 23:20
Last seen: 2025-04-17 10:50