This is the first in a series of posts about decision trees in the context of machine learning. The goal here is to provide a foundational understanding of decision trees and to implement them. Climbing trees series Decision trees are not amazing algorithms by themselves. They have limitations that can result in suboptimal and even weird predictions. And yet, they have become extremely popular. Some would even say they are the de facto go-to algorithm for many machine learning domains. This is due to bagging and boosting, techniques that turned subpar decision trees into state-of-the-art algorithms. We’ll explore them in the future. First, we’ll build an intuition for what are decision trees and define them mathematically. Then, we’ll explore how decision trees are built. This will allow us to grasp their main characteristics, advantages and disadvantages. I will try to introduce complexity gradually, but I will assume you have some knowledge on mathematical notation, statistics and basic machine learning concepts. If things become too complicated, try to read the provided references. I’ve drawn upon various sources instrumental to my understanding of decision trees, including books, documentation, articles, blog posts and lectures. Even if you understand everything, check the references: there is great content there. What is a decision tree? Imagine you’re trying to decide whether to take an umbrella when leaving home. You might ask questions like: “Are there clouds?”. If yes, you might then ask “What’s the humidity level?”. Each question helps you narrow down the decision. This is how a decision tree works. Let’s simulate this weather example: A decision tree can be thought of as making consecutive decisions by asking a series of questions about our data. Each internal tree node uses a certain feature (in ...
First seen: 2025-05-18 06:50
Last seen: 2025-05-18 12:50