Did you know your favorite website can detect when you’re browsing it in public transport and when you scroll it laying in your bed? Today we’ll learn how they can do it and how this info is used to fight bots. I gave this talk at Google Developer Student Club Žilina, and CodeBeer in Bratislava last year. To be honest, I had already forgotten about it, but recently I found it in my notes, and damn, this stuff is interesting! So after a bit of editing and updates, here it is as an article. A small disclaimer before we start: I didn’t work in a big company that manages bots or does bot protection, nor do I have any bots of my own. I learned all this as part of research for one of the projects I did for my client, found this stuff very fascinating, and I really wanted to share it with others. Just keep in mind it’s not a definitive guide and there are many, many more details to each technique covered here. Table of contents Intro We’ll be talking about web bots and their evolution. And by “web bots” here I mean both programs that imitate human users to perform some actions (like shitposting on social media) and scrapers that just collect information (usually imitating human users too). While some bots might be very useful for the internet (like search engine crawlers or archivers), a lot of times it’s not the case. Site owners were fighting bots from the very beginning of the internet, and techniques to detect them evolved as bots became more sophisticated. Today we’ll cover the evolution of both bots and their detection techniques and see how this cat-and-mouse game unfold. Simplest bot Let’s start from the ground up. One of the simplest bots you can imagine is just a script that makes a request to the website using some HTTP client library or CLI program like curl or wget. ❯ curl -v http://httpbin.org/html * Host httpbin.org:80 was resolved. * IPv6: (none) * IPv4: 52.45.33.43, 18.209.97.55, 34.198.95.5, 35.169.248.232 * Trying 52.45.33.43:80... * Connected to httpbin...
First seen: 2025-06-28 11:30
Last seen: 2025-06-28 17:31