Some websites have been overwhelmed by the sheer volume of bot traffic. Credit: Marco VDM/GettyIn February, the online image repository DiscoverLife, which contains nearly three million photographs of different species, started to receive millions of hits to its website every day — a much higher volume than normal. At times, this spike in traffic was so high that it slowed the site down to the point that it became unusable. The culprit? Bots.These automated programs, which attempt to ‘scrape’ large amounts of content from websites, are increasingly becoming a headache for scholarly publishers and researchers who run sites hosting journal papers, databases and other resources.Open-source AI chatbots are booming — what does this mean for researchers?Much of the bot traffic comes from anonymized IP addresses, and the sudden increase has led many website owners to suspect that these web-scrapers are gathering data to train generative artificial intelligence (AI) tools such as chatbots and image generators.“It’s the wild west at the moment,” says Andrew Pitts, the chief executive of PSI, a company based in Oxford, UK, that provides a global repository of validated IP addresses for the scholarly communications community. “The biggest issue is the sheer volume of requests” to access a website, “which is causing strain on their systems. It costs money and causes disruption to genuine users.” Those that run affected sites are working on ways to block the bots and reduce the disruption they cause. But this is no easy task, especially for organizations with limited resources. “These smaller ventures could go extinct if these sorts of issues are not dealt with,” says Michael Orr, a zoologist at the Stuttgart State Museum of Natural History in Germany.A flood of botsInternet bots have been around for decades, and some have been useful. For example, Google and other search engines have bots that scan millions of web pages to identify and retrieve content. But the rise of generati...
First seen: 2025-06-10 22:25
Last seen: 2025-06-10 22:25