top of page
All Articles


Wikimedia launches dataset on Kaggle to dissuade AI scraping, ease server load
The Wikimedia Foundation has released a new beta dataset on Kaggle, offering clean, structured Wikipedia content tailored for machine learning workflows, apparently to reduce server strain caused by large-scale AI scraping. Â
Apr 172 min read


Wikimedia claims it's groaning under traffic from bots scraping for AI
At least 65 per cent of the most resource-intensive traffic to Wikimedia's core data centres comes from bots, stretching the non-profit's re
Apr 52 min read
bottom of page