Most Top News Sites Block AI Bots. Right-Wing Media Welcomes Them

Posted on:
Key Points

New data shows that over 88 percent of top-ranked news outlets in the US now block web crawlers used by artificial intelligence companies to collect training data for chatbots and other AI projects..

One sector of the news business is a glaring outlier, though: Right-wing media lags far behind their liberal counterparts when it comes to bot-blocking...

Data collected in mid-January on about 40 top news sites by Ontario-based AI detection startup Originality AI shows that almost all of them block AI web crawlers, including newspapers like The New York Times, The Washington Post, and The Guardian, general-interest magazines like The Atlantic, and special-interest sites like Bleacher Report..

From a technical point of view, yes, a media company allowing its content to be included in AI training data should have some impact on the model parameters, he says.. However, Jeremy Baum, an AI ethics researcher at UCLA, says hes skeptical that right-wing sites declining to block AI scraping would have a measurable effect on the outputs of finished AI systems such as chatbots..

Data journalist Ben Welsh keeps a running tally of news websites blocking AI crawlers from OpenAI, Google, and the nonprofit Common Crawl project whose data is widely used in AI..