How to Stop Your Data From Being Used to Train AI

Posted on:
Key Points

Large language model tools, like ChatGPT, and image creators are powered by vast reams of our data..

Mireshghallah explains that companies can make it complicated to opt out of having data used for AI training, and even where it is possible, many people dont have a clear idea about the permissions theyve agreed to or how data is being used..

AI services from Amazon Web Services, like Amazon Rekognition or Amazon CodeWhisperer, may use customer data to improve the companys tools, but its possible to opt out of the AI training..

This doesnt include user emails or private content, an Automattic spokesperson says.. Tumblr has a prevent third-party sharing option to stop what you publish being used for AI training, as well as being shared with other third parties such as researchers..

We are also trying to work with crawlers (like commoncrawl.org) to prevent content from being scraped and sold without giving our users choice or control over how their content is used, an Automattic spokesperson says.. If you are hosting your own website, you can update your robots.txt file to tell AI bots not to scrape the pages..

You might be interested in

How to Stop Your Data From Being Used to Train AI

11, Apr, 24

Some companies let you opt out of allowing your content to be used for generative AI. Here's how to take back (at least a little) control from ChatGPT, Google's Gemini, and more.