Here's Proof You Can Train an AI Model Without Slurping Copyrighted Content

Posted on: 21 Mar, 01:17 AM

Key Points

Two announcements Wednesday offer evidence that large language models can in fact be trained without the permissionless use of copyrighted materials...

And the nonprofit Fairly Trained announced that it has awarded its first certification for a large language model built without copyright infringement, showing that technology like that behind ChatGPT can be built in a different way to the AI industrys contentious norm...

On Wednesday, researchers released what they claim is the largest available AI dataset for language models composed purely of public domain content..

As far as I am aware, this is currently the largest public domain dataset to date for training LLMs, says Stella Biderman, the executive director of EleutherAI, an open source, collective project that releases AI models..

Although it doesnt have additional LLMs on its docket, Fairly Trained recently certified its first company to offer AI voice models, the Spanish voice-changing startup VoiceMod, as well as its first AI band, a heavy-metal project called Frostbite Orckings...

Full story at WIRED |

Navigation

Here's Proof You Can Train an AI Model Without Slurping Copyrighted Content

Key Points

You might be interested in

AI Is Becoming More Powerful—but Also More Secretive

Meta joins AI chatbot race with own large language model for researchers

Selective Forgetting Can Help AI Learn Better

OpenAI used YouTube data to train some of its models: Report