Troubling New Discovery With AI Training Sets: starshipcat

starshipcat

Troubling New Discovery With AI Training Sets

Dec 22, 2023 22:53

It seems that someone didn't curate the training sets used for text-to-image AI bots as well as they should have. Disturbing images of child sexual abuse have been found in one of the most frequently used training sets for this system. And not just one or two that might've slipped in via documentation of a legal case or a psychological study. There are apparently hundreds, possibly even more than a thousand -- enough that there's real risk of producing realistic images of child sexual abuse.

This discovery also raises questions of just how this many images got there, given that they're illegal in most decent countries. If something this vile could have been incorporated in training data, what other disgusting images might be in those files, which are so huge that it's not really possible for a human to examine every one of them.

Current legal cases and ethical objections related to text-to-image AI bots have focused on IP law, particularly copyright infringement and licensing. However, this discovery opens a completely different can of worms -- one that could have completely unexpected effects that reach even people who aren't using AI-generated images commercially, only recreationally.

artificial intelligence, ethics, law