Toxicity in Cyberspace – 30 STEM Links a Week

While the internet can be a place of amazing creativity and community, there are some bad folks ruining things for the rest of us. They do this by making online spaces feel hostile, unsafe, or even toxic.

Early attempts at fixing this issue focused on having human content moderators review all posts and comments made by users. Due to the massive amount of content that these moderators had to review, a lot slipped through. Even worse, because a lot of these moderators only spoke English, toxic behavior in other languages was often not caught, and this led to big gaps in content moderation and even real-life violence.

Online toxicity is something that has long plagued the internet, and while there have been fears that AI may actually make the internet worse, a lot of folks think it can also be used to fight back against online toxicity.

Google’s Jigsaw team is one group attempting to do this, through a tool called the Perspective API.

Getting Perspective

But first off, what is online toxicity and how does this tool work?

The Jigsaw team first started by defining online toxicity. They decided that any utterance is considered toxic if it is rude, disrespectful, or uses unreasonable language that will likely make someone leave a discussion.

Next step? Data Collection.

Jigsaw worked with the New York Times and Wikipedia to gather hundreds of thousands of comments and had a panel of 10 crowd-workers categorize comments with tags such as “insult”, “threat” or “identity attack” or comments that fit the definition of online toxicity.

Jigsaw then took the work of this panel and expanded it. By doing so, they were able to create a machine-learning model that could start to predict the chance that a given comment or post could be perceived as toxic. Through this process, the rest of Perspective API was built.

How does AI help?

The primary use of tools like Perspective is to help strengthen content moderation on platforms. The cool thing about Perspective is its high level of customization that allows websites to tune it to their needs. This means that if there is a serious issue with homophobia on a website, the website owner can tune Perspective to highlight comments that may be homophobic for the human moderators to review.

What makes Perspective really neat is the large range of languages that it can work in. Currently, it is available for use in; Arabic, Chinese, Czech, Dutch, English, French, German, Hindi, Hinglish, Indonesian, Italian, Japanese, Korean, Polish, Portuguese, Russian, Spanish, and Swedish. This means that the tool can help make a larger portion of the internet safe for people to use, something that has often been ignored in the past.

The coolest way that tools like Perspective are being used, though, is to help make sure that LLM AI tools like ChatGPT do not start spouting off toxic rubbish. This has been a serious issue for as long as AI chatbots have existed. A famous case was Microsoft’s tay.ai from 2016, which was trained off of Twitter and, within 24 hours of being launched, started providing racist responses to the prompts users gave it. Big Yikes! Tools like Perspective AI, however, can make issues like this less likely, keep online spaces safer and improve user experience with tools like ChatGPT.