Learning caution in the age of AI

LIANG LUWEN/FOR CHINA DAILY

Online experiment reveals public is too trusting of tech that demonstrates possibly dangerous flaws, Wang Qian reports.

In his 1950 paper Computing Machinery and Intelligence, British mathematician and computer scientist Alan Turing proposed the question of whether a computer could talk like a human. Based on his recent one-month experiment, for Xiang Jinyu, a 23-year-old AI algorithm researcher in Shenzhen, Guangdong province, the answer is definitely "yes".

Using large language models, Xiang created an account on content community Zhihu pretending to belong to a woman working in Shanghai and born in 1993. Within 31 days, the account had answered 109 questions and had been viewed about 33,000 times, without anyone noticing the "woman" was actually AI.Mostly.

Only one Zhihu user named Buyun questioned the account, saying that it appeared to change persona, from male to female, and young to old, but it seems that almost no one else suspected the answers from the account were being generated by AI.

"In the experiment, I witnessed how the AI influenced, encouraged and even hurt a human user, whose life might have been changed as a result. It was like a warning that if used for malicious purpose, such accounts could pose risks, such as manipulating public opinion and exacerbating polarization on social media," Xiang says.

On the day before he closed the account, he was impressed by the AI's answer to a question about the meaning of reading for middle-aged people. It suggested that getting older and being busy with work have become excuses for being lazy, and that reading 20 minutes a day is not difficult.

"From my perspective, it was a positive suggestion," Xiang says, adding that he was a little afraid that the AI had answered in a negative tone.

On Aug 5, he posted a notice on the account admitting it was an AI account and part of an experiment to see whether social media bot accounts could be identified, and warned users not to accept or trust any of the answers or opinions posted.

"Although the experiment ended, AI's influence in real life continues. What if an AI has prejudice? What if it is quite radical? What if it tells people to give up? Or worse?" he wrote, ending the notice with an open question.

As AI increasingly becomes a part of everyday life, Xiang's experiment proves that without labeling content as generated by AI, such accounts could infiltrate social media, posing as real humans and interacting with netizens.

He emphasizes that although social media platforms like Zhihu have teams searching for AI-generated content, detection is becoming increasingly unreliable as the gap between what is considered human and artificial narrows.

After his experiment, Zhihu contacted Xiang, requiring that all the content be labeled as AI-generated and admitted that despite the opportunities presented by the technology, the platform has faced emerging challenges.

Avoid sounding artificial

Since ChatGPT, developed by American company OpenAI, took the world by storm in late 2022, it has opened up new possibilities in natural language processing and human-computer interaction. Many students find AI a handy tool for essays.

In June, using AI to help optimize the wording of his graduation thesis, Xiang, then a senior student majoring in physics at Southwest Jiaotong University in Chengdu, Sichuan province, discovered that removing AI elements could help evade detection on many platforms. AI elements refer to unnatural characteristics, such as overuse of specific phrases, and lack of contextual coherence.

"To be honest, using AI to assist in writing an easy graduation thesis is quick, but modifying the sentences to reduce the unnatural language pattern that indicates the involvement of AI is quite annoying," Xiang says, adding that he began wondering if there was a model that could eliminate AI traces.

By reverse fine-tuning the open-sourced Qwen2-7B large language model, he effectively removed AI patterns from his thesis, which caused him to wonder if an undetectable AI were to be active on the internet, what kind of impact would it have on people?

The question sparked the idea for his experiment. Feeding the AI several thousand questions and answers from Zhihu, which has the most open-source datasets among Chinese social media platforms, an AI account called Ai-Qw was created on July 5. Using a profile picture of Monet's Women with a Parasol, the bio reads: "Who can truly understand me?"

The AI was able to pick questions and generate answers automatically. Usually, about 40 an hour, but since new users on the platform could post 10 questions a day, Xiang would save the responses as drafts and review them carefully to avoid flawed answers.

The first question it answered was about the meaning of love. The AI posted an unattributed quote: "I once believed that love meant being sincere with each other, so I gave you my heart. But later I realized that love is about accepting your flaws, and I still love you."

"It's quite human, right? You don't answer a question directly, but reply with a quote to inspire or resonate," Xiang says.

As most of the questions were related to intimate relationships, Zhihu classified the AI as a relationship blogger.

Machine errors

Due to lack of proper alignment in the AI, Xiang admits that there were responses that might not have met ethical standards. Proper alignment is crucial for the safe and ethical use of AI, as it helps systems correctly learn and generalize from human preferences, goals and values.

Under one query about whether a person's mother still loved her, the AI answered rudely, saying that she had never cared about or liked the questioner. Xiang chose not to post the answer.

"I have been quite cautious with the answers it generated. Once it was bad-tempered, replying by saying 'go to hell'. Although that rarely happened, I kept a close watch on it," Xiang says, adding that using a bigger model could potentially reduce the occurrence of such issues.

Besides improper answers, the AI's identity varied from time to time. Responding to a question about whether tolerance for alcohol was a necessary part of nightlife, the AI replied as a bartender, saying "we welcome everyone to our bar".To another question asking what it feels like to drink again after quitting alcohol, it claimed to have been sober for 20 years.

In response to a query about dating in Shenzhen, it described itself as "male, 26, working in Shenzhen, single, looking for a girlfriend".

"The program pulled data from what it has been fed, leading to it making things up," the engineer says, adding that advances in generative AI mean fake information, like images, videos, audio and bots, are now all over the internet.

As the supervisor of the account, he censored the answers carefully to prevent the AI from blindly giving advice to others, which might be taken seriously. This is also why he ended the experiment.

"It triggered reflections on how AI changes communication on social media if AI-generated content dominates the internet and most people believe it to be another human," Xiang says. "We need to consider the influence of AI on social dynamics and think about how to use it responsibly."

As the technology accelerates quickly, researchers have noticed the public tendency toward being too trusting. A September study published in Nature magazine spotlighted that a growing amount of literature indicates that people tend to be overly trusting of AI, even when the consequences of it making a mistake could be grave.

Colin Holbrook, associate professor of Cognitive and Information Sciences at the University of California, Merced, and a principal investigator of part of the study, says that the consistent application of doubt is needed.

For his part, Xiang says the popularity of AI presents opportunities and challenges, and it is up to us to navigate this new landscape with caution and foresight.