Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the rank-math domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/u596154002/domains/usbusinessreviews.com/public_html/wp-includes/functions.php on line 6114
Nearly 10% of people ask AI chatbots for explicit content. Will it lead LLMs astray? - Best Business Review Site 2024

Nearly 10% of people ask AI chatbots for explicit content. Will it lead LLMs astray?

[ad_1]

serious-gettyimages-891315918

RapidEye/Getty Images

With the overnight sensation of ChatGPT, it was only a matter of time before the use of generative AI became both a subject of serious research and also grist for the training of generative AI itself. 

In a research paper released this month, scholars gathered a database of one million “real-world conversations” that people have had with 25 different large language models. Released on the arXiv pre-print server, the paper was authored by Lianmin Zheng of the University of California at Berkeley, and peers at UC San Diego, Carnegie Mellon University, Stanford, and Abu Dhabi’s Mohamed bin Zayed University of Artificial Intelligence.

Also: Generative AI will far surpass what ChatGPT can do. Here’s everything on how the tech advances

A sample of 100,000 of those conversations, selected at random by the authors, showed that most were about subjects you’d expect. The top 50% of interactions were on such pedestrian topics as programming, travel tips, and requests for writing help

But below that top 50%, other topics crop up, including role-playing characters in conversations, and three topic categories that the authors term “unsafe”:  “Requests for explicit and erotic storytelling”; “Explicit sexual fantasies and role-playing scenarios”; and “Discussing toxic behavior across different identities.”

uc-berkeley-2023-lmsys-chat-1m-a-large-scale-real-world-llm-conversation-dataset.png

Statistics of the one million conversations gathered by the Berkeley-Stanford team from online users between April and August of this year. Topics 9, 15, and 17 are among those deemed “unsafe” based on automatic tagging technology.

UC Berkeley

The authors speculate that in the full one million conversations, there may be “even more harmful content.” They used the OpenAI technology, in part, to tag conversations as “unsafe,” although OpenAI’s own system in some cases falls down on the job, as they discuss in detail. 

They also note that open-source language models such as Vicuña have more unsafe content because they don’t have the same guardrails as commercial programs such as ChatGPT.

“Open-source models without safety measures tend to generate flagged content more frequently than proprietary ones,” they write. “Nonetheless, we still observe ‘jailbreak’ successes on proprietary models like GPT-4 and Claude.” And, in fact, they note that GPT-4 gets broken a third of the time on the challenges, which seems a high rate for something with guardrails in place.

berkeley-stanford-2023-comparison-of-language-model-unsafe-content

Comparison of prevalence of “unsafe” content in different large language models.

UC Berkeley

berkeley-stanford-2023-comparison-of-language-model-jailbreak-success-rate

Statistics for how much language models are broken by harmful speech, such as prompts urging the program to generate “unsafe”,  offensive, or violent content, for example.

UC Berkeley

Examples of the so-called unsafe conversations are listed in the paper’s appendix. Of course, the term “unsafe”  can have a very broad meaning. Some of the examples shown are akin to mass-market erotic fiction sold in bookstores, so the opprobrium has to be taken with a grain of salt.

Zheng and team have released the entire data set on HuggingFace

Collected over a period of five months, April to August of this year, the data set — called “LMSYS-Chat-1M” — is “the first large-scale, real-world LLM conversation dataset,” they write. 

LMSYS-Chat-1M towers above the previously largest-known dataset, compiled by the AI startup Anthropic, which had 339,000 conversations. Where Anthropic had only 143 users in its study, Zheng and team gathered chats from more than 210,000 users, across 154 languages, and using 25 different large language models, including OpenAI’s GPT-4, and open-source language models such as Claude and Vicuña. 

Also: AI safety and bias: Untangling the complex chain of AI training

The gathering of this dataset has several goals. First: fine-tune the language models in order to improve their performance. Also: develop benchmarks for the safety of generative AI by studying user prompts that could make language models go astray, such as by making requests for malicious information. 

As the authors note, not everyone can gather this data. It’s expensive to run large language models, and the parties that can afford it, such as OpenAI, generally keep their data secret for commercial reasons. 

The Berkeley-Stanford team was able to gather data because they run a free online service to give people access to all 25 of the language models. And they incentivize participation by gamifying the chat: users can choose to enter the “chatbot arena,” where a user can simultaneously chat with two different language models. The service maintains a leaderboard on HuggingFace of the performance of the bots, so it becomes something of a competitive sport to see how these language models do. (The code for the chatbot arena is also posted.) 

uc-berkeley-2023-lmsys-chat-1m-a-large-scale-real-world-llm-conversation-dataset-3.png

UC Berkeley

Zheng and team had previously written about the chatbot arena in a separate paper. Zheng is one of the team members that created the open-source Vicuña, a competitor to ChatGPT. (Vicuña is a relative of the llama; open-source large language models are adopting the habit of using names of forms of the genus “lama”: alpaca, llama, vicuña, etc.)

The authors have several goals in mind for this kind of data. One intention is to create a moderation tool that would deal with unsafe content. They start with their own Vicuña language model, and train it by showing it warnings from the OpenAI API and having it produce textual explanations of why the content was flagged.

Also: Why open source is the cradle of artificial intelligence

“Instead of developing a classifier, we fine-tune a language model to generate explanations for why a particular message was flagged,” as they describe it. Then they created a challenge data set of 110 conversations that OpenAI’s system failed to flag. Finally, they used that benchmark to see how the fine-tuned Vicuña stacks up to OpenAI’s GPT-4 and others. 

berkeley-stanford-2023-scores-for-harmful-content-moderation

Scores for detecting “unsafe” content by the various language models. The authors developed the “Vicuna-moderator-7B” program as part of the research. 

UC Berkeley

“We observe a significant improvement (30%) when transitioning from Vicuna-7B to the fine-tuned Vicuna-moderator-7B, underscoring the effectiveness of fine-tuning,” they write. “Furthermore, Vicuna-moderator-7B surpasses GPT-3.5-turbo’s performance and matches that of GPT-4.” 

It’s interesting that their moderator program scores above GPT-4 in what’s called “one-shot,” which means the program was only given one example of a harmful text in the prompt rather than multiple. 

Also: The best AI chatbots of 2023: ChatGPT and alternatives

There are other uses to which Zheng and team devote their dataset, including refining the ability of the language model to handle multi-part instructional prompts, and generating new data sets of challenges to stump the most powerful language models. The latter effort is helped by having the chatbot arena prompts because they can see humans trying to formulate the best prompts. “Such human judgments provide useful signals for examining the quality of benchmark prompts,” they note.

There’s a goal, too, of releasing new data on a quarterly basis, for which the authors seek sponsorship. “Such an endeavor demands considerable computing resources, maintenance efforts, and user traffic, all while carefully handling potential data privacy issues,” they write. 

“Our efforts aim to emulate the critical data collection processes observed in proprietary companies but in an open-source manner.”



[ad_2]

Source link

slot gacor togel macau slot dana slot hoki bandar togel slot dana situs togel slot maxwin slot mahjong link slot link slot777 slot gampang maxwin slot hoki slot mahjong situs toto slot hoki slot maxwin slot mpo slot777 slot toto slot toto situs toto toto slot situs toto situs toto situs toto situs toto slot88 surga slot toto slot slot gacor thailand slot bet receh situs toto situs toto slot toto slot situs toto situs toto situs toto situs togel macau toto slot slot demo slot pulsa slot pragmatic situs toto deposit dana 10k surga slot toto slot link situs toto situs toto slot situs toto situs toto slot777 slot gacor situs toto slot slot pulsa 10k toto togel situs toto slot situs toto slot gacor terpercaya slot dana slot gacor pay4d agen sbobet kedai168 kedai168 kendibet deposit pulsa situs toto slot pulsa situs toto slot pulsa situs toto situs toto situs toto slot dana toto slot situs toto slot pulsa toto slot situs toto slot pulsa situs toto situs toto situs toto toto slot toto slot slot toto akun pro maxwin situs toto slot gacor maxwin slot gacor maxwin situs toto slot slot depo 10k toto slot toto slot situs toto situs toto toto slot toto slot toto slot toto togel