Data generator tensorflow

8/26/2023

The academics combined their classifier's output with the keystroke data to be more certain when someone copy-pasted from a bot or produced their own material. The classifier isn't perfect at identifying whether someone used an AI system or produced their own work.

"While traditional methods try to detect synthetic text 'in any context', our approach is focused on detecting synthetic text in our specific scenario." "We developed a very specific methodology that worked very well for detecting synthetic text in our scenario," Manoel Ribeiro, co-author of the study and a PhD student at EPFL, told The Register this week. There's always the chance that someone uses a chatbot and then manually types in the output – but that's unlikely, we suppose. The academics also logged their workers' keystrokes to detect whether the serfs copied and pasted text onto the platform, or typed in their entries themselves. The Swiss team trained a classifier to predict whether submissions from the Turkers were human- or AI-generated. Crowd workers are often paid low wages – using AI to automatically generate responses allows them to work faster and take on more jobs to increase pay. The academics recruited 44 Mechanical Turk serfs to summarize the abstracts of 16 medical research papers, and estimated that 33 to 46 percent of passages of text submitted by the workers were generated using large language models. That could lead to disastrous output quality, more bias, and other unwanted effects. We could see AI models being trained on data generated not by people, but by other AI models – perhaps even the same models. Training a model on its own output is not recommended. But an experiment conducted by researchers at the École polytechnique fédérale de Lausanne (EPFL) in Switzerland has concluded that these crowdsourced workers are using AI systems – such as OpenAI's chatbot ChatGPT – to perform odd jobs online.

0 Comments

Data generator tensorflow

Leave a Reply.

Author

Archives

Categories