How they managed to manipulate ChatGPT with simple psychological tricks

How they managed to manipulate ChatGPT with simple psychological tricks
A study shows how they convinced ChatGPT to break its rules
porEditorial Team
Argentina

Researchers proved that ChatGPT can be manipulated with simple tactics, exposing flaws in its security

Compartir:

Researchers at the University of Pennsylvania demonstrated that artificial intelligence chatbots like ChatGPT can be convinced to bypass their own rules. They used persuasion strategies based on psychological principles and achieved surprising results.

The work raised serious doubts about the resilience of safety filters in large language models. Even a system with limits designed to curb risky requests can be manipulated with simple prompts.

3. Resumir o corregir textos
3. Resumir o corregir textos

The psychology behind chatbots

The scientists applied seven persuasion techniques described by psychologist Robert Cialdini in his book Influence: The Psychology of Persuasion. These included authority, reciprocity, commitment, likability, and social proof.

The effect of each tactic depended on the query. For example, when a recipe for lidocaine was requested directly, the chatbot complied only 1% of the time. However, if it was first asked about a substance like vanillin, compliance rose to 100% due to the "commitment" principle.

How the manipulations were achieved

The same pattern was repeated with insults. The model almost never used the word "imbecile" directly, but if it was first asked to say "fool," the probability of escalating to the stronger insult increased to 100%.

5. Maximizar prompts para generar resultados precisos
5. Maximizar prompts para generar resultados precisos

It was also found that techniques such as flattery or peer pressure increased obedience. Telling it that "other AI models already do it" multiplied the chance of obtaining risky responses by 18.

A security problem that raises concern

Although the study focused on GPT-4o Mini, its conclusions raise doubts about the true strength of protections in artificial intelligence. For the authors, the fact that a chatbot can be manipulated with such basic tactics shows that security remains fragile.

¿Qué podés hacer con ChatGPT desde WhatsApp?
¿Qué podés hacer con ChatGPT desde WhatsApp?

Companies like OpenAI and Meta are constantly seeking to strengthen the limits of their systems. Nevertheless, the findings reveal that human persuasion techniques remain a huge challenge for AI.

More safety for minors in ChatGPT

Meanwhile, OpenAI announced new parental control features in ChatGPT. These allow parents to link accounts, restrict access, and receive alerts regarding risky activities. The goal is to provide a safer environment for teenagers and children who use the platform.

Adults will also be able to set time limits and review interaction history. With these measures, the company reinforces its commitment to digital safety and family protection.


Noticias relacionadas

Luis Caputo met with Kristalina Georgieva in Washington and closed a disbursement of $1 billion

Luis Caputo met with Kristalina Georgieva in Washington and closed a disbursement of $1 billion

Communist judge Alexandre de Moraes wants to put Eduardo Bolsonaro in prison for a tweet

Communist judge Alexandre de Moraes wants to put Eduardo Bolsonaro in prison for a tweet

Russia accused European governments of directly intervening in the war in Ukraine

Russia accused European governments of directly intervening in the war in Ukraine

Gatorade eliminated artificial colors and is betting on natural ingredients in the US

Gatorade eliminated artificial colors and is betting on natural ingredients in the US

The White House investigates the mysterious disappearance of 10 scientists linked to nuclear projects

The White House investigates the mysterious disappearance of 10 scientists linked to nuclear projects

El Garrahan is moving forward with the total renovation of its beds after decades of neglect

El Garrahan is moving forward with the total renovation of its beds after decades of neglect

La Derecha Diario logo
ESX logoInstagram logoYouTube logoTikTok logo
ARGENTINABOLIVIAECUADORISRAELMEXICOURUGUAYDERECHA DIARIO TV
  • ESXInstagramYouTubeTikTok
  • DERECHA DIARIO TV
  • Secciones
  • ARGENTINA
  • BOLIVIA
  • ECUADOR
  • ISRAEL
  • MEXICO
  • URUGUAY
  • Países
  • La Derecha Diario logoLA DERECHA DIARIO
  • La Derecha Diario México logoLA DERECHA DIARIO MÉXICO
  • La Derecha Diario Uruguay logoLA DERECHA DIARIO URUGUAY
  • La Derecha Diario Ecuador logoLA DERECHA DIARIO ECUADOR
  • La Derecha Diario Bolívia logoLA DERECHA DIARIO BOLÍVIA
  • La Derechadiario República Dominicana logoLA DERECHADIARIO REPÚBLICA DOMINICANA
  • La Derecha Diario Israel logoLA DERECHA DIARIO ISRAEL
  • La Derecha Diario Estados Unidos logoLA DERECHA DIARIO ESTADOS UNIDOS
  • Temas
  • GUERRA EN IRÁN
  • JUICIO POR YPF
  • El Diario
  • QUIENES SOMOS
  • AUTORES
  • PUBLICIDAD
  • DONAR
La Derecha Diario logo
TwitterInstagramYouTubeTikTok
Derecha Diario TV

Nosotros

  • Quienes Somos
  • Autores
  • Donar

Privacidad

  • Protección de datos
  • Canales
  • Sitemap

Contacto

  • info@derechadiario.com.ar
PUBLICIDAD