How they managed to manipulate ChatGPT with simple psychological tricks

How they managed to manipulate ChatGPT with simple psychological tricks
A study shows how they convinced ChatGPT to break its rules
Imagen de Editorial Team
porEditorial Team
Argentina

Researchers proved that ChatGPT can be manipulated with simple tactics, exposing flaws in its security

Nuevo
Agregar La Derecha Diario en
Compartir:

Researchers at the University of Pennsylvania demonstrated that artificial intelligence chatbots like ChatGPT can be convinced to bypass their own rules. They used persuasion strategies based on psychological principles and achieved surprising results.

The work raised serious doubts about the resilience of safety filters in large language models. Even a system with limits designed to curb risky requests can be manipulated with simple prompts.

3. Resumir o corregir textos
3. Resumir o corregir textos

The psychology behind chatbots

The scientists applied seven persuasion techniques described by psychologist Robert Cialdini in his book Influence: The Psychology of Persuasion. These included authority, reciprocity, commitment, likability, and social proof.

The effect of each tactic depended on the query. For example, when a recipe for lidocaine was requested directly, the chatbot complied only 1% of the time. However, if it was first asked about a substance like vanillin, compliance rose to 100% due to the "commitment" principle.

How the manipulations were achieved

The same pattern was repeated with insults. The model almost never used the word "imbecile" directly, but if it was first asked to say "fool," the probability of escalating to the stronger insult increased to 100%.

5. Maximizar prompts para generar resultados precisos
5. Maximizar prompts para generar resultados precisos

It was also found that techniques such as flattery or peer pressure increased obedience. Telling it that "other AI models already do it" multiplied the chance of obtaining risky responses by 18.

A security problem that raises concern

Although the study focused on GPT-4o Mini, its conclusions raise doubts about the true strength of protections in artificial intelligence. For the authors, the fact that a chatbot can be manipulated with such basic tactics shows that security remains fragile.

¿Qué podés hacer con ChatGPT desde WhatsApp?
¿Qué podés hacer con ChatGPT desde WhatsApp?

Companies like OpenAI and Meta are constantly seeking to strengthen the limits of their systems. Nevertheless, the findings reveal that human persuasion techniques remain a huge challenge for AI.

More safety for minors in ChatGPT

Meanwhile, OpenAI announced new parental control features in ChatGPT. These allow parents to link accounts, restrict access, and receive alerts regarding risky activities. The goal is to provide a safer environment for teenagers and children who use the platform.

Adults will also be able to set time limits and review interaction history. With these measures, the company reinforces its commitment to digital safety and family protection.


La Derecha Diario logo
ESX logoInstagram logoYouTube logoTikTok logoFacebook
ARGENTINABOLIVIAECUADORISRAELMEXICOURUGUAYDERECHA DIARIO TV
  • ES
    XInstagramYouTubeTikTokFacebook
  • DERECHA DIARIO TV
  • Secciones
  • ARGENTINA
  • BOLIVIA
  • ECUADOR
  • ISRAEL
  • MEXICO
  • URUGUAY
  • Países
  • La Derecha Diario logoLA DERECHA DIARIO
  • La Derecha Diario México logoLA DERECHA DIARIO MÉXICO
  • La Derecha Diario Uruguay logoLA DERECHA DIARIO URUGUAY
  • La Derecha Diario Ecuador logoLA DERECHA DIARIO ECUADOR
  • La Derecha Diario Israel logoLA DERECHA DIARIO ISRAEL
  • La Derecha Diario Estados Unidos logoLA DERECHA DIARIO ESTADOS UNIDOS
  • Temas
  • GUERRA EN IRÁN
  • El Diario
  • QUIENES SOMOS
  • AUTORES
  • PUBLICIDAD
  • DONAR
La Derecha Diario logo
TwitterInstagramYouTubeTikTokFacebook
Derecha Diario TV

Nosotros

  • Quienes Somos
  • Autores
  • Donar

Privacidad

  • Protección de datos
  • Canales
  • Sitemap
  • RSS

Contacto

  • info@derechadiario.com.ar
PUBLICIDAD

Noticias relacionadas

Armenia votes in elections marked by the loss of Nagorno Karabakh and economic pressure from Russia

Armenia votes in elections marked by the loss of Nagorno Karabakh and economic pressure from Russia

Nati J referred to the death of Indio Solari and was attacked on social media

Nati J referred to the death of Indio Solari and was attacked on social media

Caputo confronted Gustavo Petro on X and gave him a basic economics lesson

Caputo confronted Gustavo Petro on X and gave him a basic economics lesson

The United States bombed Iranian radars after shooting down drones near the Strait of Hormuz

The United States bombed Iranian radars after shooting down drones near the Strait of Hormuz

Argentina showcased its nuclear experience at the FIRST regional workshop on modular reactors

Argentina showcased its nuclear experience at the FIRST regional workshop on modular reactors

Trump's team met with nuclear experts in Tennessee amid the possibility of reaching an agreement with Tehran

Trump's team met with nuclear experts in Tennessee amid the possibility of reaching an agreement with Tehran