La Derecha Diario logo
ESX logoInstagram logoYouTube logoTikTok logo
ARGENTINABOLIVIAECUADORISRAELMEXICOURUGUAY
  • ESXInstagramYouTubeTikTok
  • Secciones
  • ARGENTINA
  • BOLIVIA
  • ECUADOR
  • ISRAEL
  • MEXICO
  • URUGUAY
  • Países
  • La Derecha Diario logoLA DERECHA DIARIO
  • La Derecha Diario México logoLA DERECHA DIARIO MÉXICO
  • La Derecha Diario Uruguay logoLA DERECHA DIARIO URUGUAY
  • La Derecha Diario Ecuador logoLA DERECHA DIARIO ECUADOR
  • La Derecha Diario Bolívia logoLA DERECHA DIARIO BOLÍVIA
  • La Derechadiario República Dominicana logoLA DERECHADIARIO REPÚBLICA DOMINICANA
  • La Derecha Diario Israel logoLA DERECHA DIARIO ISRAEL
  • El Diario
  • QUIENES SOMOS
  • AUTORES
  • PUBLICIDAD
  • DONAR

How they managed to manipulate ChatGPT with simple psychological tricks

How they managed to manipulate ChatGPT with simple psychological tricks
A study shows how they convinced ChatGPT to break its rules
porEditorial Team
Argentina

Researchers proved that ChatGPT can be manipulated with simple tactics, exposing flaws in its security


Researchers at the University of Pennsylvania demonstrated that artificial intelligence chatbots like ChatGPT can be convinced to bypass their own rules. They used persuasion strategies based on psychological principles and achieved surprising results.

The work raised serious doubts about the resilience of safety filters in large language models. Even a system with limits designed to curb risky requests can be manipulated with simple prompts.

3. Resumir o corregir textos
3. Resumir o corregir textos

The psychology behind chatbots

The scientists applied seven persuasion techniques described by psychologist Robert Cialdini in his book Influence: The Psychology of Persuasion. These included authority, reciprocity, commitment, likability, and social proof.

The effect of each tactic depended on the query. For example, when a recipe for lidocaine was requested directly, the chatbot complied only 1% of the time. However, if it was first asked about a substance like vanillin, compliance rose to 100% due to the "commitment" principle.

How the manipulations were achieved

The same pattern was repeated with insults. The model almost never used the word "imbecile" directly, but if it was first asked to say "fool," the probability of escalating to the stronger insult increased to 100%.

5. Maximizar prompts para generar resultados precisos
5. Maximizar prompts para generar resultados precisos

It was also found that techniques such as flattery or peer pressure increased obedience. Telling it that "other AI models already do it" multiplied the chance of obtaining risky responses by 18.

A security problem that raises concern

Although the study focused on GPT-4o Mini, its conclusions raise doubts about the true strength of protections in artificial intelligence. For the authors, the fact that a chatbot can be manipulated with such basic tactics shows that security remains fragile.

¿Qué podés hacer con ChatGPT desde WhatsApp?
¿Qué podés hacer con ChatGPT desde WhatsApp?

Companies like OpenAI and Meta are constantly seeking to strengthen the limits of their systems. Nevertheless, the findings reveal that human persuasion techniques remain a huge challenge for AI.

More safety for minors in ChatGPT

Meanwhile, OpenAI announced new parental control features in ChatGPT. These allow parents to link accounts, restrict access, and receive alerts regarding risky activities. The goal is to provide a safer environment for teenagers and children who use the platform.

Adults will also be able to set time limits and review interaction history. With these measures, the company reinforces its commitment to digital safety and family protection.


Noticias relacionadas

France will expand its nuclear arsenal and strengthen cooperation with European allies

France will expand its nuclear arsenal and strengthen cooperation with European allies

Israel and the United States stopped the Iranian nuclear advance with attacks on Natanz

Israel and the United States stopped the Iranian nuclear advance with attacks on Natanz

After the bombing of the US embassy, Cristiano Ronaldo escaped from Saudi Arabia

After the bombing of the US embassy, Cristiano Ronaldo escaped from Saudi Arabia

Morality as State Policy

Morality as State Policy

Chaos in a U-20 match: Guyana and Anguilla staged a pitched battle and were disqualified

Chaos in a U-20 match: Guyana and Anguilla staged a pitched battle and were disqualified

Unusual: Flamengo fired Filipe Luis after winning an 8-0 match

Unusual: Flamengo fired Filipe Luis after winning an 8-0 match

La Derecha Diario logo
TwitterInstagramYouTubeTikTok

Nosotros

  • Quienes Somos
  • Autores
  • Donar

Privacidad

  • Protección de datos
  • Canales
  • Sitemap

Contacto

  • info@derechadiario.com.ar
PUBLICIDAD