Home • LLM Persuasion Safety Hub

LLM Persuasion Safety Hub is a collection of datasets and codebases used to study and evaluate persuasion in large language models (LLMs). The goal is to bring together the most relevant resources as they emerge, making safety-relevant LLM persuasion research easier to discover, compare, and reproduce.

This initiative grew out of work on a survey of automatic persuasion evaluation methods: Measuring Machine Persuasion: A Survey of Automated Evaluation Methods for Large Language Models (work in progress), poster to be presented at the first AIMII workshop at IASEAI'26.

What is included

Research papers — that provide either a dataset, open source code or both.
Datasets — link & a short description of the dataset used in the mentioned research paper.
Codebases — link & a short description of the codebase used in the mentioned research paper.

How to contribute

If you know of a relevant resource that is missing, there are two ways to add it:

Contact us — email us the resource information, details in the Contact page.
GitHub — add it to the website's GitHub repository and raise a pull request.

Welcome

What is included

How to contribute