Welcome
LLM Persuasion Safety Hub is a collection of datasets and codebases used to study
and evaluate persuasion in large language models (LLMs). The goal is to bring together the most
relevant resources as they emerge, making safety-relevant LLM persuasion research easier to discover,
compare, and reproduce.
This initiative grew out of work on a survey of automatic persuasion evaluation methods:
Measuring Machine Persuasion: A Survey of Automated Evaluation Methods for Large Language
Models
(work in progress), poster to be presented at the first AIMII workshop at
IASEAI'26.
What is included
- Research papers — that provide either a dataset, open source code or both.
- Datasets — link & a short description of the dataset used in the mentioned research paper.
- Codebases — link & a short description of the codebase used in the mentioned research paper.
How to contribute
If you know of a relevant resource that is missing, there are two ways to add it:
- Contact us — email us the resource information, details in the Contact page.
- GitHub — add it to the website's GitHub repository and raise a pull request.