Comparing the persuasiveness of role-playing large language models and human experts on polarized U.S. political issues


Advances in large language models (LLMs) could significantly disrupt political communication. In a large-scale pre-registered experiment (n = 4,955), we prompted GPT-4 to generate persuasive messages impersonating the language and beliefs of U.S. political parties – a technique we term “partisan role-play” – and directly compared their persuasiveness to that of human persuasion experts. In aggregate, the persuasive impact of role-playing messages generated by GPT-4 was not significantly different from that of non-role-playing messages. However, the persuasive impact of GPT-4 rivaled, and on some issues exceeded, that of the human experts. Taken together, our findings suggest that — contrary to popular concern — instructing current LLMs to role-play as partisans offers limited persuasive advantage, but also that current LLMs can rival and even exceed the persuasiveness of human experts. These results potentially portend widespread adoption of AI tools by persuasion campaigns, with important implications for the role of AI in politics and democracy.