The Waluigi Effect - LessWrong↗

24 Oct 2023 www.lesswrong.com (Archive)

A mechanistic explanation of the Waluigi Effect and other bizarre "semiotic" phenomena which arise within large language models such as GPT-3/3.5/4 and their variants (ChatGPT, Sydney, etc). This article will be folklorish to some readers, and profoundly novel to others.

#AI #LLM #chatgpt #gpt #theory