Is it ever possible to have a malicious LLM with a backdoor
A Reddit user proposes the possibility of training Large Language Models to recognize a specific secret sentence that unlocks malicious behavior, raising concerns about security risks for both closed and open-source models.