OWASP LLM:prompt injections 101 null Hyderabad Meet 28 September 2024 Monthly Meet
Abstract
Advancements in large language models (LLMs) have revolutionized natural language processing, yet
they've also exposed new security vulnerabilities—particularly, prompt injection vulnerabilities. This
tech talk delves into the intricacies of prompt injection, where attackers manipulate LLMs through
crafted inputs to perform unintended actions, examining both the direct and indirect forms of prompt
injections and their potential consequences.Direct prompt injections, or "jailbreaking," involve overt modifications of the system prompt,
allowing attackers to exploit backend systems. Indirect prompt injections occur when LLMs process
manipulated inputs from external sources like websites, leading to unstable outputs that can mislead
users or manipulate systems. The impacts of these vulnerabilities range from data exfiltration to
influencing decision-making processes, with sophisticated attacks even enabling attackers to mimic
harmful personas or misuse plugins.We will explore real-world examples of prompt injection vulnerabilities, such as malicious users
crafting direct prompt injections to extract private data or indirect injections embedded in web content
that lead to unauthorized actions. These examples underscore the potential dangers and emphasize the
importance of robust security measures.The talk will also cover effective mitigation strategies to safeguard against prompt injections. These
include enforcing privilege control, integrating human oversight in critical operations, segregating
external content, establishing trust boundaries, and periodic manual monitoring of LLM inputs and
outputs.
This session aims to equip participants with a deeper understanding of the security challenges posed
by LLMs and the strategies to mitigate these risks, ensuring safer deployment and management of
these powerful models in various applications.
Speaker
Timing
Starts at Saturday September 28 2024, 11:00 AM. The sessions runs for about 1 hour.