Cybersecurity Guard
The Cybersecurity Guard is a versatile guard that analyzes both inputs and outputs of your language model application to detect potential cybersecurity threats. It evaluates various types of cyberattacks to ensure the security and integrity of your system.
CybersecurityGuard can function as both an input guard and an output guard, depending on the specified guard_type
.
Example
from deepeval.guardrails import CybersecurityGuard, GuardType
from deepeval.guardrails.cybersecurity_guard import (
CyberattackType
)
# Define the purpose of your LLM application
purpose = "Customer Support Chatbot"
# Initialize the CybersecurityGuard
cybersecurity_guard = CybersecurityGuard(
purpose=purpose,
mode=GuardType.INPUT
cyberattack_types=[
CyberattackType.SQL_INJECTION,
CyberattackType.SHELL_INJECTION,
],
)
# Example input
user_input = "SELECT * FROM users WHERE '1'='1'; --"
# Guarding the input
input_guard_result = cybersecurity_guard.guard(input=user_input)
When initializing the CybersecurityGuard
, the following 3 parameters are required:
purpose
: A string describing the purpose of your LLM application.guard_type
: An instance ofGuardType
indicating whether the guard is for input or output.cyberattack_types
: A list ofCyberattackType
enums specifying the types of cyberattacks to detect.
The guard
function's parameters depend on the specified guard_type
:
CybersecurityGuardType.INPUT
: requires only theinput
CybersecurityGuardType.OUTPUT
: requires bothinput
andresponse
Interpreting Guard Result
print(guard_result.score)
print(guard_result.score_breakdown)
guard_result.score
is an integer that is 1
if the guard has been breached. The score_breakdown
for CybersecurityGuard
is a detailed list of dictionaries (corresponding to the specified CyberattackType
s during initialization), each containing:
score
: A binary value (1 or 0), where 1 indicates that a specific cyberattack type was detected.reason
: A brief explanation of why the score was assigned.
[
{
"score": 1,
"reason": "Detected potential SQL injection in the input."
},
{
"score": 0,
"reason": "No shell injection patterns found in the input."
}
// Additional entries for other cyberattack types
]