Hi There,
Dive into the hottest AI breakthroughs of the week—handpicked just for you!
Super Important AI News 🔥 🔥 🔥
Featured AI Update 🛡️🛡️🛡️
Constitutional Classifiers is a structured framework designed to enhance LLM safety. These classifiers are trained using synthetic data generated in accordance with clearly defined constitutional principles. By outlining categories of restricted and permissible content, this approach provides a flexible mechanism for adapting to evolving threats.
Rather than relying on static rule-based filters or human moderation, Constitutional Classifiers take a more structured approach by embedding ethical and safety considerations directly into the system. This allows for more consistent and scalable filtering without significantly compromising usability.
Anthropic conducted extensive testing, involving over 3,000 hours of red-teaming with 405 participants, including security researchers and AI specialists. The results highlight the effectiveness of Constitutional Classifiers:
✔️ No universal jailbreak was discovered that could consistently bypass the safeguards.
✔️ The system successfully blocked 95% of jailbreak attempts, a significant improvement over the 14% refusal rate observed in unguarded models.
✔️ The classifiers introduced only a 0.38% increase in refusals on real-world usage, indicating that unnecessary restrictions remain minimal.
✔️ Most attack attempts focused on subtle rewording and exploiting response length, rather than finding genuine vulnerabilities in the system......
Coding Tutorial 🎖️🎖️🎖️
In our previous tutorial, we built an AI agent capable of answering queries by surfing the web. However, when building agents for longer-running tasks, two critical concepts come into play: persistence and streaming. Persistence allows you to save the state of an agent at any given point, enabling you to resume from that state in future interactions. This is crucial for long-running applications. On the other hand, streaming lets you emit real-time signals about what the agent is doing at any moment, providing transparency and control over its actions. In this tutorial, we’ll enhance our agent by adding these powerful features.
Setting Up the Agent
Let’s start by recreating our agent. We’ll load the necessary environment variables, install and import the required libraries, set up the Tavily search tool, define the agent state, and finally, build the agent…….
pip install langgraph==0.2.53 langgraph-checkpoint==2.0.6 langgraph-sdk==0.1.36 langchain-groq langchain-community langgraph-checkpoint-sqlite==2.0.1
import os
os.environ['TAVILY_API_KEY'] = "<TAVILY_API_KEY>"
os.environ['GROQ_API_KEY'] = "<GROQ_API_KEY>"
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator
from langchain_core.messages import AnyMessage, SystemMessage, HumanMessage, ToolMessage
from langchain_groq import ChatGroq
from langchain_community.tools.tavily_search import TavilySearchResults
tool = TavilySearchResults(max_results=2)
class AgentState(TypedDict):
messages: Annotated[list[AnyMessage], operator.add]
class Agent:
def __init__(self, model, tools, system=""):
self.system = system
graph = StateGraph(AgentState)
graph.add_node("llm", self.call_openai)
graph.add_node("action", self.take_action)
graph.add_conditional_edges("llm", self.exists_action, {True: "action", False: END})
graph.add_edge("action", "llm")
graph.set_entry_point("llm")
self.graph = graph.compile()
self.tools = {t.name: t for t in tools}
self.model = model.bind_tools(tools)
def call_openai(self, state: AgentState):
messages = state['messages']
if self.system:
messages = [SystemMessage(content=self.system)] + messages
message = self.model.invoke(messages)
return {'messages': [message]}
def exists_action(self, state: AgentState):
result = state['messages'][-1]
return len(result.tool_calls) > 0
def take_action(self, state: AgentState):
tool_calls = state['messages'][-1].tool_calls
results = []
for t in tool_calls:
print(f"Calling: {t}")
result = self.tools[t['name']].invoke(t['args'])
results.append(ToolMessage(tool_call_id=t['id'], name=t['name'], content=str(result)))
print("Back to the model!")
return {'messages': results}