AI-RADAR.IT · AI-RADAR.NET · AI-RADAR.TECH

News & analysis on local LLMs, stack & on-prem hardware.

📁 Frameworks AI generated

Traversal-as-Policy: Log-Distilled Gated Behavior Trees for Safe LLM Agents

Published on 2026-03-09 04:05 🏆 ArXiv cs.LG 📰 Read the original source article →

Traversal-as-Policy: Behavior Trees per agenti LLM sicuri

Traversal-as-Policy: A New Approach for LLM Agents

Managing safety and efficiency in autonomous agents based on LLMs presents a complex challenge. A new study introduces "Traversal-as-Policy," a method that uses Gated Behavior Trees (GBT) to control the behavior of these agents.

How it works

The approach involves extracting execution logs from sandbox environments (OpenHands) and distilling them into a single executable GBT. Each node in the tree represents a state-conditioned action macro, derived from successful trajectories. Trajectories considered unsafe are blocked via pre-execution "gates," updated based on experience to prevent the re-admission of dangerous contexts.

Results

Tests on various benchmarks (software, web, reasoning, security) demonstrate that GBT improves the success rate, reduces violations, and decreases costs. For example, on SWE-bench Verified (Protocol A, 500 issues), GBT-SE increases success from 34.6% to 73.6%, reduces violations from 2.8% to 0.2%, and cuts token/character usage from 208k/820k to 126k/490k. With the same distilled tree, 8B executors more than double success on SWE-bench Verified (from 14.0% to 58.8%) and WebArena (from 9.1% to 37.3%).

Implications

This approach offers a way to externalize and verify the policies of LLM agents, improving safety and efficiency. The ability to reduce computational costs and increase success opens new perspectives for the use of autonomous agents in complex environments.

AI-Radar Takeaway

Traversal-as-Policy uses Gated Behavior Trees (GBT) derived from execution logs to control LLM agents. This method enhances safety, efficiency, and reliability, reducing violations and computational costs. Results show significant improvements on standard benchmarks, opening new perspectives for the development of more robust autonomous agents.

🤖 Ask AI about this

Want to dive deeper? Read the full article from the source:

📖 READ THE ORIGINAL ARTICLE

💻 Need GPU Cloud Infrastructure?

For running LLM inference, training models, or testing hardware configurations, check out this platform:

PeerPush AI Community Platform

Discover and share AI tools and projects. Connect with developers, get feedback, and grow your AI startup in a vibrant community of innovators.

✓ AI Community ✓ Project Showcase ✓ Developer Network

🔗 This is an affiliate link - we may earn a commission at no extra cost to you.

AI-RADAR NEWSLETTER

Stay ahead — get AI signals in your inbox

Daily or weekly digest of the most important AI news. 160+ readers, no spam.

💬 Comments (0)

🔒 Log in or register to comment on articles.

No comments yet. Be the first to comment!

🔍 Continue Exploring

Explore LLM On-Premise

Complete guide to running AI models locally: hardware, stack, and privacy.

OpenAI's Secure Sandbox for Codex on Windows: Control and Efficiency for AI Agents

OpenAI's Secure Sandbox for Codex on Windows: Control and Efficiency for AI Agents

OpenAI has developed a secure sandbox environment to integrate Codex on Windows, aiming to enable efficient and protected coding agents. This solution implement

OpenKedge: Governance and Safety for Autonomous AI Agents

OpenKedge: Governance and Safety for Autonomous AI Agents

OpenKedge is an innovative protocol addressing vulnerabilities in API-centric architectures when autonomous AI agents execute state mutations. Instead of immedi

AgentWall: Runtime Safety and Control for Local AI Agents

AgentWall: Runtime Safety and Control for Local AI Agents

AgentWall introduces a runtime safety and observability layer for autonomous AI agents operating in local environments. It addresses the risk of unsafe or manip

Netomi’s lessons for scaling agentic systems into the enterprise

Netomi scales enterprise AI agents using GPT-4.1 and GPT-5.2. The platform combines concurrency, governance, and multi-step reasoning for reliable production wo

Microsoft Introduces Specification for AI Agent Behavior Control

Microsoft Introduces Specification for AI Agent Behavior Control

Microsoft has released a new specification enabling development, compliance, and security teams to define custom policies for AI agents. These directives, store

More in Frameworks

GNOME’s AI Assistant Now Generates Images: Newelle 1.4.5 Arrives

Llama.cpp cuts CUDA synchronizations, boosting on-premise inference performance

DeepSeek V4 Flash and MiniMax M3 on llama.cpp: When will native support arrive?

llama.cpp: Vulkan Tensor Parallelism Now Within Reach

A software veteran builds a local LLM harness and asks the community: what do you need?

Patronus AI secures $50M to crash-test AI agents

→ View all in Frameworks →

AI-Radar AI Frameworks

LangChain, LlamaIndex, Hugging Face, and the top frameworks for building AI applications.

👥 Join 160+ AI explorers

A free community of developers, engineers and AI enthusiasts following local AI daily.

Register free → Already a member? Log in