HumanMCP: A New Dataset for Evaluating Model Context Protocols
A new dataset, named HumanMCP, has been developed to evaluate the performance of Model Context Protocols (MCP). MCP servers contain thousands of open-source standardized tools that connect large language models (LLMs) to external systems.
The dataset stands out for its realistic user queries, created to simulate human interactions. Existing datasets often lack this feature, limiting their ability to accurately assess tool usage and the ecosystems of MCP servers. HumanMCP includes diverse, high-quality queries paired with 2800 tools across 308 MCP servers, building upon the MCP Zero dataset.
Each tool is associated with several user "personas," created to represent varying levels of intent, from precise requests to ambiguous and exploratory commands. This reflects the complexity of real-world interactions and allows for a more accurate evaluation of tool retrieval system capabilities.
๐ฌ Comments (0)
๐ Log in or register to comment on articles.
No comments yet. Be the first to comment!