MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers
Published in arXiv preprint arXiv:2508.20453, 2025
Introduces a benchmark for evaluating LLM agents on real-world tasks using Model Context Protocol servers.
Recommended citation: Z. Wang, Q. Chang, H. Patel, S. Biju, C. E. Wu, Q. Liu, A. Ding, A. Rezazadeh, et al. "MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers." arXiv preprint arXiv:2508.20453, 2025.