
Artificial Analysis
Artificial Analysis 是独立 AI 模型评测和对比平台,用于选择 LLM、图像模型和 AI 服务商。它追踪模型智能、速度、价格、上下文、延迟、质量和服务商可用性,帮助团队在接入模型前做决策。


LMArena,也就是过去常被称为 LMSYS Chatbot Arena / Chatbot Arena 的平台,是一个基于人类偏好的 AI 模型排行榜,覆盖文本和更多新模态。它适合追踪模型口碑,但不应作为唯一选型依据。
81
Views
0
Likes
Jan 2026
Added
lmarena.ai
Website
A quick visual look at LMArena before you visit the official site.

Editorial Review
Chatbot Arena 论文将其描述为通过众包人类两两比较来评估 LLM 的开放平台。LMSYS 方法更新还说明了从在线 Elo 式评分转向 Bradley-Terry 模型,以获得更稳定的评分和置信区间。
Chatbot Arena 论文将其描述为通过众包人类两两比较来评估 LLM 的开放平台。LMSYS 方法更新还说明了从在线 Elo 式评分转向 Bradley-Terry 模型,以获得更稳定的评分和置信区间。
The platform has evolved from LMSYS Chatbot Arena/LMArena branding toward Arena-style leaderboards, but the core idea is human-preference model comparison.
The Chatbot Arena paper and LMSYS updates describe blind pairwise comparisons and Bradley-Terry/Elo-like rating methodology.
No. Use it as one signal and also evaluate cost, latency, safety, context length, tool use, and your own domain tasks.
Visit the official website to get started
Have an AI tool to share?
Get your product in front of people actively exploring AI tools.
Submit Your Tool
Artificial Analysis 是独立 AI 模型评测和对比平台,用于选择 LLM、图像模型和 AI 服务商。它追踪模型智能、速度、价格、上下文、延迟、质量和服务商可用性,帮助团队在接入模型前做决策。

LiveCodeBench is a holistic and contamination-free evaluation benchmark of LLMs for code that continuously collects new problems over time. - 智能 AI 工具,助力您的工作效率。

Compare LLM API pricing across 200+ models from OpenAI, Anthropic, Google, and more. Includes token counters, cost calculators, and benchmark comparisons. - 智能 AI 工具,助力您的工作效率。

whichllm 用硬件识别加上关注时效性的基准排名,帮助开发者找出最适合自己机器的本地 LLM,而不是只靠参数规模盲猜。