Humanity's Last Exam - AI Benchmark
Автор: Vectoring AI
Загружено: 2026-02-02
Просмотров: 4
Описание:
Humanity’s Last Exam (HLE), the new frontier benchmark designed to push AI systems far beyond traditional tests like MMLU.
Official website: https://lastexam.ai/
Dataset: https://huggingface.co/datasets/cais/hle
Leaderboard: https://scale.com/leaderboard/humanit...
Article: https://arxiv.org/abs/2501.14249
Created by the Center for AI Safety and Scale AI, HLE includes 2,500 expert-crafted questions spanning more than 100 advanced subjects — from mathematics and physics to law, chemistry, linguistics, and even specialized fields like Palmyrene inscriptions and ballet technique.
In this episode, we break down:
The origins and motivations behind HLE
How nearly 1,000 experts across 50 countries contributed
Why these questions are designed to be non-searchable and originality-checked
How accuracy and calibration are evaluated
Why this benchmark matters for AGI safety and research
#HumanitysLastExam #AIbenchmark #AGI #VectoringAI #AISafety #LLM #ArtificialIntelligence #MMLU #ScaleAI #CenterForAISafety #gpt
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: