ycliper

Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
Скачать

The Data Addition Dilemma

Simons Institute

theoretical computer science

UC Berkeley

Computer Science

Theory of Computing

foundations of computing

Domain Adaptation and Related Areas

Irene Y Chen

Автор: Simons Institute

Загружено: 2024-11-12

Просмотров: 1234

Описание: Irene Y Chen (UC Berkeley)
https://simons.berkeley.edu/talks/ire...
Domain Adaptation and Related Areas

When training machine learning methods, combining data from different sources isn't always beneficial. While more data generally helps machine learning models, mixing data from dissimilar sources can sometimes reduce overall accuracy, create unpredictable fairness issues, and worsen performance for underrepresented groups. We identify this situation as the "Data Addition Dilemma", which happens due to a trade-off between the benefits of more data and the drawbacks of combining different data distributions. We find that this possibly arises from an empirically observed trade-off between model performance improvements due to data scaling and model deterioration from distribution shift. We thus establish baseline strategies for navigating this dilemma, introducing distribution shift heuristics to guide decision-making on which data sources to add in data scaling, in order to yield the expected model performance improvements. We conclude with a discussion of the required considerations for data collection and suggestions for studying data composition and scale in the age of increasingly larger models.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...
The Data Addition Dilemma

Поделиться в:

Доступные форматы для скачивания:

Скачать видео

  • Информация по загрузке:

Скачать аудио

Похожие видео

How Transformers Learn Causal Structure with Gradient Descent

How Transformers Learn Causal Structure with Gradient Descent

All Machine Learning algorithms explained in 17 min

All Machine Learning algorithms explained in 17 min

Julia Kempe - Synthetic Data – Friend or Foe in the Age of Scaling?

Julia Kempe - Synthetic Data – Friend or Foe in the Age of Scaling?

Learning from Dynamics

Learning from Dynamics

Using Machine Learning and Psychology to Predict and Understand Human Decisions

Using Machine Learning and Psychology to Predict and Understand Human Decisions

A New Paradigm for Learning with Distribution Shift

A New Paradigm for Learning with Distribution Shift

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

StatQuest: Principal Component Analysis (PCA), Step-by-Step

StatQuest: Principal Component Analysis (PCA), Step-by-Step

Generalization in the representations and computations of frontier language models.

Generalization in the representations and computations of frontier language models.

4 Hours Chopin for Studying, Concentration & Relaxation

4 Hours Chopin for Studying, Concentration & Relaxation

© 2025 ycliper. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]