Fixing a Broken AI: Battling Class Imbalance with SMOTE & XGBoost

Автор: BioniChaos

Загружено: 2025-12-16

Просмотров: 3

Описание: We are back with another intense coding session working on our Kaggle competition entry for detecting Body-Focused Repetitive Behaviors (BFRBs). In this video, we face a classic Machine Learning nightmare: a model that looks amazing on the surface but is failing hard under the hood.

We start by running our full training pipeline on over 8,000 sensor sequences using data from IMU, Thermopile, and Time-of-Flight sensors. Initial indicators look incredible—our Binary Classification model (the "Bouncer") is hitting a 98% F1 score, perfectly identifying when a gesture occurs. However, when we ask the model which gesture is happening, things fall apart.

We analyze the Confusion Matrix to find that our model has learned to predict only two gestures out of ten, resulting in a disastrous Gesture F1 score. Through a mix of technical analysis and a comedy skit breakdown (featuring the "King of Idiots" meta-model), we realize our previous high scores were due to data leakage during grid search. The reality is we are dealing with severe class imbalance.

Join us as we debug the code live. We implement a two-front strategy to fix the imbalance:

Data Level: Aggressively tuning SMOTE parameters (increasing k-neighbors) to generate better synthetic data for rare gestures.

Algorithm Level: Implementing Class Weights in XGBoost to heavily penalize the model for missing rare classes.

We also squash a critical bug where our safe_fit function was silently ignoring our weight parameters, effectively rendering our fixes useless. Watch as we patch the code, handle the data leakage, and relaunch the training to chase a competitive spot on the Kaggle leaderboard.

All code and project updates are available at BioniChaos.com.

#MachineLearning #DataScience #Python #Kaggle #XGBoost #BioniChaos #AI #Coding #ClassImbalance #SMOTE #BiomedicalAI #WebDevelopment

00:00
Intro and model expectations: Anticipating an F1 score above 0.9.
01:00
Explaining the "Safe Fit" fix and how early stopping handles bad models without crashing.
02:46
Launching the full training run on 8,151 sequences and monitoring feature extraction.
06:03
The Skit: A Comedian and Data Scientist break down the "Bouncer" model (Binary Classification).
08:10
The Confusion Matrix disaster: Why the model is only guessing two gestures.
12:08
Live training update: Binary F1 hits 98%, but Gesture Classification is the real test.
19:30
Clarifying the dataset structure: Target gestures vs. Non-target noise.
22:54
Analyzing the Kaggle Leaderboard: Comparing our target against the top public and private scores.
26:04
The reality check: Why our full run dropped to a 76% average and the issues with "Rare" classes.
33:23
Implementing the fixes: Tuning SMOTE k-neighbors and adding computed Class Weights.
36:00
Plain language explanation: How we are forcing the model to pay attention to minority classes.
40:30
Skit Part 2: Realizing the previous 95% score was "cheating" due to data leakage in grid search.
50:50
Investigating the raw data: Identifying the 4:1 imbalance ratio between common and rare gestures.
53:44
Bug fix: Correcting the safe_fit function to properly pass sample weights to XGBoost.
55:00
Restarting the training pipeline with the new imbalance strategy and monitoring initial progress.

Check out the tools we develop at https://bionichaos.com
Support BioniChaos on Patreon: / bionichaos
Become a channel member to get exclusive perks: / @bionichaos

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Fixing a Broken AI: Battling Class Imbalance with SMOTE & XGBoost

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Создание панели мониторинга на основе датчиков с использованием ИИ: обучение модели на Python и в...

Создание панели мониторинга на основе датчиков с использованием ИИ: обучение модели на Python и в...

Почему «Трансформеры» заменяют CNN?

Почему «Трансформеры» заменяют CNN?

Typst: Современная замена Word и LaTeX, которую ждали 40 лет

Typst: Современная замена Word и LaTeX, которую ждали 40 лет

ЧТО ЗА РАЛЬФ? Вечный ИИ-агент для кодинга и не только

ЧТО ЗА РАЛЬФ? Вечный ИИ-агент для кодинга и не только

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

Расшифровка внутренней речи: анализ данных интерфейса «мозг-компьютер» с помощью Python и ИИ.

Расшифровка внутренней речи: анализ данных интерфейса «мозг-компьютер» с помощью Python и ИИ.

OpenAI, Google, Apple: кто реально победит в гонке AI

OpenAI, Google, Apple: кто реально победит в гонке AI

Код работает в 100 раз медленнее из-за ложного разделения ресурсов.

Код работает в 100 раз медленнее из-за ложного разделения ресурсов.

System Design Concepts Course and Interview Prep

System Design Concepts Course and Interview Prep

Как происходит модернизация остаточных соединений [mHC]

Как происходит модернизация остаточных соединений [mHC]

ПЛАН ТРАМПА РАСКРЫТ: Война в Иране и распад России к 2030 году

ПЛАН ТРАМПА РАСКРЫТ: Война в Иране и распад России к 2030 году

Google Antigravity VS Claude Code: Почему я перешёл | Полный гайд 2026

Google Antigravity VS Claude Code: Почему я перешёл | Полный гайд 2026

Экспресс-курс RAG для начинающих

Экспресс-курс RAG для начинающих

Playlist,,Deep House,Music Played in Louis Vuitton Stores

Playlist,,Deep House,Music Played in Louis Vuitton Stores

КРИЗИСА НЕ БУДЕТ, БУДЕТ ХУЖЕ. Рост инфляции - разбор. Влияние НДС

КРИЗИСА НЕ БУДЕТ, БУДЕТ ХУЖЕ. Рост инфляции - разбор. Влияние НДС

Понимание GD&T

30 самых прекрасных классических произведений для души и сердца 🎵 Моцарт, Бах, Бетховен, Шопен

30 самых прекрасных классических произведений для души и сердца 🎵 Моцарт, Бах, Бетховен, Шопен

Декораторы Python — наглядное объяснение

Декораторы Python — наглядное объяснение

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Гигабаза Маска и безумие CES 2026 — новый Atlas, живые LEGO и странные гаджеты от Razer, HP и Lenovo

Гигабаза Маска и безумие CES 2026 — новый Atlas, живые LEGO и странные гаджеты от Razer, HP и Lenovo