SAM - Segment Anything model for promptable pixel segmentation
Автор: Mak Gaiduk
Загружено: 2024-10-15
Просмотров: 905
Описание:
This video talks about SAM - an attempt to build a foundational segmentation model by Facebook researchers.
Instead of classical segmentation problems like Semantic Segmentation or Instance Segmentation, which rely on existing set of labels, SAM deals with "Promptable Segmentation" - given a bounding box or a point "prompt", return segmentation for the object to which that point belongs, no matter what that object might be.
This unusual approach allowed creating of a huge, diverse dataset for segmentation and training a good model. On the other hand, SAM model cannot solve other segmentation tasks (like instance segmentation) on its own, but it can do that with an extra model to generate prompts, like object detector for example.
The video also dives deep into the architecture of the model - prompt encoding, Decoder with query-to-image cross attention, upscaling with Transposed Convolution, maks segmentation and Dice and IoU loss.
Important links:
Original paper: https://arxiv.org/pdf/2304.02643
DETR paper: https://arxiv.org/pdf/2005.12872
MaskFormer paper: https://arxiv.org/pdf/2107.06278
Tesla Autopilot video with 3d depth estimation: • FULL Andrej Karpathy Tesla Autonomous Driv...
Tesla Optimus video with occupancy detection: https://www.youtube.com/live/ODSJsviD...
00:00 - Intro
01:17 - Segmentation Task
07:04 - Promptable Segmentation
08:58 - Data Engine
14:34 - Model Architecture
15:45 - Prompt Encoding
18:02 - Decoder, Cross Attention
28:04 - Loss
33:08 - Transposed Convolution
37:02 - Results
Повторяем попытку...

Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: