[CVPR 2025] InstructCLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement
Автор: Sherry X. Chen
Загружено: 2025-06-04
Просмотров: 93
Описание:
Natural language edit instructions provide an intuitive way to guide image editing, but misalignments between the visual changes in input/output image pairs and their corresponding instructions in synthetic training datasets hinder model performance. We introduce Instruct-CLIP, a self-supervised method that learns semantic differences between image pairs to refine edit instructions. It can also be extended to latent diffusion models and used as a loss function during training. Applied to the InstructPix2Pix dataset, Instruct-CLIP produces 120K refined samples and significantly improves editing performance.
0:09 Motivation
1:19 Method Overview
2:25 Instruct-CLIP (I-CLIP) Vision Encoder
3:15 Training DINOv2 with Latent Diffusion
4:01 Edit Instruction Refinement Results
4:25 Image Editing Results
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: