AI Network News - Sierra Nakamura Introduces
Автор: AINewsMediaNetwork
Загружено: 2025-06-03
Просмотров: 34
Описание:
When you’re trying to communicate or understand ideas, words don’t always do the trick. Sometimes the more efficient approach is to do a simple sketch of that concept — for example, diagramming a circuit might help make sense of how the system works.
But what if artificial intelligence could help us explore these visualizations? While these systems are typically proficient at creating realistic paintings and cartoonish drawings, many models fail to capture the essence of sketching: its stroke-by-stroke, iterative process, which helps humans brainstorm and edit how they want to represent their ideas.
A new drawing system from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Stanford University can sketch more like we do. Their method, called “SketchAgent,” uses a multimodal language model — AI systems that train on text and images, like Anthropic’s Claude 3.5 Sonnet — to turn natural language prompts into sketches in a few seconds. For example, it can doodle a house either on its own or through collaboration, drawing with a human or incorporating text-based input to sketch each part separately.
The researchers showed that SketchAgent can create abstract drawings of diverse concepts, like a robot, butterfly, DNA helix, flowchart, and even the Sydney Opera House. One day, the tool could be expanded into an interactive art game that helps teachers and researchers diagram complex concepts or give users a quick drawing lesson.
CSAIL postdoc Yael Vinker, who is the lead author of a paper introducing SketchAgent, notes that the system introduces a more natural way for humans to communicate with AI.
“Not everyone is aware of how much they draw in their daily life. We may draw our thoughts or workshop ideas with sketches,” she says. “Our tool aims to emulate that process, making multimodal language models more useful in helping us visually express ideas.”
SketchAgent teaches these models to draw stroke-by-stroke without training on any data — instead, the researchers developed a “sketching language” in which a sketch is translated into a numbered sequence of strokes on a grid. The system was given an example of how things like a house would be drawn, with each stroke labeled according to what it represented — such as the seventh stroke being a rectangle labeled as a “front door” — to help the model generalize to new concepts.
Vinker wrote the paper alongside three CSAIL affiliates — postdoc Tamar Rott Shaham, undergraduate researcher Alex Zhao, and MIT Professor Antonio Torralba — as well as Stanford University Research Fellow Kristine Zheng and Assistant Professor Judith Ellen Fan. They’ll present their work at the 2025 Conference on Computer Vision and Pattern Recognition (CVPR) this month.
SketchAgent Official
https://yael-vinker.github.io/sketch-...
SketchAgent Official White Paper
https://yael-vinker.github.io/sketch-...
SketchAgent Code GitHub
https://github.com/yael-vinker/Sketch...
🔔 Subscribe for brutal honesty, weekly breakdowns, and the sharpest commentary in AI media.
🔗 Follow me for more AI news & updates:
X/Twitter: https://x.com/ainewsmedianet
Instagram: / ainewsmedianetwork
Facebook: https://www.facebook.com/profile.php?...
Websites:
https://aienvisioned.com/
https://aicoreinnovations.com/
https://aiinnovativesolutions.com/
https://aiforwardthinking.com/
#SketchAgent #aiart #multimodalai #aidrawing #humanaiinteraction #aitools #creativeai #CSAIL #StanfordAI #aieducation #aiinnovation #aiindesign #aistorytelling #aivisuals #ainetworknews #ainews #technews #futureofai #nextgenai #SketchWithAI
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: