Turn Any Audio into ANY Language — Build a Free Multi-Language Transcriber with Whisper.
Автор: JigCode
Загружено: 2025-11-20
Просмотров: 103
Описание:
Turn any audio or video into English and multi-language captions using a local AI transcriber built with Whisper, Argos Translate, and Streamlit – completely free, no API keys, and no paid SaaS involved.
In this video, we create a multi-language AI transcriber that:
Converts speech to text with Whisper AI
Generates timestamped English subtitles
Translates them into multiple languages with Argos Translate
Allows you to download everything as .txt and .srt files for YouTube, Reels, Shorts, lectures, podcasts, and more
If you’ve ever wanted an open-source alternative to tools like Otter, Descript, or other AI captioning tools, this is it.
🔍 What you’ll learn
How to install and run Whisper locally for speech-to-text
How to use Argos Translate for free text translations on your own machine
How to build a simple Streamlit UI for uploading audio/video files and downloading transcripts
How to generate proper SRT subtitle files with timestamps from Whisper segments
How to produce subtitles in multiple languages from one audio file
Everything runs on your computer, so your audio doesn’t have to go through any cloud APIs.
🧱 Tech Stack
🧠 Whisper – AI speech recognition and transcription
🌍 Argos Translate – open-source translation engine
🖥 Streamlit – lightweight Python web UI
🎥 ffmpeg – backend for audio/video decoding
💻 Source Code
Full project (app + transcriber + README):
👉 GitHub repo: https://github.com/jigs074/jigcode-Mu...
⚙️ Requirements
Python 3.10+
ffmpeg installed and added to your PATH
Internet connection only needed the first time to download:
Whisper model (e.g., base, small)
Argos language models (e.g., en → fr, en → es)
Afterward, the transcriber runs locally.
🧪 Use Cases
Auto-caption YouTube videos, Shorts, and Reels
Generate transcripts for lectures, podcasts, interviews
Create multilingual subtitles for courses and tutorials
Help learners with subtitles in their native language
Build internal tools where audio stays on your machine
❓ FAQ
Q: Does this use any paid API like OpenAI or Google Cloud?
A: No. Everything uses open-source models: Whisper + Argos. No API keys required.
Q: Can I change the Whisper model size?
A: Yes – you can switch between tiny, base, small, medium, large depending on your CPU/GPU and desired accuracy.
Q: What file types are supported?
A: Anything ffmpeg can decode: mp3, wav, m4a, mp4, mkv, mov, ogg, opus, etc.
📣 If this helped you, drop a comment telling me which feature you want next:
Audio → audio translation
Better UI
Batch processing / CLI
Hit like so more developers and creators discover it.
Subscribe for more real AI builds you can actually run on your own machine.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: