Web2024

Transcription Pro

Structured Multi-Speaker Transcription System

Role

Full Stack Developer

Status

Core processing pipeline complete, UI and workflow iteration ongoing

Stack

Next.jsReactFlaskTailwind CSSRedisTranscription ModelsDiarization Models

The Challenge

Most transcription SaaS tools fall short for enterprise or internal workflows, especially when handling large files, multiple speakers, or custom output requirements. Self-hosted solutions are often brittle or difficult to scale.

The Solution

I designed a split-stack architecture focused on correctness and scalability. A Next.js frontend handles job submission and status, while a Flask backend with workers manages long-running transcription and diarization tasks asynchronously.

Key Features

Asynchronous Job Queueing with Redis
Large File Audio Processing
Multi-Speaker Diarization and Transcription
Structured Transcript Outputs
Export Formats (SRT, VTT, JSON, TXT)

Description

A web-based transcription and diarization system designed around reliable, high-volume audio processing. The core pipeline and job orchestration are production-ready, with UI and workflow refinement in progress.