Back to Projects
Web2024

Transcription Pro

Structured Multi-Speaker Transcription System

Role

Full Stack Developer

Status

Core processing pipeline complete, UI and workflow iteration ongoing

Stack

Next.jsReactFlaskTailwind CSSRedisTranscription ModelsDiarization Models

The Challenge

Most transcription SaaS tools fall short for enterprise or internal workflows, especially when handling large files, multiple speakers, or custom output requirements. Self-hosted solutions are often brittle or difficult to scale.

The Solution

I designed a split-stack architecture focused on correctness and scalability. A Next.js frontend handles job submission and status, while a Flask backend with workers manages long-running transcription and diarization tasks asynchronously.

Key Features

  • Asynchronous Job Queueing with Redis
  • Large File Audio Processing
  • Multi-Speaker Diarization and Transcription
  • Structured Transcript Outputs
  • Export Formats (SRT, VTT, JSON, TXT)

Description

A web-based transcription and diarization system designed around reliable, high-volume audio processing. The core pipeline and job orchestration are production-ready, with UI and workflow refinement in progress.

Transcription Pro dashboard interface
Transcription Pro architecture diagram