AI & InfrastructureOngoing
On-Prem AI Compute Platform
GPU Inference & Product Processing
Role
Architect & Lead Developer
Status
In active use powering AI workloads for VidCutAI and Nulite products
Stack
Nvidia RTX 4080Nvidia RTX 3070 TiNvidia RTX 3060DockerRedisTranscription ModelsDiarization ModelsLocal LLMs
The Challenge
AI products like VidCutAI and Transcription Pro require consistent, high-throughput GPU compute for transcription, diarization, and language models without the cost or latency of cloud-only solutions.
The Solution
I built dedicated GPU-backed machines integrated into the homelab network, separating production and testing workloads. These systems expose internal APIs for AI processing and support both local and hybrid (API-assisted) workflows.
Key Features
- Dedicated GPU Inference Nodes
- Production and Testing Environments
- Self-hosted Transcription, Diarization, and LLM Workloads
- Internal APIs for AI Processing
- Tight Integration with Desktop and Web Products
Description
A dedicated on-prem AI compute layer built on top of the homelab infrastructure, designed to run GPU-intensive inference and batch processing workloads for AI-driven products.