Shweta Sharma | UX Researcher

Overview

Breaking the Communication Barrier

Skill Constellation

Primary

Computer Vision (MediaPipe)ML Model Training (TensorFlow)Full-Stack Engineering

Supporting

Real-Time InferenceDocker Containerization

Emerging

Accessibility EngineeringOpen-Source Development

GesturePro is an interactive sign language translator that empowers hearing-impaired and aphonic individuals by using advanced AI to instantly translate sign language gestures into real-time text and speech. The platform uses computer vision and deep learning to recognize hand gestures through a webcam feed and convert them into readable text — bridging the communication gap in real time.

Live Demo →GitHub Repo →

Next.jsFastAPIPythonTensorFlowMediaPipePostgreSQLDockerVercel

Problem

Hearing-impaired individuals face daily communication barriers. Existing translation tools are either expensive, non-real-time, or require specialized hardware.

Solution

A browser-based, real-time sign language translator using just a webcam — no special hardware needed. Powered by ML hand-tracking and gesture recognition.

The Craft

Full-Stack AI Architecture

GesturePro is built as a three-tier architecture — a Next.js frontend for real-time video capture, a FastAPI backend for authentication and data management, and an ML pipeline for gesture recognition. The entire system is containerized with Docker for consistent deployment.

Skill Spotlight

Three-Tier System Design

Architected full-stack system separating real-time frontend, API backend, and ML inference pipeline into independently deployable services.

Evidence: Docker Compose orchestration with health checks across 3 services.

🖥️

Frontend

Next.js + TailwindCSS

Real-time webcam capture, video streaming UI, authentication flow (sign-in/sign-up), and responsive gesture translation display.

⚡

Backend

FastAPI + PostgreSQL

RESTful API with user authentication, session management, translation history storage, and health-checked Docker services.

🧠

ML Pipeline

Python + TensorFlow

Hand landmark detection via MediaPipe, processed training data, Jupyter notebooks for experimentation, and saved models for inference.

Skill Spotlight

Computer Vision Pipeline (MediaPipe)

Implemented MediaPipe hand landmark detection extracting 21 key points per frame, creating skeletal hand representations for ML classification.

Evidence: Real-time 21-landmark tracking powering gesture recognition at webcam speed.

Project Structure

gesturepro/
├── client/ — Next.js frontend (signin, signup, video capture)
├── server/ — FastAPI backend (auth, API, models, services)
├── ml/ — ML pipeline (notebooks, saved models, processed data)
├── data/ — Training datasets
└── docker-compose.yml — Full-stack orchestration

Languages

BaselineJavaScript 51.5%

→

TargetPython 31.9%

Codebase

Baseline56 commits

→

Target3 services

Deployment

BaselineDocker Compose

→

TargetVercel + Cloud

Key Features

What GesturePro Does

The platform combines real-time computer vision with a clean, accessible interface to create a seamless translation experience.

Skill Spotlight

ML Model Training (TensorFlow)

Trained TensorFlow classifier on MediaPipe landmark data to recognize ASL gestures with real-time inference capability.

Evidence: Working real-time gesture → text translation pipeline.

📹

Real-Time Video Capture

Browser-based webcam access streams hand gestures directly to the ML model — no downloads or special hardware required.

✋

Hand Landmark Detection

MediaPipe extracts 21 hand landmarks per frame, creating a skeleton representation of hand position and finger orientation.

🤖

AI Gesture Classification

TensorFlow model classifies hand landmarks into sign language letters and words with real-time inference.

💬

Instant Text Translation

Recognized gestures are immediately converted to on-screen text, enabling fluid conversation without delays.

🔐

User Authentication

Secure sign-in/sign-up flow with session management, enabling personalized translation history and preferences.

🐳

Containerized Deployment

Docker Compose orchestrates all services (frontend, backend, database) with health checks for reliable deployment.

The Evidence & Growth

Accessibility Through Technology

GesturePro represents a step toward making communication universally accessible. By combining browser-based computer vision with deep learning, the platform removes the cost and hardware barriers that have historically limited sign language translation tools.

Real-TimeTranslation Speed

Gestures are recognized and displayed as text within milliseconds of capture.

ZeroHardware Cost

Works with any standard webcam — no specialized sensors or gloves needed.

Full-StackProduction Architecture

Containerized 3-tier system with auth, persistence, and ML inference.

OpenSource Code

Fully open-source on GitHub for community contribution and extension.

Skill Spotlight

Accessibility-First Engineering

Built an accessible-by-default platform — zero cost, zero hardware, browser-only deployment democratizing ASL translation.

Evidence: Open-source, zero-hardware-cost solution deployed on Vercel.

Future Roadmap

🌍

Multi-Language ASL Support

Expand beyond ASL to include BSL, ISL, and other sign language systems for global accessibility.

🗣️

Text-to-Speech Output

Add voice synthesis so translated text can be spoken aloud for two-way communication.

📱

Mobile-First PWA

Progressive Web App for on-the-go translation using smartphone cameras.

📊

Learning Analytics

Track user progress in learning sign language with personalized practice recommendations.

GesturePro is an open-source project. Contributions, feedback, and feature requests are welcome on GitHub.

GesturePro — Real-Time Sign Language Translation

Breaking the Communication Barrier

Skill Constellation

Primary

Supporting

Emerging

Full-Stack AI Architecture

Three-Tier System Design

Frontend

Backend

ML Pipeline

Computer Vision Pipeline (MediaPipe)

Project Structure

Languages

Codebase

Deployment

What GesturePro Does

ML Model Training (TensorFlow)

Real-Time Video Capture

Hand Landmark Detection

AI Gesture Classification

Instant Text Translation

User Authentication

Containerized Deployment

Accessibility Through Technology

Accessibility-First Engineering

Future Roadmap

Multi-Language ASL Support

Text-to-Speech Output

Mobile-First PWA

Learning Analytics