Intelligent Document Processing Platform
AI-Powered Multilingual OCR & Document Intelligence
Converts scanned PDFs and photographed forms into clean, structured Markdown across 10+ languages including complex scripts like Urdu, Arabic and Amh…
I'm Hammad Tariq, a Software Engineer with 5+ years building production-grade systems: async APIs, self-hosted AI pipelines, and data architectures that cut cost and scale cleanly.
Hammad Tariq
Backend, AI & Data Engineer
Tools of the trade
Whether it's orchestrating data pipelines or fine-tuning LLMs for domain automation, the focus is always scalability, observability and clean execution.
I design and ship the unglamorous parts that decide whether a system survives production: async ingestion, confidence scoring, on-prem verification, and pipelines that fail soft and recover on their own. Recent work has cut operating costs by up to 99% and turned hours of manual effort into seconds.
Embedding LLMs (Gemini, Claude, Whisper) into production APIs with low latency and self-hosted economics.
Orchestrated ETL/ELT with Dagster, dbt and Airflow — from raw ingestion to analytics-ready warehouses.
Dockerised, CI/CD-driven deployments on AWS with cost optimisation and observability baked in.
Async architectures, isolated failure domains and graceful degradation — built to survive real traffic.
Real systems with measured outcomes — problem, architecture, and the numbers that moved.
AI-Powered Multilingual OCR & Document Intelligence
Converts scanned PDFs and photographed forms into clean, structured Markdown across 10+ languages including complex scripts like Urdu, Arabic and Amh…
A Zero-Cost-Per-Lookup B2B Email-Discovery Pipeline
A self-hosted pipeline that turns a company domain into SMTP-verified emails for titled employees at effectively zero marginal cost replacing a five-…
OSINT-grade people search, synthesised in under 15 seconds
A people-search API that aggregates public-web signals from Google, Bing and DuckDuckGo, fuses them through a proprietary clustering engine, and synth…
Turning spoken words into structured, speaker-attributed intelligence
A RESTful API that transforms raw audio into speaker-attributed transcripts using a multi-model pipeline Whisper for transcription, PyAnnote for diar…
Practical, production-tested deep dives on backend, AI and data engineering — the decisions and the numbers behind real work.
A practical AI cost optimization guide for 2026: why your AI bill exploded, how task-based model routing cuts LLM costs 30–50%, and a real case study that slashed document-processing AI spend by 99%.
An honest, production-tested comparison of Dagster vs Airflow vs Prefect for ETL in 2026 — and how to pick the best ETL orchestration tool for your stack.
Three ways to engage — from a focused consultation to full end-to-end delivery.
From production APIs to self-hosted AI that kills per-call costs let's scope it. I reply within one business day.