
The fashion recommendation space was crowded with LLM demos by 2025. The pattern was predictable: upload a photo, describe an occasion, get a list of "recommended items" with beautiful descriptions and purchase links that either don't exist, lead to wrong products, or show items sold out.
The interesting design question wasn't "how do we recommend better outfits?" It was "how do we architect an agent system where hallucination at the product level is structurally impossible?"
A naive sequential execution of 5 LLM calls would take 20–30 seconds. Total p95 latency in production: 8.4 seconds.