Natural-language product discovery on a sportswear PDP, trained on 500+ products
A 'chat with the product page', built and user-tested in under six weeks.

The challenge
Product detail pages have looked the same since the early 2010s — a hero image, some tabs, a size selector, filters in a sidebar. Meanwhile, the way people interact with software has shifted completely. They ask questions. They describe what they want in natural language. They expect the system to understand intent, not just parse keywords.
The gap between those two had become a real problem for the brand. A shopper looking for "something bold and minimal for a city weekend" had no way to express that through a colour dropdown.
Product pages are also the highest-converting pages on any e-commerce site, which makes them the riskiest place to experiment. The brand's digital team didn't want a rebuild of their PDP. They wanted to know whether a version of conversational commerce could be validated quickly and cheaply before they committed production resources to it. Real enough to generate behavioural signal from real users, without requiring platform-wide changes.
The harder problem was the AI itself. Keyword search works fine when someone types "black running shoes size 10." It fails when someone types "comfortable shoes for walking around a new city for a few days." Those queries describe products that overlap, but they don't share a single keyword. The system had to understand what the shopper meant, not what they said.
What we learned
| Product pages froze in 2010 | Hero image, tabs, size selector — the interface stopped evolving while user expectations of software kept moving. |
| PDPs are the riskiest experiment | The highest-converting pages on any e-commerce site are the worst place to break something to learn. |
| Keywords miss intent | 'Comfortable shoes for walking around a new city' shares no keywords with the products it should return. |
The solution
We built a working prototype of a conversational product discovery layer embedded directly within the existing product detail page — no changes to the underlying e-commerce platform — deployed and user-tested in under six weeks.
Three architecture decisions did the heavy lifting. The conversational interface sits on top of the existing PDP, not in place of it. The familiar layout stays. The conversation appears as an optional layer. Shoppers who prefer browse-and-filter can ignore it; those who engage get progressively deeper interaction as they ask more questions. The component is headless — deployable on any product detail page without touching the underlying platform. That meant the prototype could be tested in isolation, and a future production version could roll out page by page or per A/B segment without a platform-wide deployment.
On the backend, two technical choices shaped everything. Pinecone for semantic product matching: every product in the catalogue stored as a vector embedding, so natural-language queries find the closest semantic matches regardless of keyword overlap. Sub-500ms query latency, fast enough that the interaction feels like a real-time dialogue rather than a database lookup. LangGraph for conversation state management: the assistant remembers what you've said, builds on your preferences, and treats a follow-up as a continuation rather than a new search. LangGraph also coordinated multi-step orchestration when a shopper asked a comparison question, retrieving information about multiple products and evaluating them against the stated criteria.
The less glamorous but arguably most important work was the fashion knowledge layer. Generic large language models produce generic responses — asked about sportswear, they default to spec-sheet descriptions. We built a layer that translates shopper language into product attribute language ("casual but not boring" maps to specific style characteristics), applies occasion logic ("gym to dinner" narrows to a different product set than "weekend travel"), and calibrates tone to match the brand. The difference is audible: "Here are three products matching your query" versus "For a city weekend where you want to look put-together without overdressing, here's what I'd pick — and here's why." The second one builds confidence. That's what converts.
The build ran six weeks total. Week 1 discovery, weeks 2-5 build with weekly demos, week 6 user testing with real shoppers on real product pages.
What this shaped
| Layer over, don't replace | Wrap the existing PDP with an optional chat surface so browse-and-filter shoppers stay unaffected. |
| Headless beats platform-wide | A component deployable per-page lets you A/B by segment instead of betting the whole site at once. |
| Knowledge layer is the moat | Translating 'casual but not boring' to product attributes is what separates a useful AI from a generic one. |
The impact
The functional prototype was delivered in under six weeks with a team of three engineers and one designer. User testing confirmed the hypothesis: shoppers engaged more deeply, explored more products, reported higher confidence in their selections. The brand advanced the concept to its production roadmap.
The lesson Twistag took away — and now applies on every conversational AI engagement — is that the engineering value isn't in the model. It's in the knowledge layer that translates between user intent and the brand's product reality. That's where a generic AI tool becomes a useful one.
What this proved
| Six weeks beats six months | Real users on real product pages give signal no planning document ever will. |
| Reasoning builds buyer confidence | 'Here's what I'd pick and why' converts where 'here are three matches' doesn't, because it sounds like advice. |
| The model isn't the engineering value | What ships is the layer that translates between user intent and the brand's product reality. |
Technologies used
- Pinecone
- LangGraph
- Next.js
- Node.js

