Files
tesla-roadtrip/server/services/llm
tony ed64712525 feat: Phase 5 — live streaming trip building via SSE
Grok now drives the trip rendering in real time instead of dumping
the full result after ~90 seconds.

Backend
- GrokHeadlessClient gains a chatStream() async generator that spawns
  grok with --output-format streaming-json (NDJSON of {type,data}
  events), buffers the "text" tokens, and emits partial events as the
  buffer becomes parseable.
- tryPartialJsonParse — lenient JSON repair: walks the buffer once,
  closes structures in stack order, drops in-progress strings and
  dangling keys, returns whatever object is currently consistent.
  Hard-tested with progressive slicing of a multi-stop itinerary.
- New SSE endpoint POST /api/chat/stream with events: open / thinking
  / partial / done / error. Uses res.on('close') + writableEnded as a
  reliable client-disconnect signal (req.on('close') fires in Express
  5 once the body is consumed, which was killing the grok child).

Frontend
- sendMessage swaps to fetch+ReadableStream against /api/chat/stream
  and parses SSE blocks. Each partial event runs a fast synchronous
  normalizePartialItinerary (no Nominatim — drops stops missing
  lat/lng so partial render doesn't block on geocoding).
- The done event runs the full async normalizer for the final pass
  and caches the result per variant.
- Stops, day cards, map markers, polylines, the variant strip, and
  the trip summary all update progressively as Grok writes each stop.

Verified with a London → Edinburgh prompt: 6 partial events landed
across the 76-second stream, with the rail filling in
"Baldock Services" → "+Grantham A1" → "+Premier Inn Newcastle"
→ "+Fort Kinnaird" before the final done event.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 16:01:00 +01:00
..