HERMES++ answers language queries while predicting roads
The prevailing view has been that autonomous‑driving world models must choose between two extremes: a perception‑only pipeline that reconstructs the current bird’s‑eye‑view (BEV) layout, or a generative model that rolls forward future geometry without a semantic grasp of the scene. HERMES++ demonstrates that a single network can inhabit both roles, answering natural‑language queries while extrapolating the road ahead. Previously, scene‑understanding systems relied on dense BEV encoders tuned fo...
📰 Original Source
Read full article at Dev →KhanList aggregates and links to publicly available news content. We do not host full articles from third-party sources. Always verify important information with original sources.