A rental-search assistant with LLM features and multi-market support faced persistent user defects despite 1,553 passing automated tests. Analysis of 252 bug-fix commits showed 44% of fixes occurred at four unseen seams: browser runtime, non-default market, end-to-end flows, and whole-system level. A fix without a seam guard caused a defect to ship twice, highlighting the need for targeted testing at these boundaries.
LLM-Integrated App Bug Seams Reveal Testing Gaps
from English