Problem: output comparison used exact string equality after whitespace
normalisation, causing correct solutions to fail on problems where
floating-point answers are accepted within a tolerance (e.g. 1e-6).
Solution: add an optional ui.panel.epsilon config value. When set,
actual and expected output are compared token-by-token: numeric tokens
are compared with math.abs(a - b) <= epsilon, non-numeric tokens fall
back to exact string equality. Per-problem epsilon can also be stored
in the cache and takes precedence over the global default.