Because the two evaluation tools measure different things, perhaps?



Schrödinger's cat walks into a bar. And doesn't.