This comparison is super useful for anyone relying on AI tools. The hallucination issue with Claude is particulary telling, it shows these models will confidently serve up fiction when they lack data. What makes this experiment valuable is that you didn't just accept the first answer but pushed back, which revealed how often AI doubles down before correcting itself. I've had smiliar experiences where followup questions expose the gaps.
This comparison is super useful for anyone relying on AI tools. The hallucination issue with Claude is particulary telling, it shows these models will confidently serve up fiction when they lack data. What makes this experiment valuable is that you didn't just accept the first answer but pushed back, which revealed how often AI doubles down before correcting itself. I've had smiliar experiences where followup questions expose the gaps.