“Can Large Language Models Reason?”
See Betteridge’s law
On a dataset of programming challenges, GPT-4 solved 10 out of 10 problems that had been published before 2021 (GPT-4’s pre-training cutoff date) and zero out of 10 problems that had been published after 2021.