Posted in arXiv, “How does ChatGPT behavior change over time?” In a paper titled Lingjiao Chen, Matei Zaharia and James Zou question the consistent performance of OpenAI’s large language models (LLM), specifically GPT-3.5 and GPT-4. API access has been used to test March and June 2023 versions of models in tasks such as Solving mathematical problems, answering sensitive questions, creating code, and visual reasoning. GPT-4’s ability to identify prime numbers is said to have been greatly reduced, From 97.6 percent accuracy in March to just 2.4 percent accuracy by June. Curiously, GPT-3.5 showed improved performance over the same period.
The study came after people often complained of a decrease in their self-perception of GPT-4 performance in recent months. Popular theories as to why OpenAI throttling models are included to reduce computational overhead to speed up output and save GPU resources; Fine-tuning (additional training) done to reduce harmful output may have unintended effects; And a series of unsubstantiated conspiracy theories, such as OpenAI diminishing the capabilities of GPT-4 coding to get more people to pay for GitHub Copilot.
Meanwhile, OpenAI has consistently denied these allegations. Last Thursday, OpenAI VP of Product Peter Welinder tweeted:
No, we haven’t made GPT-4 dumber. On the contrary: we make each new version smarter than the previous one. The current hypothesis: The more you use it, the more you’ll start to notice problems you didn’t see before.
OpenAI is aware of the new research and says it is watching for reports of declining GPT-4 capabilities.
The results didn’t convince all experts, but they say the increased uncertainty points to a bigger problem, which is that OpenAI doesn’t publish details about its models. Critics have repeatedly pointed out OpenAI’s current closed approach to AI.
In the case of GPT-4, the company did not disclose the source of the training materials, the source code, the weights of the neural network, or even the document describing the architecture.
Cover photo credit: Jonathan Raa/NurPhoto via Getty Images