Reviews · MAY 5, 2026
Reviewed: GPT-5.5 Instant ships as ChatGPT's new default with a 52.5% hallucination-reduction claim
OpenAI's May 5 update to the default ChatGPT model promises sharper answers on medicine, law, and finance. The headline number is internal; the rollout is universal.
OpenAI on Monday, May 5 released GPT-5.5 Instant as the new default model in ChatGPT, replacing GPT-5.3 Instant for the consumer product across the entire user base. The release headline, from OpenAI's internal evaluations, is a 52.5% reduction in hallucinated claims on high-stakes prompts spanning medicine, law, and finance.
Read carefully, the announcement is a reliability update inside the existing "Instant" family — the low-latency tier that fields the long tail of ChatGPT interactions — rather than a frontier-capability extension. OpenAI did not publish third-party benchmark scores for the new model on launch day. The lab did not publicly disclose the prompt-construction protocol behind the hallucination-reduction figure beyond the high-level domain categories.
The numbers we have:
- 52.5% fewer hallucinated claims vs. GPT-5.3 Instant, on prompts in medicine, law, and finance (OpenAI internal eval).
The numbers we don't have, on launch day:
- MMLU, GPQA, HumanEval, SWE-bench, or any other public benchmark.
- The breakdown of the 52.5% across the three named domains.
- The eval set size or prompt-construction methodology.
OpenAI bundled four adjacent product moves with the launch. The first was a new realtime voice intelligence model surface in its API, which the lab said can reason, translate, and transcribe speech within a single inference path. The second was a preview of a personal-finance experience in ChatGPT for U.S. Pro subscribers, with bank-account connection and a dashboard built on top of the chat surface. The third was Codex's expansion into the ChatGPT mobile application in preview, with extended remote-SSH, hooks, access-token, and HIPAA support for enterprise teams. The fourth was an unrelated research disclosure: the lab said one of its models had disproved a central conjecture in discrete geometry, a claim that did not come with a technical write-up.
For operators currently running production workloads on the GPT-5 series, the practical upgrade question is whether to retest workloads against GPT-5.5 Instant before letting the rollout settle. The honest answer is yes — the 52.5% figure is an aggregate over high-stakes domains and may not translate to a specific workload's hallucination profile. The default model has changed under everyone running ChatGPT-based features; the right move is to re-establish a baseline before assuming the upgrade is monotonically positive.
The release lands two weeks before Google's I/O announcements positioning Gemini 3.5 Flash, Gemini Omni, and Gemini Spark as a coordinated three-product response. The frontier-lab cadence is now: ship a reliability move, ship a modality move, ship an agent move — together, not sequenced. OpenAI led with reliability. Google followed with modality.