AI TL;DR
Explosive leaks reveal Google's internal Gemini 3.5 'Snow Bunny' model can generate 3,000 lines of code in one shot, scores 88% on lateral reasoning, and may outperform GPT-5.2 and Claude Opus 4.5.
The AI world is buzzing with leaks about Google's next frontier model. In late January 2026, internal test versions of Gemini 3.5—codenamed "Snow Bunny"—began surfacing through A/B testing on Google AI Studio, and the capabilities being reported are nothing short of extraordinary.
What is "Snow Bunny"?
"Snow Bunny" is an internal codename for what appears to be Google's most advanced thinking model to date. Unlike previous Gemini iterations, this model introduces System2 Reasoning (also called "Deep Think" mode), allowing it to engage in profound deliberation before generating responses.
Early testers encountering mysterious model IDs like "DN9" and "D13" on Google AI Studio have described it as a "frontier-class engine built for full-stack creation" rather than a typical chatbot.
Mind-Blowing Code Generation
The most viral demonstrations of Snow Bunny involve its unprecedented coding capabilities:
The Game Boy Emulator Test
One tester reported that the model generated a fully functional Nintendo Game Boy emulator—over 3,000 lines of executable code—in a single prompt. The code required only minor manual adjustments to run correctly, a feat that would take human developers weeks.
"Vibe Coding" at Scale
The model can apparently interpret vague, high-level structural descriptions and deploy production-ready applications:
- Complete web applications from conceptual descriptions
- Clean HTML/CSS/JS without syntax errors
- Multi-file project generation with proper architecture
"This isn't code assistance—it's code generation at a level we've never seen before." — Anonymous AI Studio Tester
Reasoning Benchmarks: Crushing the Competition
Where Snow Bunny truly shines is in advanced reasoning tasks:
| Benchmark | Gemini 3.5 (Leaked) | GPT-5.2 | Claude Opus 4.5 |
|---|---|---|---|
| Hieroglyph (Lateral Reasoning) | 80-88% | 55% | ~60% |
| Logic Tests | 80%+ | 72% | 75% |
The Hieroglyph Benchmark
This test measures a model's ability to infer patterns and rules in symbolic sequences—a proxy for genuine abstract reasoning. Snow Bunny's reported 80-88% score represents a massive leap from previous models and suggests Google may have cracked something fundamental in reasoning architecture.
Specialized Variants Revealed
The leaks suggest Google is developing a family of specialized Gemini 3.5 models:
Fierce Falcon
- Optimized for speed and logical reasoning
- Likely the lightweight, fast-inference variant
- Designed for real-time applications
Ghost Falcon
- Excels in UI design and visual effects
- Strong audio creation capabilities
- Multimodal output generation
Snow Bunny (Core)
- The flagship full-capability model
- Deep reasoning and agentic workflows
- Maximum context and capability
Multimodal Mastery
Beyond text and code, Snow Bunny reportedly handles:
- Complex SVG generation for vector graphics
- Waveform-level audio tracks directly from prompts
- Advanced image understanding with reasoning
- Cross-modal generation (text → image → code workflows)
When Can We Expect Access?
Currently, Snow Bunny appears sporadically in A/B tests on Google AI Studio. Users have reported encountering the model unpredictably, with no official announcement from Google.
Based on Google's historical release patterns, a full public release—likely branded as Gemini 3.5 or Gemini 3.5 Ultra—could arrive within the next few weeks. Some speculate a Super Bowl timing for maximum publicity.
What This Means for the AI Race
If these leaks are accurate, Google is about to leap ahead in several key areas:
- Reasoning depth that surpasses current OpenAI and Anthropic offerings
- Code generation capabilities that could transform software development
- Specialized variants offering optimized performance for specific use cases
The "Thinking Model" era that began with OpenAI's o3 is entering its next phase, and Google appears determined to lead it.
Stay Updated
Note: Google has not officially confirmed these leaked specifications. All claims are based on reports from developers who encountered test versions through A/B testing.
