Google I/O 2026 recap: Gemini Omni, Spark, and Gemini 3.5

What Google announced on May 19, 2026 — and what actually deserves your attention

May 20, 20268 min readClipia

Google I/O 2026 recap: Gemini Omni, Spark, and Gemini 3.5

On May 19, 2026, Google ran its I/O — and for the first time in a couple of years, it wasn't "yet another model." It was a paradigm pivot. Gemini is shifting from a tool you prompt and wait on into a system that sees, hears, generates, and gets work done on its own.

The short version: one model now makes video out of literally any input (Omni), another one lives in your background 24/7 and closes tasks while you sleep (Spark), and the new core Gemini 3.5 Flash is four times faster than the previous frontier.

Here's everything that landed at I/O 2026 — without the marketing fog — and what actually deserves your attention.

One note: we're already wiring Gemini Omni into Clipia — no separate Google subscription, no VPN. Subscribe to our Telegram channel to hear about it on launch day. Meanwhile, Clipia already runs Veo 3.1, Seedance 2.0, Kling 3.0 — same class of models.

Watch the keynote

A one-minute keynote sizzle — no commentary, just Gemini Omni results:

The full Google I/O 2026 keynote recording (~1h 50m) and Google's official 35-minute recap, both on Google's YouTube channel:

▶ Full keynote (1h 50m) · Condensed 35-minute recap

TL;DR — the 60-second version

Gemini Omni — multimodal "anything in, anything out" model. Inputs: text, images, audio, video. Output: up to 10 seconds of video with sound. Conversational voice editing.
Gemini Spark — AI agent that runs in the background 24/7 on Google Cloud, even with your laptop closed and your phone locked. Reads email, drafts in Docs, fills Sheets.
Gemini 3.5 Flash — new frontier model, 4× faster, beats 3.1 Pro on coding and multimodal benchmarks.
AI Search — search box now accepts images, files, video, and even Chrome tabs as input.
Smart Glasses — eyewear with Samsung, Gentle Monster, and Warby Parker, Gemini built in. US, this fall.
Wear OS 7, Googlebooks, Universal Cart, Neural Expressive — a wave of ecosystem updates.
AI Ultra — now $100/month, with a $200/month top tier.

Gemini Omni — the big story for content people

Gemini Omni takes any input and outputs video

This is the one model worth watching the full keynote for. Google calls it "create anything" — and it's not just marketing. Omni is natively multimodal: not three models behind a single API, but one neural network that takes a mix of formats and outputs video while understanding the whole context at once.

What goes in

Text — describe the scene, dialogue, action.
Images — up to 5 references: character, background, props.
Audio — voice, music, sound effects.
Video — an existing clip to remix or edit.

What comes out

Video up to 10 seconds long with natively generated audio. Not "effects on top" — a single scene where the model holds physics, character continuity, and logical flow.

The demos Google ran

A marble rolling through a maze — with believable bounce physics and audio for every hit, including the bell ringing at the finish.
A claymation explainer of protein folding — consistent animation, readable captions, recognizable style.
A professor at a chalkboard — writing out a trigonometric identity and explaining it aloud, with the text on the board staying legible.

Here's the key keynote demo — the marble with real physics and natively generated audio:

Video officially published by Google in the blog post "Introducing Gemini Omni". More demos in our deep dive into Omni.

The killer feature — Conversational Editing

This isn't "press a button, apply a filter." It's a dialogue with the model:

— "Swap the character for a blonde." — "Make the lighting more cinematic." — "Pull the camera up." — "Stabilize the motion."

Omni holds the conversation's context and doesn't break the scene between edits. Faces, props, and lighting stay consistent across iterations.

Safety — SynthID on every frame

Every video Omni produces is watermarked with an invisible SynthID signal. It can be checked later to identify AI-generated content. Bonus: OpenAI, Kakao, and ElevenLabs all signed on to the same scheme — an industry standard is forming.

Availability

Gemini Omni Flash — live now, on AI Plus at $20/month and up.
YouTube Shorts and YouTube Create — free, rolling out the same week as the announcement.
Omni Pro — teased, no firm date.

Want to try AI video without committing to a Google subscription? Compare models on Clipia →

Gemini Spark — the first agent that works while you sleep

Gemini Spark works in the background while the user sleeps

Yes, "AI that does everything for you while your laptop's closed" sounds like marketing. But the engineering underneath is real: Spark literally runs on virtual machines in Google Cloud, not on your device. So it actually keeps working when your iPhone is in your pocket and the MacBook lid is down.

What Spark does

Reads your Gmail inbox.
Fills Sheets with information scattered across Docs and email threads.
Coordinates meetings through Calendar.
Operates third-party services: Canva, OpenTable, Instacart.
Accepts voice tasks and queues several in parallel.

The demo that flipped the keynote

On stage, Google walked through Spark planning and organizing a full party single-handed: it pulled the guest list from email, booked a table via OpenTable, designed posters in Canva, drafted an invoice in Sheets, and sent invites. All from a single spoken brief.

Daily Brief — morning triage from your agent

Spark ships with Daily Brief: overnight, the agent works through your inbox, calendar, and tasks, and surfaces "what matters today" in the morning. Not a post-hoc AI summary — proactive triage.

What's under the hood

Gemini 3.5 Flash plus a new agent framework called Antigravity. Spark checks in with you for high-stakes actions (e.g., before payment) and quietly handles the rest.

Availability

Beta — week after I/O.
AI Ultra subscribers in the US only — $100/month.
iOS and Android — both at launch.
Global rollout and Chrome integration — later this year.

Gemini 3.5 Flash + Antigravity — the engine for everything else

Not the loudest announcement, but it's the model Omni and Spark stand on.

4× faster than the previous frontier on output tokens per second.
Beats 3.1 Pro on coding, agentic, and multimodal benchmarks.
Antigravity — the upgraded developer platform for building custom agents on top of Gemini.
Gemini 3.5 Pro is in private testing, public next month.

For developers, this is the floor on "how fast can I ship an agent" dropping another step.

AI Search — drop anything into the search bar

Google rethought the search box. It's no longer "type here." It's an intelligent input surface:

Images — ask "what is this" or "where is this from."
Files — PDFs, docs, spreadsheets.
Video — "find me this moment in this clip."
Chrome tabs — search that's aware of your open pages.

Suggestions go past autocomplete: the system predicts intent, not just letters.

Paired with it: Universal Cart — a shared cart where Gemini watches prices, deals, and stock across stores. Utilitarian, but it turns Google from search into a shopping agent.

Smart Glasses — Gemini moves onto your face

Google and Samsung announced partnerships with two eyewear brands:

Gentle Monster — premium.
Warby Parker — mass US audience.

What's inside:

Voice access to Gemini.
Photos and video on command.
Real-time translation — talk across languages, the glasses overlay the translation.
Navigation overlay.

Ships fall 2026, US first. This isn't a Vision Pro replay: the play is light weight and daily wear.

Wear OS 7, Googlebooks, Neural Expressive — the ecosystem

The smaller announcements that together complete the picture:

Wear OS 7 — Gemini foundation for watches plus a new Create My Widget feature: describe a widget by voice, the system assembles it.
Googlebooks — premium Android laptops from Acer, ASUS, and Lenovo, Gemini baked in. Out this fall.
Neural Expressive — new design language for the Gemini app: fluid animation, saturated colors, haptic response. The "stern corporate" era is over.

SynthID becomes an industry standard

Quiet but important: OpenAI, Kakao, and ElevenLabs all signed on to SynthID — Google's invisible watermark for AI-generated content. That means more models worldwide will tag their video and audio the same way. For moderation platforms and journalism, this is a meaningful step toward a unified "is this AI" signal.

What this means for content creators

The big signal out of I/O 2026 is the death of the one-shot prompt. The old loop was: write a prompt, get an image or clip, finish it in an editor. The new one:

You edit video by voice in chat until it's the version you wanted.
An agent in the background pulls references, drafts shot lists, writes captions.
Search accepts a screenshot instead of a sentence.

This doesn't mean rush to migrate to Gemini. It means the interface to AI is shifting from prompt to conversation. The model and the platform that lose that transition will fall behind.

If you already work with AI video but don't want to lock into a single subscription — Clipia gives you Seedance 2.0, Kling 3.0, Veo 3.1, Nano Banana 2, and others side by side. No "either/or" plans: pay for credits, pick by task.

What Clipia is doing about this announcement

Let's not pretend to be neutral observers: I/O 2026 hits us directly.

Gemini Omni — we're plugging it in. Right now, Omni lives inside Google AI Plus at $20/month. There's no public API yet, but the day Google opens one, Omni shows up in Clipia without a separate subscription. In parallel, we're finishing infrastructure for conversational editing (a new UX pattern that needs long-lived sessions).

What's already running with us:

Veo 3.1 — cinematic model from the same Google DeepMind team that built Omni.
Seedance 2.0 — up to 9 references and 2K resolution.
Kling 3.0 — Multi-Shot and Motion Control.
Nano Banana 2 — best-in-class I2I setup.

One account, credits instead of subscriptions, no VPN.

To get the launch notification: → Clipia Telegram channel. We post the day any new model goes live.

Deep dive into Gemini Omni — separate article →

Bottom line: what to take from I/O 2026

Try Gemini Omni on AI Plus at $20/month — it's the most interesting model for creators in the last year. Or wait for it to appear on Clipia.
Wait for Spark to go global — US-only and $100/month for now, but in 3–6 months this becomes the new normal everywhere.
Get ready for conversational interfaces. "Apply filter" buttons are going away. Chat is what stays.
Watch SynthID. If you work with clients, transparency tagging is about to become a requirement.

Google didn't show "revolution tomorrow." They showed that the revolution is shipping — quietly, into production, behind a subscription. Which is, frankly, scarier than any hype cycle.

Subscribe to Clipia Telegram → · Try the models now →

Sources

Try it yourself on Clipia

50+ models for video and image generation. No VPN needed.

Maksim ZakharovFounder of Clipia.ai

Founder of an AI image and video generation platform with 50+ models including Veo, Kling, Seedance, and Midjourney. Personally tests every new model on real-world tasks, runs side-by-side comparisons, and writes in-depth reviews based on actual generations. Keeps articles updated after new model releases.

All author articles