Agents Week Hit Different
Cloudflare dropped Agents Week this week — six announcements, all focused on making their platform the best place to build AI agents. I’d been building Agent V3 on saltwaterbrc.com for weeks (9 tools, 9 Cloudflare products, autonomous planning). When I saw the announcements, the question wasn’t “should I integrate these?” — it was “how fast can I ship?”
The answer: same day. Three new integrations. The agent went from 9 tools to 11, from 9 products to 12, and gained an entirely new interface (voice). Here’s what we added and why each one matters.
Integration 1: Browser Run — The Agent Can Visit Websites
Before: The agent had web_search — it called the Brave Search API and got back titles, URLs, and snippets. Like reading Google results without clicking any links.
After: The agent has browse_web — it opens a real headless Chrome browser on Cloudflare’s network, navigates to any URL, waits for JavaScript to render, and extracts the full page content.
How It Works
Browser Run (formerly Browser Rendering) runs headless Chrome instances across Cloudflare’s global network. We use the Quick Actions REST API — no Puppeteer binding needed, just a fetch() call. Three modes:
- Markdown mode: Renders the page, extracts content as clean markdown. Perfect for reading articles, documentation, pricing pages.
- Screenshot mode: Captures a visual PNG of the page, stores it in R2. Perfect for “show me what this looks like.”
- Scrape mode: Extracts specific HTML elements using CSS selectors. Perfect for pulling pricing tables or specific data points.
Why It Matters
The agent can now do things like:
- “Go to AWS Lambda’s pricing page and summarize the free tier” — it actually visits the page, reads the rendered content, and synthesizes an answer
- “Take a screenshot of competitor.com” — captures a visual, stores it in R2, shows it inline in chat
- “What are the h2 headings on this blog post?” — scrapes specific elements without loading the full page
This is the jump from “can search about the web” to “can browse the web.” Your agent stops being limited to what search engines index.
The Security Layer
SSRF protection is built in — the tool validates URLs, blocks private IPs, localhost, and internal network ranges. Only public HTTP/HTTPS URLs are allowed. Screenshots are stored with session isolation in R2 (each session gets its own prefix, same pattern as image generation).
Integration 2: Voice Agents — Talk to the Agent
This one changes the interaction model entirely. Instead of typing into a chat box, you speak. The agent listens, thinks, and speaks back.
How It Works
Cloudflare released @cloudflare/voice — an experimental voice pipeline for the Agents SDK. It adds real-time speech-to-text and text-to-speech to any Agent class using a mixin pattern:
withVoice(Agent) → VoiceAgent with STT + TTS pipeline
The architecture:
- Your mic captures audio (16kHz mono PCM)
- Deepgram Flux (via Workers AI) transcribes continuously — the model itself detects when you stop talking
onTurn()fires with your transcript — runs the LLM- Deepgram Aura (via Workers AI) synthesizes the response to audio, sentence by sentence
- Your speakers play the streamed audio
All of this happens over a single WebSocket connection to a Durable Object. The same Durable Object, same tools, same SQLite conversation history. Voice is just another input/output channel.
What We Built
A new /voice page on saltwaterbrc.com with a dedicated voice interface — status ring (listening/thinking/speaking), audio level meter, real-time transcript, mute controls. The server-side agent uses withVoice(Agent) with pipeline hooks for noise filtering (drops transcripts under 3 characters) and TTS text cleanup (spells out abbreviations, strips markdown formatting).
Why It Matters
Some of the best use cases for agents aren’t text-first. You’re on a commute, juggling tasks, or just want to have a conversation. Voice makes the agent accessible in contexts where typing doesn’t work.
The key insight from the Cloudflare blog: “Voice should not require a separate stack.” The agent doesn’t split into a “text version” and a “voice version.” It’s one agent, one Durable Object, multiple interfaces. User starts typing, switches to voice, goes back to text — same conversation, same state.
Integration 3: Workflows V2 — Durable Multi-Step Research
This is the reliability upgrade. Agent V3’s autonomous loop was impressive but fragile — if the LLM planned 6 steps and step 4 failed, the whole chain was lost. No checkpoint, no retry, no recovery.
How It Works
Cloudflare Workflows is a durable execution engine. You define steps, each step checkpoints its result, and the engine handles retries and recovery:
Step 1: Browse competitor1.com/pricing → ✅ checkpoint saved
Step 2: Browse competitor2.com/pricing → ✅ checkpoint saved
Step 3: Browse competitor3.com/pricing → ❌ timeout → retry → ✅ checkpoint saved
Step 4: Analyze all content with LLM → ✅ checkpoint saved
Step 5: Generate HTML report in R2 → ✅ checkpoint saved
If Step 3 fails, the workflow retries Step 3. Steps 1 and 2 don’t re-run because their results are already saved. If the Worker restarts mid-execution, the workflow picks up exactly where it left off.
What We Built
A new deep_research tool that the agent can invoke for complex, multi-URL research tasks. The agent provides a task description and a list of URLs (up to 5). Behind the scenes, a Workflows instance spins up that:
- Validates all URLs
- Browses each URL via Browser Run (one durable step per URL, with 2 retries each)
- Analyzes all content with Workers AI
- Generates a styled HTML report and stores it in R2
- Returns the download URL
Why It Matters
This is the difference between “demo agent” and “production agent.” Real-world research tasks involve multiple pages, each of which might time out or fail. Without durable execution, you’re building on sand. With Workflows V2, every step is a checkpoint, every failure gets a retry, and the agent can handle tasks that take minutes, not seconds.
The Medium-Lift Items
Three other Agents Week announcements are on our radar:
Registrar API (Beta): Agents can now search for and register domains programmatically. We could add a check_domain tool that lets the agent check domain availability and pricing. Fun demo, practical for prospect conversations.
Project Think (Next-Gen Agents SDK): The preview of the next Agents SDK edition — batteries-included platform for agents that “think, act, and persist.” When this goes GA, we’ll migrate Agent V3 to the new SDK. For now, we’re watching the preview closely.
Agent Lee (In-Dashboard Agent): Not something we integrate — it’s Cloudflare’s own AI agent built into the dashboard. Manage your stack via prompts instead of clicking through tabs. We’ve been using it to manage our own infrastructure and it’s already saving time on routine tasks.
The Scorecard
After Agents Week integrations, here’s where saltwaterbrc.com stands:
| Before | After |
|---|---|
| 9 tools | 11 tools |
| 9 Cloudflare products | 12 Cloudflare products |
| Text-only interface | Text + Voice |
| Fragile multi-step chains | Durable Workflows with checkpointing |
| Can search the web | Can browse the web (real headless Chrome) |
The new products in the stack: Browser Run, Workflows, and Workers AI voice models (Deepgram Flux STT + Deepgram Aura TTS).
What’s Next
The agent is getting smarter, more reliable, and more accessible with every iteration. The V1 → V2 → V3 progression was about capability (stateless → stateful → autonomous). The Agents Week upgrades are about robustness and reach — the agent can now browse the real web, speak with you, and run complex tasks without breaking.
If you’re building on Cloudflare’s platform, Agents Week is a signal: this is where the investment is going. The developer platform and the AI platform are converging. Every product announcement this week was about making agents better — faster compute, real browsers, durable execution, voice, identity, payments.
Try the agent at saltwaterbrc.com/agent or the new voice interface at saltwaterbrc.com/voice.