On About AI
Posts
After the Summer Lull: AI’s Agentic Arms Race Heats Up

After the Summer Lull: AI’s Agentic Arms Race Heats Up

Enterprise AI shifts from copilots to production agents embedded in ERP/CRM. Cohere’s $500M, Oracle×Gemini in OCI & Fusion, Salesforce’s Agentforce at scale, Kyndryl’s “100 agents/100 days,” GPT-5, Anthropic’s 1M-token context, and MCP.

Joao Coutinho
August 15, 2025

August is usually a slow month for enterprise tech. Not this year. In the span of days we saw Cohere add half a billion dollars to its war chest, Oracle fold Google’s Gemini models into OCI and Fusion apps, Salesforce push agents from “pilot” to product, Kyndryl prove that systems integrators can industrialise agent delivery, OpenAI ship GPT‑5, and Anthropic respond with longer context windows and sharper protocol strategy. The centre of gravity in AI is sliding from one‑off copilots to production agents embedded in the systems that run firms. The practical consequences are architectural as much as they are commercial.

The through‑line: models, orchestration and business applications are becoming a single purchase. A year ago you bought a model, then picked a framework and hoped a vendor marketplace had the right connectors. Today the model ships with a control plane, the control plane ships with a marketplace, and the marketplace is being absorbed by the ERP/CRM you already pay for. That is the agentic stack.

Cohere raises $500M and aims it squarely at enterprise agents

Cohere’s latest round values the company at $6.8 billion and brings in heavyweight operators (Joelle Pineau as Chief AI Officer; François Chadwick as CFO). The investor list—Radical, Inovia, AMD, NVIDIA, PSP, Salesforce Ventures—reads like a who’s who of the enterprise AI supply chain. It is not a bet on consumer chat; it is a bet on operational AI.

What Cohere is trying to win is the part of the market where retrieval, privacy, latency and predictable cost matter more than theatrical benchmarks. “North,” its enterprise agent platform, is the organising idea: vertical agents that live next to knowledge bases and transaction systems, speak your taxonomy, and can be governed—who can call what tool, on whose data, under which policy. The capital buys time to harden that proposition: better embeddings and filtering for RAG, safer action models, and the unglamorous features that make a procurement team say yes—SLA clarity, audit trails, and crisp TCO stories.

This is also a signalling round. It tells buyers that the “third cloud” of enterprise LLMs will not be a two‑horse race, and it tells system integrators they can safely standardise on Cohere for accounts that will never permit data to leave a particular geographic or contractual boundary.

Oracle invites Gemini into OCI and Fusion — AI moves into the system of record

Oracle and Google agreed to let Oracle sell and serve Gemini models inside OCI and to surface them in Fusion Cloud Applications (finance, HR, supply chain). Two details matter for practitioners:

Commercial portability. Customers can pay for Gemini with Oracle cloud credits. That collapses weeks of legalese and vendor onboarding into something the cloud ops team already understands.
Placement. AI is moving down the stack. Instead of stitching a bot onto the side of an ERP, the model sits inside workflow screens and policy boundaries your auditors already review. That reframes risk: the integration map gets simpler; the change‑management plan gets harder because the work itself changes.

For multi‑cloud enterprises this deal reduces a classic trade‑off. You do not have to pick Oracle’s SaaS or Google’s models; you can have both and keep billing and data residency where they already are. The technical work shifts to governance—deciding which processes are safe for generation and action, and which remain deterministic.

Salesforce turns agents from experiments into operations

Salesforce’s Agentforce now ships with hundreds of prebuilt agents and, more importantly, the managerial plumbing to run them at scale: an agent command centre, observability, and explicit controls for who can do what. Their own help site has handled more than a million requests autonomously—a useful proof that the economics work when the surface area is large and the tasks repeatable.

Enterprises do not buy agents; they buy outcomes that survive month‑end close. The emerging pattern on Salesforce looks like this: start with narrow, high‑volume tasks (case triage, entitlement checks, appointment scheduling), push results and rationales to the customer record, and let managers adjust thresholds and hand‑off rules without filing a ticket with engineering. The result is not headcount elimination so much as queue compression and fewer hand‑offs. Where firms do see savings is in reduced swivel‑chair between systems, lower rework, and faster time to entitlement—all of which show up in the same dashboards finance already trusts.

The strategic point is simpler: the big SaaS vendors are productising agents, not just offering a marketplace. That raises the bar for custom builds; the gap you must clear is no longer a blank page but a working baseline.

Kyndryl demonstrates the factory model: 100 agents in 100 days

Systems integrators succeed when they can turn one successful pattern into a thousand deployments. Kyndryl’s collaboration with Google Cloud—delivering “100 AI agents in 100 days”—is a useful data point for how that looks in practice. The work is part engineering, part operations: curate a library of patterns, stand up a control plane that treats agents like microservices, establish golden paths for identity, secrets and logging, and teach client teams to own the run‑time.

The lesson for buyers is not the headline number; it is the factory method behind it. When agents become composable units, the constraint shifts from model capability to environment readiness: permissions, action scopes, and how quickly business owners can provide labelled examples of “good” versus “bad” outcomes. The integrator that masters those human loops will own the next phase of digital transformation.

OpenAI’s GPT‑5: more capability, less theatre

OpenAI’s GPT‑5 arrives with cleaner reasoning, better control over style and format, stronger multi‑file coding, and a candid focus on reducing sycophancy. There is also a Pro tier for those who need sustained throughput and enterprise‑grade knobs. The subtext is that OpenAI understands where the competitive fight now sits: not in parlor tricks but in predictable behaviour under governance.

Technically, GPT‑5 is easier to aim. Combined with a tools interface and an actions model, it plans and executes over longer horizons without becoming opaque to the operator. Practically, it shortens the time between a product manager writing a policy and an agent adhering to it. You still need retrieval, guardrails and observability, but you spend less time second‑guessing the model’s intent and more time designing the workflow.

Anthropic’s counter: longer context, protocol discipline, and a reminder that APIs are political

Anthropic’s update pushes Claude Sonnet to 1 million tokens and ships Opus 4.1 with faster, sharper reasoning. The pricing nudges are transparent, and the message is clear: if you want to keep entire repositories, contract estates or research corpora in‑memory, we can do it and we will charge for the privilege. In parallel, Anthropic doubled down on the Model Context Protocol (MCP), the cleanest attempt so far to standardise how models discover and invoke tools.

Then there is the API dispute: Anthropic cut back OpenAI’s access to Claude following claims of policy violations around internal testing. The details will fade; the principle will not. API access is now a strategic lever. Procurement and architecture teams should assume that vendor relationships can change mid‑programme and design for graceful degradation and model substitution from day one.

Interoperability grows teeth: MCP becomes the common bridge

MCP matters because it lowers the integration tax. Tooling becomes model‑agnostic; models become swappable without rewiring the firm. OpenAI’s adoption of MCP, DeepMind’s support, and the emergence of SDKs across Python, TypeScript and C# are the kind of dull, cumulative progress that changes outcomes. In practice, MCP servers expose familiar enterprise assets—databases, document stores, Git, ticketing—under a consistent contract. MCP clients (your agents) negotiate capability and permission without bespoke glue code.

Two caveats. First, a standard is only as good as its governance. Expect divergence at the edges where monetisation meets control—streaming versus batched tool calls, tracing formats, and how much context a model can demand per call before metering breaks budgets. Second, portability cuts both ways; it also lowers switching costs for your rival. The advantage will sit with teams that can instrument agents (not just models), capture outcomes, and feed those back into policy and prompt libraries that are unique to the firm.

What actually changes on Monday morning

Architecture. Treat agents as first‑class services. Give them identities, scopes and SLOs. Put them behind the same API gateways and observability stacks as everything else, and make rollback boring. Your golden path should include retrieval by default, a governed action layer, and, where appropriate, a memory strategy that combines short‑term buffers with durable traces for audit.

Data and control. The hard part is not model choice; it is designing interfaces with humans. Decide where a human remains in the loop, where they are on the loop (monitoring), and where they are out of the loop. Encode those positions as policy, not folklore, and test them like you test access control.

Cost. Long context is powerful but not free. A 1M‑token window changes feasibility for codebase‑wide refactors and portfolio‑level analysis; it also changes the unit economics of every run. Track spend at the agent and action level, not at the model level. You cannot optimise what you cannot see.

Procurement. The bundle is back. You will increasingly be offered a model, an orchestration layer, and a slate of prebuilt agents from the same vendor or a close ally. The right answer is rarely ideological. Take the bundle where it buys you time; insist on MCP‑grade escape hatches where it jeopardises your options.

People. The teams that win are the ones that turn change management into a muscle: clear communications to staff about what tasks will move to agents, explicit retraining for judgment work, and transparent metrics so people can see the improvement, not just feel the disruption.

Frequently asked questions

What’s the real advantage of a 1M‑token window?
It keeps the entire working set in play—monorepos, program histories, or a stack of contracts with their amendments—so the agent reasons across wholes rather than samples. You avoid the failure modes of brittle chunking while gaining traceability: the model can cite within the same session.

If vendors are embracing MCP, do I still need a bespoke integration layer?
Yes, but less of it. You will still maintain canonical connectors for core systems and your own policy enforcement. MCP reduces the bespoke code you write to let agents discover and invoke those capabilities, and it lets you move between models without re‑plumbing.

How should we think about lock‑in now that models live inside SaaS apps?
Lock‑in shifts from infrastructure to workflow. When the agent lives inside the CRM or ERP, the gravitational pull is real. The countermeasure is to keep your business logic and prompts in versioned repositories, prefer MCP‑compatible tools, and prove you can replay a critical workflow on two vendors before you scale it.

Where do pilots go wrong?
Two places: thin problem statements (“add an agent to support”) and hazy success criteria. Tight pilots describe the workflow, the data allowed, the action scope, and the target metric (cycle time, error rate, recovery value). They also plan for operations: alarms, rollback, and ownership.

What about safety and audit?
Treat agents like colleagues with superpowers. They must log what they saw, what they decided, and which tool they used. You do not need esoteric alignment research to be responsible; you need instrumentation, policy and the will to disable capabilities that cannot yet be governed.