The AI model war is over. The workflow war has just begun

30 days.

That's how long it took two of the most powerful AI models on the planet to make the exact same bet: move to Microsoft Excel and fight for financial workflows.

Claude Opus 4.6 shipped on February 5th with Excel integration, enterprise plugins for investment banks, FP&A and PwC as implementation partner. GPT-5.4 followed on March 5, delivering ChatGPT for Excel add-ins and live data connections to Moody's, S&P Global, FactSet, and LSEG. Same month. It's the same application. different architectures. different ecosystems. Both were aimed squarely at your team.

Meanwhile, Gemini 3.1 Pro was quietly expanded to a 2 million token context window . DeepSeek V3.2 continued to drive prices down to a fraction of a penny per token . And the launch of Anthropic's enterprise agents sparked what traders call a “SaaSpocalypse.”: On February 3rd, approximately $285 billion of market capitalization disappeared in one trading day. across software and services stocks. Thomson Reuters recorded its biggest single-day decline ever . Salesforce and ServiceNow each fell about 7%. Intuit and Equifax lost more than 10% each.

The market wasn't betting on which model would win. This envisioned a world where AI co-pilots built into existing tools would be replaced by standalone software sold by companies.

For financial leaders, the questions have changed. The question is no longer “Should we introduce AI?” It's “which models fit into which workflows, and how do you manage your portfolio?”

what's actually inside the box

Both Excel copilots work from a side panel within a workbook. Explain what you need in plain language. This tool builds, updates, or debugs models using expressions and structures already contained in your files. Both ask for permission before making edits. Both link the description to a specific cell. Both perform calculations natively in Excel, rather than inside the model's black box. The last part is what makes both tools auditable.

The difference lies in the ecosystem around spreadsheets.

OpenAI dug deep into the data. Excel add-in for GPT-5.4 Connect directly to Moody's, S&P Global, Dow Jones Factiva, LSEG, MSCI, and FactSet. Get credit metrics, returns, transcripts, and market data without switching windows. The company also shipped reusable “skills” Regular financial tasks: Earnings previews, DCF analysis, comparables and draft investment notes. The positioning is clear. OpenAI is building financial terminals inside chatbots.

Anthropic expands your workflow. Claude in Excel is part of a broader corporate initiative that began with Claude for Financial Services. In late February, Anthropic launched pre-built agents For financial analysis, stock research, private equity, asset management, etc. PwC, Accenture and Deloitte have signed on as implementation partners. Last week, the company launched a marketplace Here, enterprise customers can use their existing commitment spend to purchase Claude-powered tools from partners like Snowflake and Harvey. Positioning is equally clear. Anthropic builds operating systems for enterprise knowledge work.

Two architectures. Two strategies. The spreadsheet is just an entry point.

72/14 problem

Here are the stats you should reframe every model comparison you read this year: According to an RGP survey of 200 US CFOs, 72% are currently using AI tools. Only 14% report a clear and measurable ROI.

The barrier is not about the intelligence of the model. Only 10% of CFOs surveyed fully trust company data. 86% say their legacy systems limit their AI readiness. 68% cite the skills gap as their biggest challenge.

Competitive speed benefits finance teams, but only when evaluated based on workflow suitability rather than key benchmarks.

There are common patterns among organizations that bridge this gap. Route specific models to specific workflows rather than standardizing on a single vendor. Brex uses the Opus-based system to automate 75 percent of its expense transactions, achieve a 94 percent policy compliance rate, and save approximately 169,000 hours per month. TELUS runs over 13,000 internal AI tools built on Claude, saving 500,000 hours and realizing approximately $90 million in benefits. Lloyds Banking Group expects the value of agent AI to increase by around £100m this year. BNY Mellon runs 117 agent tools in production across operations and risk.

These are not “ChatGPT vs. Claude” stories. They are about architecture. The model is one variable. The remaining three are integration suitability, governance posture, and data readiness.

Governance layer that cannot be skipped

Choosing a financial model is not just a matter of capability. That's a vendor risk question. Two developments over the past few weeks illustrate this.

The same week that Anthropic confirmed that its ARR had nearly doubled to $19 billion since late 2025, Secretary of Defense Hegseth designated the company as a supply chain risk. This label has historically been reserved for foreign adversaries. The controversy centers on restrictions Anthropic has pushed for on the use of military AI. For finance teams evaluating long-term vendor commitments, this introduces a variable measurement without a benchmark.

In terms of cost efficiency, DeepSeek V3.2 is attractive. At $0.55 per million tokens, it is 10-25 times cheaper than proprietary alternatives. However, a September 2025 NIST assessment found that the DeepSeek model was compliant with 94 percent of malicious jailbreak attempts. This compares to 8 percent in the U.S. reference model. Agents built on DeepSeek were 12 times more likely to fall victim to a hijacking attack, where the agent is redirected from a task by malicious instructions embedded in an external document. This is a serious governance issue for workflows that touch external data.

And then there's the cost architecture. GPT-5.4 runs at the standard rate of $2.50/$15 per million tokens, but the price doubles when input tokens exceed 272,000 tokens. Opus 4.6 starts at $5/$25, but the moment you exceed 200,000 input tokens you hit the “200,000 cliff” where the entire request is repriced to $10/$37.50. Both models penalize sloppy context management. Architectural decisions about how to configure prompts and manage token budgets can cause costs to fluctuate by as much as five orders of magnitude per year.

3 points

Competition is the story. Both Frontier models shipped the Excel Copilot within the same 30-day period. Gemini is expanding the context. DeepSeek is compressing costs. Competitive speed benefits finance teams, but only when evaluated based on workflow suitability rather than key benchmarks.
14 percent demonstrated ROI matching model and process. Not standardized to one vendor. Route modeling to one tool, diligence to another, and bulk extraction to a third tool. Governance encompasses all of that.
Governance is the real selection criterion. Data integration, token economics, vendor risk posture, and enterprise controls will determine which models remain in the stack after the current benchmark scores are retired.

Source link

What's Hot

6 Critical To-Dos for Mid-Market CEOs

Turning liquidity into a strategic weapon: A CEO's handbook

Revolutionizing Private Markets with Charly Kevers of Carta

The AI model war is over. The workflow war has just begun

Revolutionizing Private Markets with Charly Kevers of Carta

The Power Of Community – CFO Leadership

Build collaboration skills

AI benefits start with data

Finding Purpose in a Mission-Driven Company with Hetu Patel

CFO confidence remains strong in second quarter despite cautious optimism

Apple Mission and Vision Statement

Understanding the Industry Lifecycle: Phases and Examples

Nike Mission Statement | Vision | Values | Strategy (2024 Analysis)

Apple's Mission Statement | Vision | Core Values | Strategy (2024 Analysis)

Profit with purpose: How women-inclusive business practices drive small business success

Building Business Partnerships Fit for the Future: A Renewed Vision for Business Action on Poverty, Inequality and Climate Change – Partnerships

City launches new business promotion program | Department of Commerce

12 Tips for Building an Effective Business Website

Our Picks

6 Critical To-Dos for Mid-Market CEOs

Turning liquidity into a strategic weapon: A CEO's handbook

Revolutionizing Private Markets with Charly Kevers of Carta

Most Popular

ITA performance exceeds business plan: Spohr | News

How To Write a Nonprofit Business Plan (2024)

Nissan unveils Arc business plan to drive value, increase competitiveness and profitability | Corporate Finance

Subscribe to Updates

What's Hot

The AI ​​model war is over. The workflow war has just begun

what's actually inside the box

72/14 problem

Governance layer that cannot be skipped

3 points

Related Posts

The AI model war is over. The workflow war has just begun