# The Models Trying to Fill the Fable Gap — Transcript (2026-06-18)

https://aidailybrief.ai/e/2026-06-18 · Listen: https://pod.link/1680633614

---

[00:00:00] Today on the AI Today on the AI Daily Brief, the models trying to replace Fable. Before that in the headlines, what we learned about AI and global politics at the G7. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, Section, Assembly, and Outsystems. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, Section, Assembly, and Outsystems. To get an ad-free version of the show, go to patreon.com/aidailybrief, or you can subscribe on Apple Podcasts. To learn more about sponsoring the show, send us a note at sponsors@aidailybrief.ai.



You should also check out the new aidailybrief.ai anyways. One of the big things that I have heard from folks is that they want easier ways to share specific parts of these episodes with folks inside their organizations. 

So that's what we've tried to build with the new website. It divides every episode up into dozens of short, easily shareable cards

Lastly today

There is a link down in the show notes to check out a preview of [00:01:00] something that is coming soon. training.besuper.ai. if you've been following along with the AIDB learning program journey, keep an eye there for some more announcements to come soon With that, let's talk G7 Welcome We Now, spoiler alert

We do not have any particularly big updates when it comes to when we're getting Fable V back and the resolution



between Anthropic and the US government. However, what we did have was a number of the key players all in the same room



As a slew of AI leaders joined the usual heads of state at this year's G7 meeting in France



Sam Altman, Demis Hassabis, Meta's Alexander Wang, and yes, Dario Amodei were all present as part of the US contingent France brought along Mistral CEO Arthur Mersch, while Cohere CEO Aidan Gomez attended as part of the Canadian delegation Another half dozen executives from regional AI champions also attended



now it is not unusual for corporate executives to attend this sort of diplomatic and trade meeting, but it is certainly the first time the G7 has seen such a heavy representation from the AI industry

And [00:02:00] frankly, their attendance makes even more sense in the context

of the US government's effective banning of mythos and fable

At a meeting that is all about international cooperation, for the first time, the global community is reckoning with the idea that access to US-made frontier models is not a given. Now Now the pivotal discussion came at a closed-door lunch meeting focused on AI and innovation



flanking Donald Trump on either side were Google DeepMind Hassabis and OpenAI Sam Altman With Anthropic's Dario Amodei being on the exact opposite side of the table Next to France is Emmanuel Macron

at the meeting, Dario Amodei and Demis Hassabis reportedly led the call for international cooperation on AI risk, with the US taking the lead. In In his address, Amodei said that international cooperation should include structured access to frontier models, chip trade deals that exclude China, and a unified approach to AI risks, including cyberattacks and bioterrorism

G7 leaders to, quote, "Resist the temptation to splinter over the deployment of advanced AI."

Meanwhile, Canadian Prime Minister Mark Carney, who has recently pushed sovereign AI policies and [00:03:00] called for cooperation between AI middle powers, agreed that the US could lead an AI coalition

France's Macron voiced the concerns of European leaders, warning that the Trump administration had now made it clear that the US government holds the AI kill switch. He told reporters after the meeting that he made a forceful plea for the US to not keep frontier AI to themselves. Macron said the US and Europe have shared interest in keeping the technology from authoritarian regimes.

" so let us move forward together," he commented. " Our relevant agencies must first cooperate so that in the areas of security and cybersecurity, we have a smooth government-to-government relationship."

Sam Sam Altman was aligned on this view that AI is now in the domain of government, and regulation should not just be left to corporate policy alone. He said that the technology must be shaped by people, democratic institutions, and society as a whole, not just, in his words, by the companies building the most capable systems.

Altman added, " We need an international forum for discussion that establishes globally accepted standards for testing, provides expert and impartial analysis of capabilities and risks, and serves as a venue for cooperation among nations."

OpenAI Head of [00:04:00] Global Affairs Chris Lehane framed the discussion as moving towards international regulation. He said there really is a coalescing around a forum or a space for the different democratic countries to be able to work together to ultimately see if there's a way to establish some type of AI safety standards.

Lehane said the US would lead this body, adding, " "The The ability to generate or create standards would be an avenue or pathway helping to ensure ongoing and continued access to frontier models."

Now all of this is well and good, kind of platitudinal, but what do you expect from a G7 meeting?

But when it came to the rubber hitting the road

I.e., global access to Mythos It doesn't appear that the US government gave any ground. Now, there's Now, there's essentially no commentary on what the US delegation said during the meeting itself President Trump made some classically generic comments in a press conference stating that the meeting was, quote, "excellent," and that AI is, quote, "going to be the biggest thing ever."

We have to be careful with it. It's both great and could be bad. We have to be careful with it, but we're leading China. We're leading the world on that

The European commentary struck a very different tone



Euronews framed the mood in EU policy circles as particularly sour. [00:05:00] 

they noted that European leaders expected to be discussing the need to form a united front against China

And they need to rebuild AI supply chains to route around that Eastern superpower. Instead, they found themselves pleading for access to frontier models that are viewed as critical to securing shared financial infrastructure

Thomas Ranier, the European Commission spokesperson for tech sovereignty said, " We are a trusted partner. I would challenge you to find a more trusted partner than Europe."

We got news that UK Prime Minister Keir Starmer had requested a carve-out for British nationals and companies from the export control restrictions and was denied



And perhaps And perhaps for that reason, even as they try to get access to fable and mythos



There is clearly a shift in European thinking as well Italian politician Brando Benifei said it plainly commenting, " "The The anthropic kill switch shows that tech sovereignty was never abstract.

The G7 should not lock allies into competing AI dependencies. Europe must cooperate with the US, Canada, and democratic partners, but from a position of strength."

Still, ultimately It's becoming very clear that strength in the AI era comes from one thing: putting putting GPUs on [00:06:00] racks. And And on that front, Europe is lagging badly behind

In April, the European Commission unveiled a grand plan to build up to five AI gigafactories to support the training of frontier models. The The only problem is that only 20 billion euros were committed to the project, which expects to deploy around 100,000 For comparison, the hyperscalers are on track to spend three times as much every month building out AI data centers in the US





now when it came to the Mythos situation specifically



When President Trump was asked by a reporter about the negotiations with Anthropic, he said simply, "They're going fine." Looking over at Commerce Secretary Howard Lutnick, reiterated, " " Going Going fine."



Now Now meanwhile, as this was all happening



reporting from Wired Added some context to the China dimension of the export control ban



the, the TLDR was that when Anthropic expanded access to Mythos a few weeks ago, one of the companies that got access was Korean telecom giant SK Telecom

The US government concerned about ties to China ordered the company to revoke SK Telecom's access a few days before the ban



adding at least some [00:07:00] credence to the China reasoning for the ban as opposed to just personality politics Now, what one thinks of SK Telecom's supposed connections to China is a different matter

Satrina analyst Jukin writes, " "My My God, I'm honestly beyond disappointed with the Trump administration. SK Telecom has absolutely nothing to do with Huawei or China. In fact, the only Korean telecom operator that uses Huawei equipment is LG



"In DC, China-linked company is sometimes a real thing and sometimes an utterly thought-terminating process. When I heard it was a Korean company, I immediately thought SK Telecom and went, 'Ah, yes, the network where some of the most valuable IP in the entire AI hardware field is being transacted daily.'"



When all was When all was said and done, I don't think anyone wasparticularly surprised at the amount of talk versus action from a G7 meeting I think what people found notable was the extent to which the White House's actions around Anthropic have shifted the tone globally



and how little we got from the White House about any sort of timeline or sense of how things are actually going when it comes to the anthropic issue I would say that I would say that on average, most people who are watching this



just just had their timelines for [00:08:00] when we get Fable back extended, not shortened



now now moving back into the AI industry itself, we got a really big personnel move with legendary AI researcher Noam Shazeer leaving Google to join OpenAI In 2017, Shazeer was one of the lead authors on the seminal research paper, "Attention Is All You Need," which introduced the transformer architecture and kicked off the entire LLM revolution In In 2021, after Google refused to release a chatbot of his design, Shahzeer left the company to found Character.ai.

In 2024, Google rehired Shahzeer as the technical lead on the Gemini project

And in order to retain Shazeer, Google spent 2.7 billion licensing Character AI's technology in in one of the first big acqui-hire deals of the modern AI era



In short, Shazeer is one of an elite group of AI researchers that can command a multi-billion dollar investment, right up right up there with Andrej Karpathy, Noam Brown, and Ilya Sutskever. And And yet, less than two years after Google paid up for Shazeer, he's already out the door

Sam Altman has said this move has been a long time coming, posting, " ""Noam Noam is one of the people I have [00:09:00] most wanted to work with since the very beginning of OpenAI. It only took 10 years. I think it will be worth the wait." OpenAI reportedly told employees that Shazeer would be working creating new architectures for AI models Google, meanwhile, was magnanimous in losing one of the world's preeminent researchers.

A A spokesperson said, " "We are We are grateful for Noam's meaningful contributions to Google over the years, and we wish him well." Still, for many, the news raises even more questions about the future of Google's AI roadmap The rumor mill has been awfully quiet about the release of Gemini 3.5 Pro, which they said would be coming in June

Yu Chen Jin wrote, " "Noam's Noam's leaving Google makes Gemini's future feel uncertain. More than one DeepMind person has told me Noam saved Gemini. There's even lore that he tweaked a few lines of training code and Gemini's quality instantly jumped. Gemini's coding ability still feels behind.

I really hope Gemini can find its way back to its former glory. We need more model choices."



Now Now one more little product update from OpenAI

The quest to remove the side quest continues as as ChatGPT announces that they'll be sunsetting Pulse. Pulse Pulse was introduced [00:10:00] last year and served as a daily AI briefing. Users could tune Pulse to generate relevant daily content based on their interests.

OpenAI said the feature would be removed within the next two weeks and encouraged users to build their own daily briefing using scheduled tasks

Now, OpenAI is presenting this as an expansion of the feature set, coupling the removal of Pulse to the expansion of the moregeneralized scheduled tasks feature As part of that expansion, scheduled tasks will now be available to all paid ChatGPT subscribers, even those on the cut-price Go tier

Now Now for some

this paints a very clear picture of what type of users OpenAI is prioritizing now. on the announcement thread, ChatGPT subscriber Diav wrote, " After sunsetting 4.5 and Pulse, will there be any reason to keep Pro subscription for someone who is not a coder and has zero interest in Codex?"

And And the short answer is I'm not sure that OpenAI cares right now That's gonna do it for this slightly extended edition of the headlines. Next up, the main episode 

One of the most important AI questions right now isn't who's using [00:11:00] ai, it's who's using it? Well,

KPMG and the University of Texas at Austin. Just to analyzed 1.4 million real workplace AI interactions and found something surprising. The highest impact users aren't better prompt engineers. They treat AI like a reasoning partner.

They frame problems, guide thinking, iterate, and push for better answers. and the good news, these behaviors are teachable at scale.

If you're trying to move from AI access to real capability, KPMG's research on sophisticated AI collaboration is worth your time. Learn more at kpmg.com/us/slash sophisticated. That's kpmg.com/us/sophisticated. 

Here's a harsh truth. Your company is probably spending thousands or millions of dollars on AI tools that are being massively underutilized. Half of companies have AI tools, but only 12% use them for business value. Most employees arestill using ai. To summarize meeting notes, if you're the one responsible for AI adoption at your company, you need section.

Section is a platform that helps you manage AI transformation [00:12:00] across your entire organization.

It coaches, employees on real use cases

tracks who's using AI for business impact and shows you exactly where AI is and isn't creating value.

The result, You go from rolling out tools to driving measurable AI value. Your employees move from meeting summaries to solving actual business problems, and you can prove the ROI. Stop guessing if your AI investment is working. Check out section@sectionai.com.

That's S-E-C-T-I-O-N ai com. 

You know Assembly AI for having the most accurate streaming speech-to-text out there.

But they just went a step further and launched a full voice agent API. The idea is simple: one connection and they handle everything, the listening, the thinking, the speaking. You just stream audio in and get your agent's voice response back. We're talking about things like, outbound sales calls that actually qualify leads, customer support that handles complex requests without a script, scheduling agents that sound like a human assistant, and you can build one in five minutes with one API.

And importantly, their streaming model is the best at catching all the stuff that [00:13:00] breaks on other voice agents, things like phone numbers, emails, names, and medical terms.

And for those of you who are still in experimentation mode, there are no contracts and unlimited concurrency, so you can actually test it out without any friction. head to assemblyai.com/brief and try the live voice agent demo right there on the site. No sign-up needed 

This episode of the AI Daily Brief is brought to you by OutSystems, a leading agentic systems platform built for the enterprise. Organizations all over the world are building, orchestrating, and governing agentic systems 

on the OutSystems platform and with good reason.

OutSystems' open and unified platform allows teams to architect, deliver, and scale governed agentic systems with agility. 

Teams of any size and technical depth can use OutSystems to build, deploy, and manage AI apps and agents 

quickly and cost effectively without compromising reliability and security.

Without systems, you can rapidly launch ideas 

from concept to completion. It's the leading agentic systems platform that is unified, agile, and enterprise-proven, allowing you to accelerate growth, reduce operational friction, and deliver real enterprise [00:14:00] impact with AI OutSystems.

Build your agentic future As the discussions between Anthropic and the US government continue, we are firmly in the fallout phase of the Fable Five loss



Indeed, even by Monday, when Fable hadn't come back online by the beginning of the workweek and the markets opening, It was pretty clear that this was going to end up being a bigger fight than just a weekend annoyance

Over at the G7 meetings this week

we started to see the geopolitical ramifications of the Fable shutdown, with with Europe in particular and other US allies trying to figure out both where they fit within the US' prioritization And what they needed to do to, on the one hand, retain access to US models while also not being totally reliant on the US

Over Over among AI builders, meanwhile, while the first couple days were disbelief and mourning, since then it's been all about

what sort of systems we can MacGyver together to get close to Fable-type performance? Now for organizations and enterprises, the question is even more interesting



while few if any enterprises had actually shifted any sort of [00:15:00] meaningful workflows over to Fable, it was yet more fuel to the fire of needing to think beyond just blindly using whatever the most powerful state-of-the-art model is now up until the fable banning, the reason that that conversation had started among enterprises was not a question of access, but a question of cost.

As agentic workloads actually came online, people's AI bills were going up in meaningful ways, and that led many, if not most organizations using AI extensively to start to think about more comprehensive strategies that again, weren't just slapping the most powerful model on topof every single use case they had

And And yet what's very clear

As we now come up on a week of Fable being gone, is that the question of the value of the state of the art is higher than ever. ever. now for now for some, this is a market question for a couple of months now, one of the lurking bear narratives among investors has been that if American frontier models remain comparatively over expensive compared to cheaper Chinese models



At some point, the concern was buyers would just shift their behavior And all of a sudden that revenue that seemed at least for a time to be justifying the big infrastructure build-out might no longer be as durable



for for [00:16:00] individual organizations, though, who aren't thinking about market implications It is a moment in which many are considering alternatives. Chubby Chubby on X pointed to headlines in Bloomberg and CNBC and wrote, " "All All the major news outlets agree. The biggest winner in the Anthropic controversy is open source." Whether it's Bloomberg, Fortune, or CNBC, the consensus is clear. as Bloomberg put it, " Making the model open means that companies, governments, or organizations with sufficient hardware can run it locally and never have to worry about it being yanked on a whim."

In short, what companies are recognizing



is that using open weight or open source models is potentially not just a cost issue, but also one of predictability around access. If we're getting to the point where the power of AI is such that governments are going to have kill switches, 

That makes it very hard to build mission-critical workflows around

As Citrini Research put it, " "The The risk of the government deciding that a model is too dangerous should only add to the reasons why open source models running on local hardware can be a reasonable alternative."



the mo- so what are some of the new model solutions, China or otherwise, that companies are [00:17:00] starting to look towards? Well, the first thing that we should note is the Chinese labs themselves are certainly taking advantage of this particular moment

Right around the same time that Fable came out, we got Kimi 2.7 code

The The official account wrote Kimi Kimi K 2.7 Code, our latest coding model, is now released and open sourced. compared to 2.6 It saw about a 22% improvement on Kimi CodeBench V2, 11% on ProgramBench, and 31.5% on MLS Bench Light

they They argued that it had new reasoning efficiency with 30% lower reasoning token usage compared to 2.6



clearly cost is a consideration, both in terms of raw per token cost, but also in terms of efficiency and how many tokens a model uses to solve a problem



unfortunately for many who tried it, it seemed like the benchmarks didn't totally match the reality

As opposed to some past Kimi models, people aren't really raving about this one yet and are finding some issues



ar-- Venture, VentureBeat argued that while teams who are using K 2.6 in production right now can swap in 2.7 code and immediately expect lower inference costs, for people who aren't using Kimi, this won't necessarily be the reason to switch over



putting a [00:18:00] fine point on that, on the Agent Arena leaderboard, code ranked 19th overall and only sixth among open models

Another model that's getting some buzz in the last 24 hours or so is called Vibe Thinker 3B from Weibo AI

and this one has people talking because of the ease with which you can actually run it on local hardware

Orcas108 wrote, " What is happening in AI? A three billion parameter model just put up coding benchmark scores in the same league as Claude Opus 4.5. Three billion. The weights are on Hugging Face, anyone can test it. I genuinely don't know if this is a breakthrough or if the benchmarks are broken." Now, I don't wanna assume everyone knows what a three billion parameter model is, but TLDR, it's very, very small.



the frontier models are now well into the trillions of parameters, so you're talking about something that is a tiny fraction of the size

Now it seems like what's going on here is that this is super tuned for reasoning

And really bad at knowledge

software engineer Drew Black wrote, " "This is This is territory I started researching too. It's great to see something out in the wild. Take a small model and crank its reasoning [00:19:00] power up to 11. Then Then knowledge can live outside the model in a database. Something like this reduces the hardware and power requirements needed to run it.

Humans Humans don't know everything all the time, so why should an AI model? It just needs the intelligence to figure things out

So once again, here we have something that is not really at this stage for enterprises, but is all pointing in this similar trajectory of the viability of smaller and open models running on local hardware



mo- the the Chinese open model getting the most buzz right now by far is ZAI's GLM 5.2

BridgeMind AI wrote, " Two days ago, the US banned Claude Fable 5. Yesterday, China dropped GLM 5.2. Today, 5.2 is number one on BridgeBench, and number one on reasoning, beating Fable 5 at one tenth of the cost and three hundred tokens per second. You cannot export control your way out of an open source race.

The ban didn't slow China down."

Indeed, a lot of the early coverage has been around GLM 5.2 beating models like GPT on a variety of highly valuable tasks, including long horizon coding tasks for a fraction of the [00:20:00] cost



found-- On the front-end code arena, arena.ai found GLM behind Fable 5, but ahead of all the Opus models

And And on Design Arena, even went ahead of Fable 5

Now in some tests

there was evidence at least a little bit of benchmark maxing. In other words, where a model is specifically tuned to try to do well on benchmarks for exactly this sort of first impression AI entrepreneur Bindu Reddy wrote, " GLM 5.2 is mind-blowingly good on benchmarks. Yes, it even beats Opus 4-8 and GPT 5.5 on 

some of them. However, it is also benchmarked. Internal evals have it behind them. Still a huge win for open source AI." And And when it came specifically to certain use cases like design, A lot of folks are sharing

Examples where you don't have to trust the benchmarks and you can just see it with your own two eyes

Hassan from Together wrote, " This model is insane at design. I asked GLM 52 and Opus 48 to build me a landing page, and you can't even tell the difference. GLM GLM costs six cents while Opus costs [00:21:00] 49 cents. More than 6X cheaper while being faster and more token efficient. Another Another win for open source AI."

Now, Now, obviously there is a lot of talk about, this model being distilled from Anthropic. Pete Cooper wrote, " GLM 5.2 is absolutely convinced that it is actually Claude. When I tell it that it's GLM 5.2, It refuses to believe me

So the argument here is not that all of a sudden everyone is going to run out and start using GLM 5.2

but that there is more consideration for doing so than there ever has been because of the gap left by Fable V

G G Money wrote, " "Is anyone Is anyone running GLM 5.2 locally on a Mac Studio? How is it? seriously considering a Mac Studio for the first time now."

Now when it comes to actual changes in enterprise behavior

The maybe bigger news is that Microsoft is apparently considering using a locally hosted fine-tune of DeepSeek V4 to power Copilot CoWork

Microsoft, as we know, has moved CoWork to usage-based pricing And is thus looking for a way to provide cheaper access to their enterprise customers

Based on Axios reporting, it seems [00:22:00] like Microsoft isn't just fixated on DeepSeek. they're trying a variety of open source models as lower cost alternative to the models that are coming out of Anthropic and OpenAI

at the same time, this doesn't seem like a theoretical plan. with Axios reporting that Microsoft says it expects to make a lower cost model available in the coming weeks



reporter Deirdre Bosa wrote, " "Was Was only a matter of time. DeepSeek already hosted on hyperscaler cloud since last year. Microsoft moving closer to it for enterprise just normalizes and gives cover for others to embrace and adopt it, which is already happening Bigger question, does this give the Chinese stack a foot in the door in the US since DeepSeek is optimizing for Huawei chips?



As an aside, Gail Wiener pointed out the absolute irony of this when it comes to US policy. The US government bans Fable-5 and Mythos-5 worldwide because frontier models are too dangerous to let foreigners touch. Won't even exempt the UK because the threat model says the weights themselves are a national security asset.

And simultaneously, the most deeply embedded US enterprise software company on Earth quietly fine-tunes a Chinese model and prepares to ship it inside the productivity stack of every Fortune 500 that runs [00:23:00] Microsoft 365

Importantly though

It's not just raw Chinese models that are potentially changing the business AI landscape. One One of the models that we've talked about most recently, specifically around the token efficiency question, is Cursor's Composer 2.5. now this one was built on a foundation of one of the Qimi models, but was post-trained to be specifically good at coding tasks with very impressive results in the benchmarks, scoring up in the range of Opus and GPT 5.5



Now, of course, the most important thing is not just the benchmark scores, but the fact that it does so at a fraction of the cost of either of those comparable models



And now that it's been a couple of weeks

We're starting to get reports from the ground around how good Composer is in practice, not just on the benchmarks Talking about his experience with Composer 2.5, Ryan Shaw wrote, "II haven't bothered with anything else in weeks. Stronger than 5.5 medium oftentimes, even though I know nobody believes me."

Engineer Yasser writes

2.5 for a dollar, it scored 65%. Fable for $12, it scored 70%. why would I use Fable for only a 5% increase and pay 12X the [00:24:00] price?

And And yet, of course, this has not been everyone's experience. Ethan Novak wrote, " "We're We're trying out Cursor's Composer 2.5. Results are not what I expected. I found the model indirectly changing files and items without my approval. Opus 4.8 doesn't go on a rogue UI overhaul off one prompt versus Composer.

Many people told me to try it for UI, but I'm not seeing effective results



and and after artificial analysis updated its benchmarks to be more focused on agentic coding tasks, throwing out some of the more saturated benchmarks

2.5 fell fairly significantly

Being closer to the open Chinese models like GLM 5.1 as compared to where it previously was around GPT 4.7



Now, another interesting experiment that has a lot of people excited is is OpenRouter's Fusion API



they're calling it the smartest compound model in the market, achieving fable-level intelligence at half the price

OpenRouter writes, " "We We benchmarked Fusion on 100 hard research tasks and found, one, panels of model consistently outperform individual models. Two, beyond frontier performance can be achieved with [00:25:00] frontier panels. And three, panels of budget models can surpass frontier models at a much lower cost."

So basically what you have here is an API, that routes model tasks more automatically



and performs better or comparable to state-of-the-art models based on that routing

Explaining how it works, they write, " "When you When you send a prompt to Fusion, we fan it out to a panel of models in parallel, each with web search and bash tools enabled. A judge model reads every response and extracts the structure. Then a synthesizer writes the final answer grounded in that analysis

Shanu Matthew wrote, " "This This seems to be pretty huge and validates the future where each model will be called upon to do the specific tasks that they excel at for intelligence versus cost trade-offs versus all other models. as each lab further gets better at specific tasks, this will become more of the default assuming labs and model providers allow for this.

Multi-model becomes the default and model panels or councils are the way

Investor Anisha Charia writes, " "Been Been saying for a bit that this compound architecture makes sense for both aggregators and labs. Model consumers want the right capability and cost controls on a per task and even perhaps per token [00:26:00] basis. Labs need to protect otherwise depreciating model assets, and one way to do it is to selectively expose new features via specific token paths.

E.g., I've long thought/Ultracode was a Mythos class model exposed via 4.X. The compound workflow matches the way many of us work, which is using adversarial models to generate, review, iterate, and test. More generally, where all this is going feels like it will be labs vertically integrating to sell capabilities so that they can capture the downstream economics from their models without exposing them to distillation."



which which brings me to one of the most interesting experiments where some of these ideas are being put into practice, which is around Harvey

Explaining the background, Harvey president and co-founder Gabe Perriaro wrote

The The belief a few months ago was that model costs were halving every six months, meaning tokens would get cheap, and so application layer companies would need to find a way to charge for the value of tokens by selling the work or services. What actually happened is AI got much more expensive than people realized at the time.

The shift from chat to agents led to an explosion in costs. One user could trigger hundreds of agents, and each of those agents could trigger more agents. Agents started running longer and more autonomously. On top of that, frontier models [00:27:00] like Mythos are getting more expensive, not less

The problem for application layer companies like Harvey is how do you take that large token cost and convert it into something useful for your customers? A rough analogy is every company is about to get the ability to hire infinite employees. The main challenge is going to be figuring out how to manage those employees and make your business model work the same way it did with human employees

For Harvey, this means we don't have to become a services company. The infrastructure for every law firm to deploy, train, and manage a large number of agents is going to be so complex that model and cloud providers and law firms likely won't build all of it

it Now Now this dovetails perfectly with an experiment that Harvey recently discussed where they worked with Fireworks to build a worker advisor agent the the idea as described by head of applied research, Niko, is that an open weight, in this case GLM 5.1 worker, delegates high stakes and complex tasks to a closed frontier advisor With their test, this was Opus 4.7 This combination of models, not only allowed them to do things much more cheaply than just using Opus but actually got increased performance as well

Patrick Ojo writes, " "The insight The insight [00:28:00] isn't that open source beat frontier, it's that smart routing beat brute force. Using the most expensive model for every task is not a quality strategy, it's a laziness task. The teams building routing layers that send each task to the right model at the right cost are now demonstrably ahead on both dimensions simultaneously.

Inference optimization just became a first-class competitive advantage."

Now Now Harvey says that this is just the beginning of their experimentation with models, and I think that they're going to be the vanguard for something that lots of others start experimenting with as well

We don't know when Fable is coming back

What's clear is that as powerful as it is



in a world where frontier costs continue to go up, companies are simply going to have to start to get more sophisticated about the combinations of models they use to get the best results.



if we are looking for a bright side to what is an incredibly confusing and chaotic situation, it's that it puts more emphasis on the point that this sort of inference optimization and token efficiency exploration was coming for us no matter what

And now companies have more of a chance to get ahead of it than they might have otherwise had when everyone could just get lost in the sauce of the glory of Fable [00:29:00] 5. Anyways, interesting trends to continue to watch.



For now, that's gonna do it for the AI Daily Brief. Appreciate you listening or watching as always, and until next time, peace. 

​
