New Miami AI Startup Claims Breakthrough in Model Efficiency—But Experts Want Proof
A freshly launched Miami-based startup called Subquadratic has made headlines by announcing what could be a fundamental shift in how artificial intelligence systems handle large volumes of information. The company unveiled its SubQ 1M-Preview model this week, claiming to have solved a mathematical constraint that has shaped every major machine learning system since the transformer architecture emerged in 2017.
If the assertions hold true, the implications could be transformative. Yet the response from the artificial intelligence research community has been decidedly mixed—ranging from cautious interest to open skepticism.
Understanding the Quadratic Scaling Problem
To grasp why this announcement matters, it helps to understand the fundamental challenge that has constrained large language model development. Every transformer-based system relies on a mechanism called attention, which compares each piece of input data against every other piece. This creates a mathematical relationship where doubling your input size quadruples the computational cost required to process it.
This constraint has shaped the entire industry. Current frontier models from OpenAI, Anthropic, and Google typically support context windows of 128,000 tokens, with some reaching one million tokens. Yet processing longer inputs becomes prohibitively expensive, pushing developers to build elaborate workarounds. Retrieval-augmented generation systems, prompt engineering techniques, and multi-agent orchestration have emerged as industry standards—essentially creating scaffolding around models that cannot efficiently process everything at once.
Why This Limitation Matters
These workarounds are not just inconvenient—they are costly and brittle. They force developers to manually curate retrieval systems and conditional logic, adding complexity that ultimately limits what artificial intelligence applications can accomplish. Subquadratic’s core argument is that a more direct solution exists.
How Subquadratic Claims to Solve It
The company’s proposed solution, called Subquadratic Sparse Attention, operates on a deceptively simple principle: most token-to-token comparisons in standard attention mechanisms are wasted computation. Rather than comparing every token to every other token, the system learns which comparisons actually matter and focuses computational resources there.
The approach is content-dependent, meaning the model determines where to focus based on semantic meaning rather than fixed patterns. This theoretically allows retrieval of specific information from distant positions in a context without incurring the quadratic computational penalty.
The Efficiency Claims
According to Subquadratic’s technical documentation, this architecture achieves a 7.2x speedup over conventional attention at 128,000 tokens, escalating to 52.2x at one million tokens. The company claims its model reduces attention computation by approximately 1,000 times compared to frontier systems at maximum context length.
For context, this would represent an unprecedented efficiency gain. The company is backed by $29 million in seed funding from investors including Tinder co-founder Justin Mateen and former SoftBank Vision Fund partner Javier Villamizar, with the round valuing the startup at $500 million.
Benchmark Results: Impressive but Limited
Subquadratic released three benchmark results that appear competitive with or superior to offerings from major artificial intelligence developers. On SWE-Bench Verified, the model scored 81.8% against Opus 4.6’s 80.8%. On RULER at 128,000 tokens, it achieved 95% compared to Claude Opus’s 94.8%.
The most striking result came on MRCR v2, a multi-hop retrieval benchmark. The company reported 65.9%, substantially outperforming Claude Opus at 32.2% and Gemini 3.1 Pro at 26.3%.
Where Skepticism Enters
Several factors have prompted concern from machine learning researchers. First, the benchmark selection is narrow—three tests specifically emphasizing long-context retrieval and coding, precisely the tasks the system was designed for. Broader evaluations across general reasoning, mathematics, and safety remain unpublished.
Second, each benchmark was run only once without confidence intervals, leaving results vulnerable to statistical variance. Third, a significant gap exists between research-phase results and the production model shipped to users. On MRCR v2, the research version scored 83, while the verified production model scored 65.9—a notable 17-point decline left largely unexplained.
Additionally, Subquadratic claims its model costs roughly $8 to achieve 95% accuracy on RULER 128K compared to Claude Opus at approximately $2,600. However, the company has not disclosed specific API pricing, making independent cost verification impossible.
The Skepticism Within the AI Research Community
Response from the artificial intelligence research community has crystallized around fundamental doubt. One prominent commentator framed the stakes starkly: “SubQ is either the biggest breakthrough since the Transformer or it’s artificial intelligence Theranos”—referencing the infamous fraud case.
Critical concerns include allegations that the system may be a fine-tuned variant of existing open-source models rather than a fundamentally new architecture. The company confirmed it uses weights from open-source models as a starting point, citing funding limitations. Others questioned whether the claimed linear scaling benefits actually materialize in practice or represent marginal improvements rebadged as breakthroughs.
Historical Precedent for Caution
The skepticism is reinforced by recent history. Magic.dev announced a 100-million-token context-window model in August 2024 with roughly identical efficiency claims and a similar funding level. As of early 2026, there is minimal public evidence the model achieved meaningful adoption. Similar promises have come from Kimi Linear, DeepSeek Sparse Attention, and other approaches—yet most delivered incremental rather than revolutionary improvements when evaluated independently.
The Team Behind the Claims
CEO Justin Dangel has founded and scaled five companies across health technology, insurance technology, and consumer goods. CTO Alexander Whedon previously worked as a software engineer at Meta and led generative artificial intelligence implementations. The team includes 11 PhD researchers from Meta, Google, Oxford, Cambridge, ByteDance, and Adobe.
This represents credible technical talent. However, neither co-founder has published foundational machine learning research, and the company has not released peer-reviewed papers. A technical report remains forthcoming.
What Comes Next
The fundamental question facing Subquadratic is whether its mathematical approach survives independent scrutiny. If verified, the implications would be transformative—enterprise applications processing entire codebases, contracts, and medical records could operate as single-pass systems rather than requiring elaborate retrieval infrastructure.
If independent evaluation contradicts the claims, Subquadratic joins a growing list of long-context innovations that sounded revolutionary at launch but proved unremarkable in practice. The burden of proof now rests with the company to demonstrate that it has genuinely solved a problem that has eluded organizations with vastly greater resources.
The artificial intelligence industry will be watching closely as this verification process unfolds.
Frequently Asked Questions
What is subquadratic attention and why does it matter?
Subquadratic attention is an approach that reduces computational scaling from quadratic (where doubling input quadruples cost) to linear (where doubling input doubles cost). This matters because it could fundamentally reduce the expense of processing long documents and context in large language models, making enterprise applications more economically feasible.
How does Subquadratic's model compare to ChatGPT and Claude?
On specific benchmarks for long-context retrieval, Subquadratic reports superior performance. However, the company has only published results on three narrow benchmarks emphasizing retrieval and coding. Broader comparisons across general reasoning, mathematics, and safety have not been released, making comprehensive evaluation impossible.
Why are researchers skeptical about Subquadratic's claims?
Researchers cite several concerns: narrow benchmark selection, single-run testing without confidence intervals, a significant gap between research and production performance, unverified cost claims, and historical precedent of similar promises delivering marginal rather than revolutionary improvements when independently evaluated.





