
AI lab based in San Francisco Arcée do waves last year to be one of the only US companies training large language models (LLM) from scratch and publish them under open or partially open source licenses to the public, allowing developers, individual entrepreneurs and even medium and large companies to use the powerful AI models for free and customize them at will.
Arcee is back this week with the release of its largest and most capable open language model yet: Great Trinitya mixture of experts (MoE) of 400 billion parameters, available now in preview,
Alongside the flagship version, Arcee is shipping a "raw" checkpoint template, Trinity-Large-TrueBasewhich allows researchers to study what a $400 billion sparse MoE learns from raw data alone, before instruction tuning and boosting is applied.
By providing a clean slate at the 10 trillion token mark, Arcee enables AI builders in highly regulated industries to perform authentic audits and conduct their own specialized alignments without inheriting the "black box" biases or formatting quirks of a general-purpose chat template. This transparency allows for a deeper understanding of the distinction between a model’s intrinsic reasoning abilities and the useful behaviors adopted during the later stages of post-training.
This launch comes as powerful Chinese open source LLM alternatives such as Alibaba (Qwen), z.AI (Zhipu), DeepSeek, Moonshot and Baidu have flooded the market, effectively leading the category with high-efficiency architectures.
Trinity Large also comes after Meta notably withdrew from the open source frontier landscape. Following the Llama 4 debuts in April 2025who was met a mixed receptionand later Yann LeCun, former Meta AI researcher admitted that the company had used several specialized versions of the model to inflate scores on third-party benchmarks.
In the middle of this national void, only OpenAI, with its gpt-oss family released in summer 2025—and Arcee are currently carrying the mantle of new open source models made in the USA and trained entirely from scratch.
As sparse as they are
Trinity Large is distinguished by the extreme parsimony of its attention mechanism. An MoE architecture, "rarity" refers to the model’s ability to selectively activate only a tiny fraction of its total parameters for a given task.
While Trinity Large hosts 400 billion parameters in total, only 1.56% (13 billion parameters) are active at any given time.
This architectural choice is significant because it allows the model to have the "awareness" of a massive system while retaining the inference speed and operational efficiency of a much smaller system, achieving performance approximately 2-3x faster than peers on the same hardware.
Sovereignty and "TrueBase" philosophy
This release’s most significant contribution to the research community is Trinity-Large-TrueBase, a raw checkpoint of 10 trillion tokens.
Unlike almost everyone else "open" liberation, which comes after having been "veil" Through instruction tuning and reinforcement learning, TrueBase provides a rare, untouched look into fundamental intelligence.
In the rush to make models useful, most labs apply supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) before the weights are released. While this makes the model a better speaker, it can obscure underlying knowledge distributions.
TrueBase provides a "AND basic model" which has not yet undergone learning rate annealing or pre-training phases two and three where instruction data is typically introduced.
For researchers and businesses in highly regulated industries, starting with TrueBase enables authentic audits and personalized alignment. As Lucas Atkins, CTO of Arcee, noted during a video call with VentureBeat: "This is interesting because this checkpoint itself is already one of the most successful base models in the world.".
Technology: engineering by constraint
The creation of Trinity Large is not the product of infinite resources, but rather of what Atkins calls "engineering by constraint".
Trained for approximately $20 million over just 33 days, the model represents a masterclass in capital efficiency.
Arcee, a team of just 30 people, was operating with total capital of just under $50 million, making the $20 million formation a true success. "support the business" bet.
"I always thought that having a constraint, whether financial, personal or otherwise, was extremely important for creativity," Atkins explained. "When you have an unlimited budget, you don’t need to work your way out of complex problems.".
Architecture: 4 out of 256 Sparsity and SMEBU
Trinity Large uses a 4 of 256 sparse MoE architecture, meaning it only activates 4 of its 256 experts for each token.
This high degree of sparsity, one of the highest ever successfully trained, created significant stability issues during pre-training.
To solve this problem, Arcee developed soft-tight Momentum Expert Bias Updates (SMEBU). This mechanism ensures that experts are specialized and evenly distributed across a general web corpus, thus preventing a few experts from becoming experts. "winners" while others remain untrained "dead weight".
The speed of training was facilitated by Arcee’s early access to Nvidia B300 (Blackwell) GPUs. These chips offered approximately twice the speed of the previous Hopper generation and significant memory increases.
"The pre-training was 33 days," Atkins noted. "We could have done it on Hopper, and it probably would have taken two to three months. And at this point we are in a whole new generation of models".
In partnership with DatalogyAIArcee has used over 8 trillion synthetic data tokens. However, this was not typical "imitation" synthetic data where a smaller model learns to speak like a larger model.
Instead, the intention was to take raw web text, such as blogs or Wikipedia articles, and rewrite it synthetically to condense the information into a smaller number of total tokens. This process helped the model learn to reason about information rather than simply memorizing exact strings of tokens.
The architectural design also incorporates alternating local and global sliding window attention layers in a ratio of 3:1. This hybrid approach allows the model to be very effective in long context scenarios. Although trained for a sequence length of 256,000, Trinity Large natively supports a 512,000 token context, and evaluations suggest that it remains performant even at the 1 million token horizon.
Technical comparison: Trinity Large vs gpt-oss-120b
As a US alternative, Trinity Large can be compared to OpenAI’s gpt-oss-120b.
Although both models use sparse architectures to achieve state-of-the-art performance under permissive licensing, they serve different operational roles.
While gpt-oss-120b currently holds an advantage in specific reasoning and math tests, Trinity Large offers a significant advantage in context capacity and raw parameter depth for complex, multi-step agent workflows.
Sovereignty: filling the void
The release of Trinity Large is as much a geopolitical statement as it is a technical one. CEO Mark McQuade noted to VentureBeat in the same interview that the void of U.S. open source models at the border forced a pivot in Arcee’s strategy.
"There was this kind of change where US or Western based players stopped opening these models," McQuade said. "We build on these models and then go further into organizations… but the Chinese labs have only just started… producing cutting-edge models and making them available as open source.".
For McQuade, this created a dependency that made American businesses increasingly uncomfortable. "Especially in the conversations we have with large organizations, they weren’t able to use Chinese-based architectures," he explained. "We want to be that champion in the United States. [It] in fact, does not exist at the moment".
By releasing under the Apache 2.0 License, Arcee provides the permissive framework of reference that allows businesses to "own" the model layer entirely. This is critical for industries like finance and defense, where using a third-party hosted model or restrictive cloud provider is not a solution.
Balancing Intelligence and Utility
Arcee is currently focusing on "current thinking model" moving Trinity Large from a general instruction model to a comprehensive reasoning model. The team struggles with the balance between "intelligence vs utility"—strive to create a model that excels on benchmarks without becoming "yapper" or ineffective in real production applications.
"We built Trinity so you can own it," » declares the team, signaling a return to the core values of the American open source movement. As the industry moves toward agent workflows and massive contextual requirements, Trinity Large is not positioning itself as a "packaging," but as a sovereign infrastructure layer that developers can finally control.




