Can the Swiss LLM Compete?

Update (September 5, 2025):

This blog post has been updated today with the latest information about the released model Apertus. Read our detailed report on the release for more information.

On September 2, 2025, Switzerland released its own Large Language Model (LLM) named Apertus, developed by ETH Zurich, EPFL, and the Swiss National Supercomputing Centre (CSCS). The project emphasizes transparency, data privacy, and linguistic diversity – positioning itself as a transparent, public-good alternative to commercial AI models.

But how realistic is this goal? Can a publicly funded model truly compete with billion-dollar projects from Silicon Valley? We've examined all available facts – objectively, critically, and without tech hype.

What's Behind the Swiss LLM?

The model is part of the Swiss AI Initiative, launched in late 2023. Key highlights:

Open Source: Fully open-source – including model weights, source code, training data, and checkpoints on Hugging Face and GitHub
Model Sizes: Two variants with 8 billion and 70 billion parameters
Multilingualism: Trained on 15 trillion tokens across 1,811 languages (pretraining) and 149 languages (post-training) – 40% non-English
Infrastructure: Developed on the Alps supercomputer at CSCS with 10,752 NVIDIA GH200 Grace-Hopper chips
Data Privacy: Compliant with GDPR, EU AI Act, and Swiss data protection laws

Research: Well-Staffed but Not Over-Funded

ETH Zurich and EPFL rank among the world's leading universities for engineering and natural sciences. In artificial intelligence, they're well-positioned:

Prof. Andreas Krause (ETH) is an internationally recognized expert in Reinforcement Learning
Prof. Martin Jaggi (EPFL) leads the Machine Learning & Optimization Lab
Since 2024, the Swiss National AI Institute has strengthened collaboration between research and application

However, ETH and EPFL cannot match the salaries and resources of OpenAI or xAI. They offer something different – an environment for open, ethically-oriented research. For a public model like the Swiss LLM, this is a solid foundation.

Computing Power: Strong – But Not Competitive

Apertus was trained on the Alps supercomputer, operational at CSCS since September 2024:

Hardware Specifications:

10,752 NVIDIA GH200 Grace-Hopper chips (total)
Training ran on 2,048–4,096 GPUs
~6 million GPU-hours total training
6.74×10²⁴ FLOPs (Floating Point Operations)
~5 GWh energy consumption (hydropower)
~80% scaling efficiency

For comparison:

GPT-4 was reportedly trained with approximately 25,000 A100 GPUs over 90–100 days
Grok 4 from xAI uses the Colossus supercomputer with up to 200,000 NVIDIA H100 GPUs

Conclusion: For an academic project, Alps is powerful. But compared to the massive data centers of major tech companies, it falls significantly behind – affecting training speed and model size.

Training Data: Quality Over Quantity

Apertus was trained on approximately 15 trillion tokens. Particularly noteworthy is the high proportion of non-English data (40%) and coverage of 1,811 languages in pretraining and 149 languages in post-training – including rare ones like Romansh or Zulu.

The data was ethically sourced – without illegal scraping, respecting robots.txt, copyright, toxicity filters, and PII protection. While this limits access to certain specialized information, CSCS emphasizes: «For general tasks, this doesn't lead to measurable performance losses.»

Linguistic Diversity: Where the Swiss LLM Leads

Support for 1,811 languages in pretraining and 149 languages in post-training is remarkable – even compared to commercial models:

Model	Language Coverage
Apertus	1,811 languages (pretraining), 149 languages (post-training)
GPT-4.5	~80–120 languages
Claude 4	no official number
Llama 4	12 languages (200+ in training)

This breadth is particularly relevant for:

SMEs with international audiences
Organizations with multilingual communication
Applications in linguistically diverse countries

Transparency & Data Privacy: Advantage with Compromises

Apertus is fully open-source and transparent – code, weights, training data, and checkpoints are publicly available on Hugging Face and GitHub. It meets the requirements of GDPR, EU AI Act, and Swiss data protection regulations.

This makes it attractive for:

Government agencies and institutions
Companies in regulated industries
Research and education

However: Avoiding certain data sources – such as medical literature – may limit performance in specialized tasks. Commercial models have advantages here because they can access proprietary content.

Model Comparison: How Does the Swiss LLM Perform?

Model	Parameters	Openness	Training Hardware	Strengths
Apertus	8B / 70B	Fully Open Source	Alps: 2,048–4,096 GH200 GPUs	Linguistic diversity, data privacy, transparency
GPT-4.5	~2T (estimated)	Proprietary	Azure: ~25,000 A100 GPUs	Creativity, natural conversation, agentic planning
Claude 4	Not published	Proprietary	Anthropic: Internal clusters	Adaptive reasoning, coding
Llama 4	109B / 400B	Open Weight	Meta: ~20,000 H100 GPUs	Multimodality, 200 languages, agentic tasks
Grok 4	~1.8T MoE	Proprietary	Colossus: 200,000 H100 GPUs	Reasoning, real-time data, humor

What Does This Mean in Practice?

Apertus won't be the most powerful AI on the market. But it's a strong tool for many concrete applications – especially in Europe:

Suitable for:

Multilingual chatbots and customer support
Text summarization and translation
Applications in regulated sectors (e.g., healthcare)
Research, education, and open-source projects

Not suitable for:

Highly complex reasoning tasks
Multimodal applications (e.g., speech + image + video)
Performance at GPT-4o or Grok level

Conclusion: An Important Model – With Clear Focus

Apertus is not a miracle model. But it's a responsibly developed, transparent, and linguistically comprehensive AI system that excels precisely where commercial models often have deficits: in data privacy, openness, and regulatory security.

In a market increasingly dominated by «black-box» models, Switzerland is deliberately setting a different tone. As a transparent, public-good alternative, Apertus is comparable to Llama 3 – not competitive at the frontier, but a solid, open baseline for research and application.

Apertus demonstrates that even without billion-dollar budgets, respectable AI models can be developed that set new standards in crucial areas like data privacy and transparency.

Sources: