Can the Swiss LLM Compete?
Update (September 5, 2025):
This blog post has been updated today with the latest information about the released model Apertus. Read our detailed report on the release for more information.
On September 2, 2025, Switzerland released its own Large Language Model (LLM) named Apertus, developed by ETH Zurich, EPFL, and the Swiss National Supercomputing Centre (CSCS). The project emphasizes transparency, data privacy, and linguistic diversity – positioning itself as a transparent, public-good alternative to commercial AI models.
But how realistic is this goal? Can a publicly funded model truly compete with billion-dollar projects from Silicon Valley? We've examined all available facts – objectively, critically, and without tech hype.
What's Behind the Swiss LLM?
The model is part of the Swiss AI Initiative, launched in late 2023. Key highlights:
- Open Source: Fully open-source – including model weights, source code, training data, and checkpoints on Hugging Face and GitHub
- Model Sizes: Two variants with 8 billion and 70 billion parameters
- Multilingualism: Trained on 15 trillion tokens across 1,811 languages (pretraining) and 149 languages (post-training) – 40% non-English
- Infrastructure: Developed on the Alps supercomputer at CSCS with 10,752 NVIDIA GH200 Grace-Hopper chips
- Data Privacy: Compliant with GDPR, EU AI Act, and Swiss data protection laws
Research: Well-Staffed but Not Over-Funded
ETH Zurich and EPFL rank among the world's leading universities for engineering and natural sciences. In artificial intelligence, they're well-positioned:
- Prof. Andreas Krause (ETH) is an internationally recognized expert in Reinforcement Learning
- Prof. Martin Jaggi (EPFL) leads the Machine Learning & Optimization Lab
- Since 2024, the Swiss National AI Institute has strengthened collaboration between research and application
However, ETH and EPFL cannot match the salaries and resources of OpenAI or xAI. They offer something different – an environment for open, ethically-oriented research. For a public model like the Swiss LLM, this is a solid foundation.
Computing Power: Strong – But Not Competitive
Apertus was trained on the Alps supercomputer, operational at CSCS since September 2024:
Hardware Specifications:
- 10,752 NVIDIA GH200 Grace-Hopper chips (total)
- Training ran on 2,048–4,096 GPUs
- ~6 million GPU-hours total training
- 6.74×10²⁴ FLOPs (Floating Point Operations)
- ~5 GWh energy consumption (hydropower)
- ~80% scaling efficiency
For comparison:
- GPT-4 was reportedly trained with approximately 25,000 A100 GPUs over 90–100 days
- Grok 4 from xAI uses the Colossus supercomputer with up to 200,000 NVIDIA H100 GPUs
Conclusion: For an academic project, Alps is powerful. But compared to the massive data centers of major tech companies, it falls significantly behind – affecting training speed and model size.
Training Data: Quality Over Quantity
Apertus was trained on approximately 15 trillion tokens. Particularly noteworthy is the high proportion of non-English data (40%) and coverage of 1,811 languages in pretraining and 149 languages in post-training – including rare ones like Romansh or Zulu.
The data was ethically sourced – without illegal scraping, respecting robots.txt, copyright, toxicity filters, and PII protection. While this limits access to certain specialized information, CSCS emphasizes: «For general tasks, this doesn't lead to measurable performance losses.»
Linguistic Diversity: Where the Swiss LLM Leads
Support for 1,811 languages in pretraining and 149 languages in post-training is remarkable – even compared to commercial models:
Model | Language Coverage |
---|---|
Apertus | 1,811 languages (pretraining), 149 languages (post-training) |
GPT-4.5 | ~80–120 languages |
Claude 4 | no official number |
Llama 4 | 12 languages (200+ in training) |
This breadth is particularly relevant for:
- SMEs with international audiences
- Organizations with multilingual communication
- Applications in linguistically diverse countries
Transparency & Data Privacy: Advantage with Compromises
Apertus is fully open-source and transparent – code, weights, training data, and checkpoints are publicly available on Hugging Face and GitHub. It meets the requirements of GDPR, EU AI Act, and Swiss data protection regulations.
This makes it attractive for:
- Government agencies and institutions
- Companies in regulated industries
- Research and education
However: Avoiding certain data sources – such as medical literature – may limit performance in specialized tasks. Commercial models have advantages here because they can access proprietary content.
Model Comparison: How Does the Swiss LLM Perform?
Model | Parameters | Openness | Training Hardware | Strengths |
---|---|---|---|---|
Apertus | 8B / 70B | Fully Open Source | Alps: 2,048–4,096 GH200 GPUs | Linguistic diversity, data privacy, transparency |
GPT-4.5 | ~2T (estimated) | Proprietary | Azure: ~25,000 A100 GPUs | Creativity, natural conversation, agentic planning |
Claude 4 | Not published | Proprietary | Anthropic: Internal clusters | Adaptive reasoning, coding |
Llama 4 | 109B / 400B | Open Weight | Meta: ~20,000 H100 GPUs | Multimodality, 200 languages, agentic tasks |
Grok 4 | ~1.8T MoE | Proprietary | Colossus: 200,000 H100 GPUs | Reasoning, real-time data, humor |
What Does This Mean in Practice?
Apertus won't be the most powerful AI on the market. But it's a strong tool for many concrete applications – especially in Europe:
Suitable for:
- Multilingual chatbots and customer support
- Text summarization and translation
- Applications in regulated sectors (e.g., healthcare)
- Research, education, and open-source projects
Not suitable for:
- Highly complex reasoning tasks
- Multimodal applications (e.g., speech + image + video)
- Performance at GPT-4o or Grok level
Conclusion: An Important Model – With Clear Focus
Apertus is not a miracle model. But it's a responsibly developed, transparent, and linguistically comprehensive AI system that excels precisely where commercial models often have deficits: in data privacy, openness, and regulatory security.
In a market increasingly dominated by «black-box» models, Switzerland is deliberately setting a different tone. As a transparent, public-good alternative, Apertus is comparable to Llama 3 – not competitive at the frontier, but a solid, open baseline for research and application.
Apertus demonstrates that even without billion-dollar budgets, respectable AI models can be developed that set new standards in crucial areas like data privacy and transparency.