GPT-NL opens up: Inside the Dutch bid for sovereign AI

It was an unlikely moment for a deep dive into large language models. The audience - around a hundred developers, engineers, and hackers - had just finished a full day of building AI solutions across healthcare, plant-based proteins, and defense. Dinner had been served, drinks were flowing, and the final pitches were still ahead.

Yet when Saskia Lensink stepped onto the stage at the High Tech Campus Eindhoven, the room leaned in.

Not because she promised the next breakthrough model in some well-known Silicon Valley style of talking. Because she didn't. Instead, Lensink offered a grounded, transparent account of what it actually takes to build a European alternative to Big Tech and why that might matter more than raw performance.

A different starting point

Lensink, a linguist by training at TNO, began with a simple observation: language technology used to be niche. Today, it has become part of everyday life, embedded in tools and workflows that millions rely on without even thinking about it.

For TNO, that shift comes with responsibility. Their mission, loosely translated as making the world a bit nicer, has been reframed in AI terms as building systems that are not only innovative but also responsible, sovereign, and competitive. That last word is crucial. As Lensink pointed out, there is little value in developing a European alternative if it fails to match real-world needs. A sovereign model that no one uses is, ultimately, irrelevant.

The case for a European model

The urgency behind GPT-NL is not purely technical. It is geopolitical, legal, and societal at the same time. Lensink pointed to the growing number of lawsuits over copyright infringement, in which content owners challenge how large language models were trained on vast amounts of scraped data. At the same time, concerns around privacy continue to grow, especially when data is processed or stored outside European jurisdiction.

There is also a more subtle but equally important issue: control. If European organizations rely entirely on foreign AI infrastructure, they risk losing grip on how their data is used and how outcomes are generated. Add to that the emerging risks of data poisoning (where malicious content is deliberately inserted into training datasets), and the picture becomes even more complex.

Against this backdrop, regulations such as the European AI Act set a clear direction for Europe. But regulation alone is not enough. It needs to be matched by technological alternatives that actually embody those principles.

Building GPT-NL: focus over scale

With "only" €13.5 million in funding, the GPT-NL team had to make hard choices from the start. Competing head-on with global tech giants on scale was simply not an option. Instead, they chose focus.

Rather than building a general-purpose consumer model, GPT-NL is designed for professional environments where compliance, security, and reliability are essential. That focus also shaped the technical roadmap. The model focuses on core capabilities widely used in real-world applications, such as summarizing complex information, simplifying texts for different audiences, and working effectively in retrieval-based systems that combine external knowledge with language generation.

In terms of performance, the ambition is deliberately pragmatic. The team aims for a level comparable to earlier widely adopted models, not because that is the ceiling, but because it is sufficient to unlock meaningful use cases. In this context, “good enough” becomes a strategic choice rather than a limitation.

Training from scratch, on purpose

One of the defining decisions behind GPT-NL is that it was trained entirely from scratch on the Snellius supercomputer. This was not about reinventing the wheel, but about maintaining full control over what goes into the model.

By starting from zero, the team ensured that every piece of data could be traced, validated, and justified. The dataset itself reflects that philosophy. It consists of nearly two trillion tokens of text, all of which have been lawfully acquired. Extensive filtering was applied to remove sensitive or private information, resulting in a dataset that complies with strict European privacy standards.

This careful approach has not gone unnoticed. The project even received recognition for its privacy-first methodology, demonstrating that large-scale AI development and regulatory compliance need not be at odds. By publishing parts of the dataset and the underlying pipelines on GitHub, the team has also embraced transparency in a way that contrasts sharply with many commercial models.

A new model for data ownership

If the technical choices behind GPT-NL are notable, the economic model may be even more so. In a landscape where content creators increasingly push back against AI companies, GPT-NL has taken a collaborative route.

Dutch news organizations did not simply hand over their data. Instead, they entered into agreements that tie their contribution to future value creation. Professional users of GPT-NL will pay a license fee, part of which flows back to the data providers. This creates a shared incentive structure in which better data leads to a better model, which in turn generates more value for everyone involved.

What emerged is an ecosystem rather than a one-off project. Data providers, technology developers, and end users are no longer separate actors but part of a shared system. According to Lensink, this dynamic is already visible, with organizations actively encouraging others to join and contribute.

Where it works, and where it doesn’t (yet)

Lensink was careful not to oversell the current state of GPT-NL. The model performs well in areas such as summarization and, when given sufficient context, in retrieval-based applications. At the same time, tasks like text simplification still require further refinement, and the model can struggle when used without clear input or structure.

That is not a flaw, but a reflection of its intended use. GPT-NL is not designed as a free-form chatbot for casual interaction. It is built for structured environments where the boundaries are known, and the stakes are higher. In those contexts, reliability and control often matter more than raw flexibility.

This makes the model particularly relevant for sectors such as government, finance, healthcare, and defense, where concerns around data sovereignty and compliance are not theoretical but operational realities.

From Dutch model to European ambition

At the time of the BrabantHack presentation, GPT-NL had reached a mid-stage level of maturity, with multiple feasibility studies running in parallel across different sectors. These collaborations are not just about testing the technology, but mainly about understanding where it truly adds value and where it needs to evolve.

The roadmap reflects that iterative approach. A broader public release is on the horizon, followed by a hosted version that will make the model more accessible. Beyond that, the ambition expands toward a European-scale initiative, tentatively referred to as GPT-EU.

That next phase will require more data, more partners, and more capabilities. It will also require something less tangible but equally important: a shared belief that Europe should not only regulate AI, but also build it.

The real takeaway: fit for purpose

During the Q&A, a question about defense applications captured the essence of the project. Could GPT-NL be adapted for such sensitive domains? Lensink’s answer was telling. The real value of GPT-NL, she argued, lies not in being a universal model but in being adaptable to specific contexts. Whether in healthcare, government, or defense, the goal is to create systems that are tailored, controlled, and aligned with users' needs.

In that sense, GPT-NL represents a shift in thinking. Instead of chasing the most powerful model, it focuses on building the most appropriate one.

A quiet shift in AI

As the session came to a close, the mood in the room had shifted. What started as a technical presentation had turned into something more reflective. The questions moved beyond features and performance toward hiring ("Yes, just send us your application"), collaboration, and long-term impact.

GPT-NL may not dominate benchmarks or headlines, but that is not its ambition. It is testing a different path: one in which trust, transparency, and alignment with societal values are treated as core features rather than afterthoughts.

In a field defined by scale and speed, that approach may seem modest. But in Eindhoven, late one evening after a long day of hacking, it felt quietly significant.