←back to Articles

A Forgotten Manifesto: Mozilla Betrays Its Own Values on Open Source AI

The Mozilla Foundation (MoFo), “a global nonprofit dedicated to keeping the Internet a public resource that is open and accessible to all,” proudly proclaims “we’re reclaiming the internet, building trustworthy AI, and holding tech companies accountable”. And yet, when the Open Source Initiative (OSI) recently released their contentious Open Source AI Definition (OSAID) — which does not require the “source” of AI models (i.e., the data), Mozilla celebrated it as “an important step forward for open source AI.”

This definition has already been roundly rejected by the industry, including fellow Debian Linux developers who consider it “obvious bulls—t“, the Free Software Foundation with their own definition, the Software Freedom Conservancy running a single-issue ticket against the board, and even Bruce Perens, co-founder of the OSI (since exiled by it) and Open Source Definition (OSD) author arguing the 25-year old existing definition doesn’t even need to be changed. He reckons there is no need to rewrite it as it can already be applied as is to the source, including data (like AI models). The data is the source for AI, and without it, we simply cannot ensure openness.

In response to Mozilla’s “naïve” letter in the name of “economists” — which despite making some great points sadly admitted “there are many ways to define open source AI” and caveated that “open source AI, appropriately defined, will generate many global economic benefits” — Perens had this to say:

Such a framework might have tremendous economic benefit, as you say, but it ignores the fact that the training data is the source code of the AI’s knowledge, and in an “Open Source” AI, should be as Open as the rest. So, build it if you wish, but please call it something other than “Open Source” [like Open Weight — ed.], so that we can go forward with truly Open Source AI systems, in which both the framework and the training data are open. Such things already exist and more will be built.

Indeed, researchers like Pleias and AI2 would like a word about models like OLMo 2: The best fully open language model to date that include the data. Let’s also not forget that there was relatively little Open Source software than there is today when the definition was released a quarter century ago, so the excuse of defining a niche is unfounded.

While Mozilla acknowledged OSAID as “an initial attempt” that “will need refinement over time,” the OSI has had over a year — they claim two — to incorporate feedback on the critical issue of data openness but has steadfastly refused to address it. Instead, the OSI silenced its own community (which hasn’t seen a third-party post in nearly two months), rejecting proposals for consensus from respected community leaders in favor of a version that conveniently happens to appease its paying sponsors. Mozilla’s support for this debacle is particularly troubling given its reputation as a champion of openness and transparency.

For context, Mozilla published a Joint Statement on AI Safety and Openness last year, with 1,821 signatories agreeing that “when it comes to AI safety and security, openness is an antidote, not a poison.” And yet here they are poisoning the same antidote that previously pulled us out of the dark ages of proprietary software monopolies of the 80s and 90s and could do so again in a world of cloud-based closed AI models.

Mozilla researchers also released a paper (“Towards a Framework for Openness in Foundation Models“) which rightly started to address the need for both openness and completeness, echoing ideas found in the Linux Foundation’s Model Openness Framework. However, instead of building on these efforts to address the completeness issue, Mozilla’s leadership — but apparently not its employees nor community members — have so far aligned themselves with the flawed OSI process and its OSAID product. In this post I hope to trigger an internal discussion that will realign their voice with their values, and I encourage employees to take it up with management who have so far been resistant to external input from folks like Perens (which is strange given Mozilla were in the room with him when Open Source was created back in 1998!).

When the MIT Technology Review breathlessly announced, “We finally have a definition for open-source AI,” Mozilla’s senior advisors rushed to the defense of the OSI and its OSAID, claiming the exclusion of data transparency was justified due to “issues like copyright and data ownership” given “the lack of transparency about where training data comes from [which] has led to innumerable lawsuits against big AI companies”. But these challenges are precisely why meaningful openness in AI is essential. Without access to training data, it is impossible to verify vendors’ respect for others’ copyrighted works — which appears to be sorely lacking so far — nor assess, let alone address, critical issues like fairness, bias, and security. With more and more human output being tainted by unknown others’ copyright claims (including the content oligarchs) via closed AI models, it is particularly perplexing to find Mozilla on the wrong side of the negotiating table today.

Furthermore, freeware AI models released without the data are like sterile GM seeds, incapable of forming the foundation of future generations by preventing modification beyond narrow fine-tuning. This prevents tomorrow’s developers from “standing on the shoulders” of today’s, and renders client companies and countries captive consumers of their vendors (for however long their support of said models suits their business model, after which they find themselves at a dead end). This violates the very spirit of Open Source, which goes beyond the ability use and share software, guaranteeing users the unfettered freedoms to both study and modify it.

More and more software is being written by and incorporating AI — including at Mozilla.ai who claim to be “democratizing open-source AI to solve real user problems” — so this flawed definition risks undermining not only Open Source AI, but all Open Source software, especially if the OSI predictably try to “harmonise” the two definitions in the new year. That’s why we launched the Open Source Declaration; to lock in the Open Source Definition at v1.9 and put it out of harm’s way.

The Mozilla Manifesto

Let’s take a closer look at how Mozilla’s continued support for OSI’s OSAID contradicts its own manifesto, examining point-by-point how their principles stack up against Mozilla’s own stance on Open Source AI, starting with its addendum, a Pledge for a Healthy Internet:

Principle 0: “The open, global internet is the most powerful communication and collaboration resource we have ever seen. It embodies some of our deepest hopes for human progress. It enables new opportunities for learning, building a sense of shared humanity, and solving the pressing problems facing people everywhere. Over the last decade we have seen this promise fulfilled in many ways. We have also seen the power of the internet used to magnify divisiveness, incite violence, promote hatred, and intentionally manipulate fact and reality. We have learned that we should more explicitly set out our aspirations for the human experience of the internet. We do so now.

While already being [self-]promoted as a success, the OSI’s chosen process used the wrong tool in the wrong way for the wrong job. It should be treated by the “co-design” community as a catastrophic failure to be learned from in post-mortem, not as a case study to be replicated elsewhere, and certainly not anywhere else within the technology industry. The rights of the minorities carefully selected to participate in the process — over experts in artificial intelligence or open source, let alone both — were trampled on by its facilitators, while co-design’s inventors “see the role of the designer as a facilitator rather than an expert”. They absolutely need the data to deal with the biases present in today’s AI, and have either been told they don’t, or misrepresented.

Principle 1The internet is an integral part of modern life—a key component in education, communication, collaboration, business, entertainment, and society as a whole.

Even now this is increasingly the case, but how is this principle served by embracing opaque, black-box AI models that consolidate power in the hands of a few corporations? These “fauxpensource” models undermine the collaborative spirit of the internet. Let’s not forget that the release of the source of Netscape Navigator (codenamed Mozilla) in 1998 was the spark that triggered the Open Source revolution!

Principle 2The internet is a global public resource that must remain open and accessible.

The OSAID is fundamentally at odds with this principle. By excluding data from its definition of “open,” the OSI perpetuates a system where only the privileged few have access to the tools and resources necessary to build and innovate. Mozilla must stand against such closed models if it is to be true to its own manifesto. There are no participation prizes, and nearly good enough is not good enough; the definition must function as a litmus test for what is “open and accessible” or it will deliver neither.

Principle 3The internet must enrich the lives of individual human beings.

AI has the potential to enrich lives, but only if it is open, transparent, and accountable. Closed AI models risk creating a future where individuals are mere consumers of pre-packaged, opaque systems controlled by a cabal of capitalists. Far from enriching lives, models that conceal biases can destroy them, denying opportunities like loans and employment to those who need them most — including, ironically, the OSI’s own minorities carefully selected as malleable “co-designers”.

Principle 4Individuals’ security and privacy on the internet are fundamental and must not be treated as optional.

Privacy and security are intrinsically linked, and transparency in training data is critical for ensuring both privacy and security in AI systems. I’ve already covered how OSAID is Turning Open Source AI Insecurity up to 11, and access to the data is the only way to give users a chance of finding the needles in the haystack. The idea that closed models known for exhibiting recall with photographic memory can be used to conceal personal information, medical data, and even CSAM is dangerous nonsense. Anyone making the claim that this is safe either doesn’t understand the technology and shouldn’t be involved in its definition, or does and is deliberately misleading their followers (including Mozilla) and shouldn’t be involved in its definition.

Principle 5Individuals must have the ability to shape the internet and their own experiences on it.

Closed AI models limit individuals’ ability to understand let alone shape their digital experiences. Open Source AI models including the data are essential for empowering users to innovate and customise AI to meet their needs. Without it they have no way to study a model to verify vendors’ claims regarding content — most models today having been built on the backs of others’ copyrights — and their ability to modify it is restricted to fine-tuning. Describing data (OSAID’s infamous “data information”) instead of delivering it doesn’t help, because a recipe including “unobtainium” (like Facebook/Instagram’s social graph data that have been incorporated into Meta’s Llama, raising questions about user consent and restricting its deployment in Europe) cannot be reproduced. If I use their recipe to train on texts with my mother, I’m not going to end up with Llama (but I’m required to call it that anyway by their decidedly non-free terms).

Principle 6The effectiveness of the internet as a public resource depends upon interoperability (protocols, data formats, content), innovation, and decentralized participation worldwide.

OSAID’s exclusion of data from its definition undermines interoperability and decentralisation, locking the future of AI into proprietary silos. Furthermore, the OSI’s own role in its release undermines its self-declared unilateral “stewardship” of the Open Source Definition, which effectively ended when they lost community consensus. In today’s multipolar world, relying on a single country’s standard-bearer when Open Source is vital to all countries is no longer sustainable. We must usher in an era of global governance, giving all stakeholders a voice and sovereignty to all affected by such decisions. Mozilla should be part of that effort, not working against it.

Principle 7Free and open source software promotes the development of the internet as a public resource.

The Digital Public Goods Alliance (DPGA) rightly emphasises the role of Open Data in AI systems as essential for digital public goods. Mozilla should champion this position instead of supporting OSAID’s diluted approach, its manifesto being more aligned with the DPGA’s views on The Role of Open Data in AI systems as Digital Public Goods and their decision “to continue requiring open training data for AI systems to be considered DPGs”.

Principle 8Transparent community-based processes promote participation, accountability, and trust.

The OSI’s OSAID process was neither transparent nor community driven. Participation was limited to select “co-designers” drawn from minorities in an act of performative diversity that actively excluded established experts (see So, you want to write about the OSI’s Open Source AI Definition (OSAID)…). Regular references to LGBTQ+ and specifically trans issues echoes rainbow capitalism, and the result deprives all participants access to the data they need to assess and address biases against them. If nothing else, the rampant gaslighting and harassment that went on is not something Mozilla should be associating itself with. A former OSI board once stated, “a process that is not open cannot be trusted to produce a product that can be considered open”, and their chosen “co-design” process was anything but, per The New Stack’sThe Case Against OSI’s Open Source AI Definition“.

Principle 9Commercial involvement in the development of the internet brings many benefits; a balance between commercial profit and public benefit is critical.

Mozilla must get back to its roots and resist the temptation to prioritise sponsor-friendly definitions over public benefit despite — or perhaps because of — recent events ending the era of their being propped up by Google as evidence of competition. While the OSI was founded to find a balance between free and proprietary software, that balance risks tipping way too far in favor of commercial interests with OSAID — and given it defines four classes of data only to accept any of them (or none), they find themselves at the extreme closed end of the spectrum. Whether actual conflicts or just the appearance of one at the OSI, the trustworthiness of Mozilla’s brand is under threat and no amount of rebranding will save them if they don’t choose their allegiances more carefully.

Principle 10Magnifying the public benefit aspects of the internet is an important goal, worthy of time, attention, and commitment.

With AI invading all aspects of the Internet, from its design and operation, to coding and execution, as well as content and consumption, one could substitute internet for AI: “Magnifying the public benefit aspects of artificial intelligence is an important goal”. AI can be used as a force for good (for example, to assist people in their work) or evil (to replace them), and current discussions about restricting Open Source AI with regulations constitute an existential risk for it. If Mozilla truly believes that Open Source AI is a worthy pursuit, then it must work to ensure it is actually open.

Looking to the Future

It is imperative for Mozilla to reconsider its support for the OSI and OSAID, ideally revoking its apparently unsupported endorsement of same. Instead, Mozilla should advocate for a truly open and community-driven definition of Open Source AI, one that includes data transparency and better aligns with the principles of the Open Source Definition — ideally the Open Source Definition itself, either as-is at v1.9, or patched to explicitly cover the completeness dimension (i.e., data).

This is not merely a matter of semantics; it is about preserving the integrity of the open internet and ensuring that the future of AI is one of openness and accountability. Mozilla has the opportunity to lead by example — now it must have these difficult discussions and rise to the challenge. Or not, and risk fading into obscurity.