Lmst

108.4 kg #openWeights #opensource

Finally, the press is reporting on the issue with Fake Open-Source AI, which developers have warned against for some time. Words have meaning, and you cannot simply redefine them.
#AI #opensource #openweights
https://dig.watch/updates/meta-faces-backlash-over-open-source-ai-claims

#bloodpressure 130 80 #pulse 75
#openWeights #opensource

108.6 kg #openWeights #opensource (measured after breakfast)

108.0 kg #openweights #opensource

#OpenAI signals it plans to release a new open-weight AI model - joining #Meta and #Mistral in the race to make #AI more open and accessible. Great rundown by @lizziegibney.bsky.social. #OpenWeights #LLM #AGI #ResearchSky

‘AI models are capable of nove...

105.0 kg #openWeights #opensource

103.2 kg #openWeights #opensource

My current weight is 101.0 kg.
#openWeights #opensource

🎤🎙️ Behold, the latest marvel: an open-weights #TTS model so realistic, it's like having a conversation with your fridge 🤖. Yet another GitHub project where the buzzwords outweigh the substance—because who needs meaningful human dialogue when AI can babble for us? 🙄 #InnovationOrIllusion
https://github.com/nari-labs/dia #OpenWeights #AIConversation #TechBuzz #GitHubProjects #HackerNews #ngated

Dia, an open-weights TTS model for generating realistic dialogue

https://github.com/nari-labs/dia

#HackerNews #Dia #TTS #openweights #dialogue #realistic #AItechnology #texttospeech

Skywork-OR1: new SOTA 32B thinking model with open weight

https://github.com/SkyworkAI/Skywork-OR1

#HackerNews #SkyworkOR1 #SOTA #AI #Model #OpenWeights #MachineLearning #Innovation

Meta promised a 10M-token AI. The reality? Junk summaries, hardware nightmares, and disappointed devs. Here's what went wrong with Llama 4. #Llama4 #AIhype #OpenWeights

https://geekoo.news/metas-llama-4-hype-hits-a-wall-of-reality/

There haven't been many times where I have said something positive about OpenAI, but they managed to call their upcoming AI model open-weights instead of pretending it's open-source like so many others.
#opensource #openweights #AI
https://siliconangle.com/2025/03/31/openai-launch-first-open-weights-model-since-2019/

"While DeepSeek has been very non-specific about just what kind of code it will be sharing, an accompanying GitHub page for "DeepSeek Open Infra" promises the coming releases will cover "code that moved our tiny moonshot forward" and share "our small-but-sincere progress with full transparency." The page also refers back to a 2024 paper detailing DeepSeek's training architecture and software stack.

The move threatens to widen the contrast between DeepSeek and OpenAI, whose market-leading ChatGPT models remain completely proprietary, making their inner workings opaque to outside users and researchers. The open source release could also help provide wider and easier access to DeepSeek even as its mobile app is facing international restrictions over privacy concerns."

https://arstechnica.com/ai/2025/02/deepseek-goes-beyond-open-weights-ai-with-plans-for-source-code-release/

#AI #GenerativeAI #DeepSeek #OpenWeights #OpenSource

open source vs open weights

un petit lien comme ça https://intelligence-artificielle.developpez.com/actu/356012/L-equilibre-delicat-entre-securite-et-innovation-dans-l-IA-bannir-les-modeles-open-weights-serait-un-desastre-selon-un-chercheur-l-administration-Biden-envisage-de-bloquer-l-acces-a-ces-modeles-afin-d-eviter-les-abus/ pour comprendre la nuance entre open source et open weights (pour les LLM)
et que donc l'essentiel des modèles IA y compris DeepSeek R-1 sont open weights, pas open source.

#OpenSource #OpenWeights #différence

Training Data: The Source Code of the AI Era

In software development, we've long understood the distinction between source code and compiled binaries. Source code is what programmers write - the human-readable instructions that define a program's behavior. When we compile this code, we transform it into machine code that computers can execute efficiently. This compilation process necessarily obscures the original logic, making the compiled binary much harder to understand than the source code.

This familiar process offers a powerful new way to think about AI models and transparency: training data is the true source code of AI systems, and the training process is compilation.

Why Training Data is Source Code

Just as programmers express their intentions through source code, AI developers express their goals through carefully curated training data. This data encodes the patterns, relationships, and behaviors we want our AI models to learn. The choice of what to include or exclude from the training set is analogous to a programmer's decisions about what functionality to implement in their code.

Consider what training data actually represents: it's the human-selected examples that define what we want the model to learn. When we include certain types of examples and exclude others, we're essentially "programming" the model's behavior. When we clean and preprocess the data in specific ways, we're writing the instructions that will shape how the model understands its task.

Training as Compilation

The training process, then, is remarkably similar to compilation. Just as a compiler transforms human-readable source code into optimized machine instructions, the training process transforms human-curated training data into optimized model weights. Both processes:

Take human-understandable input (source code/training data)
Apply complex transformations to optimize for machine execution
Produce output (binaries/weights) that's efficient but obscures original intent
Make it difficult to reverse-engineer the original input

The Distribution Misconception

This perspective reveals something crucial about the current practice of releasing "open weights": it's fundamentally about distribution, not transparency. When companies release model weights while keeping their training data private, they're doing exactly what software companies do when they release compiled binaries without source code - they're providing a way to use the technology without revealing how it was created.

Just as having access to a compiled binary doesn't tell you much about how the program was designed, having access to model weights doesn't give you real insight into the training data that shaped the model's behavior. The weights, like a binary, are the end product of a transformation process that has obscured the original "programming" - the carefully selected training data.

Implications for True AI Transparency

Understanding training data as source code clarifies what real AI transparency would require. Just as true software transparency means access to source code, true AI transparency would require access to training data. This is why many companies are hesitant to provide it - the training data, like source code, represents their core intellectual property and competitive advantage.

This also explains why examining model weights, while valuable for understanding certain aspects of AI systems, can never provide complete transparency. The weights are like a compiled binary - they contain the instructions for execution but have lost much of the context and intent that went into their creation.

Moving Forward

This reframing suggests we need to be more precise in our discussions about AI transparency. When companies release model weights, they're not really participating in "open source" AI - they're simply choosing a distribution method that makes their technology more accessible while protecting their core IP (the training data).

If we want genuine transparency in AI development, we need to focus on the "source code" - the training data. This might mean:

Developing new ways to share training data while protecting privacy and intellectual property
Creating standards for documenting data selection and preprocessing decisions
Building tools for understanding how training data choices influence model behavior
Establishing frameworks for auditing training data without requiring full access

Understanding that training data is the true source code of AI systems helps clarify these challenges and points the way toward meaningful solutions. It suggests that the path to genuine AI transparency might look more like the open source software movement than current "open weights" initiatives.

The next time you hear about a company releasing their model weights, remember: you're getting the compiled binary, not the source code. The real question is: how do we move toward a future where AI systems can be truly understood through their training data, just as we understand programs through their source code?

Photo by Merlin Lightpainting

Unlock the Future of Business with AI

Dive into our immersive workshops and equip your team with the tools and knowledge to lead in the AI era.

Get in touch with us

#AI #LLM #openWeights #sourceCode