Introducing Mistral Small 4 | Mistral AI
Introducing Mistral Small 4 | Mistral AI
LLM Architecture Gallery
Short Doco: How LLMs Took Over The World - Everything is a Pattern
I am wondering, is the path big AI corps are going with providing models via huge server farms quite opposing capitalism?
Normally costs run down over time (see solar or microchips). LLMs get smaller and suddenly they fit on your device.
I checked OVH cloud for their offerings of cloud models. They all fit on a 64gb strix halo, probably even 32gb ram. The SOTA models still have an edge, but honestly not much.
CanIRun.ai — Can your machine run AI models?
llama.cpp + mcp - docker and more
How to... (Maybe I am missing something)
Guide to run Qwen3.5 locally
Anyone interested in AI radio?
A possible hardware solution for ultra speed (73x faster than H200) self hosted small models that is not dependent on RAM
PewDiePie trains his own AI
llmfit - find best model that runs on your computer
Smaller qwen3.5 models released
ollama 0.17 Released With Improved OpenClaw Onboarding
ggml.ai (the founding team of llama.cpp) joins Hugging Face to ensure the long-term progress of Local AI
Artificial Analysis Intelligence Index and cost benchmarks are useful decision/guidance determinants for which models to use. Analysis for top models.
I Sold Out for $20 a Month and All I Got Was This Perfectly Generated Terraform
Qwen3.5: Towards Native Multimodal Agents
gpt-oss:20b running in your browser thanks to Transformers.js v4 and ONNX Runtime Web
cedric (@cedric_chee)
MiniMax M2.5는 총 230B 파라미터에 활성 파라미터가 단 10B로 작아 홈랩에서 유력한 후보라는 평가입니다. M2 시리즈는 LocalLLaMA 커뮤니티에서 인기를 끌고 있으며, 작성자는 추론 제공자들이 TPS를 극한까지 끌어올려 성능을 보여주길 바란다고 언급했습니다.