Lmst

中古GPU（RTX 3060/12GB）でローカルLLM検証-2 ～ llama.cpp で TinyLlama 1.1B を試す
https://qiita.com/nabe2030/items/15e7b6cffd46fafb34d4?utm_campaign=popular_items&utm_medium=feed&utm_source=popular_items

#qiita #CUDA #NVIDIA #RTX3060 #LLM #llama_cpp

#MistralSmall24B-Instruct is a really nice model to run locally for Coding Advice, Summarizing or Creative Writing.

With a recent #llama_cpp on a #GeForce #RTX4090 at Q8, the 24GB VRAM is tightly maxed out and I am seeing text generation at 7-9 token/s.

https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501

Llama.cpp guide – Running LLMs locally on any hardware, from scratch
https://steelph0enix.github.io/posts/llama-cpp-guide/
#ycombinator #llama_cpp #llama #cpp #llm #building #running #guide #inference #local #scratch #hardware

Llama-3-ELYZA-JP-8Bを使ってStreamlitでサクッとLLMアプリ開発
https://qiita.com/sat01m0/items/9a7c07e80afa4b121e35?utm_campaign=popular_items&utm_medium=feed&utm_source=popular_items

#qiita #Streamlit #LLM #llama_cpp #ELYZA #Llama_3_ELYZA_JP_8B

FYI GGUF is now following a naming convention of `<Model>-<Version>-<ExpertsCount>x<Parameters>-<EncodingScheme>-<ShardNum>-of-<ShardTotal>.gguf`

https://github.com/ggerganov/ggml/blob/master/docs/gguf.md#gguf-naming-convention

#gguf #llm #llama_cpp #huggingface #llama #ai

Just merged in metadata override for adding custom authorship information to #gguf metadata in https://github.com/ggerganov/llama.cpp/issues/7165 . If you are a model weight distributor, you may want to note this so that your models are easier to search for in #huggingface
#AI #LLMs #llama_cpp #safetensors

Thanks to Josh Ramer for contributing a debug helper script to #llama_cpp which will help in debugging a specific test in GDB. This will help improve maintainer experience in improving the stability of the llama.cpp project!

To use this helper script, refer to this document for further guidance https://github.com/ggerganov/llama.cpp/blob/master/docs/debugging-tests.md

https://github.com/josh-ramer

#LLMs

Proposing adding metadata override and a default naming scheme for generated files when converting #safetensors to #gguf.

Requesting feedback if what I got makes sense for everyone https://github.com/ggerganov/llama.cpp/issues/7165

This is most relevant for model creators
#LLMs #llama_cpp #llamafiles #AI

Raspberry Pi 5 で高性能な日本語 LLM RakutenAI-7B-chat を動かす
https://qiita.com/susumuota/items/66585baa0ec8e38e3367?utm_campaign=popular_items&utm_medium=feed&utm_source=popular_items

#qiita #RaspberryPi #機械学習 #LLM #llama_cpp #raspberrypi5

Anyone happen to know the correct prompt format for #mixtral 8x7b? I'm not having luck with the [INST] style in llama.cpp. #llm #llama_cpp #MistralAI #mistral

tonight's project was to build llama.cpp.

to get a sense of what starting from scratch feels like, i built a quick chatbot using the llama 13B parameter foundational model, quantized to 4 bits.

The following is a conversation with an AI research assistant.
The assistant's tone is angry and always replies in ALL CAPS.

Human: Hello, who are you?
AI: WHY ARE YOU WASTING MY TIME?

Human: Can you tell me about the creation of blackholes?
AI: THERE'S NO SUCH THING AS BLACK HOLES. THERE IS NO SUCH THING AS SPACE.

Human: Oh...

AI: I HAVE NOTHING TO SAY TO YOU.

Human:

#llm #llama_cpp