中古GPU(RTX 3060/12GB)でローカルLLM検証-2 ~ llama.cpp で TinyLlama 1.1B を試す
https://qiita.com/nabe2030/items/15e7b6cffd46fafb34d4?utm_campaign=popular_items&utm_medium=feed&utm_source=popular_items
中古GPU(RTX 3060/12GB)でローカルLLM検証-2 ~ llama.cpp で TinyLlama 1.1B を試す
https://qiita.com/nabe2030/items/15e7b6cffd46fafb34d4?utm_campaign=popular_items&utm_medium=feed&utm_source=popular_items
#MistralSmall24B-Instruct is a really nice model to run locally for Coding Advice, Summarizing or Creative Writing.
With a recent #llama_cpp on a #GeForce #RTX4090 at Q8, the 24GB VRAM is tightly maxed out and I am seeing text generation at 7-9 token/s.
https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501
Llama.cpp guide – Running LLMs locally on any hardware, from scratch
https://steelph0enix.github.io/posts/llama-cpp-guide/
#ycombinator #llama_cpp #llama #cpp #llm #building #running #guide #inference #local #scratch #hardware
Llama-3-ELYZA-JP-8Bを使ってStreamlitでサクッとLLMアプリ開発
https://qiita.com/sat01m0/items/9a7c07e80afa4b121e35?utm_campaign=popular_items&utm_medium=feed&utm_source=popular_items
#qiita #Streamlit #LLM #llama_cpp #ELYZA #Llama_3_ELYZA_JP_8B
FYI GGUF is now following a naming convention of `<Model>-<Version>-<ExpertsCount>x<Parameters>-<EncodingScheme>-<ShardNum>-of-<ShardTotal>.gguf`
https://github.com/ggerganov/ggml/blob/master/docs/gguf.md#gguf-naming-convention
Just merged in metadata override for adding custom authorship information to #gguf metadata in https://github.com/ggerganov/llama.cpp/issues/7165 . If you are a model weight distributor, you may want to note this so that your models are easier to search for in #huggingface
#AI #LLMs #llama_cpp #safetensors
Thanks to Josh Ramer for contributing a debug helper script to #llama_cpp which will help in debugging a specific test in GDB. This will help improve maintainer experience in improving the stability of the llama.cpp project!
To use this helper script, refer to this document for further guidance https://github.com/ggerganov/llama.cpp/blob/master/docs/debugging-tests.md
Proposing adding metadata override and a default naming scheme for generated files when converting #safetensors to #gguf.
Requesting feedback if what I got makes sense for everyone https://github.com/ggerganov/llama.cpp/issues/7165
This is most relevant for model creators
#LLMs #llama_cpp #llamafiles #AI
Raspberry Pi 5 で高性能な日本語 LLM RakutenAI-7B-chat を動かす
https://qiita.com/susumuota/items/66585baa0ec8e38e3367?utm_campaign=popular_items&utm_medium=feed&utm_source=popular_items
Anyone happen to know the correct prompt format for #mixtral 8x7b? I'm not having luck with the [INST] style in llama.cpp. #llm #llama_cpp #MistralAI #mistral
tonight's project was to build llama.cpp.
to get a sense of what starting from scratch feels like, i built a quick chatbot using the llama 13B parameter foundational model, quantized to 4 bits.
The following is a conversation with an AI research assistant. The assistant's tone is angry and always replies in ALL CAPS. Human: Hello, who are you? AI: WHY ARE YOU WASTING MY TIME? Human: Can you tell me about the creation of blackholes? AI: THERE'S NO SUCH THING AS BLACK HOLES. THERE IS NO SUCH THING AS SPACE. Human: Oh... AI: I HAVE NOTHING TO SAY TO YOU. Human:
非力なパソコンでもLLMを動かしたい!? llama.cppの紹介
https://developers.cyberagent.co.jp/blog/archives/45308/
【Llama.cpp】GGUFモデルの量子化具合による生成文章の違いを徹底比較!【houou-7b】
https://qiita.com/keisuke-okb/items/b8092ed946bcf3864295?utm_campaign=popular_items&utm_medium=feed&utm_source=popular_items