NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute
#HackerNews #NanoGPT #Slowrun #LanguageModeling #LimitedData #InfiniteCompute
NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute
#HackerNews #NanoGPT #Slowrun #LanguageModeling #LimitedData #InfiniteCompute
Python Trending (@pythontrending)
dLLM(dllm): Simple Diffusion Language Modeling이라는 프로젝트/도구 공개 알림입니다. 확산(diffusion) 기반 기법을 언어 모델링에 적용한 간단한 구현체 또는 연구용 레퍼런스로 보이며, 확산 기반 LLM 실험·연구를 위한 오픈 프로젝트 성격으로 해석됩니다.
Ngày 9/21: MultiHead Attention mở khóa khả năng xử lý đa chiều cho mô hình ngôn ngữ. Thay vì 1 cơ chế chú ý, mô hình dùng nhiều "head" song song để chuyên sâu vào ngữ pháp, ngữ nghĩa, mối quan hệ từ vựng, hoặc cảm xúc. Ví dụ: câu "Sarah thích bảo tàng Paris" giúp các head đồng thời phân tích cấu trúc, mối liên hệ và ngữ cảnh. Tối ưu tính toán mà vẫn nâng cao độ sâu ngữ nghĩa. #AI #MachineLearning #NgônNgữTựNhiên #LanguageModeling
https://www.reddit.com/r/LocalLLaMA/comments/1pon3oz/day_9_21_day
🚀 ĐồngMai và/component Cydonia v4.2.0 đã ra mắt! Cùng dùng nền tảng Magistral 2509, Bruchously tăng tính sáng tạo so với v4.1. Mình chia sẻ cốt lõi trên Hugging Face. Đang trải nghiệm vers Goulden Air 4.6CODECHAN đi. #AI #LanguageModeling #HuggingFace #T cytokine #Cydonia
https://www.reddit.com/r/LocalLLaMA/comments/1oa29de/drummers_cydonia_and_magidonia_24b_v420/
Tokenization for language modeling: BPE vs. Unigram Language Modeling (2020)
https://ndingwall.github.io/blog/tokenization
#HackerNews #Tokenization #LanguageModeling #BPE #Unigram #NLP
Andrey Markov & Claude Shannon Counted Letters to Build the First Language-Generation Models
Shannon’s said: “OCRO HLI RGWR NMIELWIS”
#Shannon #Markov #NLP #AIhistory #LanguageModeling
https://spectrum.ieee.org/andrey-markov-and-claude-shannon-built-the-first-language-generation-models
🎯 #OuteTTS introduces a novel approach to text-to-speech synthesis using pure #languagemodeling
🔧 Built on #LLaMa architecture with just 350M parameters, featuring:
Zero-shot #voicecloning capability
Integration with #WavTokenizer (75 tokens/sec)
Local deployment via #llamacpp
#GGUF format compatibility
🔍 Technical Implementation:
Audio tokenization process
CTC forced alignment
Structured prompt system
Temperature-adjustable outputs
⚠️ Current Limitations:
Limited vocabulary range
String-only input support
Best performance with shorter sentences
Variable temperature sensitivity
https://github.com/edwko/OuteTTS
https://huggingface.co/OuteAI/OuteTTS-0.1-350M
New #languagemodeling #nlp #ai #paper, led by Angelica Chen! We break the steepest MLM training loss drop into *2* phase changes: first in internal grammatical structure, then external capabilities. Big implications for emergence, simplicity bias, and interpretability! https://arxiv.org/abs/2309.07311
GPT-4 API by OpenAI Now Available – Analytics India Magazine #GPT4API
Hashtags: #chatGPT #AIAdvancements #LanguageModeling Summery: OpenAI, the leading artificial intelligence research lab, has announced the general availability of its GPT-4 API. GPT-4, or Generative Pre-trained Transformer 4, is the latest version of OpenAI's language model, which has been trained on a vast amount of internet text to generate human-like responses. The GPT-4 API allows developers to…
https://webappia.com/gpt-4-api-by-openai-now-available-analytics-india-magazine-gpt4api/
Chat-GPT: The Unnoticed Transformation of Your Thought Process #AIRevolution
Hashtags: #AIRevolution #LanguageModeling #CognitiveShift Summery: Generative models, a type of artificial intelligence (AI) model, have the ability to generate responses without uncertainty or the ability to communicate their lack of certainty. This means that they can confidently provide answers, even if they are false or hallucinated. This poses a problem because people are more likely to…
https://webappia.com/chat-gpt-the-unnoticed-transformation-of-your-thought-process-airevolution/
Improving Dialogue through Optimization of Language Models #DialogueOptimization
Hashtags: #DialogueOptimization #LanguageModeling #ConversationalAI Summery: ChatGPT has become one of the most popular technologies in the IT industry due to its impressive capabilities in optimizing language models for dialogue generation. It offers a human-like conversational experience and has gained immense popularity, with over 100 million users and 1 million users within just 5 days…
Revealing the structure of language model capabilities
https://arxiv.org/abs/2306.10062
Building a theoretical understanding of the capabilities of large language models (LLMs) is vital for our ability to predict & explain the behavior of these systems. ... we analyzed data from 29 LLMs / 27 cognitive tasks. LLM are better explained by 3 well-delineated factors that represent reasoning, comprehension & core language modeling.
#LLM #LargeLanguageModels #reasoning #LanguageModeling #comprehension #GPT
#LanguageModeling is trending, to a large extent because of #ChatGPT. But did you know language modeling has been with us for more than a century? And that it was born of the collaboration of a poet and a mathematician?
Our engineer Carsten Schnober tells us more:
https://blog.esciencecenter.nl/language-modeling-the-first-100-years-357556816148