#ise2023

2023-08-28

Last leg in our brief #timeline of (Large) #languagemodels (so far) is 2023, which saw the advent of many new and updated #LLMs:
- BARD #chatbot is introduced by Google
- LLaMA is introduced by Meta
- GPT-4 is introduced by OpenAI.
- LLaMA2.0 is introduced by Meta
- and many others...
#ISE2023 #lecture slides: drive.google.com/file/d/1atNvM
GPT-4 tech report: arxiv.org/pdf/2303.08774
@fizise @KIT_Karlsruhe #ai #artificialintelligence #llm #llms #gpt #openai #llama #lamda #bard

Slide from Information Service Engineering 2023 lecture, Brief History of (Large) Language Models, 2023: BARD is introduced by Google, a chatbot based on the Google LaMDA language model.
- LLaMA is introduced by Meta (LLaMA 65B trained on 1.4T tokens)
- GPT-4 is introduced by OpenAI.
- LLaMA2.0 is introduced by Meta (LLaMA 70B trained on 2T tokens)
BIbliography: OpenAI: GPT-4 - Technical Report, arXiv:2303.08774 [cs.CL]
2023-08-24

Next stop on our brief #timeline of (Large) #LanguageModels is 2022:
InstructGPT is introduced by OpenAI, a GPT-3 model complemented and fine-tuned with reinforcement learning from human feedback.
ChatGPT is introduced by OpenAI as a combination of GPT-3, Codex, and InstructGPT including lots of additional engineering.
#ise2023 lecture slides: drive.google.com/file/d/1atNvM
#RLHF explained: huggingface.co/blog/rlhf
#ai #creativeai #rlhf #gpt3 #gpt #openai #chatgpt #lecture #artificialintelligence #llm

Slide from Information Service Engineering 2023 lecture, BRief History of (Large) Language Models:
2022, InstructGPT is introduced by OpenAI, a GPT-3 model complemented and fine-tuned with reinforcement learning feedback. Improved instruction following and less likely producing hallucinated answers.
ChatGPT introduced by OpenAI, a combination of GPT-3, Codex, and InstructGPT plus a massive engineering effort.
Bibliography: N. Lambert, L. Castricato. L. von Werra, A. Havrilla. (2022). Illustrating Reinforcement Learning from Human Feedback (RLHF). huggingface.co.
2023-08-22

Next stop in our brief #timeline of (large) #languagemodels is 2021:
DALL-E is released by OpenAI and raises text2img to a new level.
Codex is released by OpenAI able to translate natural language into programming code.
WebGPT is released by OpenAI for answering open-ended questions.
LaMDA is introduced by Google.
Slides from #ise2023 #lecture: drive.google.com/file/d/1atNvM
Codex paper: arxiv.org/abs/2107.03374
DALL-E paper: arxiv.org/abs/2102.12092
@fizise #ai #generativeAI #GPT #dalle #openai
#lamda

Slide from Information Service ENgineering 2023 lecture, A Brief history of (Large) Language Models:
"2021:
DALL-E is released by OpenAI, a 12B parameter version of GPT-3 trained to generate images from text descriptions.
GitHub Co-Pilot, an AI pair-programmer for coding.
Codex released by OpenAI, able to translate natural language into programming code based on 159GB code and documentation.
WebGPT released by OpenAI, a fine-tuned GPT-3 for answering open-ended questions with citations and links to sources.
LaMDA (Language Model for Dialogue Application) introduced by Google."
Bibliography:
Chen, M., et al. (2021). Evaluating Large Language Models Trained on Code. ArXiv, abs/2107.03374
Ramesh, A.et al. (2021). Zero-Shot Text-to-Image Generation. ArXiv, abs/2102.12092.
2023-08-17

Next leg in our brief history of (Large) #LanguageModel is 2020, when #GPT-3 was released by OpenAI, based on 45TB data crawled from the web. A “data quality” predictor was trained to boil down the training data to 550GB “high quality” data. Learning from the prompt was introduced (few-shot learning)
Lecture slides: drive.google.com/file/d/1atNvM
paper: proceedings.neurips.cc/paper/2
@fizise #ai #artificialintelligence #creativeai #llm #ise2023 #lecture

Lecture SLide from Information Sertvice Engineering 2023, A Brief HIstory of (Large) Language Models: GPT-3 was released by OpenAI, based on 45TB data crawled from the web. A “data quality” predictor was trained to boil down the training data to 550GB “high quality” data. Learning from the prompt is introduced (few-shot learning).
Bibliography: T. B. Brown et al. (2020). Language models are few-shot learners. In Proceedings of the 34th Int. Conf. on Neural Information Processing Systems (NIPS'20). Curran Associates Inc., Red Hook, NY, USA, Article 159, 1877–1901.
2023-08-15

Next stop in our Brief History of (Large) #languagemodels is 2019: GPT-2 was released by OpenAI as a direct scale-up of GPT, comprising 1.5B parameters and trained on 8M web pages.
Slides (from #ise2023 lecture): drive.google.com/file/d/1atNvM
Paper: d4mucfpksywv.cloudfront.net/be
#llm #llms #ai #artificialintelligence #generativeai #gpt #lecture #historyofAI

Slides from the lecture Information Service Engineering 2023, Brief History of Large Language Models: 2019, GPT-2 was released by OpenAI as a direct scale-up of GPT, comprising 1.5B parameters and trained on 8M web pages.
Bibliography: Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language Models are Unsupervised Multitask Learners.
2023-08-14

@bsletten The #ise2023 summer lecture was not recorded. The intention was to bring students back to university lecture halls. There is a 80% overlap with the #ise2021 lecture which is already on #youtube. However, if you are interested in our latest lecture on #knowledgegraphs, don't miss the (free) #KG2023 online course "Knowledge Graphs - Foundations and Applications" on #OpenHPI which starts in Oct 2023.
open.hpi.de/courses/knowledgeg
@fizise @tabea @sashabruns @Hasso_Plattner_Institute

Sample from the online course video of our free Knowledge Graphs - Foundations and Applications KG2023 OpenHPI lecture, which will be broadcasted in October 2023. In the video player you can see my colleague Ann Tan and me (Harald Sack) . The cover slide on the right is on knowledge graph embeddings and shows a Japanese woodcut in the style of HIroshige created via ArtBot stableDiffusion.
2023-08-05

Next stop in our brief Timeline of (large) #languagemodels from the #ise2023 lecture is the advent of the Graphical Processing Units #gpu. In 1999 Nvidias GeForce 256 was one of the very first, which enabled highly parallel computations for #neuralnetworks
Slides: drive.google.com/file/d/1atNvM
@fizise #artificialintelligence #lecture #ai #machinelearning #llm

Slide from the Information Service Engineering 2023 lecture, A Brief History of Large Language Models: "NVIDIA  introduced the first Graphical Processing Unit (GPU) card Nvidia Geforce 256". Bibliography: John Peddie, Famous Graphics Chips: Nvidia’s GeForce 256, IEEE Computer Society.
Link: https://www.computer.org/publications/tech-news/chasing-pixels/nvidias-geforce-256
2023-08-04

1997 with the advent of Long Short-Term Memory recurrent #neuralnetworks marks the subsequent step in our brief history of )large) #languagemodels from last week's #ise2023 lecture. Introduced by Sepp Hochreiter and Jürgen Schmidhuber #LSTM #RNNs enabled efficient processing of sequences of data.
Slides: drive.google.com/file/d/1atNvM
#nlp #llm #llms #ai #artificialintelligence #lecture @fizise

Slide from Information Service Engineering 2023 Lecture, A Brief History of Large Language Models. "ong Short-Term Memory (LSTM) Recurrent Neural Networks are introduced by Sepp Hochreiter and Jürgen Schmidhuber which efficiently enabled the processing of sequences of data (instead of single data points) able to learn from data and to generate text." Depicted is a schematic view of an LSTM.
Bibliography: 
Hochreiter, Sepp; Schmidhuber, Juergen (1996). LSTM can solve hard long time lag problems. Advances in Neural Information Processing Systems, pp. 473–479.
Link: https://dl.acm.org/doi/10.5555/2998981.2999048
2023-08-03

Next step in our brief timeline of (large) #languagemodels from our #ise2023 lecture was statistical language modeling with n-grams based on large text corpora as introduced and popularized by Frederick Jelinek and Stanley F. Chen using statistical tricks like Bayes Theorem, Markov Assumption, and Maximum Likelihood Estimation, etc.
Slides: drive.google.com/file/d/1atNvM
@fizise #nlp #llm #llms #artificialintelligence #ai #lecture #creativeAI

Slide from the Information Service Engineering 2023 lecture. Brief Timeline of (Large) Language Models about statistical N-gram models.
"1990s: N-grams for statistical language modeling were introduced and popularized by Frederick Jelinek and Stanley F. Chen from IBM Thomas J. Watson Research Center, who developed efficient algorithms and techniques for estimating n-gram probabilities from large text corpora for speech recognition and machine translation." Furthermore, the formula to compute the conditional probability of an n-gram is given with a depiction of the N-gram Shakespeare generator with 1-grams, 2-grams, 3-grams and 4-grams. The picture of WIlliam Shakespeare has been created via ArtBot. 
Bibliography: F. Jelinek (1997), Statistical Methods for Speech Recognition, MIT Press, Cambridge, MA.
2023-08-02

Went to the university office to collect the #ise2023 final exams for later reviewing together with the @fizise ta team. But for now, a #perfectEspresso at home… because it was too rainy today for walking to Espresso Stazione ☕️ #lecture #coffeechallenge #karlsruhe @KIT_Karlsruhe

A cup of Espresso placed on a mirroring black surface (in fact an oven) which is mirroring parts of a window
2023-08-01

Slide 2 of our Brief Timeline for (Large) #LanguageModels from the last #ise2023 lecture introduced us to #ELIZA, Joseph Weizenbaum's simple #Chatbot from 1966 that simulates a conversation with a psychoanalyst. Weizenbaum was shocked that some persons including his secretary attributed human-like feelings to the computer program...
Slides: drive.google.com/file/d/1atNvM
#nlp #ai #llm #artificialintelligence @fizise

Slide 2 from the last Information Service Engiuneering 2023 lecture, with a brief history of Large Language Models:
ELIZA was an early natural language processing computer program created from 1964 to 1966 at the MIT Artificial Intelligence Laboratory by Joseph Weizenbaum which simulated conversation giving users an illusion of  understanding on the part of the program.
Bibliography: Weizenbaum, Joseph (1966). ELIZA—a computer program for the study of natural language communication between man and machine. Communications of the ACM. 9: 36–45.
2023-07-31

One of the final sections of the #ise2023 lecture was an excursion with a #timeline of (Large) #LanguageModels. We started our tour in 1948 with Claude Shannon's seminal work "A Mathematical Theory of Communication""
Slides: drive.google.com/file/d/1atNvM
@fizise #llm #ai #nlp #artificialintelligence #informationtheory #lecture

Slide from last week's Information Service Engineering 2023 lecture, A Timeline for Large Language Models, Depicting a portrait photography of Claude Elwood Shannon in front of a 1950s mainframe computer.
Text: "Claude Shannon proposed the idea of using n-grams as a means to analyze the statistical properties of language in "A Mathematical Theory of Communication" (1948). While Shannon's primary focus was on communication and information transmission, he recognized the relevance of n-grams in modeling language and predicting the likelihood of word sequences."
Bibliography:
Shannon, Claude Elwood (July 1948). A Mathematical Theory of Communication, Bell System Technical Journal. 27 (3): 379–423.
2023-07-28

As a 2nd topic of this last #ise2023 lecture, we were discussing #KnowledgeGraph Completion. Most simple approach for unsupervised #linkprediction based on (here translation-based) knowledge graph embeddings was explained on the example of Isaac Asimov.
Slides: drive.google.com/file/d/1atNvM
@fizise @enorouzi #scifi #knowledgegraphs #ai #deeplearning #embeddings

Slide from the last Information Service Engineering 2023 lecture, ISE Applications, 5.2 Knowledge Graph Completion:
Link Prediction with KG Embeddings
- Use Translational Embeddings 
 -- Unsupervised methods, e.g. TransE, use zs + zp to predict zo
 -- Supervised Methods for prediction, based on embedding vectors

Vectors for "Isaac Asimov" and "occupation" are added. For the resulting vector a nearest neighbor search is conducted to find - besides others - "SciFi Writer".
2023-07-27

Ok, I tried out Runway Gen-2. I did some tests with prompt only but also with uploaded images (mostly generated by another generative AI). Lessons learned: 1) don't expect too much...
2) you have to try very often...
3) don't expect too much ;-)
Below you can see the video generated based on my stablediffusion "Singularity" picture from the #ise2023 lecture. #generativeAI #runway #stablediffusion #stablediffusionart #aiart #singularity

2023-07-27

How can we find out the importance of a node in a #knowledgeGraph? In the last #ise2023 lecture, we were discussing graph centrality measures and how they can be applied in the context of knowledge graphs.
Slides: drive.google.com/file/d/1atNvM
SPARQL query (cf image below, the 100 most "important" #SciFi authors according to #wikidata) w.wiki/78Un
@fizise @enorouzi #semanticweb #lecture #ai #datascience #analytics

SPARQL Query from the Information Service Engineering 2023 Lecture, no. 13,  ISE APplications: What are the most important Science Fiction authors according to Wikidata? Please find the SPARQL query in the slides or in the link above.
2023-07-26

Topics of the last #ise2023 lecture; The Graph in #KnowledgeGraphs, Knowledge Graph Completion, A Brief History of Large Language Models, and Knowledge Graphs and Large Language Models. I will highlight some topics with the upcoming toots...
Slides: drive.google.com/file/d/1atNvM
#llms #languagemodels #deeplearning #linkprediction #kgc #lecture #machinelearning #transformers #gpt @fizise @enorouzi

Cover Slide of the last Information Service Engineering 2023 lecture, ISE Applications 01. Picture created by ArtBot. Prompt: "The seeds of modern Artificial Intelligence were planted by philosophers who attempted to describe the process of human thinking as the mechanical manipulation of symbols. Deep learning is a class ….”,  created via ArtBot, Deliberate, 2023, [CC-BY-4.0]
2023-07-26

Last #ise2023 lecture of this semester is about to start. 8:00AM is always tough for the students as well as for the professor 🥳 @fizise @KIT_Karlsruhe @enorouzi #ai #machinelearning #KnowledgeGraphs

ISE 2023 lecture. We see my interim laptop with the title slide standing on the desk in front of the lecture hall
2023-07-25

Last thing we've discussed in the "Limits of #AI" chapter of the #ISE2023 lecture was the threat of the so-called #Singularity. What is the singularity? Under which circumstances can it possibly happen? How real is this threat and should we better already aim for potential regulations?
Slides: drive.google.com/file/d/1LUOA-
@fizise @enorouzi @KIT_Karlsruhe #machinelearning #deeplearning #philosophy #aiart #stablediffusionart #creativeai

SLide from Information Service Engineerring 2023 lecture no. 12, Basic Machine Learning 03, on the Singularity. Under the Singularite, we understand a hypothetical future point in time at which technological growth becomes uncontrollable and irreversible, resulting in unforeseeable changes to human civilization. According to the most popular version of the singularity hypothesis, I. J. Good's intelligence explosion model, an upgradable intelligent agent will eventually enter a "runaway reaction" of self-improvement cycles, each new and more intelligent generation appearing more and more rapidly, causing an "explosion" in intelligence and resulting in a powerful superintelligence that qualitatively far surpasses all human intelligence.
2023-07-24

When discussing the limits of #AI in last week's #ise2023 lecture, we also talked about the Chinese Room Problem introduced by John Searle in 1980.
Slides: drive.google.com/file/d/1LUOA-
#machinelearning #artificialintelligence #deeplearning #lecture #philosophy @fizise @enorouzi

A slide from the Information Service Engineering 2023 lecture no 12, Basic Machine Learning 03,  about the limits of AI. The picture shows a "Chinese Room", The Chinese Room problem is a thought experiment that challenges the notion of artificial intelligence's understanding, wherein a person inside a closed room, who does not speak Chinese, can mechanically manipulate Chinese characters to produce correct responses but does not actually comprehend the language. The picture has been created via ArtBot with a prompt, describing the Chinese Room problem.

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst