Spring naar content

What are Generative AI and Large Language Models? 

Generative AI (GEN-AI) is a specific type of AI that focuses on generating new content, such as text, images, audio, or video. These systems are trained on large datasets and use machine learning algorithms to generate new content. This can be useful for endless different applications. For example, generating text for chatbots, creating art, or even generating speech or media files. 

Large Language Models (LLMs) are a form of generative AI capable of understanding and generating text. You can use this technology to predict answers to questions, write creatively (such as headlines, blog posts, etc.), translate text, or generate summaries of text. You can also use it to generate, translate, or detect errors in code. LLMs are generally trained on enormous amounts of text data, sometimes even with petabytes of data. As a result, they understand the relationships between sentences, words, and parts of words, learned by taking in vast amounts of web data, including hundreds of thousands of Wikipedia entries, social media posts, and news articles. LLMs are self-supervising. In other words, it is a machine learning algorithm that does not require annotated human-generated data (labeling) because it generates those labels itself in the first phase and uses them later in a later phase. 

For many, LLMs seem magical, but in fact, they are prediction machines. The model takes a piece of text as input (we call it ‘the prompt’) and then predicts what the next words should be based on the data on which the model is trained. Behind the scenes, it calculates the probabilities of the various possibilities and combinations of words that could follow. The output of the model is a massive list of possible words and their probabilities. In other words, if I ask a Large Language Model “Who is the President of the United States?” it will likely say “The President of the US is… Biden.” Not because the model knows the facts, but because of the probability of the word “Biden” based on the data it has seen. The word “Biden” in the context of the US President will score higher than any other name. 

Because these models are trained on a lot of data, they can generate a huge variety of texts, including unexpected texts. Give a LLM a text prompt to translate a text into another language, and it will likely produce a good translation. This is because the model has been trained on an enormous amount of multilingual texts, which allows it to perform translations without being explicitly trained to do so. 

ChatGPT and other Large Language Models 

The term Generative AI and Large Language Models likely brings ChatGPT to mind. This application has been in the news a lot in recent months. ChatGPT is an LLM that can generate text in a conversational way. It is a finely tuned version of another large language model called GPT-3.5, created by OpenAI, a startup founded by Elon Musk. In just five days, the release of ChatGPT brought in one million users in December 2022. 

Google has also developed an LLM called Bard. While ChatGPT is based on GPT-3.5, Bard, which is also a conversational LLM, is based on (a lightweight version of) LaMDA. A major advantage of Bard is that it can generate text based on real-time data, while ChatGPT has been trained on data up to 2021. Therefore, ChatGPT cannot answer current questions. 

Aside from Google and OpenAI, there are many other parties working on Large Language Models. You can find an overview below. 

Name By Announced Public 
PaLM 2 Google Deepmind May ‘23 Beta 
GPT-4 OpenAI March ‘23 Yes 
Bard Google Feb ’23 Beta 
Sparrow Google Deepmind Sept ’22 No 
OPT-IML Meta AI Dec ’22 Yes 
ChatGPT OpenAI Nov ’22 Yes 
LaMDA 2 Google AI May ’22 No 
PaLM Google Deepmind Apr ’22 Beta 
Chinchilla Google Deepmind May ’22 No 
GLaM Google Dec ’21 No 
LaMDA Google AI Jun ’21 No 
GPT-3 OpenAI May ’20 Yes 
Meena Google Jan ’20 No 
T5 Google Oct ’19 Yes 
GPT-2 OpenAI Feb ’19 Yes 
BERT Google Oct ’18 Yes 
GPT-1 OpenAI Jun ’18 Yes 

How do Large Language Models work: Prompt, parameters, and tokens 

To understand how an LLM works and how it learns, there are some key concepts you need to know: 

  1. A prompt is the input on which an LLM can generate a response. Based on the prompt, an LLM predicts what the next words should be based on the data on which it is trained. 
  2. A zero-shot prompt is the simplest type of prompt. It only provides a description of a task, a piece of text for the LLM to start with. It can be literally anything, a question, the beginning of a story, instructions, etc. The clearer your prompt text, the easier it is for the LLM to predict the next text. 
  3. A one-shot prompt provides one example that the LLM can use to learn how to best complete the task. 
  4. Few-shot prompts provide multiple examples. Usually between 10 and 100. It can be used to teach the LLM a pattern that needs to be continued. 
  5. Large language models are trained on parameters and tokens
  6. Parameters are the parts of the model that have been learned from historical training data and define the model’s skill at a problem. The more parameters, the more nuances there are in the model’s understanding of each word meaning and context. 
  7. Tokens are a numerical representation of words (or more often: parts of words). When you send a prompt to a LLM, it is split into tokens. 

Finetune a Large Language Model to Your Needs 

Thanks to its training, a LLM knows a lot about language and has knowledge that is useful for all kinds of Natural Lang Language Processing tasks (NLP tasks). Think, for example, of classifying and summarizing text and identifying sentiment. In addition, it is possible to make some small changes to the structure of the Large Language Model so that it focuses on classifying topics rather than predicting next words. At the same time, It doesn’t lose what it has learned about language patterns. Additionally, you can train an LLM specifically focused on your business practice. Suppose you want to build a chatbot for your travel agency. The LLM you use, knows everything about all countries in the world, but doesn’t know anything about the package tours you offer as a travel agent. You can fine-tune an LLM by retraining the foundation model with your own data. This process, also known as transfer learning, can produce accurate models with smaller datasets and less training hours, making it cheaper than creating an entirely custom model to do specific tasks. 

The performance of a pre-trained language model depends on its size. The larger the model, the better the quality of the output, but this comes at the cost of speed and price. Smaller models are cheaper to use and give output faster than the larger models. However, they are not very powerful, and therefore better suited for simpler tasks such as classification. Larger models are useful for generating creative content.

In conclusion

We can indeed state that the possibilities of Generative AI and LLMs are significant. The results are astonishing, and you can train existing models according to your own desires. And that promises a lot for the future. Whether everything surrounding this new technology is all sunshine and roses remains to be seen. But we will delve deeper into that in this series of articles.

Want to learn more about Generative AI and Large Language Models? Then tune in to the first episode of the second season of the DDMA Podcast: Shaping The Future, where they delve into the positive and negative implications of this technology for the field of marketing. 

Lee Boonstra

Applied AI Engineer and Developer Advocate bij Google

Marike van de Klomp

Lead Product Owner Digital Channels & Conversational AI bij ABN AMRO

Robin Hogenkamp

Senior Business Consultant CX bij VodafoneZiggo

Romar van der Leij

Legal counsel

Ook interessant

Lees meer
Artificial Intelligence |

Interview Aleyda Solis (Orainti) | The (dis)advantages of Search Engine Experience

‘Search Generative Experience’ (SGE) is poised to be a game changer. How will it affect organizations? What are its (dis)advantages? We’re thrilled that Aleyda will be speaking about this at…
Lees meer
Artificial Intelligence |

DDMA Commissie AI heeft versterking met twee nieuwe leden

De DDMA Commissie AI heeft er twee nieuwe aanwinsten bij. Dirk de Raaff, Head of Quantitative Marketing bij Air France-KLM, en Victor Eekhof, co-founder van Studio Vi, maken sinds januari…
Lees meer
Artificial Intelligence |

De grootste trends in data-driven marketing voor 2024 (volgens de commissies van DDMA)

Welke ontwikkelingen gaan ervoor zorgen dat jouw werk als marketeer, jurist of dataspecialist in 2024 ingrijpend gaat veranderen? Welke innovaties mag jij echt niet missen? Zoals elk jaar zetten de…