Pasar al contenido principal
Inicio
Leapfrog.cl

Main navigation

  • Acerca
  • Servicios
    • Servicios IA
    • Desarrollo web
  • Historia
  • Blog
  • IA glosario

LLM

Node read time
2 minutos
Made with Stable Diffusion, 2024 by Leapfrog.cl
Made with Stable Diffusion, 2024 by Leapfrog.cl

ChatGPT is possibly the most famous LLM (large language model) and is based on GPT foundation models (GPT-3.5,-4), that were fine-tuned to target conversational usage.[1] GPT means Generative Pre-trained Transformer and is a class of natural language processing models developed by OpenAI and are designed to understand and generate human-like text. GPT models are pre-trained on huge datasets, the "pre-training phase involves learning the structure and nuances of language, including grammar, semantics, and context."[2]

What is LLM?

LLM is a general term for a range of large-scale language models designed for natural language processing tasks, GPT models are a subset. LLMs are not limited to a single architecture like the Transformer used in GPT models. LLMs can have various architectures, including recurrent neural networks (RNNs) and convolutional neural networks (CNNs). LLM are considered are a form of generative AI and are very large deep learning models that are pre-trained and can be fine-tuned on specific tasks or domains. This fine-tuning process "tailors the model’s capabilities to particular applications, such as language translation, text completion, or question-answering".[2][3]

LLMs can be pre-trained and then fine-tuned for specific purposes. "Pre-training and fine-tuning are key steps in developing large language models. Pre-training involves training a large language model for general purposes with a large data set, while fine-tuning involves training the model for specific aims with a much smaller data set."[2]

There are 3 types of LLMs, 1. Generic (or RAW) Language Models that predict the next token (word), like an autocomplete in a search. 2. Instruction tuned models, trained to predict a response to the given instructions in the input and 3. Dialog tuned, trained to have a dialog by predicting the next response. These models require different prompt design to perform, "Chain of thought reasoning" is a method to improve answers, "models are better at getting the right answer when they first output text that explains the reason for the answer." [2]

Prompt design & engineering

"Prompt design involves creating a clear, concise, and informative prompt for the desired task, while prompt engineering focuses on improving performance. This may involve using domain-specific knowledge, providing examples of the desired output, or using keywords that are known to be effective for the specific system"[3] and adjusting its parameters and weights to improve performance. It is the task of developing prompts that guide models to perform a specialized tasks, a process of structuring input to create accuracy and effectiveness in the response. 

[1] ChatGPT Wikipedia 

[2] Watch recommended video "Introduction to large language models" by Google Cloud Tech

[3] Understanding the Difference Between GPT and LLM blog.stackademic.com
 

 

Recommended video(s)
Further reading(s)
Attention Is All You Need
Artificial intelligence the future of content management and the web, Dries Buy…

Ai glossary navigation

  • Supervised Machine Learning
  • Unsupervised Machine Learning
  • Reinforcement learning
  • LLM
  • Advanced Learning Algorithms

Footer

  • Abierto para desarrollo. ¿Hablemos?

Copyright © 2024 Leapfrog Web Solutions EIRL. All rights reserved.