ML Models - Introduction¶
Under the hood, Gainly uses various machine learning (ML) models including traditional ML models, foundation models (FM), and large language models (LLM) to implement its AI features.
What is a model?¶
A model in the context of machine learning and artificial intelligence (AI) is a mathematical representation or algorithm that is designed to make predictions or decisions based on input data.
The model learns patterns from data during a process called training, where it adjusts its internal parameters to minimize the difference between its predictions and the actual outcomes. Once trained, the model can be used to infer new information, classify data, or make predictions.
Traditional ML models¶
Traditional ML models are specialized algorithms designed to perform specific tasks by learning patterns from structured data. Unlike foundation models, these models are typically:
- Task-specific: Trained for a single, well-defined purpose like classification
- Data-efficient: Require less training data compared to foundation models
- Computationally lighter: Generally require fewer computing resources to train and run
These models excel at structured data tasks such as:
- Numerical predictions
- Binary and multi-class classification
- Anomaly detection
- Pattern recognition in tabular data
Foundation models (FM)¶
Foundation models (FM) are large-scale models trained on vast amounts of data that can be adapted for a wide range of tasks. Unlike traditional ML models, foundation models are characterized by:
- Task-agnostic: Trained on broad datasets without specific task objectives
- Transfer learning: Ability to apply knowledge learned from one task to new, different tasks
- Scale-dependent: Generally improves with increased model size and data
- Resource-intensive: Require significant computational resources to train and run
Common applications of foundation models include:
- Text understanding and generation
- Image recognition and generation
- Speech processing
- Multi-modal tasks (combining text, images, audio)
In addition, these models serve as a foundation for developing more specialized models, with large language models being a prime example.
Large language models (LLM)¶
An LLM (Large Language Model) is a type of foundation model (FM) that is specifically designed to understand and generate human-like text based on vast amounts of language data. Key characteristics include:
- Scale: Typically contain billions or trillions of parameters
- Architecture: Based on transformer neural networks
- Training: Self-supervised pre-training on internet-scale text data
- Plus supervised fine-tuning (SFT), and in some cases reinforcement learning (RL)
- Versatility: Can perform multiple language tasks without task-specific training
Common capabilities of LLMs include:
- Text generation and completion
- Question answering
- Summarization
- Translation
- Code generation
- Reasoning and analysis