ML Models - Introduction¶

Under the hood, Gainly uses various machine learning (ML) models including traditional ML models, foundation models (FM), and large language models (LLM) to implement its AI features.

What is a model?¶

A model in the context of machine learning and artificial intelligence (AI) is a mathematical representation or algorithm that is designed to make predictions or decisions based on input data.

The model learns patterns from data during a process called training, where it adjusts its internal parameters to minimize the difference between its predictions and the actual outcomes. Once trained, the model can be used to infer new information, classify data, or make predictions.

Traditional ML models¶

Traditional ML models are specialized algorithms designed to perform specific tasks by learning patterns from structured data. Unlike foundation models, these models are typically:

Task-specific: Trained for a single, well-defined purpose like classification
Data-efficient: Require less training data compared to foundation models
Computationally lighter: Generally require fewer computing resources to train and run

These models excel at structured data tasks such as:

Numerical predictions
Binary and multi-class classification
Anomaly detection
Pattern recognition in tabular data

Foundation models (FM)¶

Foundation models (FM) are large-scale models trained on vast amounts of data that can be adapted for a wide range of tasks. Unlike traditional ML models, foundation models are characterized by:

Task-agnostic: Trained on broad datasets without specific task objectives
Transfer learning: Ability to apply knowledge learned from one task to new, different tasks
Scale-dependent: Generally improves with increased model size and data
Resource-intensive: Require significant computational resources to train and run

Common applications of foundation models include:

Text understanding and generation
Image recognition and generation
Speech processing
Multi-modal tasks (combining text, images, audio)

In addition, these models serve as a foundation for developing more specialized models, with large language models being a prime example.

Large language models (LLM)¶

An LLM (Large Language Model) is a type of foundation model (FM) that is specifically designed to understand and generate human-like text based on vast amounts of language data. Key characteristics include:

Scale: Typically contain billions or trillions of parameters
Architecture: Based on transformer neural networks
Training: Self-supervised pre-training on internet-scale text data
- Plus supervised fine-tuning (SFT), and in some cases reinforcement learning (RL)
Versatility: Can perform multiple language tasks without task-specific training

Common capabilities of LLMs include:

Text generation and completion
Question answering
Summarization
Translation
Code generation
Reasoning and analysis