Predictor ML Model¶
You can train a custom Predictor ML model from your Gainly Dashboard and use it in the Predict endpoint.
What is a Predictor Model?¶
A Predictor model is a machine learning (ML) model that predicts an outcome based on patterns in your historical structured data.
Types of Outcomes¶
Predictor models can predict the following types of outcomes:
- Numeric: Predict a continuous number value (examples: price, temperature)
- Non-Numeric: Predict a boolean value (examples: yes/no, true/false) or a string value (examples: category, video ID)
Example Use Cases
- Forecast monthly sales for each product
- Predict customer lifetime value based on early interactions
- Estimate how long a project will take based on task characteristics
- Predict optimal pricing based on market conditions
- Predict if a customer is likely to churn based on their activity patterns
- Determine if a transaction is potentially fraudulent
- Forecast subscription renewal based on engagement metrics
- Predict equipment failure based on performance metrics
- Categorize financial transactions based on attributes
- Classify customer segments based on behavioral metrics
- Determine product quality grades from manufacturing parameters
- Categorize risk levels based on financial indicators
- Recommend products based on user history
- Recommend personalized content based on user behavior
- Recommend new features for a user to try
- Recommend optimal subscription tiers based on usage metrics
- Predict the most effective marketing channel for each customer segment
- Determine optimal inventory levels based on seasonal patterns
- Suggest the best shipping method based on order details
- Forecast resource allocation needs based on project attributes
Dataset for Training¶
To train a Predictor ML model, we recommend at least 1000 samples of your historical structured data to train on. These samples must be:
- High quality
- Balanced
- Representative of your domain
- Consistent
Dataset for Numeric vs. Non-Numeric Outcomes
Numeric outcomes:
- We recommend at least 1000 samples of your historical structured data to train on.
Non-Numeric outcomes:
- We recommend at least 500 samples per class of your historical structured data to train on.
- A class refers to a unique value in the labelcolumn (see below for more details on thelabelcolumn).
Structured Data¶
In the context of training a Predictor model, structured data refers to tabular data that is organized with columns and rows and contains the following data types:
| Data Type | Format | Example Column Name | Example Values | 
|---|---|---|---|
| String | Text strings (1-100 chars) | customer_tiervideo_id | "premium""video_192864" | 
| Numeric | Integer or decimal | order_value | 29.99,250 | 
| Boolean | Binary (1 or 0) | is_verified | 1,0 | 
The dataset file must be in CSV format with the following column requirements.
Required Columns¶
| Column | Required | Description | Constraints | 
|---|---|---|---|
| label | Yes | Value (outcome) that the model will learn to predict | • Any supported data type (see above) • At least 2 unique values (outcomes) | 
Date and Time¶
- Date - Split into year, month, and date columns and use numericvalues in each column
- Time - Split into hour, minute, and second columns and use numericvalues in each column
Date Example
Convert transaction_date column containing values such as 2024-12-25 into 3 separate columns:
- transaction_date_year:- 2024
- transaction_date_month:- 12
- transaction_date_day:- 25
Long-Form Text¶
Please note that long-form text is not supported in Predictor models. If your use case requires long-from text columns in addition to structured data columns, please refer to the Mixed Data Types page.
Additional Requirements¶
- CSV file must be UTF-8 encoded
- CSV file must have header row (first row) with column names
- Column names:- Must be unique
- Must only contain lowercase letters, numbers, and underscores (a-z,0-9,_)
 
- CSV file must include 1-1000 additional columns (excluding the labelcolumn). These columns provide the data that the model uses to learn patterns and make predictions about thelabelcolumn.
- CSV file must contain (excluding the header row):- At least 100 rows of data
- No more than 1,000,000 rows of data
 
- CSV file size must not exceed 200MB
Steps for Training¶
- Log in to your Gainly Dashboard.
- Go to Settings > Custom Models.
- Click the Create Model button.
- Select Predictor as the Model Type.
- Follow the on-screen instructions to train your model.