Predictor ML Model¶

You can train a custom Predictor ML model from your Gainly Dashboard and use it in the Predict endpoint.

What is a Predictor Model?¶

A Predictor model is a machine learning (ML) model that predicts an outcome based on patterns in your historical structured data.

Types of Outcomes¶

Predictor models can predict the following types of outcomes:

Numeric: Predict a continuous number value (examples: price, temperature)
Non-Numeric: Predict a boolean value (examples: yes/no, true/false) or a string value (examples: category, video ID)

Example Use Cases

Numeric PredictionsYes/No DecisionsCategory ClassificationsRecommendationsOther Examples

Forecast monthly sales for each product
Predict customer lifetime value based on early interactions
Estimate how long a project will take based on task characteristics
Predict optimal pricing based on market conditions

Predict if a customer is likely to churn based on their activity patterns
Determine if a transaction is potentially fraudulent
Forecast subscription renewal based on engagement metrics
Predict equipment failure based on performance metrics

Categorize financial transactions based on attributes
Classify customer segments based on behavioral metrics
Determine product quality grades from manufacturing parameters
Categorize risk levels based on financial indicators

Recommend products based on user history
Recommend personalized content based on user behavior
Recommend new features for a user to try
Recommend optimal subscription tiers based on usage metrics

Predict the most effective marketing channel for each customer segment
Determine optimal inventory levels based on seasonal patterns
Suggest the best shipping method based on order details
Forecast resource allocation needs based on project attributes

Dataset for Training¶

To train a Predictor ML model, we recommend at least 1000 samples of your historical structured data to train on. These samples must be:

High quality
Balanced
Representative of your domain
Consistent

Dataset for Numeric vs. Non-Numeric Outcomes

Numeric outcomes:

We recommend at least 1000 samples of your historical structured data to train on.

Non-Numeric outcomes:

We recommend at least 500 samples per class of your historical structured data to train on.
A class refers to a unique value in the label column (see below for more details on the label column).

Structured Data¶

In the context of training a Predictor model, structured data refers to tabular data that is organized with columns and rows and contains the following data types:

Data Type	Format	Example Column Name	Example Values
String	Text strings (1-100 chars)	`customer_tier` `video_id`	`"premium"` `"video_192864"`
Numeric	Integer or decimal	`order_value`	`29.99`, `250`
Boolean	Binary (1 or 0)	`is_verified`	`1`, `0`

The dataset file must be in CSV format with the following column requirements.

Required Columns¶

Column	Required	Description	Constraints
`label`	Yes	Value (outcome) that the model will learn to predict	• Any supported data type (see above) • At least 2 unique values (outcomes)

Date and Time¶

Date - Split into year, month, and date columns and use numeric values in each column
Time - Split into hour, minute, and second columns and use numeric values in each column

Date Example

Convert transaction_date column containing values such as 2024-12-25 into 3 separate columns:

transaction_date_year: 2024
transaction_date_month: 12
transaction_date_day: 25

Long-Form Text¶

Please note that long-form text is not supported in Predictor models. If your use case requires long-from text columns in addition to structured data columns, please refer to the Mixed Data Types page.

Additional Requirements¶

CSV file must be UTF-8 encoded
CSV file must have header row (first row) with column names
Column names:
- Must be unique
- Must only contain lowercase letters, numbers, and underscores (a-z, 0-9, _)
CSV file must include 1-1000 additional columns (excluding the label column). These columns provide the data that the model uses to learn patterns and make predictions about the label column.
CSV file must contain (excluding the header row):
- At least 100 rows of data
- No more than 1,000,000 rows of data
CSV file size must not exceed 200MB

Steps for Training¶

Log in to your Gainly Dashboard.
Go to Settings > Custom Models.
Click the Create Model button.
Select Predictor as the Model Type.
Follow the on-screen instructions to train your model.