Text Classifier ML Model¶
You can train a custom Text Classifier ML model from your Gainly Dashboard and use it in the Classify endpoint.
What is a Text Classifier Model?¶
A Text Classifier model is a machine learning (ML) model that classifies long-form text content (video/audio transcripts, support tickets, emails, reviews, comments, articles, etc.) based on your historical samples.
Example Use Cases
- Assign labels to video/audio transcripts
- Assign categories to product descriptions
- Classify user reviews/comments by sentiment
- Categorize customer feedback by topic or product
- Flag urgent vs non-urgent customer inquiries
- Detect spam in emails or user posts
- Route support tickets to the right team (technical, billing, sales)
- Assign topic to documents or articles
Dataset for Training¶
To train a Text Classifier ML model, we recommend at least 1000 samples (per class) of your historical data to train on. These samples must be:
- High quality
- Balanced
- Representative of your domain
- Consistent
The dataset file must be in CSV format with exactly two columns as described below.
Required Columns¶
Column | Required | Description | Constraints |
---|---|---|---|
label |
Yes | Target class/category for classification | • Non-empty string values only • 2-25 unique values (classes) • Length: 1-100 characters |
text |
Yes | Text content to classify | • Non-empty string values only • Length: 25-5000 characters |
Additional Requirements¶
- CSV file must be UTF-8 encoded
- CSV file must have header row (first row) with column names of
label
andtext
- CSV file must contain (excluding the header row):
- At least 100 rows of data
- No more than 20,000 rows of data
- CSV file size must not exceed 200MB
Steps for Training¶
- Log in to your Gainly Dashboard.
- Go to Settings > Custom Models.
- Click the Create Model button.
- Select Text Classifier as the Model Type.
- Follow the on-screen instructions to train your model.