Skip to content

Text Classifier ML Model

You can train a custom Text Classifier ML model from your Gainly Dashboard and use it in the Classify endpoint.

What is a Text Classifier Model?

A Text Classifier model is a machine learning (ML) model that classifies long-form text content (video/audio transcripts, support tickets, emails, reviews, comments, articles, etc.) based on your historical samples.

Example Use Cases
  • Assign labels to video/audio transcripts
  • Assign categories to product descriptions
  • Classify user reviews/comments by sentiment
  • Categorize customer feedback by topic or product
  • Flag urgent vs non-urgent customer inquiries
  • Detect spam in emails or user posts
  • Route support tickets to the right team (technical, billing, sales)
  • Assign topic to documents or articles

Dataset for Training

To train a Text Classifier ML model, we recommend at least 1000 samples (per class) of your historical data to train on. These samples must be:

  • High quality
  • Balanced
  • Representative of your domain
  • Consistent

The dataset file must be in CSV format with exactly two columns as described below.

Required Columns

Column Required Description Constraints
label Yes Target class/category for classification • Non-empty string values only
• 2-25 unique values (classes)
• Length: 1-100 characters
text Yes Text content to classify • Non-empty string values only
• Length: 25-5000 characters

Additional Requirements

  • CSV file must be UTF-8 encoded
  • CSV file must have header row (first row) with column names of label and text
  • CSV file must contain (excluding the header row):
    • At least 100 rows of data
    • No more than 20,000 rows of data
  • CSV file size must not exceed 200MB

Steps for Training

  1. Log in to your Gainly Dashboard.
  2. Go to Settings > Custom Models.
  3. Click the Create Model button.
  4. Select Text Classifier as the Model Type.
  5. Follow the on-screen instructions to train your model.