Skip to content

Filter

You can use the filter parameter in your API requests to specify filter criteria for document attributes, and find matching documents.

Gainly API supports complex filter criteria, including AND, OR, NOT logic, as well as nested criteria.

The following API requests support the filter parameter:

Filter vs. Search

Search involves performing a lexical or semantic (AI-Semantic) text search within documents, specifically targeting the content and title fields to match the value provided in the query parameter.

Filter applies filter criteria (specified via the filter parameter) to document attributes without conducting a text search.

Specifying both query and filter parameters in an API request (that supports both parameters) will apply the filter criteria to narrow down the document set, then perform a text search within those documents.

Use Filter Documents endpoint to apply filter criteria without performing a text search. This is useful for cases - like finding stores within 5 miles of the user - which only require querying document attributes.

Filter Search
Filter answers the question "Which documents match my attribute criteria?" Search answers the question "How well does the text in each document match my search terms?"
To find exact matches in document attributes. To search text fields (title and content) and sort by relevance.

Syntax

"filter": {
  "must": [ // optional, 1 or more clauses
      {
        CLAUSE
      },
      // additional clauses
  ],
  "must_not": [ // optional, 1 or more clauses
      {
        CLAUSE
      },
      // additional clauses     
  ],
  "should": [ // optional, 1 or more clauses
      {
        CLAUSE
      },
      // additional clauses
  ],
  "minimum_should_match": 1 // optional
}

The top-level keys are:

must

Only documents that match all clauses under must will be included. You can think of it as the AND operator.

must_not

Only documents that match none of the clauses under must_not will be included. You can think of it as the NOT operator.

should

Documents that match at least N clauses under should will be included, where N is specified by minimum_should_match. When N = 1, you can think of this as the OR operator.

minimum_should_match

Specifies the minimum number of should clauses that must be matched. If you specify must criteria, the default for minimum_should_match is 0; otherwise, it defaults to 1.

This can be specified using:

Type Example
integer "minimum_should_match": 1
percentage "minimum_should_match": "50%"

Example

"filter": {
    "must": [
        {
            "term": {
                "metadata.length": 52
            }
        },
        {
            "term": {
                "metadata.is_sale": true
            }
        },
        {
            "range": {
                "metadata.price": {
                    "gte": 50,
                    "lte": 100
                }
            }
        }
    ],
    "must_not": [
        {
            "term": {
                "metadata.status": "discontinued"
            }
        }
    ],
    "should": [
        {
            "term": {
                "metadata.color": "blue"
            }
        },
        {
            "term": {
                "metadata.color": "red"
            }
        }
    ],
    "minimum_should_match": 1
}

Supported fields for filter criteria

The following fields (document attributes) can be used to specify the filter criteria:

  • metadata subfields. Specify using dot notation, for example:
    • metadata.color
    • metadata.sale.sale_start_date
  • created_at
  • language
  • source_uri
  • tenant_id
  • updated_at

Clauses

Clauses are used within the top-level keys must, must_not, and should to define your filter criteria.

Term

Use the term clause to match for an exact value, in a field of any type.

"filter": {
  "must": [
    {
      "term": {
        "metadata.length": 52
      }
    },
    {
      "term": {
        "metadata.is_sale": true
      }
    }
  ]
}

Case-insensitive matches

If you'd like to perform a case-insensitive match against a keyword (string) field, you can use the extended syntax for term:

"filter": {
  "must": [
    {
      "term": {
        "metadata.color": {
          "value": "blue",
          "case_insensitive": true
        }
      }
    },
    {
      "term": {
        "metadata.is_sale": true
      }
    }
  ]
}

Matching for capitalization is decided by the case_insensitive parameter - default is false.

Array fields

Matching against array fields is a special case for the term clause.

Suppose metadata.color is an array of strings. The following clause will match documents that have blue as one of the elements in the metadata.color array.

1
2
3
4
5
6
7
8
9
"filter": {
  "must": [
    {
      "term": {
        "metadata.color": "blue"
      }
    }
  ]
}

For example, both of the following documents will match the query shown above.

{
  // some fields
  {
    "metadata": {
      "color": [
        "blue",
        "red",
        "green"
      ]
    }
  },
  // some more fields
}
{
  // some fields
  {
    "metadata": {
      "color": "blue"
    }
  },
  // some more fields
}

Terms

Use the terms clause to match for multiple values in a field.

"filter": {
  "must": [
    {
      "terms": {
        "metadata.color": [
          "blue",
          "red"
        ]
      }
    }
  ]
}

A document is included in the search if it matches any of the values in the array, with the correct capitalization.

For example, Document-1 and Document-2 will match the query above. However, Document-3 will not match due to capitalization differences.

{
  // some fields
  {
    "metadata": {
      "color": [
        "blue",
        "green"
      ]
    }
  },
  // some more fields
}
{
  // some fields
  {
    "metadata": {
      "color": "red"
    }
  },
  // some more fields
}
{
  // some fields
  {
    "metadata": {
      "color": "Blue"
    }
  },
  // some more fields
}
Case-insensitive Match for Multiple Values

terms clause does not support case-insensitive matches. To perform a case-insensitive match for multiple values in a field, use the term clause inside should:

"filter": {
  "should": [
    {
      "term": {
        "metadata.color": {
          "value": "blue",
          "case_insensitive": true
        }
      }
    },
    {
      "term": {
        "metadata.color": {
          "value": "red",
          "case_insensitive": true
        }
      }
    }
  ],
  "minimum_should_match": 1
}

This query will match all three example documents shown above.

Range

Use the range clause to match for a range of values in a field.

"filter": {
  "must": [
    {
      "range": {
        "metadata.price": {
          "gte": 20,
          "lte": 50
        }
      }      
    },
    {
      "term": {
        "metadata.is_sale": true
      }
    }
  ]
}

You can use the following operators:

  • gt: Greater than
  • lt: Less than
  • gte: Greater than or equal to
  • lte: Less than or equal to

The following field types can be used in the range clause:

  • integer
  • float
  • date

Date Values

When using the range clause (as explained above) for a date field, you can use the following date values.

"filter": {
  "must": [
    {
      "range": {
        "metadata.released_on": {
          "gte": "now-1M"           // relative date value (released in the last month)
        }
      }      
    },
    {
      "range": {
        "created_at": {
          "gte": "2023-12-31",     // absolute date value (ISO 8601 UTC default)
          "lte": "2024-12-01T00:00:00Z"
        }
      }
    },
    {
      "range": {
        "created_at": {
          "gte": "31/12/2023",     // absolute date value
          "format": "dd/MM/yyyy"   // (optional) format of date values
        }
      }
    }
  ]
}
Date Formats and Values

Format:

  • You can specify desired date format using the format parameter in the API request
    • If format is not specified, default format is ISO 8601 UTC datetime. E.g., 2025-01-12T20:48:57.845Z
    • If format is specified, please make sure any absolute date values in your API request match the specified format
  • How to specify format:
    • yyyy - Year (e.g., 2025)
    • MM - Month (01-12)
    • dd - Day of month (01-31)
    • HH - Hour in 24h format (00-23)
    • mm - Minutes (00-59)
    • ss - Seconds (00-59)
    • Common combinations:
      • yyyy-MM-dd - Date only (e.g., 2025-01-12)
      • yyyy-MM-dd'T'HH:mm:ss'Z' - Full UTC datetime
      • yyyy-MM-dd'T'HH:mm:ssXXX - Datetime with timezone offset (e.g., yyyy-MM-dd'T'HH:mm:ss-06:00)

Date Values:

  • You can use absolute date values in ISO 8601 format:
    • Full datetime: 2024-11-01T12:34:56Z
    • Date only: 2024-11-01
    • With timezone offset: 2024-11-01T12:34:56-07:00
  • You can use relative date values like now, now-1d, now-1M, etc. Supported date math units for relative date values:
    • y (years)
    • M (months)
    • w (weeks)
    • d (days)
    • h (hours)
    • m (minutes)
    • s (seconds)
  • Supported date math operators:
    • + (add)
    • - (subtract)
    • / (round down)
  • Examples of round down operator:
    • now-1d/d (yesterday at 00:00:00 - rounds down to start of day)
    • now/d (today at 00:00:00)
    • now+1y/y (start of next year)

Exists

Use the exists clause to match for documents that contain metadata or a specific metadata field.

"filter": {
  "must": [
    {
      "exists": {
        "field": "metadata.color"
      }
    },
    {
      "term": {
        "metadata.is_sale": true
      }
    }
  ]
}

A document is included in the search if it contains the specified field.

A document will contain the specified field if a value has ever been set for that field in the given document.

Geo clauses

You can use the following clauses with geo_point fields.

Geo-distance

Use geo_distance clause to match for documents with geo-points that are within a specified distance from the specified geo-point.

"filter": {
  "must": [
    {
      "geo_distance": {
        "distance": "10mi",  // distance from the specified point
        "metadata.store_location_geo_point": {  // geo-point field in your documents
          "lat": 45.77,      // latitude of the specified point
          "lon": -110.91     // longitude of the specified point
        }
      }
    }
  ]
}

The example clause shown above finds all stores located within 10 miles of the specified lat/lon.

Supported Distance Units:

  • mi (miles)
  • nmi (nautical miles)
  • km (kilometers)
  • m (meters)

Sorting by distance

The example clause shown above will find stores located within 10 miles of the specified lat/lon.

To display stores closest to the user, you also have to sort by geo-distance.

Nested filter criteria

Bool

Use bool clause to nest your filter criteria:

"filter": {
  "must": [
    {
      "bool": {
        "must": [
          { "term": { "field1": "value1" }},
          { "term": { "field2": "value2" }}
        ],
        "must_not": [
          { "term": { "field3": "value3" }}
        ]
      }
    },
    { "term": { "field4": "value4" }}
  ],
  "must_not": [
    { "term": { "field5": "value5" }}
  ],
  "should": [
    { "term": { "field6": "value6" }}
    { "term": { "field7": "value7" }}
  ],
  "minimum_should_match": 1
}

Each bool clause can contain the top-level keys (must, must_not, should, minimum_should_match), allowing you to nest your filter criteria to as many levels as needed.

This enables precise control over your filter criteria, using a structured and layered approach.

Common Pitfalls

  • Always use dot notation for metadata fields (e.g., metadata.brand not just brand)
  • For date ranges:
    • Ensure dates are properly formatted when using absolute dates (ISO 8601 format by default)
    • When using custom date formats, specify the format explicitly using the format parameter
    • Be careful with timezone handling - UTC is the default
  • For term filters:
    • Term matches are case-sensitive by default
    • Use the extended syntax with case_insensitive: true if you need case-insensitive matching
  • For geo-distance:
    • Ensure the field is of type geo_point in your metadata schema
    • Always specify both lat and lon values
    • Use appropriate distance units (e.g., km, mi)
  • When using bool queries:
    • Avoid deeply nested boolean conditions as they can impact performance
    • Use must_not instead of negating individual conditions
  • General tips:
    • Test your filters with a small dataset first
    • Use the most specific filter type for your use case
    • Consider the performance impact of complex filters on large datasets