Filter
You can use the filter parameter in your API requests to specify filter criteria for document attributes, and find matching documents.
Gainly API supports complex filter criteria, including AND, OR, NOT logic, as well as nested criteria.
The following API requests support the filter parameter:
Filter vs. Search
Search involves performing a lexical or semantic (AI-Semantic) text search within documents, specifically targeting the content and title fields to match the value provided in the query parameter.
Filter applies filter criteria (specified via the filter parameter) to document attributes without conducting a text search.
Specifying both query and filter parameters in an API request (that supports both parameters) will apply the filter criteria to narrow down the document set, then perform a text search within those documents.
Use Filter Documents endpoint to apply filter criteria without performing a text search. This is useful for cases - like finding stores within 5 miles of the user - which only require querying document attributes.
| Filter | Search |
|---|---|
| Filter answers the question "Which documents match my attribute criteria?" | Search answers the question "How well does the text in each document match my search terms?" |
| To find exact matches in document attributes. | To search text fields (title and content) and sort by relevance. |
Syntax¶
"filter": {
"must": [ // optional, 1 or more clauses
{
CLAUSE
},
// additional clauses
],
"must_not": [ // optional, 1 or more clauses
{
CLAUSE
},
// additional clauses
],
"should": [ // optional, 1 or more clauses
{
CLAUSE
},
// additional clauses
],
"minimum_should_match": 1 // optional
}
The top-level keys are:
must¶
Only documents that match all clauses under must will be included. You can think of it as the AND operator.
must_not¶
Only documents that match none of the clauses under must_not will be included. You can think of it as the NOT operator.
should¶
Documents that match at least N clauses under should will be included, where N is specified by minimum_should_match. When N = 1, you can think of this as the OR operator.
minimum_should_match¶
Specifies the minimum number of should clauses that must be matched. If you specify must criteria, the default for minimum_should_match is 0; otherwise, it defaults to 1.
This can be specified using:
| Type | Example |
|---|---|
| integer | "minimum_should_match": 1 |
| percentage | "minimum_should_match": "50%" |
Example¶
"filter": {
"must": [
{
"term": {
"metadata.length": 52
}
},
{
"term": {
"metadata.is_sale": true
}
},
{
"range": {
"metadata.price": {
"gte": 50,
"lte": 100
}
}
}
],
"must_not": [
{
"term": {
"metadata.status": "discontinued"
}
}
],
"should": [
{
"term": {
"metadata.color": "blue"
}
},
{
"term": {
"metadata.color": "red"
}
}
],
"minimum_should_match": 1
}
Supported fields for filter criteria¶
The following fields (document attributes) can be used to specify the filter criteria:
metadatasubfields. Specify using dot notation, for example:metadata.colormetadata.sale.sale_start_date
created_atlanguagesource_uritenant_idupdated_at
Clauses¶
Clauses are used within the top-level keys must, must_not, and should to define your filter criteria.
Term¶
Use the term clause to match for an exact value, in a field of any type.
Case-insensitive matches¶
If you'd like to perform a case-insensitive match against a keyword (string) field, you can use the extended syntax for term:
Matching for capitalization is decided by the case_insensitive parameter - default is false.
Array fields¶
Matching against array fields is a special case for the term clause.
Suppose metadata.color is an array of strings. The following clause will match documents that have blue as one of the elements in the metadata.color array.
For example, both of the following documents will match the query shown above.
Terms¶
Use the terms clause to match for multiple values in a field.
A document is included in the search if it matches any of the values in the array, with the correct capitalization.
For example, Document-1 and Document-2 will match the query above. However, Document-3 will not match due to capitalization differences.
Case-insensitive Match for Multiple Values
terms clause does not support case-insensitive matches. To perform a case-insensitive match for multiple values in a field, use the term clause inside should:
"filter": {
"should": [
{
"term": {
"metadata.color": {
"value": "blue",
"case_insensitive": true
}
}
},
{
"term": {
"metadata.color": {
"value": "red",
"case_insensitive": true
}
}
}
],
"minimum_should_match": 1
}
This query will match all three example documents shown above.
Range¶
Use the range clause to match for a range of values in a field.
You can use the following operators:
gt: Greater thanlt: Less thangte: Greater than or equal tolte: Less than or equal to
The following field types can be used in the range clause:
integerfloatdate
Date Values¶
When using the range clause (as explained above) for a date field, you can use the following date values.
Date Formats and Values
Format:
- You can specify desired date format using the
formatparameter in the API request- If
formatis not specified, default format is ISO 8601 UTC datetime. E.g.,2025-01-12T20:48:57.845Z - If
formatis specified, please make sure any absolute date values in your API request match the specified format
- If
- How to specify
format:yyyy- Year (e.g., 2025)MM- Month (01-12)dd- Day of month (01-31)HH- Hour in 24h format (00-23)mm- Minutes (00-59)ss- Seconds (00-59)- Common combinations:
yyyy-MM-dd- Date only (e.g., 2025-01-12)yyyy-MM-dd'T'HH:mm:ss'Z'- Full UTC datetimeyyyy-MM-dd'T'HH:mm:ssXXX- Datetime with timezone offset (e.g.,yyyy-MM-dd'T'HH:mm:ss-06:00)
Date Values:
- You can use absolute date values in ISO 8601 format:
- Full datetime:
2024-11-01T12:34:56Z - Date only:
2024-11-01 - With timezone offset:
2024-11-01T12:34:56-07:00
- Full datetime:
- You can use relative date values like
now,now-1d,now-1M, etc. Supported date math units for relative date values:y(years)M(months)w(weeks)d(days)h(hours)m(minutes)s(seconds)
- Supported date math operators:
+(add)-(subtract)/(round down)
- Examples of round down operator:
now-1d/d(yesterday at 00:00:00 - rounds down to start of day)now/d(today at 00:00:00)now+1y/y(start of next year)
Exists¶
Use the exists clause to match for documents that contain metadata or a specific metadata field.
A document is included in the search if it contains the specified field.
A document will contain the specified field if a value has ever been set for that field in the given document.
Geo clauses¶
You can use the following clauses with geo_point fields.
Geo-distance¶
Use geo_distance clause to match for documents with geo-points that are within a specified distance from the specified geo-point.
The example clause shown above finds all stores located within 10 miles of the specified lat/lon.
Supported Distance Units:
mi(miles)nmi(nautical miles)km(kilometers)m(meters)
Sorting by distance
The example clause shown above will find stores located within 10 miles of the specified lat/lon.
To display stores closest to the user, you also have to sort by geo-distance.
Nested filter criteria¶
Bool¶
Use bool clause to nest your filter criteria:
Each bool clause can contain the top-level keys (must, must_not, should, minimum_should_match), allowing you to nest your filter criteria to as many levels as needed.
This enables precise control over your filter criteria, using a structured and layered approach.
Common Pitfalls¶
- Always use dot notation for metadata fields (e.g.,
metadata.brandnot justbrand) - For date ranges:
- Ensure dates are properly formatted when using absolute dates (ISO 8601 format by default)
- When using custom date formats, specify the format explicitly using the
formatparameter - Be careful with timezone handling - UTC is the default
- For term filters:
- Term matches are case-sensitive by default
- Use the extended syntax with
case_insensitive: trueif you need case-insensitive matching
- For geo-distance:
- Ensure the field is of type
geo_pointin your metadata schema - Always specify both
latandlonvalues - Use appropriate distance units (e.g.,
km,mi)
- Ensure the field is of type
- When using
boolqueries:- Avoid deeply nested boolean conditions as they can impact performance
- Use
must_notinstead of negating individual conditions
- General tips:
- Test your filters with a small dataset first
- Use the most specific filter type for your use case
- Consider the performance impact of complex filters on large datasets