Generate Answer (RAG)¶
Provide an AI-generated answer to the user's query based on your documents, using Retrieval-Augmented Generation (RAG). Supports conversation context, quickly build context-aware AI chatbots/assistants, question answering systems, and more.
Endpoint¶
API Request¶
-
Required Parameters
query string required
Query text for the search.
MinLength: 1
MaxLength: 1000
Optional Parameters
ai_search_cutoff_score integer
Only documents with a score above this threshold will be considered. Default is
50
.Info
Use higher values (closer to 90) to consider only the most relevant documents. Use lower values to include more documents, even if they are less relevant.
Min: 0
Max: 90
document_id array of strings
Array specifying one or more document IDs.
Info
-
Only documents with an
id
matching one of the values in the array will be considered. -
Example (single document ID):
-
Example (multiple document IDs):
-
For each document ID string:
MinLength: 1
MaxLength: 128
MinLength: 1
MaxLength: 10
filter object
JSON object specifying the filter criteria.
greeting_response string
Custom response to return when
query
is detected as a greeting (e.g., "hello", "hi") rather than a question.Info
- If not specified, a default greeting response along the lines of
Hello! How can I help you?
will be returned. - Example:
MinLength: 1
MaxLength: 1000
language string (enum)
Language code (of a supported language) that indicates the language of the documents to search.
Default is
en
.Info
- Ignored if
multilingual_search
istrue
.
max_messages integer
Maximum number of previous messages used as conversation context for the search. Default is
6
.Info
- If
previous_messages
contains more messages than the specifiedmax_messages
, only the most recent messages up to this limit will be used, omitting older messages. - Higher values provide more context but may adversely impact search performance and relevance.
Min: 1
Max: 10
max_output_tokens integer
Specifies the maximum number of tokens the model (LLM) can generate in its output, which sets an upper limit on the length of the generated answer. Default is
256
.Info
- Lower values reduce LLM output token usage and response time - but may result in less informative answers, or answers that are cut off.
Min: 128
Max: 512
multilingual_search boolean
Whether to search documents across multiple languages.
Default is
false
.Info
If set to
true
:- Gainly will search documents across all languages.
- The answer will be automatically translated to match the language of the query.
language
parameter will be ignored.
previous_messages array of objects
Previous messages in the conversation context.
Info
- Don't include this parameter in the initial API request.
- Set this parameter to the
messages
value from the previous API response to maintain the conversation context.
retrieval_limit integer
Limit on the maximum number of relevant passages retrieved. Default is
10
.Info
Lower values will use fewer LLM input tokens but may result in a less relevant answer.
For most use cases, we recommend a value between
5
and15
.Min: 1
Max: 20
temperature float
Controls the randomness of the answer generated by the language models (LLMs). Default is
0.5
.Info
Lower values lead to a more predictable and conservative wording of the answer. Higher values lead to a more creative (and potentially less coherent) wording.
Min: 0
Max: 1
tenant_id array of strings
Array specifying one or more Tenant IDs.
Info
-
Only documents with a
tenant_id
matching one of the values in the array will be considered. -
Example (single tenant ID):
-
Example (multiple tenant IDs):
-
For each tenant ID string:
MinLength: 1
MaxLength: 250
MinLength: 1
MaxLength: 10
-
-
POST /v20241104/generate-answer
curl -X POST "https://api.gainly.ai/v20241104/generate-answer" \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY_HERE" \ # (1)! -d '{ "query": "do alpacas actually hum?" }'
- Replace
YOUR_API_KEY_HERE
with the value of your API key.
# Prompt for AI coding assistants/IDEs (e.g., ChatGPT, Claude, GitHub Copilot, Cursor, Windsurf) Using the Gainly API: 1. Write code to call the generate_answer operation (see OpenAPI spec: https://api.gainly.ai/v20241104/openapi.json) 2. Implement authentication using the header "X-API-Key" as described in the docs: https://docs.gainly.ai/latest/api-reference/authentication/ 3. Implement rate limit handling as described in the docs: https://docs.gainly.ai/latest/api-reference/rate-limits/ 4. Implement error handling 5. Handle the response according to the GenerateAnswerResults schema in the OpenAPI spec 6. Handle chat history using previous_messages parameter for multi-turn conversations as described in the docs: https://docs.gainly.ai/latest/docs/conversation-context/
using System.Net.Http; using System.Text.Json; using System.Text; var client = new HttpClient(); var url = "https://api.gainly.ai/v20241104/generate-answer"; var payload = new { query = "do alpacas actually hum?" }; var content = new StringContent( JsonSerializer.Serialize(payload), Encoding.UTF8, "application/json" ); client.DefaultRequestHeaders.Add("X-API-Key", "YOUR_API_KEY_HERE"); // (1)! var response = await client.PostAsync(url, content); var result = await response.Content.ReadAsStringAsync(); Console.WriteLine(result);
- Replace
YOUR_API_KEY_HERE
with the value of your API key.
package main import ( "bytes" "encoding/json" "fmt" "net/http" ) func main() { url := "https://api.gainly.ai/v20241104/generate-answer" payload := map[string]interface{}{ "query": "do alpacas actually hum?", } jsonData, _ := json.Marshal(payload) req, _ := http.NewRequest("POST", url, bytes.NewBuffer(jsonData)) req.Header.Set("Content-Type", "application/json") req.Header.Set("X-API-Key", "YOUR_API_KEY_HERE") // (1)! resp, _ := http.DefaultClient.Do(req) defer resp.Body.Close() var result map[string]interface{} json.NewDecoder(resp.Body).Decode(&result) fmt.Println(result) }
- Replace
YOUR_API_KEY_HERE
with the value of your API key.
import java.net.http.HttpClient; import java.net.http.HttpRequest; import java.net.http.HttpResponse; import java.net.URI; var client = HttpClient.newHttpClient(); var url = "https://api.gainly.ai/v20241104/generate-answer"; var payload = """ { "query": "do alpacas actually hum?" } """; var request = HttpRequest.newBuilder() .uri(URI.create(url)) .header("Content-Type", "application/json") .header("X-API-Key", "YOUR_API_KEY_HERE") // (1)! .POST(HttpRequest.BodyPublishers.ofString(payload)) .build(); var response = client.send(request, HttpResponse.BodyHandlers.ofString()); System.out.println(response.body());
- Replace
YOUR_API_KEY_HERE
with the value of your API key.
const axios = require('axios'); // or: import axios from 'axios'; const url = 'https://api.gainly.ai/v20241104/generate-answer'; const payload = { query: 'do alpacas actually hum?' }; const headers = { 'Content-Type': 'application/json', 'X-API-Key': 'YOUR_API_KEY_HERE' // (1)! }; axios.post(url, payload, { headers }) .then(response => console.log(response.data)) .catch(error => console.error('Error:', error.message));
- Replace
YOUR_API_KEY_HERE
with the value of your API key.
<?php $client = new \GuzzleHttp\Client(); $url = 'https://api.gainly.ai/v20241104/generate-answer'; $payload = [ 'query' => 'do alpacas actually hum?' ]; $response = $client->request('POST', $url, [ 'json' => $payload, 'headers' => [ 'Content-Type' => 'application/json', 'X-API-Key' => 'YOUR_API_KEY_HERE' # (1)! ], ]); echo $response->getBody();
- Replace
YOUR_API_KEY_HERE
with the value of your API key.
import requests url = "https://api.gainly.ai/v20241104/generate-answer" payload = { "query": "do alpacas actually hum?" } headers = { "Content-Type": "application/json", "X-API-Key": "YOUR_API_KEY_HERE" # (1)! } response = requests.post(url, json=payload, headers=headers) data = response.json() print(data)
- Replace
YOUR_API_KEY_HERE
with the value of your API key.
require 'json' require 'uri' require 'net/http' require 'openssl' url = URI('https://api.gainly.ai/v20241104/generate-answer') http = Net::HTTP.new(url.host, url.port) http.use_ssl = true request = Net::HTTP::Post.new(url) request['Content-Type'] = 'application/json' request['X-API-Key'] = 'YOUR_API_KEY_HERE' # (1)! request.body = { query: 'do alpacas actually hum?' }.to_json response = http.request(request) puts response.read_body
- Replace
YOUR_API_KEY_HERE
with the value of your API key.
- Replace
API Response¶
{
"object": "generate_answer_result",
"url": "/v20241104/generate-answer",
"query": "do alpacas actually hum?",
"data": [
{
"answer": "Yes, alpacas do hum. The sources state that \"Alpacas' social behavior also includes various forms of communication, such as humming, which they use to express curiosity, contentment, or stress.\" This indicates that alpacas do, in fact, hum as a means of communication.",
"sources": [
{
"id": "B3AegpIB7caKVIeL78nJ",
"title": "Social Animals",
"source_uri": "/doc/7-social-animals",
"confidence_level": "high",
"metadata": null,
"tenant_id": "tenant224",
"language": "en",
"created_at": "2023-09-21T06:37:19.223796Z",
"updated_at": "2023-09-21T06:37:19.223802Z"
},
{
"id": "7nD4gZIB7caKVIeLtsiu",
"title": "Communication",
"source_uri": "/doc/15-communication",
"confidence_level": "medium",
"metadata": null,
"tenant_id": "tenant224",
"language": "en",
"created_at": "2023-10-21T06:37:19.223796Z",
"updated_at": "2023-10-21T06:37:19.223802Z"
}
],
"confidence_level": "high",
"stop_reason": "end_turn"
}
],
"messages": [
{
"role": "user", // user is asking the question
"content": [
{
"text": "do alpacas actually hum?"
}
]
},
{
"role": "assistant", // Gainly AI "assistant" is answering the question
"content": [
{
"text": "Yes, alpacas do hum. The sources state that \"Alpacas' social behavior also includes various forms of communication, such as humming, which they use to express curiosity, contentment, or stress.\" This indicates that alpacas do, in fact, hum as a means of communication."
}
]
}
],
"document_id": null,
"tenant_id": null,
"filter": null,
"language": "en",
"multilingual_search": false,
"retrieval_limit": 10,
"max_output_tokens": 512,
"temperature": 0.5,
"ai_search_cutoff_score": 35,
"total_number_of_results": 1,
"greeting_response": "Hello! How can I help you?",
"model": "model_2",
"token_usage": {
"semantic_tokens": 8, // determined by the size of 'query' field
"llm_tokens": {
"llm_input_tokens": 1386, // determined by the amount of text the LLM has to read
"llm_output_tokens": 79, // determined by the amount of text the LLM has to write
"model": "model_2"
}
},
"livemode": false
}
If the AI is unable to generate an answer for the query based on your documents, the API response format will be identical to the one above but with a total_number_of_results
value of 0
.
Answer¶
answer
represents the AI-generated answer for the user's query.
Sources¶
sources
are the sources (your documents) from which the AI generated the answer.
Confidence Level¶
confidence_level
represents Gainly's assessment of how relevant a RAG AI-generated answer is to the query.
It will have one of the following values:
very_high
high
medium
low
not_available
Stop Reason¶
stop_reason
indicates why the model stopped generating tokens while responding to the user's query.
end_turn
: The model successfully completed generating the answer.max_tokens
: The model ran out of tokens, suggesting that you may need to do one or more of the following:- Increase the
max_output_tokens
value in your API request. - Reduce the
retrieval_limit
value in your API request.
- Increase the
not_available
: Reason is not available (rare cases).
Messages¶
messages
array contains both the user's queries and the AI ("assistant") responses ordered from oldest to newest, and is used to maintain conversation context.
Token Usage¶
token_usage
indicates the number of tokens used to process the query and generate an answer.