Generate Answer (RAG)¶

Provide an AI-generated answer to the user's query based on your documents, using Retrieval-Augmented Generation (RAG). Supports conversation context, quickly build context-aware AI chatbots/assistants, question answering systems, and more.

Endpoint¶

POST /v20241104/generate-answer

API Request¶

Required Parameters

query string required

Query text for the search.

MinLength: 1 MaxLength: 1000

Optional Parameters

ai_search_cutoff_score integer

Only documents with a score above this threshold will be considered. Default is 50.

Info

Use higher values (closer to 90) to consider only the most relevant documents. Use lower values to include more documents, even if they are less relevant.

Min: 0 Max: 90

document_id array of strings

Array specifying one or more document IDs.
Info
- Only documents with an id matching one of the values in the array will be considered.
- Example (single document ID):
```
"document_id": ["4nD1gZIB7caKVIeL2MgL"]
```
- Example (multiple document IDs):
```
"document_id": ["4nD1gZIB7caKVIeL2MgL", "4wdRU5QBP4_EawzZupIF"]
```
- For each document ID string:
  
  MinLength: 1 MaxLength: 128
MinLength: 1 MaxLength: 10

filter object

JSON object specifying the filter criteria.

greeting_response string

Custom response to return when query is detected as a greeting (e.g., "hello", "hi") rather than a question.
Info
- If not specified, a default greeting response along the lines of Hello! How can I help you? will be returned.
- Example:
```
"greeting_response": "Hello! I'm GainlyBot, an AI assistant. Ask me anything about Gainly."
```
MinLength: 1 MaxLength: 1000

language string (enum)

Language code (of a supported language) that indicates the language of the documents to search.

Default is en.
Info
- Ignored if multilingual_search is true.
max_messages integer

Maximum number of previous messages used as conversation context for the search. Default is 6.
Info
- If previous_messages contains more messages than the specified max_messages, only the most recent messages up to this limit will be used, omitting older messages.
- Higher values provide more context but may adversely impact search performance and relevance.
Min: 1 Max: 10

max_output_tokens integer

Specifies the maximum number of tokens the model (LLM) can generate in its output, which sets an upper limit on the length of the generated answer. Default is 256.
Info
- Lower values reduce LLM output token usage and response time - but may result in less informative answers, or answers that are cut off.
Min: 128 Max: 512

multilingual_search boolean

Whether to search documents across multiple languages.

Default is false.
Info

If set to true:
- Gainly will search documents across all languages.
- The answer will be automatically translated to match the language of the query.
- language parameter will be ignored.
previous_messages array of objects

Previous messages in the conversation context.
Info
- Don't include this parameter in the initial API request.
- Set this parameter to the messages value from the previous API response to maintain the conversation context.
retrieval_limit integer

Limit on the maximum number of relevant passages retrieved. Default is 10.

Info

Lower values will use fewer LLM input tokens but may result in a less relevant answer.

For most use cases, we recommend a value between 5 and 15.

Min: 1 Max: 20

temperature float

Controls the randomness of the answer generated by the language models (LLMs). Default is 0.5.

Info

Lower values lead to a more predictable and conservative wording of the answer. Higher values lead to a more creative (and potentially less coherent) wording.

Min: 0 Max: 1

tenant_id array of strings

Array specifying one or more Tenant IDs.
Info
- Only documents with a tenant_id matching one of the values in the array will be considered.
- Example (single tenant ID):
```
"tenant_id": ["tenant123"],
```
- Example (multiple tenant IDs):
```
"tenant_id": ["tenant123", "tenant124"],
```
- For each tenant ID string:
  
  MinLength: 1 MaxLength: 250
MinLength: 1 MaxLength: 10

POST /v20241104/generate-answer

cURLAI PromptC#GoJavaNode.jsPHPPythonRuby

curl -X POST "https://api.gainly.ai/v20241104/generate-answer" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY_HERE" \  # (1)!
  -d '{
    "query": "do alpacas actually hum?"
  }'

Replace YOUR_API_KEY_HERE with the value of your API key.

# Prompt for AI coding assistants/IDEs (e.g., ChatGPT, Claude, GitHub Copilot, Cursor, Windsurf)

Using the Gainly API:
1. Write code to call the generate_answer operation (see OpenAPI spec: https://api.gainly.ai/v20241104/openapi.json)
2. Implement authentication using the header "X-API-Key" as described in the docs: https://docs.gainly.ai/latest/api-reference/authentication/
3. Implement rate limit handling as described in the docs: https://docs.gainly.ai/latest/api-reference/rate-limits/
4. Implement error handling
5. Handle the response according to the GenerateAnswerResults schema in the OpenAPI spec
6. Handle chat history using previous_messages parameter for multi-turn conversations as described in the docs: https://docs.gainly.ai/latest/docs/conversation-context/

using System.Net.Http;
using System.Text.Json;
using System.Text;

var client = new HttpClient();

var url = "https://api.gainly.ai/v20241104/generate-answer";

var payload = new {
    query = "do alpacas actually hum?"
};

var content = new StringContent(
    JsonSerializer.Serialize(payload),
    Encoding.UTF8,
    "application/json"
);

client.DefaultRequestHeaders.Add("X-API-Key", "YOUR_API_KEY_HERE"); // (1)!

var response = await client.PostAsync(url, content);
var result = await response.Content.ReadAsStringAsync();
Console.WriteLine(result);

Replace YOUR_API_KEY_HERE with the value of your API key.

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "net/http"
)

func main() {
    url := "https://api.gainly.ai/v20241104/generate-answer"

    payload := map[string]interface{}{
        "query": "do alpacas actually hum?",
    }

    jsonData, _ := json.Marshal(payload)

    req, _ := http.NewRequest("POST", url, bytes.NewBuffer(jsonData))
    req.Header.Set("Content-Type", "application/json")
    req.Header.Set("X-API-Key", "YOUR_API_KEY_HERE") // (1)!

    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()

    var result map[string]interface{}
    json.NewDecoder(resp.Body).Decode(&result)
    fmt.Println(result)
}

Replace YOUR_API_KEY_HERE with the value of your API key.

import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.net.URI;

var client = HttpClient.newHttpClient();

var url = "https://api.gainly.ai/v20241104/generate-answer";

var payload = """
    {
        "query": "do alpacas actually hum?"
    }
    """;

var request = HttpRequest.newBuilder()
    .uri(URI.create(url))
    .header("Content-Type", "application/json")
    .header("X-API-Key", "YOUR_API_KEY_HERE") // (1)!
    .POST(HttpRequest.BodyPublishers.ofString(payload))
    .build();

var response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(response.body());

Replace YOUR_API_KEY_HERE with the value of your API key.

const axios = require('axios');  // or: import axios from 'axios';

const url = 'https://api.gainly.ai/v20241104/generate-answer';

const payload = {
    query: 'do alpacas actually hum?'
};

const headers = {
    'Content-Type': 'application/json',
    'X-API-Key': 'YOUR_API_KEY_HERE' // (1)!
};

axios.post(url, payload, { headers })
    .then(response => console.log(response.data))
    .catch(error => console.error('Error:', error.message));

Replace YOUR_API_KEY_HERE with the value of your API key.

<?php

$client = new \GuzzleHttp\Client();

$url = 'https://api.gainly.ai/v20241104/generate-answer';

$payload = [
    'query' => 'do alpacas actually hum?'
];

$response = $client->request('POST', $url, [
    'json' => $payload,
    'headers' => [
        'Content-Type' => 'application/json',
        'X-API-Key' => 'YOUR_API_KEY_HERE' # (1)!
    ],
]);

echo $response->getBody();

Replace YOUR_API_KEY_HERE with the value of your API key.

import requests

url = "https://api.gainly.ai/v20241104/generate-answer"

payload = {
    "query": "do alpacas actually hum?"
}

headers = {
    "Content-Type": "application/json",
    "X-API-Key": "YOUR_API_KEY_HERE" # (1)!
}

response = requests.post(url, json=payload, headers=headers)
data = response.json()
print(data)

Replace YOUR_API_KEY_HERE with the value of your API key.

require 'json'
require 'uri'
require 'net/http'
require 'openssl'

url = URI('https://api.gainly.ai/v20241104/generate-answer')

http = Net::HTTP.new(url.host, url.port)
http.use_ssl = true

request = Net::HTTP::Post.new(url)
request['Content-Type'] = 'application/json'
request['X-API-Key'] = 'YOUR_API_KEY_HERE' # (1)!
request.body = {
    query: 'do alpacas actually hum?'
}.to_json

response = http.request(request)
puts response.read_body

Replace YOUR_API_KEY_HERE with the value of your API key.

API Response¶

{
    "object": "generate_answer_result",
    "url": "/v20241104/generate-answer",
    "query": "do alpacas actually hum?",
    "data": [
        {
            "answer": "Yes, alpacas do hum. The sources state that \"Alpacas' social behavior also includes various forms of communication, such as humming, which they use to express curiosity, contentment, or stress.\" This indicates that alpacas do, in fact, hum as a means of communication.",
            "sources": [
                {
                    "id": "B3AegpIB7caKVIeL78nJ",
                    "title": "Social Animals",
                    "source_uri": "/doc/7-social-animals",
                    "confidence_level": "high",
                    "metadata": null,
                    "tenant_id": "tenant224",
                    "language": "en",
                    "created_at": "2023-09-21T06:37:19.223796Z",
                    "updated_at": "2023-09-21T06:37:19.223802Z"
                },
                {
                    "id": "7nD4gZIB7caKVIeLtsiu",
                    "title": "Communication",
                    "source_uri": "/doc/15-communication",
                    "confidence_level": "medium",
                    "metadata": null,
                    "tenant_id": "tenant224",
                    "language": "en",
                    "created_at": "2023-10-21T06:37:19.223796Z",
                    "updated_at": "2023-10-21T06:37:19.223802Z"
                }
            ],
            "confidence_level": "high",
            "stop_reason": "end_turn"
        }
    ],
    "messages": [
        {
            "role": "user",         // user is asking the question
            "content": [
                {
                    "text": "do alpacas actually hum?"
                }
            ]
        },
        {
            "role": "assistant",    // Gainly AI "assistant" is answering the question 
            "content": [
                {
                    "text": "Yes, alpacas do hum. The sources state that \"Alpacas' social behavior also includes various forms of communication, such as humming, which they use to express curiosity, contentment, or stress.\" This indicates that alpacas do, in fact, hum as a means of communication."
                }
            ]
        }
    ],
    "document_id": null,
    "tenant_id": null,
    "filter": null,
    "language": "en",
    "multilingual_search": false,
    "retrieval_limit": 10,
    "max_output_tokens": 512,
    "temperature": 0.5,
    "ai_search_cutoff_score": 35,
    "total_number_of_results": 1,
    "greeting_response": "Hello! How can I help you?",
    "model": "model_2",
    "token_usage": {
        "semantic_tokens": 8,           // determined by the size of 'query' field
        "llm_tokens": {
            "llm_input_tokens": 1386,   // determined by the amount of text the LLM has to read
            "llm_output_tokens": 79,    // determined by the amount of text the LLM has to write
            "model": "model_2"
        }
    },
    "livemode": false
}

If the AI is unable to generate an answer for the query based on your documents, the API response format will be identical to the one above but with a total_number_of_results value of 0.

    "total_number_of_results": 0

Answer¶

answer represents the AI-generated answer for the user's query.

"data": [
    {
        "answer": "Yes, alpacas do hum. The sources state that \"Alpacas' social behavior also includes various forms of communication, such as humming, which they use to express curiosity, contentment, or stress.\" This indicates that alpacas do, in fact, hum as a means of communication.",
        "sources": [
            {
                "id": "B3AegpIB7caKVIeL78nJ",
                "title": "Social Animals",
                "source_uri": "/doc/7-social-animals",
                "confidence_level": "high",
                ...
            }
        ],
        "confidence_level": "high",
        "stop_reason": "end_turn"
    }
],

Sources¶

sources are the sources (your documents) from which the AI generated the answer.

Confidence Level¶

confidence_level represents Gainly's assessment of how relevant a RAG AI-generated answer is to the query.

It will have one of the following values:

very_high
high
medium
low
not_available

Stop Reason¶

stop_reason indicates why the model stopped generating tokens while responding to the user's query.

end_turn: The model successfully completed generating the answer.
max_tokens: The model ran out of tokens, suggesting that you may need to do one or more of the following:
- Increase the max_output_tokens value in your API request.
- Reduce the retrieval_limit value in your API request.
not_available: Reason is not available (rare cases).

Messages¶

messages array contains both the user's queries and the AI ("assistant") responses ordered from oldest to newest, and is used to maintain conversation context.

Token Usage¶

token_usage indicates the number of tokens used to process the query and generate an answer.

"token_usage": {
    "semantic_tokens": 8,           // determined by the size of 'query' field
    "llm_tokens": {
        "llm_input_tokens": 1386,  // determined by the amount of text the LLM has to read
        "llm_output_tokens": 79,     // determined by the amount of text the LLM has to write
        "model": "model_2"
    }
}