RAG Query

Query a collection using natural language. Returns the most relevant text chunks with similarity scores.

POST /v1/query

Request

Headers:

x-api-key: your-api-key
Content-Type: application/json

Body:

Field	Type	Required	Description
`query`	string	Yes	Natural language query text. The backend embeds this and performs a semantic search against the collection.
`collection_name`	string	Yes	Collection to search.
`top_k`	integer	—	Number of results to return. Default: `5`.
`filters`	object	—	Optional metadata filter. Only return chunks whose metadata contains the specified key-value pairs.
`include_timing`	boolean	—	When `true`, the response includes a `timing_ms` object with `embedding_ms`, `search_ms`, and `total_ms` breakdowns. Default: `false`.

Example

curl -X POST http://localhost:8000/v1/query \
  -H "x-api-key: your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How does vector indexing work?",
    "collection_name": "articles",
    "top_k": 5,
    "filters": {"source": "docs"}
  }'

Response

{
  "status": "success",
  "data": {
    "query": "How does vector indexing work?",
    "collection": "articles",
    "results": [
      {
        "text": "Vector indexing uses approximate nearest neighbor algorithms like HNSW to enable fast similarity search over high-dimensional embeddings...",
        "score": 0.9342,
        "metadata": {"source": "docs", "page": 3},
        "external_id": "a1b2c3d4-chunk-0"
      },
      {
        "text": "HNSW builds a hierarchical graph structure where each layer provides increasingly refined proximity information...",
        "score": 0.8917,
        "metadata": {"source": "docs", "page": 5},
        "external_id": "a1b2c3d4-chunk-4"
      }
    ]
  },
  "error": null
}

With timing:

{
  "status": "success",
  "data": {
    "query": "How does vector indexing work?",
    "collection": "articles",
    "results": [
      {
        "text": "Vector indexing uses approximate nearest neighbor algorithms...",
        "score": 0.9342,
        "metadata": {"source": "docs", "page": 3},
        "external_id": "a1b2c3d4-chunk-0"
      }
    ],
    "timing_ms": {
      "embedding_ms": 11.8,
      "search_ms": 3.2,
      "total_ms": 15.0
    }
  },
  "error": null
}

note

Unlike the /v1/collections/{name}/search endpoint which requires a raw vector, the RAG query endpoint accepts plain text. The backend handles embedding the query internally before performing the search.

Errors

Code	Reason
`404`	Collection not found
`401`	Missing or invalid API key

Request​

Example​

Response​

Errors​

Request

Example

Response

Errors