Skip to main content

RAG Query

Query a collection using natural language. Returns the most relevant text chunks with similarity scores.

POST /v1/query

Request

Headers:

x-api-key: your-api-key
Content-Type: application/json

Body:

FieldTypeRequiredDescription
querystringYesNatural language query text. The backend embeds this and performs a semantic search against the collection.
collection_namestringYesCollection to search.
top_kintegerNumber of results to return. Default: 5.
filtersobjectOptional metadata filter. Only return chunks whose metadata contains the specified key-value pairs.
include_timingbooleanWhen true, the response includes a timing_ms object with embedding_ms, search_ms, and total_ms breakdowns. Default: false.

Example

curl -X POST http://localhost:8000/v1/query \
-H "x-api-key: your-key" \
-H "Content-Type: application/json" \
-d '{
"query": "How does vector indexing work?",
"collection_name": "articles",
"top_k": 5,
"filters": {"source": "docs"}
}'

Response

{
"status": "success",
"data": {
"query": "How does vector indexing work?",
"collection": "articles",
"results": [
{
"text": "Vector indexing uses approximate nearest neighbor algorithms like HNSW to enable fast similarity search over high-dimensional embeddings...",
"score": 0.9342,
"metadata": {"source": "docs", "page": 3},
"external_id": "a1b2c3d4-chunk-0"
},
{
"text": "HNSW builds a hierarchical graph structure where each layer provides increasingly refined proximity information...",
"score": 0.8917,
"metadata": {"source": "docs", "page": 5},
"external_id": "a1b2c3d4-chunk-4"
}
]
},
"error": null
}

With timing:

{
"status": "success",
"data": {
"query": "How does vector indexing work?",
"collection": "articles",
"results": [
{
"text": "Vector indexing uses approximate nearest neighbor algorithms...",
"score": 0.9342,
"metadata": {"source": "docs", "page": 3},
"external_id": "a1b2c3d4-chunk-0"
}
],
"timing_ms": {
"embedding_ms": 11.8,
"search_ms": 3.2,
"total_ms": 15.0
}
},
"error": null
}
note

Unlike the /v1/collections/{name}/search endpoint which requires a raw vector, the RAG query endpoint accepts plain text. The backend handles embedding the query internally before performing the search.

Errors

CodeReason
404Collection not found
401Missing or invalid API key