Embeddings API
Embeddings are numerical representations of text that capture semantic meaning. They are essential for tasks like document search, recommendation systems, and clustering. Siraya AI provides a unified API to access state-of-the-art embedding models from multiple providers (OpenAI, Jina, Voyage AI, etc.).
Base URL
How to Generate Embeddings
Basic Request (Python)
import requests
url = "https://llm.siraya.pro/v1/embeddings"
headers = {
"Authorization": "Bearer <API_KEY>",
"Content-Type": "application/json"
}
data = {
"model": "text-embedding-3-small",
"input": "The quick brown fox jumps over the lazy dog"
}
response = requests.post(url, headers=headers, json=data)
embeddings = response.json()["data"][0]["embedding"]
Batch Processing
To generate multiple embeddings in a single request, pass an array of strings to the input parameter:
{
"model": "text-embedding-3-small",
"input": ["First document text", "Second document text", "Third document text"]
}
Best Practices
- Model Selection: Use smaller/faster models (like
text-embedding-3-small) for low-latency search and larger models (likevoyage-large-2) for better semantic accuracy. - Batching: Group multiple texts into a single request to reduce round-trip latency and overhead.
- Normalization: Most models return normalized embeddings. Use cosine similarity for comparing vectors.
- Caching: Since embeddings are deterministic for a given model and input, caching results can save significant costs.
Visit the Models Directory to see all supported embedding models and their dimensions.