Skip to main content
Version: v5

Backends

Added in: v5.1.0

Four model backends ship with Harper. Each model entry in the models configuration selects one with its backend field.

BackendEmbeddingsGenerationStreamingTools
ollama
openai
anthropic
bedrockvaries by model

All backends support these common fields:

FieldDescription
backendWhich backend to use (required)
modelProvider-side model identifier (e.g. gpt-4o) used when a call does not pass its own model option
requestTimeoutMsPer-request timeout in milliseconds; composed with any caller-provided AbortSignal

Ollama

Calls a local or remote Ollama server. No credentials.

models:
embedding:
default:
backend: ollama
host: localhost:11434
model: nomic-embed-text:latest
generative:
local:
backend: ollama
host: ollama.internal:11434
model: mistral:7b
FieldDefaultDescription
hostlocalhost:11434Ollama server origin. A scheme-less value is treated as http://; a full origin (https://…) is used as given
modelOllama model name, e.g. nomic-embed-text:latest, mistral:7b

When embedding with nomic-embed-text, the inputType option ('document' / 'query') is applied using the model's task prefixes; other models ignore it.

The Ollama backend does not advertise tool support — declaring tools against it fails up front.

OpenAI

Calls the OpenAI API — or any service exposing an OpenAI-compatible API with bearer-token authentication, by pointing baseUrl at it. This includes vLLM's OpenAI-compatible server, Google's Gemini OpenAI-compatible endpoint, Azure OpenAI's /openai/v1 endpoint, and hosted gateways such as OpenRouter or Together AI.

models:
embedding:
default:
backend: openai
apiKey: ${OPENAI_API_KEY}
model: text-embedding-3-large
generative:
default:
backend: openai
apiKey: ${OPENAI_API_KEY}
model: gpt-4o
vllm:
backend: openai
apiKey: ${VLLM_API_KEY}
baseUrl: http://vllm.internal:8000/v1
model: meta-llama/Llama-3.1-8B-Instruct
FieldDefaultDescription
apiKey— (required)API key, sent as a bearer token. Use ${VAR} indirection
baseUrlhttps://api.openai.com/v1API root; point at any OpenAI-compatible endpoint
modelModel name, e.g. gpt-4o, text-embedding-3-large
organizationSent as the OpenAI-Organization header, for keys spanning multiple organizations

responseFormat: 'json' maps to OpenAI's JSON mode and responseFormat: { schema } to strict structured outputs (json_schema); OpenAI-compatible servers vary in their support for these.

Anthropic

Calls the Anthropic Messages API. Generation only — Anthropic does not offer an embeddings API.

models:
generative:
claude:
backend: anthropic
apiKey: ${ANTHROPIC_API_KEY}
model: claude-sonnet-4-6
FieldDefaultDescription
apiKey— (required)API key, sent as the x-api-key header
baseUrlhttps://api.anthropic.comAPI root
modelModel name, e.g. claude-sonnet-4-6

The Anthropic API requires a completion token limit on every request; when a call does not pass maxTokens, Harper sends 4096.

Amazon Bedrock

Calls AWS Bedrock. Credentials come from the standard AWS SDK chain (environment variables, shared credentials file, IAM instance/task roles) — there is no apiKey field.

The AWS SDK is not bundled with Harper. Install it in your project to use this backend:

npm install @aws-sdk/client-bedrock-runtime
models:
embedding:
titan:
backend: bedrock
region: us-east-1
model: amazon.titan-embed-text-v2:0
generative:
claude:
backend: bedrock
region: us-east-1
model: anthropic.claude-sonnet-4-5-20250929-v1:0
FieldDefaultDescription
region— (required)AWS region hosting the Bedrock models
modelBedrock model identifier; the vendor prefix selects the request format

The model identifier's vendor prefix (anthropic., meta., amazon.titan-, cohere., mistral.) determines the request/response format Harper uses; an unrecognized prefix is rejected with an error. Tool support depends on the underlying model family. Bedrock embedding APIs accept one text per request, so batch embed() calls are issued sequentially.