Sending Messages
Send chat messages to an LLM and process the response. This guide covers message formatting, request options, and handling the response object.
Prerequisites
Before starting, you should:
- Install the package
- Have an OpenRouter API key
Overview
We'll send a chat message by:
- Creating a client with credentials
- Building a messages array
- Calling
chat()with options - Processing the response
Step 1: Create the Client
The client needs your API key and a default model. Create it once and reuse it for all requests.
import { createLLMClient } from '@motioneffector/llm'
const client = createLLMClient({
apiKey: process.env.OPENROUTER_KEY!,
model: 'anthropic/claude-sonnet-4'
})
Step 2: Build the Messages Array
Messages are objects with role and content. The role is system, user, or assistant.
const messages = [
{ role: 'system' as const, content: 'You are a helpful assistant.' },
{ role: 'user' as const, content: 'What is TypeScript?' }
]
For simple requests, you can skip the system message:
const messages = [
{ role: 'user' as const, content: 'What is TypeScript?' }
]
Step 3: Send the Request
Call chat() with your messages. The method returns a promise that resolves to the response.
const response = await client.chat(messages)
Step 4: Process the Response
The response object contains the generated text and metadata:
console.log(response.content) // The generated text
console.log(response.usage) // Token counts
console.log(response.model) // Model that responded
console.log(response.latency) // Request time in ms
console.log(response.finishReason) // Why generation stopped
The usage object has detailed token counts:
console.log(response.usage.promptTokens) // Input tokens
console.log(response.usage.completionTokens) // Output tokens
console.log(response.usage.totalTokens) // Total
Complete Example
import { createLLMClient } from '@motioneffector/llm'
const client = createLLMClient({
apiKey: process.env.OPENROUTER_KEY!,
model: 'anthropic/claude-sonnet-4'
})
const response = await client.chat([
{ role: 'system', content: 'Be concise.' },
{ role: 'user', content: 'Explain async/await in JavaScript.' }
])
console.log(response.content)
console.log(`Tokens: ${response.usage.totalTokens}, Latency: ${response.latency}ms`)
Variations
With Temperature
Control randomness with temperature (0 = deterministic, 2 = very random):
const response = await client.chat(messages, {
temperature: 0.7
})
With Max Tokens
Limit response length:
const response = await client.chat(messages, {
maxTokens: 500
})
With Stop Sequences
Stop generation when specific text appears:
const response = await client.chat(messages, {
stop: ['END', '---']
})
Override Model Per-Request
Use a different model for one request:
const response = await client.chat(messages, {
model: 'openai/gpt-4o'
})
Combining Options
Options can be combined:
const response = await client.chat(messages, {
temperature: 0.3,
maxTokens: 1000,
model: 'openai/gpt-4o'
})
Troubleshooting
Empty Response
Symptom: response.content is empty.
Cause: Model hit max tokens or content filter.
Solution: Check response.finishReason. If 'length', increase maxTokens. If 'content_filter', the response was blocked.
Rate Limit Errors
Symptom: RateLimitError thrown.
Cause: Too many requests to the API.
Solution: The library retries automatically. For persistent issues, add delays between requests or check your API quota.
Slow Responses
Symptom: Requests take several seconds.
Cause: Large prompts or model processing time.
Solution: Check response.usage.promptTokens. Reduce context if too large. Consider using streaming for perceived performance.
See Also
- Streaming Responses - Real-time output
- Messages - Message format details
- Client API - Full
chat()reference