Sending Messages

Send chat messages to an LLM and process the response. This guide covers message formatting, request options, and handling the response object.

Prerequisites

Before starting, you should:

Install the package
Have an OpenRouter API key

Overview

We'll send a chat message by:

Creating a client with credentials
Building a messages array
Calling chat() with options
Processing the response

Step 1: Create the Client

The client needs your API key and a default model. Create it once and reuse it for all requests.

import { createLLMClient } from '@motioneffector/llm'

const client = createLLMClient({
  apiKey: process.env.OPENROUTER_KEY!,
  model: 'anthropic/claude-sonnet-4'
})

Step 2: Build the Messages Array

Messages are objects with role and content. The role is system, user, or assistant.

const messages = [
  { role: 'system' as const, content: 'You are a helpful assistant.' },
  { role: 'user' as const, content: 'What is TypeScript?' }
]

For simple requests, you can skip the system message:

const messages = [
  { role: 'user' as const, content: 'What is TypeScript?' }
]

Step 3: Send the Request

Call chat() with your messages. The method returns a promise that resolves to the response.

const response = await client.chat(messages)

Step 4: Process the Response

The response object contains the generated text and metadata:

console.log(response.content)        // The generated text
console.log(response.usage)          // Token counts
console.log(response.model)          // Model that responded
console.log(response.latency)        // Request time in ms
console.log(response.finishReason)   // Why generation stopped

The usage object has detailed token counts:

console.log(response.usage.promptTokens)      // Input tokens
console.log(response.usage.completionTokens)  // Output tokens
console.log(response.usage.totalTokens)       // Total

Complete Example

import { createLLMClient } from '@motioneffector/llm'

const client = createLLMClient({
  apiKey: process.env.OPENROUTER_KEY!,
  model: 'anthropic/claude-sonnet-4'
})

const response = await client.chat([
  { role: 'system', content: 'Be concise.' },
  { role: 'user', content: 'Explain async/await in JavaScript.' }
])

console.log(response.content)
console.log(`Tokens: ${response.usage.totalTokens}, Latency: ${response.latency}ms`)

Variations

With Temperature

Control randomness with temperature (0 = deterministic, 2 = very random):

const response = await client.chat(messages, {
  temperature: 0.7
})

With Max Tokens

Limit response length:

const response = await client.chat(messages, {
  maxTokens: 500
})

With Stop Sequences

Stop generation when specific text appears:

const response = await client.chat(messages, {
  stop: ['END', '---']
})

Override Model Per-Request

Use a different model for one request:

const response = await client.chat(messages, {
  model: 'openai/gpt-4o'
})

Combining Options

Options can be combined:

const response = await client.chat(messages, {
  temperature: 0.3,
  maxTokens: 1000,
  model: 'openai/gpt-4o'
})

Troubleshooting

Empty Response

Symptom: response.content is empty.

Cause: Model hit max tokens or content filter.

Solution: Check response.finishReason. If 'length', increase maxTokens. If 'content_filter', the response was blocked.

Rate Limit Errors

Symptom: RateLimitError thrown.

Cause: Too many requests to the API.

Solution: The library retries automatically. For persistent issues, add delays between requests or check your API quota.

Slow Responses

Symptom: Requests take several seconds.

Cause: Large prompts or model processing time.

Solution: Check response.usage.promptTokens. Reduce context if too large. Consider using streaming for perceived performance.