Skip to content

Latest commit

Β 

History

History
757 lines (604 loc) Β· 16.7 KB

File metadata and controls

757 lines (604 loc) Β· 16.7 KB

🌐 API Documentation - Linara Terminal

Complete reference for OpenRouter AI integration


Table of Contents

  1. API Overview
  2. Authentication
  3. Endpoints
  4. Request Format
  5. Response Format
  6. Error Handling
  7. Rate Limits
  8. Models Available
  9. Best Practices
  10. Code Examples

API Overview

What is OpenRouter?

OpenRouter is a unified API gateway that provides access to multiple AI models (GPT-4, Claude, Gemini, etc.) through a single interface.

Base URL: https://openrouter.ai/api/v1

Protocol: HTTPS (TLS 1.2+)

Format: JSON

Authentication: Bearer token

Why We Use It

Benefit Description
Unified API One interface for multiple models
Free tier Google Gemini 2.0 Flash is free
Reliability Automatic failover between providers
Simplicity Standard OpenAI-compatible format
Flexibility Easy to switch models

Authentication

API Key Setup

  1. Get API Key:

    • Visit: https://openrouter.ai/
    • Sign up (free)
    • Go to Settings β†’ Keys
    • Create new key
    • Copy: sk-or-v1-xxxxxxxxxxxxxxxx
  2. Configure in Project:

# Create .env file
echo 'OPENROUTER_API_KEY=sk-or-v1-YOUR_KEY_HERE' > .env
echo 'OPENROUTER_MODEL=google/gemini-2.0-flash-exp:free' >> .env
  1. Load in Code:
use std::env;

fn get_api_key() -> Result<String, String> {
    dotenvy::dotenv().ok();
    env::var("OPENROUTER_API_KEY")
        .map_err(|_| "API key not set".to_string())
}

Security Best Practices

// ❌ NEVER hardcode API keys
const API_KEY: &str = "sk-or-v1-abc123...";

// βœ… Use environment variables
let api_key = env::var("OPENROUTER_API_KEY")?;

// βœ…βœ… Best: Use system keyring (Linux/macOS/Windows)
#[cfg(unix)]
fn get_api_key() -> Result<String> {
    keyring::Entry::new("linara-terminal", "api-key")?.get_password()
}

Endpoints

Chat Completions

POST /chat/completions

Purpose: Generate text completions (our main use case)

URL: https://openrouter.ai/api/v1/chat/completions

Headers:

Authorization: Bearer sk-or-v1-YOUR_KEY_HERE
Content-Type: application/json
HTTP-Referer: https://github.com/zoxilsi/Linara-Terminal (optional)
X-Title: Linara Terminal (optional)

Request Format

Basic Structure

{
  "model": "google/gemini-2.0-flash-exp:free",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello!"
    }
  ]
}

Complete Request Schema

interface ChatCompletionRequest {
  model: string;                    // Required: Model identifier
  messages: Message[];              // Required: Conversation history
  temperature?: number;             // Optional: 0.0-2.0 (default: 1.0)
  top_p?: number;                   // Optional: 0.0-1.0 (default: 1.0)
  max_tokens?: number;              // Optional: Max response length
  stream?: boolean;                 // Optional: Stream response (default: false)
  stop?: string | string[];         // Optional: Stop sequences
}

interface Message {
  role: "system" | "user" | "assistant";
  content: string;
}

Our Implementation

#[derive(Serialize)]
struct OpenRouterRequest {
    model: String,
    messages: Vec<OpenRouterMessage>,
}

#[derive(Serialize)]
struct OpenRouterMessage {
    role: String,      // "system" | "user" | "assistant"
    content: String,   // Message text
}

// Build request
let request = OpenRouterRequest {
    model: "google/gemini-2.0-flash-exp:free".to_string(),
    messages: vec![
        OpenRouterMessage {
            role: "system".to_string(),
            content: "Convert natural language to Linux commands.".to_string(),
        },
        OpenRouterMessage {
            role: "user".to_string(),
            content: format!("Input: {}\nOutput:", user_input),
        },
    ],
};

Example Requests

Simple Command Translation

{
  "model": "google/gemini-2.0-flash-exp:free",
  "messages": [
    {
      "role": "system",
      "content": "Convert natural language to Linux commands. Respond with ONLY the command."
    },
    {
      "role": "user",
      "content": "list all files"
    }
  ]
}

Expected Response: ls -la

File Operation

{
  "model": "google/gemini-2.0-flash-exp:free",
  "messages": [
    {
      "role": "system",
      "content": "Convert natural language to Linux commands."
    },
    {
      "role": "user",
      "content": "create a folder called test"
    }
  ]
}

Expected Response: mkdir test


Response Format

Success Response (200 OK)

{
  "id": "gen-1234567890abcdef",
  "model": "google/gemini-2.0-flash-exp:free",
  "object": "chat.completion",
  "created": 1234567890,
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "ls -la"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 3,
    "total_tokens": 18
  }
}

Response Schema

interface ChatCompletionResponse {
  id: string;                    // Unique request ID
  model: string;                 // Model used
  object: "chat.completion";     // Type identifier
  created: number;               // Unix timestamp
  choices: Choice[];             // Response options
  usage: Usage;                  // Token usage
}

interface Choice {
  index: number;                 // Choice index (0 for first)
  message: Message;              // AI response
  finish_reason: string;         // "stop" | "length" | "content_filter"
}

interface Usage {
  prompt_tokens: number;         // Input tokens
  completion_tokens: number;     // Output tokens
  total_tokens: number;          // Sum
}

Our Parsing Implementation

#[derive(Deserialize)]
struct OpenRouterResponse {
    choices: Vec<OpenRouterChoice>,
}

#[derive(Deserialize)]
struct OpenRouterChoice {
    message: OpenRouterMessage,
}

#[derive(Deserialize)]
struct OpenRouterMessage {
    role: String,
    content: String,
}

// Parse response
let response: OpenRouterResponse = http_response.json().await?;
let command = response.choices[0].message.content.trim();

Error Handling

HTTP Status Codes

Code Meaning Action
200 Success Parse response
400 Bad Request Check request format
401 Unauthorized Verify API key
429 Too Many Requests Retry with backoff
500 Server Error Retry
503 Service Unavailable Retry later

Error Response Format

{
  "error": {
    "message": "Rate limit exceeded",
    "type": "rate_limit_error",
    "code": 429
  }
}

Our Error Handling

match http_response.status() {
    status if status.is_success() => {
        // Parse and return command
        let data: OpenRouterResponse = http_response.json().await?;
        Ok(data.choices[0].message.content.clone())
    }
    status if status == 429 => {
        // Rate limited - retry with backoff
        Err("API_RATE_LIMIT: Too many requests".into())
    }
    status if status.is_server_error() => {
        // Transient error - retry
        Err("API_TRANSIENT: Server error".into())
    }
    status => {
        // Client error - don't retry
        let body = http_response.text().await?;
        Err(format!("API error {}: {}", status, body).into())
    }
}

Retry Logic with Exponential Backoff

// Retry up to 3 times
for attempt in 0..3 {
    match make_api_call().await {
        Ok(response) if response.status() == 429 => {
            // Rate limited - calculate backoff
            let backoff_ms = match attempt {
                0 => 300,   // First retry: 300ms
                1 => 800,   // Second retry: 800ms
                _ => 0
            };
            
            if backoff_ms > 0 {
                sleep(Duration::from_millis(backoff_ms)).await;
            }
            continue; // Try again
        }
        Ok(response) => return Ok(response),  // Success!
        Err(e) => last_error = Some(e),       // Network error
    }
}

// All retries failed
Err(last_error.unwrap())

Rate Limits

Free Tier (Gemini 2.0 Flash)

Limit Type Value Notes
Requests per minute ~60 Varies by load
Tokens per request 1M context Very generous
Cost $0.00 Completely free
Uptime Best effort May have occasional outages

Rate Limit Response

{
  "error": {
    "message": "Rate limit exceeded. Please try again later.",
    "type": "rate_limit_error",
    "code": 429
  }
}

Handling Rate Limits

  1. Detect: Check for HTTP 429 status
  2. Wait: Exponential backoff (300ms, 800ms)
  3. Retry: Automatic retry up to 3 times
  4. Inform: Show user-friendly message
if error.contains("429") || error.contains("rate-limited") {
    println!("🚦 Rate limited by free model.");
    println!("   β€’ Wait 30 seconds and try again");
    println!("   β€’ Or add your own OpenRouter key");
    println!("   β€’ Or switch to a different model");
}

Models Available

Current Model (Free)

Model ID: google/gemini-2.0-flash-exp:free

Feature Value
Provider Google
Cost $0.00 (free)
Context Window 1,048,576 tokens (1M)
Speed Very fast (~500ms)
Quality Excellent for code
Release December 2024

Alternative Models

Claude 3.5 Sonnet (Paid)

# .env
OPENROUTER_MODEL=anthropic/claude-3.5-sonnet
Feature Value
Cost $3.00 / 1M input tokens
Context 200K tokens
Speed Medium (~1s)
Quality Excellent reasoning

GPT-4 Turbo (Paid)

# .env
OPENROUTER_MODEL=openai/gpt-4-turbo
Feature Value
Cost $10.00 / 1M input tokens
Context 128K tokens
Speed Slow (~2s)
Quality Excellent general

Switching Models

// Get model from environment
fn get_openrouter_model() -> String {
    env::var("OPENROUTER_MODEL")
        .unwrap_or_else(|_| "google/gemini-2.0-flash-exp:free".to_string())
}

// Use in request
let request = OpenRouterRequest {
    model: get_openrouter_model(),
    messages: vec![...],
};

Best Practices

1. Optimize Prompts

Bad prompt:

"Convert this to a command: list files"

Good prompt:

"You are a Linux terminal command generator. Convert natural language to 
valid Linux commands. Respond with ONLY the command, no explanations.

Input: list files
Output:"

2. Validate Responses

// Always validate AI output
fn validate_command(cmd: &str) -> bool {
    // Not empty
    !cmd.is_empty() &&
    // Reasonable length
    cmd.len() < 200 &&
    // Contains alphanumeric
    cmd.chars().any(|c| c.is_alphanumeric()) &&
    // First token is executable
    looks_like_valid_command(cmd)
}

3. Implement Caching

// Cache successful responses
struct CacheEntry {
    command: String,
    timestamp: SystemTime,
}

// 5-minute TTL
fn get_cached(&self, input: &str) -> Option<String> {
    if let Some(entry) = self.cache.get(input) {
        if entry.timestamp.elapsed()? < Duration::from_secs(300) {
            return Some(entry.command.clone());
        }
    }
    None
}

4. Handle Errors Gracefully

// User-friendly error messages
match result {
    Err(e) if e.contains("429") => {
        println!("🚦 Rate limited. Wait 30s or add own key.");
    }
    Err(e) if e.contains("timeout") => {
        println!("⏰ AI timed out. Try again.");
    }
    Err(e) => {
        println!("❌ Error: {}", e);
    }
    Ok(cmd) => { /* Success */ }
}

5. Use Connection Pooling

// Reuse HTTP connections
let client = reqwest::Client::builder()
    .timeout(Duration::from_secs(5))
    .pool_max_idle_per_host(20)      // Keep 20 connections
    .tcp_keepalive(Duration::from_secs(60))  // 60s keepalive
    .build()?;

Code Examples

Complete API Call (Full Implementation)

use reqwest;
use serde::{Deserialize, Serialize};
use std::time::Duration;
use tokio::time::{timeout, sleep};

#[derive(Serialize)]
struct OpenRouterRequest {
    model: String,
    messages: Vec<OpenRouterMessage>,
}

#[derive(Serialize, Deserialize)]
struct OpenRouterMessage {
    role: String,
    content: String,
}

#[derive(Deserialize)]
struct OpenRouterResponse {
    choices: Vec<OpenRouterChoice>,
}

#[derive(Deserialize)]
struct OpenRouterChoice {
    message: OpenRouterMessage,
}

async fn generate_command(
    client: &reqwest::Client,
    natural_input: &str,
    api_key: &str
) -> Result<String, Box<dyn std::error::Error>> {
    
    // Build request
    let request = OpenRouterRequest {
        model: "google/gemini-2.0-flash-exp:free".to_string(),
        messages: vec![
            OpenRouterMessage {
                role: "system".to_string(),
                content: "Convert natural language to Linux commands. \
                          Respond with ONLY the command.".to_string(),
            },
            OpenRouterMessage {
                role: "user".to_string(),
                content: format!("Input: {}\nOutput:", natural_input),
            },
        ],
    };
    
    // Make API call with timeout
    let response = timeout(
        Duration::from_secs(10),
        client
            .post("https://openrouter.ai/api/v1/chat/completions")
            .header("Authorization", format!("Bearer {}", api_key))
            .header("Content-Type", "application/json")
            .json(&request)
            .send()
    ).await??;
    
    // Check status
    if !response.status().is_success() {
        let status = response.status();
        let body = response.text().await?;
        return Err(format!("API error {}: {}", status, body).into());
    }
    
    // Parse response
    let data: OpenRouterResponse = response.json().await?;
    let command = data.choices[0].message.content.trim();
    
    // Clean markdown if present
    let command = command
        .trim_start_matches("```bash")
        .trim_start_matches("```")
        .trim_end_matches("```")
        .trim();
    
    Ok(command.to_string())
}

// Usage
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = reqwest::Client::new();
    let api_key = std::env::var("OPENROUTER_API_KEY")?;
    
    let command = generate_command(&client, "list all files", &api_key).await?;
    println!("Command: {}", command);  // "ls -la"
    
    Ok(())
}

Testing API Manually (curl)

# Test connection
curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.0-flash-exp:free",
    "messages": [
      {"role": "user", "content": "Say hello"}
    ]
  }'

# Expected response
{
  "choices": [
    {"message": {"content": "Hello!"}}
  ]
}

Troubleshooting

Common Issues

Issue Cause Solution
401 Unauthorized Invalid API key Check .env file
429 Rate Limit Too many requests Wait 30s or upgrade
Connection timeout Network issue Check internet connection
Empty response Model overloaded Retry or switch model
Invalid JSON Malformed request Validate request structure

Debug Mode

# Enable verbose logging
RUST_LOG=debug cargo run

# Check what's being sent
curl -v https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d @request.json

Resources


Conclusion

This API documentation covered:

βœ… Authentication: API key setup and security
βœ… Endpoints: Chat completions endpoint
βœ… Request/Response: JSON format and parsing
βœ… Error Handling: Status codes and retry logic
βœ… Rate Limits: Free tier limits and handling
βœ… Models: Available models and switching
βœ… Best Practices: Optimization techniques
βœ… Examples: Complete working code

Next steps:

  1. Get your API key from OpenRouter
  2. Add to .env file
  3. Test with example code
  4. Integrate into your project