Skip to content

nicko-08/llm-api-server

Repository files navigation

llm-api-server

A Spring Boot-based backend service responsible for managing chat sessions, persisting conversation history, and orchestrating interactions with the LLM gateway service.

This service acts as the core orchestration layer in a multi-service AI system, ensuring reliable communication between the client application and the LLM provider.


Overview

The llm-api-server is designed to handle:

  • Chat session lifecycle management
  • Persistent storage of conversation history (PostgreSQL)
  • Integration with the LLM gateway (llm-api-gateway)
  • Structured request validation and error handling
  • Clean separation of concerns using layered architecture

It abstracts away both the client and LLM provider, acting as a stateful backend service for conversational applications.


System Architecture

Angular Client → Spring Boot (llm-api-server) → Django (llm-api-gateway) → Groq API

Responsibilities of this service:

  • Manage chat sessions (create, retrieve, delete)
  • Store and retrieve message history
  • Call the LLM gateway with contextual conversation data
  • Aggregate and return structured responses
  • Ensure transactional consistency and data integrity

Tech Stack

  • Spring Boot
  • Spring Web (REST APIs)
  • Spring Data JPA
  • PostgreSQL
  • Docker / Docker Compose
  • Lombok

Key Features

1. Session Management

  • UUID-based session handling
  • Automatic session creation if not provided
  • Title generation based on first prompt

2. Persistent Conversation Storage

  • Messages stored in PostgreSQL
  • Indexed queries for efficient retrieval
  • Ordered message history per session

3. LLM Gateway Integration

  • Communicates with llm-api-gateway via REST
  • Sends prompt + conversation history
  • Handles downstream failures gracefully

4. Layered Architecture

  • Controller → API layer
  • Service → business logic & orchestration
  • Repository → database access
  • Transform → DTO ↔ Entity mapping

5. Validation & Error Handling

  • Bean validation (@Valid, constraints)
  • Centralized exception handling (@RestControllerAdvice)
  • Structured error responses

6. Transaction Management

  • Service-level transaction boundaries
  • Ensures consistency for session + message operations

API Endpoints

Send Message

POST /api/chat/send

Request

{
  "prompt": "Explain hashmap in Java",
  "sessionId": "optional-uuid"
}

Response

{
  "prompt": "Explain hashmap in Java",
  "result": "A HashMap in Java is...",
  "sessionId": "generated-or-existing-id"
}

Get All Sessions

GET /api/chat/sessions

Returns all sessions sorted by last update.


Get Session Messages

GET /api/chat/sessions/{sessionId}/messages

Returns ordered conversation history.


Delete Session

DELETE /api/chat/sessions/{sessionId}

Deletes a session and all associated messages.


Database Design

chat_sessions

  • id (UUID)
  • title
  • created_at
  • updated_at

chat_messages

  • id
  • session_id (FK)
  • role (user/assistant)
  • content
  • timestamp

Key Characteristics:

  • One-to-many relationship (session → messages)
  • Indexed for efficient retrieval
  • Cascade delete for cleanup

Environment Configuration

Example .env:

POSTGRES_HOST=postgres
POSTGRES_PORT=5432
POSTGRES_DB=llmdb
POSTGRES_USER=postgres
POSTGRES_PASSWORD=system

DJANGO_URL=http://django-llm:8000/api/v1/prompt/

BACKEND_PORT=8080

Running Locally

1. Build project

./mvnw clean package

2. Run application

./mvnw spring-boot:run

Running with Docker

docker-compose up --build

Services included:

  • PostgreSQL
  • pgAdmin
  • Spring Boot backend

Design Decisions

Why a Separate Backend Service?

Instead of calling the LLM gateway directly from the client:

  • Enables persistence of conversation history
  • Centralizes business logic
  • Improves scalability and maintainability
  • Allows integration of additional services in the future

Why Layered Architecture?

Separating concerns ensures:

  • Maintainability
  • Testability
  • Clear boundaries between API, business logic, and data access

Why DTO + Transform Layer?

  • Prevents leaking internal entities
  • Enables strict API contracts
  • Decouples persistence from external interfaces

Future Improvements

  • Authentication & user accounts
  • Pagination for messages
  • Caching frequently accessed sessions
  • Streaming responses from LLM
  • Multi-tenant support
  • Metrics & monitoring (Prometheus/Grafana)

Related Services

  • llm-api-gateway → Django service for LLM interaction
  • llm-api-client → Angular frontend

About

Spring Boot backend service for managing chat sessions, persisting conversation history, and orchestrating LLM requests via a Django gateway.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors