llm-api-server

A Spring Boot-based backend service responsible for managing chat sessions, persisting conversation history, and orchestrating interactions with the LLM gateway service.

This service acts as the core orchestration layer in a multi-service AI system, ensuring reliable communication between the client application and the LLM provider.

Overview

The llm-api-server is designed to handle:

Chat session lifecycle management
Persistent storage of conversation history (PostgreSQL)
Integration with the LLM gateway (llm-api-gateway)
Structured request validation and error handling
Clean separation of concerns using layered architecture

It abstracts away both the client and LLM provider, acting as a stateful backend service for conversational applications.

System Architecture

Angular Client → Spring Boot (llm-api-server) → Django (llm-api-gateway) → Groq API

Responsibilities of this service:

Manage chat sessions (create, retrieve, delete)
Store and retrieve message history
Call the LLM gateway with contextual conversation data
Aggregate and return structured responses
Ensure transactional consistency and data integrity

Tech Stack

Spring Boot
Spring Web (REST APIs)
Spring Data JPA
PostgreSQL
Docker / Docker Compose
Lombok

Key Features

1. Session Management

UUID-based session handling
Automatic session creation if not provided
Title generation based on first prompt

2. Persistent Conversation Storage

Messages stored in PostgreSQL
Indexed queries for efficient retrieval
Ordered message history per session

3. LLM Gateway Integration

Communicates with llm-api-gateway via REST
Sends prompt + conversation history
Handles downstream failures gracefully

4. Layered Architecture

Controller → API layer
Service → business logic & orchestration
Repository → database access
Transform → DTO ↔ Entity mapping

5. Validation & Error Handling

Bean validation (@Valid, constraints)
Centralized exception handling (@RestControllerAdvice)
Structured error responses

6. Transaction Management

Service-level transaction boundaries
Ensures consistency for session + message operations

API Endpoints

Send Message

POST /api/chat/send

Request

{
  "prompt": "Explain hashmap in Java",
  "sessionId": "optional-uuid"
}

Response

{
  "prompt": "Explain hashmap in Java",
  "result": "A HashMap in Java is...",
  "sessionId": "generated-or-existing-id"
}

Get All Sessions

GET /api/chat/sessions

Returns all sessions sorted by last update.

Get Session Messages

GET /api/chat/sessions/{sessionId}/messages

Returns ordered conversation history.

Delete Session

DELETE /api/chat/sessions/{sessionId}

Deletes a session and all associated messages.

Database Design

chat_sessions

id (UUID)
title
created_at
updated_at

chat_messages

id
session_id (FK)
role (user/assistant)
content
timestamp

Key Characteristics:

One-to-many relationship (session → messages)
Indexed for efficient retrieval
Cascade delete for cleanup

Environment Configuration

Example .env:

POSTGRES_HOST=postgres
POSTGRES_PORT=5432
POSTGRES_DB=llmdb
POSTGRES_USER=postgres
POSTGRES_PASSWORD=system

DJANGO_URL=http://django-llm:8000/api/v1/prompt/

BACKEND_PORT=8080

Running Locally

1. Build project

./mvnw clean package

2. Run application

./mvnw spring-boot:run

Running with Docker

docker-compose up --build

Services included:

PostgreSQL
pgAdmin
Spring Boot backend

Design Decisions

Why a Separate Backend Service?

Instead of calling the LLM gateway directly from the client:

Enables persistence of conversation history
Centralizes business logic
Improves scalability and maintainability
Allows integration of additional services in the future

Why Layered Architecture?

Separating concerns ensures:

Maintainability
Testability
Clear boundaries between API, business logic, and data access

Why DTO + Transform Layer?

Prevents leaking internal entities
Enables strict API contracts
Decouples persistence from external interfaces

Future Improvements

Authentication & user accounts
Pagination for messages
Caching frequently accessed sessions
Streaming responses from LLM
Multi-tenant support
Metrics & monitoring (Prometheus/Grafana)

Related Services

llm-api-gateway → Django service for LLM interaction
llm-api-client → Angular frontend

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.mvn/wrapper		.mvn/wrapper
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yaml		docker-compose.yaml
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml

Folders and files

Latest commit

History

Repository files navigation

llm-api-server

Overview

System Architecture

Responsibilities of this service:

Tech Stack

Key Features

1. Session Management

2. Persistent Conversation Storage

3. LLM Gateway Integration

4. Layered Architecture

5. Validation & Error Handling

6. Transaction Management

API Endpoints

Send Message

Request

Response

Get All Sessions

Get Session Messages

Delete Session

Database Design

chat_sessions

chat_messages

Key Characteristics:

Environment Configuration

Running Locally

1. Build project

2. Run application

Running with Docker

Design Decisions

Why a Separate Backend Service?

Why Layered Architecture?

Why DTO + Transform Layer?

Future Improvements

Related Services

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages