Skip to content

eZer-Net/Offline-AI-Chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Offline AI Chat

Offline AI Chat is a local CLI project for running GGUF models through llama.cpp inside an isolated Docker container without network access during runtime.

The project is intended for local interaction with an LLM on a user-controlled device in a predictable execution environment. The selected GGUF model is integrity-checked, added to the Docker image during the build stage, and then executed inside an isolated container without published TCP ports. Communication with the model is performed through a Unix socket and a console interface.

1. Quick Start

Clone the repository

git clone git@github.com:eZer-Net/Offline-AI-Chat.git
cd Offline-AI-Chat

Requirements

  • Linux host
  • Python 3.10+
  • Docker
  • user access to Docker Engine

First launch

Run from the project root:

python3 run.py

Main menu:

1) Offline AI Chat
2) System Analyzer
3) Exit

Recommended first-run order:

  1. Start System Analyzer
  2. Review the suggested host profile
  3. Update .env if required
  4. Return to the main menu
  5. Start Offline AI Chat

When network access is required

Network access is required only in the following cases:

  • the base Docker image is not available locally;
  • the required GGUF files are not available locally and must be downloaded from the model catalog.

After the model files are prepared and the image is built, the runtime scenario can operate locally without network access inside the container.

2. Project Functionality

The project implements two user scenarios.

Offline AI Chat

This scenario starts a local GGUF model and opens an interactive CLI chat.

The workflow includes:

  • reading the model catalog from _work-models/catalog.json;
  • selecting a model from the console menu;
  • checking local GGUF files;
  • downloading missing files from catalog-defined URLs;
  • verifying each model file by SHA256;
  • preparing a temporary Docker build context;
  • building a Docker image with the selected model baked into it;
  • starting llama.cpp inside an isolated container;
  • waiting for the Unix socket to become available;
  • checking API readiness and performing a smoke test;
  • starting the interactive CLI chat.

System Analyzer

This scenario evaluates Linux host resources and suggests values for .env:

  • CTX_SIZE
  • MEM_LIMIT
  • CPU_LIMIT
  • PIDS_LIMIT

After user confirmation, the selected values are written to .env and then used as runtime container limits.

3. Project Purpose

The project addresses a practical security-driven use case: local work with a language model in a controlled environment.

A key design decision is that the selected GGUF model is baked into the Docker image during the build stage. This avoids operating the model as a separate runtime mount and simplifies deployment in environments where artifact control, predictability, and reduction of data exposure paths are critical.

From an information security perspective, the project follows a conservative principle:

everything that cannot be verified and controlled should be treated as unsafe by default.

Work with AI agents and external LLM services remains one of the major concerns for large organizations and internal environments because it creates risks such as:

  • leakage of sensitive data through external AI APIs;
  • loss of control over where user prompts and context are processed;
  • inability to validate the actual execution environment of the model;
  • dependency on third-party infrastructure and its data handling policies.

This project provides a local alternative that is simple to operate and easier to reason about from a security standpoint:

  • the model files remain under local control;
  • file integrity is verified by SHA256 before use;
  • the model is built into a Docker image with a fixed content set;
  • the runtime container is started without network access;
  • interaction with the model is performed locally through CLI.

As a result, the project provides a practical way to run a local LLM on a user-owned device and maintain a dialog with the model without sending the interaction flow to an external cloud AI service.

4. Project Documentation

Detailed technical documentation is available in the following files.

English documentation

Russian documentation

5. Conclusion

Offline AI Chat is an open source solution for local execution of GGUF models in an isolated Docker runtime with a CLI interface and baseline environment control.

The project is distributed under the MIT license. The license text is available in LICENSE.

This open source solution was created by the digital-shield.tech community.

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors