Skip to content

truehot/LMLocal

Repository files navigation

🤖 LMLocal

LMLocal is a local AI chat assistant for Visual Studio 2022/2026. It integrates with LM Studio, Ollama, Jan and OpenAI-compatible APIs to provide context-aware assistance.

Visual Studio License Status


Note

Safe & Controlled: LMLocal is strictly read-only by default. Writing tools are completely optional, must be turned on in Settings, and can be rolled back in one click.

Preview Release: This extension is currently in preview; features and UI are actively evolving.


✨ Features

Interface & User Experience

  • ☁️ In-IDE Chat UI – Tool window for LLM interaction without switching applications.
  • 🌊 Streaming Responses – Real-time token delivery for low-latency feedback.
  • 🤖 Model Selection – Quick access to switch between available AI models directly from the chat interface.
  • 🎨 Visual Themes – Multi-theme support (Dark, Mid-Dark, Mid-Light, Light).
  • 📋 Quick Copy – A button above code blocks that copies the code to your clipboard.
  • ↕️ Collapse Large Code Blocks – Limits the height of long code snippets with a scrollbar and an expand option.
  • 🎭 Role-Based Presets (Instructions) – A window with pre-defined AI presets. You can customize each preset's system prompt and temperature, or toggle them on/off.

Context & Solution Awareness

  • 🛠️ Advanced AI Tool Integration – Allows the AI to deeply analyze your open solution, read file contents, and execute actions like building the solution, formatting documents, or running unit tests.
  • 📝 Automated Code Editing – Enabled tools can automatically create, delete, or modify code files directly inside Visual Studio.
  • 🛡️ Changes & Rollback Manager – Shows all file modifications in a dedicated real-time panel above the chat, allowing you to review diffs, accept changes, or roll them back in one click.
  • Active Window Context – Dedicated "+" button to instantly include active editor content in the request.
  • 🧠 Thought/Reasoning Support – Support for reasoning models; "thoughts" are displayed in expandable blocks.
  • 🛡️ Smart History Buffering – Automatically hides messages beyond the 200-entry limit to keep the UI responsive.

Efficiency & Token Management

  • 📉 Conversation Summarization – Condenses older messages into a concise overview when the conversation grows long.
  • 🧹 History Optimization – Strips markdown formatting and trims extra whitespace to reduce token usage.
  • 📊 Live Stats – Status bar metrics: real-time speed (tokens/sec) and total token count.

Infrastructure & Settings

  • ⚙️ Persistent Settings – Centralized configuration for API URLs, stream timeouts, and history management.
  • 🔌 Connect on Startup – Automatically connects to the LLM server on extension startup.
  • Customizable Timeout – Adjustable streaming inactivity limit for slower local models (0 = never timeout).
  • 📂 Local Chat Logging – Saves all conversations to disk in %LOCALAPPDATA%\LMLocalChat\ChatHistory\ for future reference in .jsonl format.
  • 🌐 Streamable MCP Support – Integrates with the Model Context Protocol to dynamically scale the AI's toolkit via both local process-based (stdio) and remote network-based (http) transports.

🛠 Requirements

To use LMLocal, ensure you have:

  • Visual Studio 2022 or 2026
  • One of the following backends (installed and running):
    • LM Studio with local server at http://127.0.0.1:1234
    • Ollama with server at http://127.0.0.1:11434 and a loaded model
    • Jan with server at http://127.0.0.1:1337
    • Any OpenAI-compatible API (custom URL and optional key)
  • A chat-capable LLM loaded

🚀 Installation

Option 1: Visual Studio Marketplace (Recommended)

  1. Open Visual Studio.
  2. Go to Extensions > Manage Extensions.
  3. Search for LM Local and click Download.
  4. Restart Visual Studio to complete the installation.

Option 2: Manual VSIX

  1. Download the .vsix file from the Marketplace.
  2. Double-click the file and follow the VSIX Installer prompts.

🏁 Getting Started

Part 1. Initial Setup (One-Time Configuration)

  1. Launch: Open the LM Local Chat tool window using one of the following methods:
    • Method A: Open it directly from the top Extensions menu.
    • Method B: In the top menu, go to View ➔ Other Windows ➔ LM Local Chat.
  2. Position the Window (Optional): Click and drag the opened window to dock it wherever is most convenient for your workflow—for example, right next to the Solution Explorer.
  3. Configure Your Provider:
    • Click the menu icon () and open Settings....
    • Under the AI Provider section, select your preferred backend from the dropdown menu:
      • LM Studio (local) – Automatically targets http://127.0.0.1:1234
      • Ollama (local) – Automatically targets http://127.0.0.1:11434
      • Jan (local) – Automatically targets http://127.0.0.1:1337
      • OpenAI compatible (custom) – Allows you to supply a custom base URL and authorization keys for remote endpoints or custom gateways.
    • Note: Choosing a local provider automatically configures the correct default port and endpoint structure.
    • Tip: If you have multiple providers, it is recommended to set them up first via the "Providers..." menu option.
  4. Verify the Connection:
    • Click the "Test" button located directly to the right of the API Base URL input field.
    • This instantly pings the specified endpoint to verify if the server is active, accessible, and correctly responding.

Part 2. How to Use the Chat

  1. Select an Instruction Preset (Optional): Open the AI Instructions... window from the menu to select from pre-defined AI presets.
    • Each preset has its own pre-configured system prompt and temperature.
    • You can toggle individual presets or parameters on/off. If a custom instruction or preset is disabled, it will be automatically hidden in the main chat selection dropdown.
  2. Context (Optional): Click the + button to include the entire content of the active document into the conversation.
  3. Chat: Type your message and click Send or hit Enter ⌨️.

💡 Interface & Interaction Tips

  • Keyboard Shortcuts: Standard hotkeys work perfectly inside the chat window—use Ctrl + C to copy text and Ctrl + V to paste your messages (the right-click context menu is disabled).
  • Copying AI Code: To copy code blocks generated by the model, click the Copy button located in the top-right corner of the code block.
  • Model Reasoning: The model's internal thinking process is neatly hidden inside the collapsible Thoughts block at the beginning of the response. Click it anytime to expand and view the full logic.
  • Token & Context Tracking: Hover your mouse over the top connection bar (where the model name is shown). If supported by your provider (like LM Studio), a tooltip will appear showing exactly how many tokens have been consumed out of the maximum available context limit.
  • Model Selection: Click the model name in the top header to open the Select model window, where you can search, filter, and quickly switch between available models.
  • Active Window Context: Click the + button to instantly include the entire content of the file currently open within your active Visual Studio solution.
    • Auto-turn off: The button automatically deactivates after the request is sent, as the document becomes part of the active chat history.
    • UI & Logs: The attached file content is kept hidden to avoid cluttering the chat UI, but it is tracked and visible in the extension logs.
  • ⏹️ Stop – Cancel an active generation.
  • 🗑️ Clear chat... – Click the menu icon () to wipe the current session history and start fresh. Use this to clear the chat context if the history is consuming too many tokens.

🎭 AI Instructions & Modes

The "AI Instructions..." window allows you to define specialized System Prompts (roles) and creativity levels (temperature) for different development tasks. The extension comes with pre-configured behavior templates like Default, Improve, Review, Plan, Bugfix, Explain, and Tests.

Tip

Performance & Accuracy Tip: > For the best results, always select your desired mode (e.g., Bugfix, Review, Explain) before sending your message.

Once configured, you can instantly switch between these system roles using the dropdown menu directly in the main chat bar.

How to Customize Modes:

  1. Click the menu icon () and select "AI Instructions...".
  2. Select a target mode/role from the left panel (e.g., Review or Bugfix).
  3. Configure its behavior in the right panel:
    • Mode Toggle Checkbox: Check or uncheck this box to show or hide this specific mode in your main chat bar dropdown.
    • System Prompt: Enter the base instructions that define the AI's role, processing rules, and operational constraints (e.g., telling the Tests mode to act as a QA Engineer and strictly generate xUnit tests in C#).
    • Temperature: Set the randomness/creativity threshold. Use values closer to 0 (e.g., 0.1 or 0.2) for rigid, deterministic tasks like compiling and bug fixing, and closer to 1 for architectural planning or brainstorming.

    💡 Note: Always check your specific model's official documentation for recommended temperature settings, as some local models require strict defaults or a value of 0 to function properly without breaking formatting or structure.

  4. Click Save to apply the changes to your chat environment.

⚙️ Providers

The "Providers..." dialog allows you to create and save multiple provider profiles (servers) so you don't have to re-enter your API keys and base URLs every time. You can store as many profiles as you need, including both local servers (like Ollama running on your machine) and cloud remote services (like Groq, OpenAI, or Gemini).

Once configured, you can seamlessly switch between your saved profiles via the main settings.

🔒 Privacy & Data Usage Note: Unlike local servers which keep 100% of your data offline on your machine, cloud remote providers process your requests on external servers. Data retention policies vary significantly by provider - some services may use your prompt history and codebase context for model training by default. Always verify the provider's privacy policy and terms of service before transmitting proprietary or sensitive source code.

Here is a quick end-to-end example of how to configure a custom remote endpoint and activate it inside the extension.

Step 1: Create the Provider Profile

  1. Click the menu icon () and select "Providers...".
  2. Click "+ Add Profile" and fill in the fields:
    • Profile name: Ollama cloud
    • Provider type: Select OpenAI compatible from the dropdown.
    • API base URL: https://ollama.com/
    • API key: Enter your cloud provider API key.
    • 💡 Note: The extension allows any profile names, but if you create multiple profiles with completely identical fields, the system will always use the first one.*

  3. Click Apply, then click Save Changes to close the window.

Step 2: Switch to the New Provider

  1. Open the menu () again and select "Settings...".
  2. Under the AI Provider dropdown, select your newly created Ollama cloud profile.
  3. Save settings, and you are ready to chat!

Step 3: Select Your Model

  1. Click the model name (or "Select model..." placeholder) in the top header.
  2. Search, filter, and select your desired model from the window to activate it.

🆓 Free to Try (Free Limited Tiers / Credits Available)

Provider Provider Type API Base URL
Ollama OpenAI compatible https://ollama.com/
Groq OpenAI compatible https://api.groq.com/openai/
Mistral OpenAI compatible https://api.mistral.ai/
Cohere OpenAI compatible https://api.cohere.ai/compatibility/
OpenRouter OpenAI compatible https://openrouter.ai/api/
Google AI Studio Gemini (cloud) https://generativelanguage.googleapis.com
GitHub Models Github Models via Azure (cloud) https://models.inference.ai.azure.com/

💳 Pay to Try (Commercial / Premium)

Provider Provider Type API Base URL
OpenAI OpenAI compatible https://api.openai.com
DeepSeek DeepSeek (cloud) https://api.deepseek.com

Built‑in AI Tools

These tools let the AI read your code, search, edit files, build, and run tests. You control what it can do.

What you can control

In Settings (two checkboxes):

  • Enable built‑in AI tools (read‑only) – The AI can open and read files, but cannot change anything.
  • Enable built‑in AI tools (write/modify) – The AI can create, change, or delete files.

Tip: Turn on write/modify only for projects that are under version control (e.g., Git). That way you can always see what changed and revert if needed.

In the Built‑in Tools… dialog (list of built‑in tools):
Open this from the extension menu. You’ll see all built‑in tools (for example, delete_file, replace_file_content). You can enable or disable each tool separately. Even if the global write/modify checkbox is on, you can still turn off specific tools like delete_file. Use “Enable All” or “Disable All” to change many at once, then click Save.

Changes panel – see what was changed and revert if needed

When a tool edits a file, the changes are applied immediately to the actual files in your solution. LMLocal tracks all modified files and shows them in a collapsible Changes panel inside the chat window. This list persists across solution reloads and Visual Studio restarts, so you can always review what the AI did.

The panel lets you:

  • Click any file to see a diff of the changes.
  • See labels: New, Modified, or Deleted next to each file.
  • Switch between List view and Tree view.
  • Click Review all – opens a side‑by‑side diff window for all changed files.
  • Click Open all – opens all changed files in Visual Studio editor tabs.
  • Click Discard all – reverts all changes using internal backups (files are restored to their state before the AI edits).
  • Click Accept all – confirms the changes, removes the internal backups, and clears the list (you can no longer revert them afterward).

List of built‑in tools

Files and projects

  • create_file – Creates a new file with initial content.
  • delete_file – Deletes a file from the solution.
  • find_files – Searches for files by name.
  • list_directory – Lists files and folders in a given path.
  • get_solution_overview – Returns a summary of projects, folders, and files.
  • set_file_project_status – Includes or excludes a file from a project.

Reading file content

  • read_file_lines – Reads a specific range of lines.
  • search_file_content – Searches for a text string (case‑insensitive) inside solution files.
  • get_active_document – Returns the path and full text of the currently open document.

Editing and formatting code

  • replace_file_content – Replaces the entire file with new text.
  • replace_file_lines – Replaces a range of lines (by numbers) with new content.
  • insert_file_lines – Inserts lines at a specific position.
  • format_document – Applies Visual Studio’s code formatting to the file.
  • optimize_usings – Removes unused using statements and sorts the rest in C# files.

Analysing code

  • inspect_type – Shows members, base types, interfaces, and dependencies of a class/struct/interface.
  • find_symbol_references – Finds all references to a symbol (class, method, etc.) across the solution, with line numbers and context (uses Roslyn).

Build and tests

  • build_solution – Builds the whole solution (runs asynchronously).
  • run_tests – Runs dotnet test for a specific .csproj and shows live output.

📉 History Optimization: Strip Formatting

When the "Strip formatting from history" option is enabled in the extension settings, LMLocal automatically runs a cleanup pass on previous conversation turns before forwarding the payload to your AI backend. This reduces token overhead for local models by flattening structural Markdown syntax into lightweight plain text.

Note

Under the Hood Only: This optimization is invisible in the user interface. Your active chat window will always display responses with full Markdown rendering, code highlighting, and structural styling. The stripping process only alters the raw background history array sent to the model to save context tokens.

🧹 What Gets Removed / Transformed:

  • Code Block Enclosures: Triple backticks (```) are stripped; code contents remain as plain text.
  • Headers: Heading markers (#, ##, etc.) are removed, keeping only the text content.
  • Text Emphasis: Bold (**text**text), italics (*text*text), and strikethroughs (~~text~~text) are flattened.
  • Inline Code: Inline backticks (codecode) are dropped.
  • Links & Media: Hyperlinks ([label](url)label) and images (![alt](url)alt) discard their URLs/paths, preserving only their descriptive text labels.
  • List & Structural Layouts: Bullets (-, *, +, 1., 2.), blockquote symbols (> ), and horizontal rules (---, ***, ___) are erased.
  • Whitespace Compaction: Extra whitespace is trimmed, and redundant blank lines are clamped down (any sequence of 3 or more consecutive newlines is compressed into exactly 2 newlines).

🌐 Model Context Protocol (MCP) Support

LMLocal supports external tool integration via the Model Context Protocol (MCP). This allows you to hook up custom or third-party servers to give your local AI even more capabilities.

⚠️ Scope & Supported Transports

  • Protocol Version: Compatible with the MCP 2025-11-25 specification standard.
  • Tools-Only Support: LMLocal exclusively loads and registers Tools exposed by your MCP servers. Other MCP features like custom Prompts or Resources are currently ignored and will not be utilized by the assistant.
  • Transports: Supports HTTP-based streamable protocols (http) and (stdio).
  • NOT Supported: Legacy sse (Server-Sent Events) transports are unsupported.
  • No Execution Restrictions: Currently, the extension does not restrict, sandbox, or prompt for manual confirmation when the AI invokes an MCP tool. Connected tools execute automatically when called by the model.

Warning

Security Notice & Trusted Sources Only

  • Trust Infrastructure: Only connect to MCP servers and URLs that you fully trust or host locally yourself.
  • Review Third-Party Tools: Before enabling a public or third-party MCP endpoint, review its exposed tools and documentation to ensure it does not execute unauthorized commands or compromise sensitive project data.

⚙️ How to Configure MCP Servers

You can set up and manage connections to external MCP servers directly inside the configuration dialog:

  1. Open the LM Local Chat tool window.
  2. Click the menu icon () in the top-right corner.
  3. Select "MCP Extensions..." from the dropdown menu.
  4. In the dialog:
    • Check "Enable Model Context Protocol (MCP)" to turn the feature on.
    • Paste or edit your JSON configuration directly into the built-in text editor.
    • Click the "Discover Tools" button to validate your settings and instantly verify connection availability.

The extension saves your settings locally to %LOCALAPPDATA%\LMLocalChat\mcp.json.

📝 Configuration Examples

You can organize your configuration using either the servers or mcpServers root keys.

Note

LMLocal Custom Extensions The following parameters are custom LMLocal properties and are not part of the official MCP specification:

  • "disabled" (boolean): Temporarily deactivates an entire server process or HTTP connection without deleting its configuration block.
  • "permissions" (object): Used to mute specific tools discovered on the server.

Example 1: Public HTTP Server (with Tool Permissions)

{
  "mcpServers": {
    "microsoft-learn": {
      "type": "http",
      "url": "https://learn.microsoft.com/api/mcp",
      "permissions": {
        "microsoft_code_sample_search": "disable"
      }
    }
  }
}

Example 2: Demonstrates how to configure endpoints requiring a GitHub Personal Access Token (PAT).

{
  "servers": {
    "github-copilot": {
      "type": "http",
      "url": "https://api.githubcopilot.com/mcp/",
      "headers": {
        "Authorization": "Bearer ghp_your_personal_access_token_here"
      },
      "disabled": false
    }
  }
}

Example 3: Illustrates the required schema structure for connecting local executable-based MCP servers

{
  "servers": {
    "OmniToolBox": {
      "type": "stdio",
      "command": "C:\\MyMCP\\OmniToolBox.exe"
    }
  }
}

🔗 Developer Resources

Model Context Protocol .NET SDK — Use this official Microsoft SDK to build and compile your own custom MCP servers compatible with LMLocal. https://github.com/modelcontextprotocol/csharp-sdk


🔧 Troubleshooting

Issue Solution
No model shown Ensure a model is fully loaded in the LM Studio "Server" tab.
Connection Error Check if the LM Studio Server is ON at http://127.0.0.1:1234. Click to retry.
UI Lag Restart the tool window or check your local machine resources (CPU/GPU).

💾 Data & Configuration

LMLocal keeps things simple and stores your preferences locally. Configuration files are maintained in:

%LOCALAPPDATA%\LMLocalChat\


📜 License & Third-Party

  • License: MIT License. See LICENSE.txt for details.
  • Components:
    • marked v15.0.12 (MIT)
    • highlight.js v11.9.0 (BSD-3-Clause)

About

LMLocal is a Visual Studio extension that adds a dedicated chat interface for interacting with local LLMs via LM Studio. It operates as a manual assistant for prompts and code generation within the IDE.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors