Skip to content

bradAGI/LLM-Shortcut

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Shortcut - iOS Shortcut (Ask AI)

Run LLMs on desktop computer / remotely and access them from your iOS device

LLM Shortcut Logo

Overview

This guide details how to use an iOS Shortcut to interact with large language models (LLMs) hosted on a local desktop computer through a FastAPI endpoint. The setup allows you to send prompts from your iOS device and receive responses from the LLM, enabling on-the-go access to advanced language models. There are 2 main components to this setup:

  • FastAPI Application: A FastAPI application that serves as an endpoint for the iOS Shortcut to send requests to.
  • iOS Shortcut: A shortcut that sends prompts to the FastAPI application and displays the response.

Why use LLM Shortcut?

  • Less battery drain - sending web requests is much less battery intensive than running inference on-device
  • Faster inference with remote models
  • More flexibility - run models accessible via LLM library
  • AnyScale compatible

Which models are available?

  • Current AnyScale model names include (1/3/24)

    • meta-llama/Llama-2-7b-chat-hf
    • meta-llama/Llama-2-13b-chat-hf
    • meta-llama/Llama-2-70b-chat-hf
    • codellama/CodeLlama-34b-Instruct-hf
    • mistralai/Mistral-7B-Instruct-v0.1
    • mistralai/Mixtral-8x7B-Instruct-v0.1
    • Open-Orca/Mistral-7B-OpenOrca
    • HuggingFaceH4/zephyr-7b-beta
  • All GGUF models are available through local inference via llama.cpp

    • The Bloke regularly publishes many GGUF models

Prerequisites

Setup Instructions

Server (Desktop) Setup

  1. Git Clone:

    • Clone this repository to your local machine:
      git clone https://github.com/00brad/LLM-Shortcut.git
  2. Build Docker Image:

    • Create a .env file in the project root with your ngrok AuthToken and AnyScale API Key (Model can be set to an AnyScale model name or a local GGUF model name):

      NGROK_AUTHTOKEN=your_ngrok_authtoken
      ANYSCALE_ENDPOINTS_KEY=your_anyscale_key
      MODEL=your_model_name
      MAX_TOKENS=your_max_tokens
      
    • Build the Docker image:

        docker build -t llm_shortcut .
  3. If using local GGUF model (optional), download the into project root:

    • Example:
        wget https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF/resolve/main/llama-2-7b-chat.Q6_K.gguf
    • Set the model name in the .env file to the name of the downloaded model.
  4. Run the FastAPI Application:

    • Start the FastAPI server using Uvicorn:
         docker run llm_shortcut
    • Upon startup, ngrok generates a public URL that tunnels to your local desktop computer.
    • The generated ngrok URL, which is your endpoint, will be displayed in the console after running:
      Edit URL

iOS Shortcut Setup

  1. Shortcut Setup:

    • Click the following link to open the LLM Shortcut in the Shortcuts app.
    • Click the '+ Add to Shortcut' button to add the shortcut to your library.
      Add to Shortcut
    • Click the '...' button to edit the shortcut.
      Edit Shortcut
    • Change the url in the first line to the ngrok URL generated in the previous section.
      Edit URL
    • Add the API key.
      Edit URL
    • Click the 'Done' button to save the changes.
    • Click the 'i' button to view the shortcut details and click Add to Home Screen to add the shortcut to your home
  2. Using the Shortcut:

    • Click the shortcut and speak your prompt or say "Hey Siri, Ask AI" to activate the shortcut.
      Edit URL
    • The prompt is sent to the FastAPI server, processed by the LLM, and a response is returned to your device.

Additional Notes

  • The ngrok URL will change each time the FastAPI server restarts.
  • You can run in detached mode by adding the -d flag to the docker run command. (Use docker logs to view the ngrok URL)
  • Ensure the .env file is correctly set up with your AnyScale API key, ngrok AuthToken, and model name.
  • Currently, the server is configured to use an AnyScale models.

This setup allows you to leverage the power of language models directly from your iOS device, making it fast and convenient to use advanced AI capabilities wherever you go.

About

iOS shortcut / Docker endpoint to access LLM models on iOS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors