logo elektroda
logo elektroda
X
logo elektroda

How to download and run GGUF AI LLM models from Huggingface in the Ollama Open-WebUI?

p.kaczmarek2 11313 0
ADVERTISEMENT
  • Screenshot of the GGUF model upload interface in Open WebUI with the file mistral-7b-instruct-v0.2.Q8_0.gguf selected.
    Open-WebUI is a great tool for running locally multimodal Large Language Models, but not all of the models are available for download directly through the OWUI web panel. Luckily, GGUF models can be downloaded externally and then uploaded to OWUI through the experimental GGUF import. Here I will show you step by step how to import such model.

    This topic assumes that you already have Open-WebUI setup, if not, please check out the previous tutorial:
    ChatGPT locally? AI/LLM assistants to run on your computer - download and installation
    You may also find interesting:
    Minitest: robot vision? Multimodal AI LLaVA and workshop photo analysis - 100% local

    Let's start with considering what is GGUF.
    GGUF, which stands for GPT-Generated Unified Format, is a successor to GGML (GPT-Generated Model Language) format, released on 21st August 2023. GGUF is a file format used to store GPT-like models for inference. GGUF models can run on both GPU and CPU, and provides extensibility, stability and versatility.
    Diagram of GGUF file structure showing section breakdown and example metadata.
    Details about GGUF can be found on Hugging Face site.

    So first, you will need a GGUF model.
    Go to https://huggingface.co/models and browse the models for download.
    Not all models are available in GGUF. So, filter entries by GGUF:
    Screenshot of Hugging Face website with GGUF model search.

    For example, let's download TheBloke/Mistral-7B-Instruct-v0.2-GGUF
    Screenshot of the Mistral-7B-Instruct model file list.

    I've chosen mistral-7b-instruct-v0.2.Q8_0.gguf version.
    Downloading the model mistral-7b-instruct-v0.2.Q8_0.gguf at 10.3 MB/s, 12 minutes remaining.
    Wait for the download to finish:
    Screenshot showing a downloaded GGUF model file named mistral-7b-instruct-v0.2.Q8_0.gguf.
    Now, enter the settings, you will now need to upload the model you've downloaded.
    Screenshot of Open WebUI interface with codellama:latest model loaded.
    In models section, find a GGUF upload form:
    Settings panel in Open-WebUI with options for managing GGUF models.
    Select the file you want to upload:
    Screenshot of the settings panel in the Open WebUI application with the Models tab selected.
    Now you need to be very patient. Don't close this page. There is currently no good progress report display.
    Screenshot of Open-WebUI settings with the option to upload a GGUF model selected.
    Finally, once the import is done, you will be able to select new model on the main page:
    Open WebUI interface with loaded model mistral-7b-instruct-v0.2.Q8_0.gguf, size 7.2 GB
    Your model is now ready to run:
    Screenshot of Open WebUI displaying a conversation about creating a sticky website header using CSS and JavaScript.

    And that's all! This way you can run any GGUF model, of course, as long as your hardware supports it!
    Have you managed to run some models that way? Which GGUF model is your favourite? Let me know and stay tuned.

    Cool? Ranking DIY
    Helpful post? Buy me a coffee.
    About Author
    p.kaczmarek2
    Moderator Smart Home
    Offline 
    p.kaczmarek2 wrote 11822 posts with rating 9927, helped 564 times. Been with us since 2014 year.
  • ADVERTISEMENT
ADVERTISEMENT