logo elektroda
logo elektroda
X
logo elektroda

How to download and run GGUF AI LLM models from Huggingface in the Ollama Open-WebUI?

p.kaczmarek2 14160 2

TL;DR

  • Open-WebUI GGUF import workflow for running external Hugging Face AI LLM models locally in Ollama Open-WebUI.
  • Download a GGUF file from Hugging Face, then upload it through the experimental GGUF form in Open-WebUI settings.
  • Example model: TheBloke/Mistral-7B-Instruct-v0.2-GGUF, using the mistral-7b-instruct-v0.2.Q8_0.gguf version.
  • Once import finishes, the new model appears on the main page and is ready to run, but the upload shows no good progress report display.
Generated by the language model.
ADVERTISEMENT
📢 Listen (AI):
  • Screenshot of the GGUF model upload interface in Open WebUI with the file mistral-7b-instruct-v0.2.Q8_0.gguf selected.
    Open-WebUI is a great tool for running locally multimodal Large Language Models, but not all of the models are available for download directly through the OWUI web panel. Luckily, GGUF models can be downloaded externally and then uploaded to OWUI through the experimental GGUF import. Here I will show you step by step how to import such model.

    This topic assumes that you already have Open-WebUI setup, if not, please check out the previous tutorial:
    ChatGPT locally? AI/LLM assistants to run on your computer - download and installation
    You may also find interesting:
    Minitest: robot vision? Multimodal AI LLaVA and workshop photo analysis - 100% local

    Let's start with considering what is GGUF.
    GGUF, which stands for GPT-Generated Unified Format, is a successor to GGML (GPT-Generated Model Language) format, released on 21st August 2023. GGUF is a file format used to store GPT-like models for inference. GGUF models can run on both GPU and CPU, and provides extensibility, stability and versatility.
    Diagram of GGUF file structure showing section breakdown and example metadata.
    Details about GGUF can be found on Hugging Face site.

    So first, you will need a GGUF model.
    Go to https://huggingface.co/models and browse the models for download.
    Not all models are available in GGUF. So, filter entries by GGUF:
    Screenshot of Hugging Face website with GGUF model search.

    For example, let's download TheBloke/Mistral-7B-Instruct-v0.2-GGUF
    Screenshot of the Mistral-7B-Instruct model file list.

    I've chosen mistral-7b-instruct-v0.2.Q8_0.gguf version.
    Downloading the model mistral-7b-instruct-v0.2.Q8_0.gguf at 10.3 MB/s, 12 minutes remaining.
    Wait for the download to finish:
    Screenshot showing a downloaded GGUF model file named mistral-7b-instruct-v0.2.Q8_0.gguf.
    Now, enter the settings, you will now need to upload the model you've downloaded.
    Screenshot of Open WebUI interface with codellama:latest model loaded.
    In models section, find a GGUF upload form:
    Settings panel in Open-WebUI with options for managing GGUF models.
    Select the file you want to upload:
    Screenshot of the settings panel in the Open WebUI application with the Models tab selected.
    Now you need to be very patient. Don't close this page. There is currently no good progress report display.
    Screenshot of Open-WebUI settings with the option to upload a GGUF model selected.
    Finally, once the import is done, you will be able to select new model on the main page:
    Open WebUI interface with loaded model mistral-7b-instruct-v0.2.Q8_0.gguf, size 7.2 GB
    Your model is now ready to run:
    Screenshot of Open WebUI displaying a conversation about creating a sticky website header using CSS and JavaScript.

    And that's all! This way you can run any GGUF model, of course, as long as your hardware supports it!
    Have you managed to run some models that way? Which GGUF model is your favourite? Let me know and stay tuned.

    Cool? Ranking DIY
    Helpful post? Buy me a coffee.
    About Author
    p.kaczmarek2
    Moderator Smart Home
    Offline 
    p.kaczmarek2 wrote 14612 posts with rating 12630, helped 655 times. Been with us since 2014 year.
  • ADVERTISEMENT
  • #2 21592563
    jmchiejr
    Level 1  
    Posts: 1
    After the upload completes, can I delete the downloaded file or is Ollama/OpenWebUI using the downloaded file in that location?
  • #3 21592605
    p.kaczmarek2
    Moderator Smart Home
    Posts: 14612
    Help: 655
    Rate: 12630
    Ollama can't even access this file directly, it's uploaded (copied) there. You can remove the source file.
    Helpful post? Buy me a coffee.
📢 Listen (AI):

FAQ

TL;DR: In 3 steps, you can add a 7B GGUF model to Ollama Open-WebUI: download it from Hugging Face, upload it in Settings, then select it on the main page. As the tutorial puts it, "be very patient" during import because the page shows poor progress feedback. This FAQ is for local AI users who need to run GGUF models not listed in the Open-WebUI download panel. [#21044053]

Why it matters: Manual GGUF import lets you run compatible local LLMs in Open-WebUI even when the built-in download list does not offer them.

Option How you get the model Format/path mentioned Main limitation from the thread
Built-in Open-WebUI download panel Download directly in the web UI Only models listed there Not all models are available
Manual GGUF import Download externally from Hugging Face, then upload in Settings GGUF file uploaded through the experimental form Import shows little or no visible progress

Key insight: Open-WebUI does not rely on the original GGUF file after upload. The file is copied into the app workflow, so you can delete the source file once import finishes. [#21592605]

Quick Facts

  • GGUF is the successor to GGML and was released on 21 August 2023 for GPT-like model inference. [#21044053]
  • The example workflow uses TheBloke/Mistral-7B-Instruct-v0.2-GGUF and selects the file mistral-7b-instruct-v0.2.Q8_0.gguf. [#21044053]
  • The tutorial states that GGUF models can run on CPU and GPU, so hardware support is the deciding factor. [#21044053]
  • Open-WebUI exposes an experimental GGUF upload form inside Settings → Models, and the page may appear stalled because it lacks a good progress display. [#21044053]

How do I download a GGUF model from Hugging Face and import it into Ollama Open-WebUI step by step?

Download the GGUF file on Hugging Face, upload it in Open-WebUI Settings, then select it on the main page. 1. Open the Hugging Face model list and filter for GGUF. 2. Download a file such as mistral-7b-instruct-v0.2.Q8_0.gguf. 3. In Open-WebUI, go to Settings → Models, use the experimental GGUF upload form, wait for import to finish, and then choose the new model on the main page. [#21044053]

What is GGUF, and how is it used for running GPT-like models locally?

"GGUF is a file format that stores GPT-like models for inference, with extensibility, stability, and versatility." The thread says GGUF stands for GPT-Generated Unified Format, succeeded GGML, and was released on 21 August 2023. In this workflow, you download a .gguf model file and import it into Open-WebUI so Ollama/Open-WebUI can run it locally on supported hardware. [#21044053]

How does GGUF compare with GGML for local LLM inference in Open-WebUI and Ollama?

GGUF is the newer format in this thread, while GGML is its predecessor. The post states that GGUF succeeded GGML on 21 August 2023 and highlights GGUF’s extensibility, stability, and versatility. For local inference in Open-WebUI and Ollama, the tutorial focuses on GGUF because Open-WebUI offers an experimental GGUF import path for manual model uploads. [#21044053]

Where in Open-WebUI do I find the experimental GGUF upload form for importing a model manually?

You find it in Open-WebUI under Settings, inside the Models section. The tutorial shows opening Settings first, then locating a GGUF upload form in Models. That form is the manual import path for .gguf files downloaded outside the built-in Open-WebUI model panel. [#21044053]

What should I do if a model I want on Hugging Face is not available through the Open-WebUI download panel?

Download the model externally from Hugging Face in GGUF format and import it manually. The thread explains that not all models appear in the Open-WebUI web panel, so the workaround is to browse Hugging Face, filter for GGUF, download the .gguf file, and upload it through the experimental GGUF import form in Settings → Models. [#21044053]

Which GGUF file variant should I choose on Hugging Face, such as mistral-7b-instruct-v0.2.Q8_0.gguf?

Choose a GGUF file that matches your intended model, then import that exact .gguf file. The tutorial’s concrete example selects mistral-7b-instruct-v0.2.Q8_0.gguf from TheBloke’s Mistral-7B-Instruct-v0.2-GGUF page. The thread does not give a rule for comparing variants, so its only explicit guidance is the demonstrated file choice and the reminder that your hardware must support the model. [#21044053]

Why does the GGUF import in Open-WebUI seem to hang without a visible progress bar, and how long should I wait?

It seems to hang because the page currently lacks a good progress display during upload and import. The tutorial explicitly warns, “Don’t close this page,” and says you need to be very patient. It gives no fixed duration in minutes, so the safe action is to keep the page open until the new model appears as selectable on the main page. [#21044053]

What happens to the original downloaded GGUF file after upload, and how does Open-WebUI or Ollama store the imported model?

The original downloaded file is not used in place after upload. The follow-up reply states that Ollama cannot access that source file directly because the file is uploaded, meaning copied into the application workflow. In practical terms, Open-WebUI/Ollama uses the imported copy, not the original file sitting in your download folder. [#21592605]

When is it safe to delete the GGUF file I downloaded from Hugging Face after importing it into Open-WebUI?

It is safe to delete the downloaded GGUF file after the upload has completed. The thread’s follow-up answer is direct: the file is uploaded, copied there, and Ollama cannot access the original source file directly. That means you can remove the local download once the import has finished successfully. [#21592605]

How do I select and run an imported GGUF model on the main page of Open-WebUI after the upload finishes?

Open the main page and choose the newly imported model from the model selector. The tutorial shows that, after import completes, the new model becomes available for selection on the main page. Once selected, the model is ready to run immediately in Open-WebUI. [#21044053]

What hardware limitations determine whether a GGUF model will run on my CPU or GPU in Ollama Open-WebUI?

Hardware support determines whether the model will run at all. The thread states that GGUF models can run on both CPU and GPU, but adds a clear condition: you can run any GGUF model only as long as your hardware supports it. The post does not give RAM, VRAM, or speed thresholds, so unsupported hardware is the key limiting case described. [#21044053]

Why are some Hugging Face models unavailable in GGUF format, and how do I filter the model list to find compatible ones?

Some models are unavailable because not every Hugging Face model is published in GGUF format. The tutorial says this directly and then shows the fix: go to huggingface.co/models and filter the entries by GGUF. That filter narrows the list to compatible files you can download and upload into Open-WebUI manually. [#21044053]

What is Open-WebUI, and how does it work with Ollama for running local multimodal LLMs?

Open-WebUI is the web interface in this workflow for running local multimodal large language models and managing model imports. The tutorial describes Open-WebUI as “a great tool for running locally multimodal Large Language Models,” then shows it working with Ollama by importing a GGUF model through Settings and running it from the main page. [#21044053]

What are the most common reasons a GGUF model import fails in Open-WebUI, and how can I troubleshoot them?

The thread points to three common failure points: wrong availability, interrupted import, and unsupported hardware. First, confirm the model exists in GGUF format on Hugging Face. Second, use the experimental GGUF form in Settings → Models and do not close the page during import, because progress feedback is poor. Third, if import finishes but the model still will not run, check whether your hardware supports that model. [#21044053]

How does importing a GGUF model manually compare with downloading a model directly through the Open-WebUI web panel?

Manual GGUF import is the fallback when the built-in Open-WebUI panel does not list the model you want. The direct panel is simpler because it downloads from inside Open-WebUI, but it covers only listed models. Manual import takes extra steps—download on Hugging Face, upload in Settings, wait through limited progress feedback—but it expands access to compatible GGUF models. [#21044053]
Generated by the language model.
ADVERTISEMENT