OpenWebUI, as a good media combine for working with neural networks, can also generate images. Not on its own, of course, but it allows even an unprepared user to write a prompt and get a picture, as if you were working in ChatGPT or GigaChat. This can be done through the OpenAI API or using interfaces for accessing StableDiffusion or Flux models such as stable-diffusion-webui (aka Automatic1111) or ComfyUI.
Adding this feature is not a very complicated process, but as usual, it has its own nuances. Below we would like to go through this process step by step and tell you how OpenWebUI will work after this feature appears. And we will add ComfyUI and the StableDiffusion 3.5 Medium model.
Install ComfyUI
You can install ComfyUI on the same server where OpenWebUI is deployed or on another one - in any case, you will access it via API by IP address. In our case, we install it on the same server where we have the OpenWebUI docker image running under Ubuntu 22.04. We do everything according to the official instructions . We assume that you already have Python 3 version 3.10 or higher installed.
- Clone the git repository (beforehand, don’t forget to install git itself)
git clone https://github.com/comfyanonymous/ComfyUI.git
- We install PyTorch. We will show it using the example of Nvidia GPU, for AMD, Intel and Apple M1, M2 and older, see the information in the project Wiki.
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121
- Go to the ComfyUI directory and install the remaining necessary dependencies.
pip install -r requirements.txt
Next, you need to load the model. To do this, go to the ComfyUI/models/checkpoints directory and run the command there:
wget https://huggingface.co/Comfy-Org/stable-diffusion-3.5-fp8/resolve/main/sd3.5_medium_incl_clips_t5xxlfp8scaled.safetensors
This will load the model immediately with built-in clip encoders and in fp8. But no one forbids you to load and configure any of the supported models at your discretion.
In theory, that’s all. Let’s go back to the ComfyUI directory and check its functionality.
python3 main.py
After this, you can open the web interface at http://localhost:8188 .
But… if you install ComfyUI on a server, you won’t be able to open anything via an external IP. To do this, you need to run ComfyUI with the —listen <server IP address> parameter, for example:
python3 main.py --listen <address>
Then you can go to http://ip:8188 and also get access.
Next, you need to add a workflow for the SD3.5 model. It is found with a bang by search, but in OpenWebUI, a slightly different one works better than what is offered by ComfyUI developers. simply drag this json file to the open web interface of ComfyUI.
We check the generation, and if everything works, we proceed to adding ComfyUI to OpenWebUI.
We attach ComfyUI to OpenWebUI
First, we need to do a few preparatory steps. First, we export our workflow as an API, otherwise we will only get errors instead of images in OpenWebUI. To do this:
- If you have an old menu, then click Save (API Format) in it and save a new json file with the workflow. This is what you will export to OpenWebUI. If you have a new menu, like in our picture, then select Workflow - Export (API) .
- Next, you need to remember the ID of the nodes with different elements (we are interested in the promt itself, the model, the image size and the node with the generation parameters). To do this, go to Settings and there turn on the Node ID Icon Mode setting in Show all in the LiteGraph submenu .
Now nodes will display their ID as #<ID>. We only need to remember the node with Positive Promt, in our case its ID is 16.
Open the OpenWebUI web interface and go to it. Go to Settings (called by clicking on the user name in the lower left corner), then to Admin Setting and then to Images . Set Image Generation Engine to ComfyUI .
In the ComfyUI Base URL field , enter the IP address with the port you previously opened ComfyUI on. Click on the icon with the arrows “back and forth” to check the connection to the server.
Next, click the Click here to upload workflow.json file button and upload our exported API workflow. Don’t get confused, it’s the API, not the original one!
After that, in the ComfyUI Workflow Nodes table, we enter for prompt* the value of our ID equal to 16. At the top, we enable Image Generation (Experimental) and immediately below Image Prompt Generation .
Let’s go back down. In Set Default Model we choose our model sd3.5_medium_incl_clips_t5xxlfp8scaled.safetensors . In Set Image Size we set 1024x1024 or another multiple of the generation size:
1:1 - 1024 x 1024
5:4 - 1152 x 896
3:2 - 1216 x 832
16:9 - 1344 x 768
In Set Steps, set the number of generation passes. We recommend setting it from 20 to 40, depending on the power of your video adapter. Be sure to click Save in the lower right corner to save our settings!
Checking the operation of image generation
Starting with the 0.5.x branch, OpenWebUI has two ways to generate images. Let’s look at both of them.
Method 1. Generate images immediately upon entering a prompt. To do this, click the Image button at the bottom of the chat line in the request line to the neural network and enter the prompt for the image. A warning will appear at the top of the chat line that you are in the Generate an Image image generation mode .
After sending a request, you will receive an image from ComfyUI after some time.
Until you press the Image button again , you will be in the mode of direct image generation on prompt.
Method 2. Generate based on the neural network response. In this mode, you communicate with the model in the chat as usual. For example, you can ask it to write a prompt or improve it. After receiving a response from the neural network, click on the Generate Image icon in the icon bar at the bottom of the response.
OpenWebUI will also send a request to ComfyUI and display the image for you.
What have we forgotten?
Important point - in the current configuration, ComfyUI will close as soon as you close the command line from which you launched it. To avoid this, you need to put ComfyUI in autorun. In Linux, this is easier to do via the systemd service.
To do this, log in as root and create in the directory /etc/system.d/system/ (yes, we know that it is more correct to create /usr/lib/systemd/system and then enable and disable the service by creating this symlink) a file comfyui.service with the following contents:
[Unit]
Description=ComfyUI Service
After=network-online.target
[Service]
ExecStart= python3 /root/ComfyUI/main.py --listen 248.255.56.23
Restart=always
RestartSec=3
Environment="PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/cuda/bin"
[Install]
WantedBy=default.target
Then run this service with the commands
systemctl daemon-reload
service comfyui restart
Check that everything works by running the command
service comfyui status
As you can see, adding image generation support to OpenWebUI is not very difficult if you follow all the necessary steps correctly. Similarly, you can add other models or use Automatic1111 instead of ComfyUI. As a result, you will expand the capabilities of working with neural network models and get an open source replacement for proprietary solutions.