Gradio, a Python library for creating web UIs for machine learning models, is enjoying increasing popularity. Especially the easy integration with Hugging Face Spaces allows developers to quickly and freely make their models available online. But what happens when you need more control over the execution environment or want to run the application on your own hardware? This is where Docker comes into play.
Docker enables the containerization of applications, which standardizes the execution environment and ensures portability between different systems. This offers numerous advantages for Gradio applications:
Consistency: Docker ensures that the Gradio app always functions the same regardless of the deployment target, as the application and its dependencies are packaged together.
Portability: Containers can be easily moved between different systems and cloud environments.
Scalability: Docker combines well with orchestration systems like Kubernetes, allowing the application to be scaled as needed.
Dockerizing a Gradio app is relatively straightforward. The following example illustrates the process:
1. Create a Gradio App: First, a simple Gradio app is needed. A file named app.py
could contain the following code:
import gradio as gr
def greet(name):
return f"Hello {name}!"
iface = gr.Interface(fn=greet, inputs="text", outputs="text").launch()
2. Create a Dockerfile: The Dockerfile defines how the app is built and run within the Docker container. In the same directory as app.py
, create a file named Dockerfile
with the following content:
FROM python:3.8-slim
WORKDIR /usr/src/app
COPY . .
RUN pip install --no-cache-dir gradio
EXPOSE 7860
ENV GRADIO_SERVER_NAME="0.0.0.0"
CMD ["python", "app.py"]
3. Build and Run the Docker Container: With the Dockerfile, the container can be built and started:
docker build -t gradio-app .
docker run -p 7860:7860 gradio-app
The Gradio app should now be accessible at http://localhost:7860
.
There are a few things to keep in mind when running Gradio applications in Docker:
Server Name and Port: The environment variable GRADIO_SERVER_NAME="0.0.0.0"
and the port mapping -p 7860:7860
are necessary for the app to be accessible from outside the container.
Multiple Replicas: When using multiple replicas, for example in AWS ECS, "stickiness" with sessionAffinity: ClientIP
is important to ensure that requests from the same user are always directed to the same instance. This is crucial for the correct processing of Gradio events.
Reverse Proxy: When using a reverse proxy like Nginx, appropriate configuration is necessary. Gradio provides a guide for this.
Hugging Face Spaces offers a simple way to permanently and freely host Gradio apps. Deployment is done either through the Hugging Face web interface or programmatically via the huggingface_hub
library.
Additionally, Gradio allows direct integration with Hugging Face Inference Endpoints. This allows developers to create demos for models on the Hub without having to install the models locally. Specifying the model name and the src="models"
argument in gr.load()
is sufficient.
The combination of Gradio and Docker offers developers more flexibility and control in deploying AI applications. Docker enables execution on their own hardware and integration into existing infrastructures. Together with Hugging Face Spaces, a powerful ecosystem is created for the development and deployment of machine learning models.