Self-Hosting DeepSeek AI Models on AWS EC2 with Docker, Ollama, and Nginx

DeepSeek’s AI models have gained significant attention, but privacy concerns regarding their web UI have led many to seek alternative solutions. This guide demonstrates how to deploy a DeepSeek model on an AWS EC2 instance using Docker, Ollama, and Nginx, giving you complete control over your AI environment.

Why Self-Host DeepSeek?

Self-hosting offers enhanced privacy and control over your AI environment, eliminating concerns associated with using third-party web UIs.

Prerequisites

  • An AWS account
  • Basic knowledge of Docker, Ollama, and Nginx

Steps

  1. Launch an EC2 Instance:
    • Select Amazon Linux as the operating system.
    • Choose an appropriate instance type. While a G4DN instance is ideal, an R5 XLarge instance can be used as an alternative.
    • Create or select an existing key pair.
    • Allow HTTPS and HTTP traffic.
  2. Connect to Your EC2 Server:
    • Use SSH to connect to your EC2 instance.
  3. Install Docker:
    • Update yum: sudo yum update -y
    • Install Docker: sudo yum install docker -y
    • Add your user to the docker group: sudo usermod -aG docker $USER
    • Enable and start the Docker service:
      • sudo systemctl enable docker
      • sudo systemctl start docker
    • Change the permissions of the Docker socket: sudo chmod 666 /var/run/docker.sock
  4. Deploy Ollama with Docker:
    • Run the following command to deploy Ollama in detached mode:bashdocker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
    • Verify that the Ollama container is running: docker ps
    • Check if Ollama is running: curl http://localhost:11434
  5. Install DeepSeek Models:
    • Use Ollama to install the desired DeepSeek models. For example, to install the 1.5 qwen distilled and 7b qwen distilled models, run the following commands:bashollama pull deepseek-ai/deepseek-coder:1.5b-base-q4_K_M ollama pull deepseek-ai/deepseek-coder:7b-base-q4_K_M
  6. Run the Ollama Web UI:
    • Deploy the Ollama web UI using Docker:bashdocker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v ollama-webui:/app/data --name ollama-webui ghcr.io/ollama-webui/ollama-webui:latest
  7. Configure Nginx:
    • Install Nginx: sudo yum install nginx -y
    • Edit the Nginx configuration file (/etc/nginx/nginx.conf) and add the following code block within the http block:textserver { listen 80; server_name your_domain.com; # Replace with your domain or public IP location / { proxy_pass http://localhost:3000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } }
    • Test the Nginx configuration: sudo nginx -t
    • Enable and start Nginx:
      • sudo systemctl enable nginx
      • sudo systemctl start nginx
  8. Testing Your Deployment:
    • Open your EC2 instance’s public IP address in a web browser, changing HTTPS to HTTP.
    • Sign up on the Ollama web UI.
    • Interact with the installed DeepSeek models.

Conclusion

By following these steps, you can successfully deploy DeepSeek AI models on an AWS EC2 instance, ensuring privacy and complete control over your AI environment. While this setup can be further optimized with higher parameter models and GPU-enabled instances, it provides a solid foundation for self-hosting DeepSeek.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *