Deploying DeepSeek 14B on Google Cloud with T4 GPU using Python and Flask API.
Step-by-step guide to deploy DeepSeek 14B on Google Cloud.

How to Deploy DeepSeek 14B on Google Cloud

Deploying DeepSeek 14B on Google Cloud (T4 GPU)

🚀 Ultimate Guide: Deploying DeepSeek 14B on Google Cloud (T4 GPU)

This guide provides a detailed step-by-step walkthrough for deploying DeepSeek 14B on Google Cloud VM with a T4 GPU. It includes troubleshooting tips, performance optimizations, and detailed instructions for smooth deployment.

Deploying DeepSeek 14B on Google Cloud with T4 GPU using Python and Flask API.
Step-by-step guide to deploy DeepSeek 14B on Google Cloud.

📌 Step 1: Google Cloud Setup

1.1 Sign Up for Google Cloud & Enable Billing

  1. Go to Google Cloud Console and create an account.
  2. Set Up Billing:
    • Navigate to Billing → Create Billing Account.
    • Attach it to your new project.
    • Google provides free credits for new users.

1.2 Create a New Project

  1. In Google Cloud Console, click Select a Project → New Project.
  2. Name it (e.g., deepseek-ai).
  3. Click Create.

1.3 Enable Compute Engine API

  1. Navigate to APIs & Services → Enable APIs.
  2. Search for Compute Engine API and enable it.

📌 Step 2: Create Google Cloud VM with T4 GPU

2.1 Create a Virtual Machine Instance

  1. Go to Compute Engine → VM Instances → Create Instance.
  2. Configure instance settings:
    • Name: deepseek-t4
    • Region: us-central1
    • Machine Type: n1-standard-8 (8 vCPUs, 30GB RAM)
    • GPU: 1 x NVIDIA T4
    • Boot Disk: Ubuntu 22.04 LTS, 100GB SSD
  3. Click Create and wait for the VM to start.

2.2 Connect to the VM via SSH

gcloud compute ssh deepseek-t4

📌 Step 3: Install Required Dependencies

3.1 Update the System & Install CUDA


sudo apt update && sudo apt upgrade -y
sudo apt install -y nvidia-cuda-toolkit
nvidia-smi
            

3.2 Install Python & Virtual Environment


sudo apt install -y python3 python3-pip python3-venv
python3 -m venv deepseek-env
source deepseek-env/bin/activate
            

3.3 Install Required Python Packages


pip install torch torchvision transformers accelerate flask bitsandbytes
            

📌 Step 4: Load DeepSeek 14B

4.1 Full Python Code for Running the Model


import os
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "deepseek-ai/DeepSeek-R1-Distill-Qwen-14B"
huggingface_token = os.getenv("HUGGINGFACE_TOKEN")

tokenizer = AutoTokenizer.from_pretrained(model_name, token=huggingface_token)
model = AutoModelForCausalLM.from_pretrained(
    model_name, token=huggingface_token, device_map="auto", torch_dtype=torch.float16
)

prompt = "Explain artificial intelligence."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    inputs.input_ids.to(model.device),
    attention_mask=inputs.attention_mask.to(model.device),
    max_length=200,
    pad_token_id=tokenizer.eos_token_id
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("\nResponse:", response)
            
✅ Your AI chatbot is now live! 🎉 Would you like to secure it with HTTPS or scale it further?