[TRACK4_001] Deployment Fundamentals
[TRACK4_001] Deployment Fundamentals
📚 Deployment Fundamentals
[VIDEO-016] Deployment Fundamentals
Track: 4 - Production Mastery Module: 1 Duration: 12 minutes Level requirement: 46 XP reward: 400 XP
---
Scene 1: Welcome to Production (0:00-1:30)
[Visual]: Multi-Agent Architect badge evolving into production environment [Animation]: Development environment transforming into production infrastructure
[Audio/Script]:
"Congratulations, Multi-Agent Architect. You can build autonomous AI systems.>
But there's a world of difference between 'it works on my machine' and 'it runs in production.'>
Production means:
- 24/7 availability
- Handling thousands of requests
- Security and compliance
- Monitoring and alerting
- Cost optimization>
Welcome to Track 4: Production Mastery. The final frontier."
[Lower third]: "Track 4: Production Mastery | Level 46"
---
Scene 2: Development vs Production (1:30-3:30)
[Visual]: Side-by-side comparison [Animation]: Dev environment disasters vs production resilience
[Audio/Script]:
"Let's understand what changes in production:"
[Comparison Table]:
| Aspect | Development | Production |
|------------------|--------------------|--------------------|
| Errors | You see them | Users see them |
| Failures | Restart and retry | Must auto-recover |
| Scale | Single user | Thousands concurrent|
| Security | Trust yourself | Trust nobody |
| Costs | Minimal | Real money |
| Uptime | When you're working | Always |
| Data | Test data | Real, sensitive data|
| Debugging | Print statements | Structured logging |[Audio/Script]:
"In development, mistakes are learning opportunities.>
In production, mistakes cost money, reputation, and trust.>
The good news: Everything we learned applies. We just need to add resilience."
---
Scene 3: Deployment Architecture (3:30-6:00)
[Visual]: Production architecture diagram [Animation]: Components connecting
[Audio/Script]:
"A production AI system has multiple layers:"
[Diagram]:
┌─────────────────────────────────────────────────────────────────┐
│ LOAD BALANCER │
│ (nginx, HAProxy) │
└─────────────────────────┬───────────────────────────────────────┘
│
┌─────────────────────────▼───────────────────────────────────────┐
│ API GATEWAY │
│ (Authentication, Rate Limiting) │
└─────────────────────────┬───────────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
┌───────▼──────┐ ┌───────▼──────┐ ┌───────▼──────┐
│ Agent 1 │ │ Agent 2 │ │ Agent 3 │
│ (Replica 1) │ │ (Replica 2) │ │ (Replica 3) │
└───────┬──────┘ └───────┬──────┘ └───────┬──────┘
│ │ │
└─────────────────┼─────────────────┘
│
┌─────────────────────────▼───────────────────────────────────────┐
│ SHARED SERVICES │
│ (Database, Cache, Message Queue, Storage) │
└─────────────────────────────────────────────────────────────────┘[Audio/Script]:
"Key components:>
Load Balancer: Distributes traffic across replicas
API Gateway: Handles auth, rate limiting, routing
Agent Replicas: Multiple copies for high availability
Shared Services: Stateless agents, stateful services"
---
Scene 4: Configuration Management (6:00-8:00)
[Visual]: Configuration files and environment variables [Animation]: Configs being loaded and validated
[Audio/Script]:
"Never hardcode. Use configuration management."
[Demo - Configuration]:
from dataclasses import dataclass
from typing import Optional
import os
import yaml
from pathlib import Path@dataclass
class DatabaseConfig:
host: str
port: int
database: str
user: str
password: str # From environment only
pool_size: int = 10
timeout_seconds: int = 30
@dataclass
class AgentConfig:
model: str
max_tokens: int
temperature: float
timeout_seconds: int
max_retries: int
@dataclass
class ProductionConfig:
"""Complete production configuration"""
# Environment
environment: str # development, staging, production
# Database
database: DatabaseConfig
# AI Agents
agent: AgentConfig
# API
api_port: int
api_workers: int
rate_limit_per_minute: int
# Monitoring
log_level: str
metrics_enabled: bool
tracing_enabled: bool
@classmethod
def load(cls, config_path: str = None) -> "ProductionConfig":
"""Load configuration from file and environment"""
# Load base config from file
if config_path and Path(config_path).exists():
with open(config_path) as f:
file_config = yaml.safe_load(f)
else:
file_config = {}
# Environment overrides
env = os.environ.get("ENVIRONMENT", "development")
return cls(
environment=env,
database=DatabaseConfig(
host=os.environ.get("DB_HOST", file_config.get("database", {}).get("host", "localhost")),
port=int(os.environ.get("DB_PORT", file_config.get("database", {}).get("port", 5432))),
database=os.environ.get("DB_NAME", file_config.get("database", {}).get("database", "agents")),
user=os.environ.get("DB_USER", file_config.get("database", {}).get("user", "postgres")),
password=os.environ["DB_PASSWORD"], # Required from environment
pool_size=int(os.environ.get("DB_POOL_SIZE", 10)),
),
agent=AgentConfig(
model=os.environ.get("AGENT_MODEL", "claude-sonnet-4-20250514"),
max_tokens=int(os.environ.get("AGENT_MAX_TOKENS", 4096)),
temperature=float(os.environ.get("AGENT_TEMPERATURE", 0.7)),
timeout_seconds=int(os.environ.get("AGENT_TIMEOUT", 120)),
max_retries=int(os.environ.get("AGENT_RETRIES", 3)),
),
api_port=int(os.environ.get("API_PORT", 8000)),
api_workers=int(os.environ.get("API_WORKERS", 4)),
rate_limit_per_minute=int(os.environ.get("RATE_LIMIT", 100)),
log_level=os.environ.get("LOG_LEVEL", "INFO"),
metrics_enabled=os.environ.get("METRICS_ENABLED", "true").lower() == "true",
tracing_enabled=os.environ.get("TRACING_ENABLED", "false").lower() == "true",
)
config.yaml example
"""
database:
host: db.example.com
port: 5432
database: agents_prod
user: agent_serviceagent:
model: claude-sonnet-4-20250514
max_tokens: 4096
temperature: 0.7
timeout_seconds: 120
max_retries: 3
api:
port: 8000
workers: 4
rate_limit_per_minute: 100
logging:
level: INFO
format: json
monitoring:
metrics_enabled: true
tracing_enabled: true
"""
---
Scene 5: Service Wrapper (8:00-10:00)
[Visual]: Agent wrapped in production service [Animation]: Service handling requests
[Audio/Script]:
"Wrap your agents in a production-ready service."
[Demo - Service Wrapper]:
from fastapi import FastAPI, HTTPException, Depends
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from typing import Dict, Any, Optional
import uvicorn
import asyncioRequest/Response models
class TaskRequest(BaseModel):
task: str
context: Optional[Dict[str, Any]] = None
priority: str = "normal"
timeout_seconds: int = 120class TaskResponse(BaseModel):
task_id: str
status: str
result: Optional[Dict[str, Any]] = None
error: Optional[str] = None
duration_ms: int
Service
class AgentService:
"""Production-ready agent service""" def __init__(self, config: ProductionConfig):
self.config = config
self.app = FastAPI(
title="AI Agent Service",
version="1.0.0",
docs_url="/docs" if config.environment != "production" else None
)
self._setup_middleware()
self._setup_routes()
def _setup_middleware(self):
"""Configure middleware"""
# CORS
self.app.add_middleware(
CORSMiddleware,
allow_origins=["*"] if self.config.environment == "development" else [],
allow_methods=["GET", "POST"],
allow_headers=["*"],
)
def _setup_routes(self):
"""Configure API routes"""
@self.app.get("/health")
async def health_check():
"""Health check endpoint for load balancers"""
return {"status": "healthy", "environment": self.config.environment}
@self.app.get("/ready")
async def readiness_check():
"""Readiness check - are we ready to serve traffic?"""
# Check dependencies
db_ok = await self._check_database()
return {
"ready": db_ok,
"checks": {"database": db_ok}
}
@self.app.post("/api/v1/task", response_model=TaskResponse)
async def execute_task(request: TaskRequest):
"""Execute an agent task"""
import time
import uuid
task_id = str(uuid.uuid4())[:8]
start_time = time.time()
try:
# Apply timeout
result = await asyncio.wait_for(
self._execute_task(request.task, request.context or {}),
timeout=request.timeout_seconds
)
duration_ms = int((time.time() - start_time) * 1000)
return TaskResponse(
task_id=task_id,
status="completed",
result=result,
duration_ms=duration_ms
)
except asyncio.TimeoutError:
duration_ms = int((time.time() - start_time) * 1000)
return TaskResponse(
task_id=task_id,
status="timeout",
error=f"Task exceeded {request.timeout_seconds}s timeout",
duration_ms=duration_ms
)
except Exception as e:
duration_ms = int((time.time() - start_time) * 1000)
return TaskResponse(
task_id=task_id,
status="error",
error=str(e),
duration_ms=duration_ms
)
async def _execute_task(self, task: str, context: Dict) -> Dict:
"""Execute the agent task"""
# Your agent execution logic here
return {"output": "Task completed"}
async def _check_database(self) -> bool:
"""Check database connectivity"""
try:
# Perform simple query
return True
except Exception:
return False
def run(self):
"""Run the service"""
uvicorn.run(
self.app,
host="0.0.0.0",
port=self.config.api_port,
workers=self.config.api_workers,
log_level=self.config.log_level.lower()
)
if __name__ == "__main__":
config = ProductionConfig.load("config.yaml")
service = AgentService(config)
service.run()
---
Scene 6: Process Management (10:00-11:30)
[Visual]: Process supervisor managing services [Animation]: Processes being monitored and restarted
[Audio/Script]:
"Production services need process management."
[Demo - Systemd Service]:
/etc/systemd/system/agent-service.service
[Unit]
Description=AI Agent Service
After=network.target postgresql.service
Wants=postgresql.service[Service]
Type=simple
User=agent
Group=agent
WorkingDirectory=/opt/agent-service
Environment
EnvironmentFile=/opt/agent-service/.env
Environment=PYTHONUNBUFFERED=1Execution
ExecStart=/opt/agent-service/venv/bin/python -m uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4
ExecReload=/bin/kill -HUP $MAINPIDRestart policy
Restart=always
RestartSec=5Resource limits
MemoryLimit=4G
CPUQuota=200%Logging
StandardOutput=journal
StandardError=journal
SyslogIdentifier=agent-service[Install]
WantedBy=multi-user.target
[Demo - Commands]:
Enable and start service
sudo systemctl enable agent-service
sudo systemctl start agent-serviceCheck status
sudo systemctl status agent-serviceView logs
sudo journalctl -u agent-service -fRestart on deploy
sudo systemctl restart agent-service---
Scene 7: Challenge Time (11:30-12:00)
[Visual]: Challenge specification [Animation]: XP reward display
[Audio/Script]:
"Your challenge: Deploy your multi-agent system as a production service.>
Requirements:
1. Configuration management with environment variables
2. FastAPI service wrapper with health checks
3. Systemd service file for process management
4. Basic API endpoint for task execution>
Complete this for 700 XP and the 'Deployment Ready' badge.>
Next: Scaling your AI systems for massive load."
---
Post-Video Challenge
Challenge ID: TRACK4_001_CHALLENGE Type: Code + Deployment Instructions:
Task 1: Create configuration system
claude "Create a production configuration system:
1. config.py with dataclass configs
2. Load from YAML + environment overrides
3. Validate required settings
4. Support dev/staging/prod environments"Task 2: Create service wrapper
claude "Create a FastAPI service wrapper:
1. /health and /ready endpoints
2. /api/v1/task POST endpoint
3. Request/response models
4. Timeout handling
5. Error responses"Task 3: Create systemd service
claude "Create systemd service file:
1. Service definition
2. Environment file support
3. Restart policy
4. Resource limits
5. Logging to journal"Task 4: Test deployment
Start the service
sudo systemctl start agent-serviceTest health endpoint
curl http://localhost:8000/healthTest task endpoint
curl -X POST http://localhost:8000/api/v1/task \
-H "Content-Type: application/json" \
-d '{"task": "Hello, world!"}'Rewards:
- XP: 700 (400 base + 300 challenge)
- Achievement: "Deployment Ready"
SEO Metadata
Alt-text: Production deployment fundamentals for AI agents - configuration management, service wrappers, process management, systemd deployment.
Tags: production AI, deployment, configuration management, FastAPI, systemd, AI service, production environment
Keywords: deploy ai agents, production ai service, ai configuration management, fastapi ai, systemd ai service, production deployment