📚 Deployment Fundamentals

🎯 Level 46+ ⭐ 700 XP ⏱️ 12 min

[VIDEO-016] Deployment Fundamentals

Track: 4 - Production Mastery Module: 1 Duration: 12 minutes Level requirement: 46 XP reward: 400 XP

---

Scene 1: Welcome to Production (0:00-1:30)

[Visual]: Multi-Agent Architect badge evolving into production environment [Animation]: Development environment transforming into production infrastructure

[Audio/Script]:

"Congratulations, Multi-Agent Architect. You can build autonomous AI systems.
>
But there's a world of difference between 'it works on my machine' and 'it runs in production.'
>
Production means:
- 24/7 availability
- Handling thousands of requests
- Security and compliance
- Monitoring and alerting
- Cost optimization
>
Welcome to Track 4: Production Mastery. The final frontier."

[Lower third]: "Track 4: Production Mastery | Level 46"

---

Scene 2: Development vs Production (1:30-3:30)

[Visual]: Side-by-side comparison [Animation]: Dev environment disasters vs production resilience

[Audio/Script]:

"Let's understand what changes in production:"

[Comparison Table]:

| Aspect           | Development         | Production          |
|------------------|--------------------|--------------------|
| Errors           | You see them        | Users see them      |
| Failures         | Restart and retry   | Must auto-recover   |
| Scale            | Single user         | Thousands concurrent|
| Security         | Trust yourself      | Trust nobody        |
| Costs            | Minimal             | Real money          |
| Uptime           | When you're working | Always              |
| Data             | Test data           | Real, sensitive data|
| Debugging        | Print statements    | Structured logging  |

[Audio/Script]:

"In development, mistakes are learning opportunities.
>
In production, mistakes cost money, reputation, and trust.
>
The good news: Everything we learned applies. We just need to add resilience."

---

Scene 3: Deployment Architecture (3:30-6:00)

[Visual]: Production architecture diagram [Animation]: Components connecting

[Audio/Script]:

"A production AI system has multiple layers:"

[Diagram]:

┌─────────────────────────────────────────────────────────────────┐
│                      LOAD BALANCER                               │
│                    (nginx, HAProxy)                              │
└─────────────────────────┬───────────────────────────────────────┘
                          │
┌─────────────────────────▼───────────────────────────────────────┐
│                      API GATEWAY                                 │
│              (Authentication, Rate Limiting)                     │
└─────────────────────────┬───────────────────────────────────────┘
                          │
        ┌─────────────────┼─────────────────┐
        │                 │                 │
┌───────▼──────┐  ┌───────▼──────┐  ┌───────▼──────┐
│   Agent 1    │  │   Agent 2    │  │   Agent 3    │
│  (Replica 1) │  │  (Replica 2) │  │  (Replica 3) │
└───────┬──────┘  └───────┬──────┘  └───────┬──────┘
        │                 │                 │
        └─────────────────┼─────────────────┘
                          │
┌─────────────────────────▼───────────────────────────────────────┐
│                    SHARED SERVICES                               │
│         (Database, Cache, Message Queue, Storage)               │
└─────────────────────────────────────────────────────────────────┘

[Audio/Script]:

"Key components:
>
Load Balancer: Distributes traffic across replicas
API Gateway: Handles auth, rate limiting, routing
Agent Replicas: Multiple copies for high availability
Shared Services: Stateless agents, stateful services"

---

Scene 4: Configuration Management (6:00-8:00)

[Visual]: Configuration files and environment variables [Animation]: Configs being loaded and validated

[Audio/Script]:

"Never hardcode. Use configuration management."

[Demo - Configuration]:

from dataclasses import dataclass
from typing import Optional
import os
import yaml
from pathlib import Path

@dataclass class DatabaseConfig: host: str port: int database: str user: str password: str # From environment only pool_size: int = 10 timeout_seconds: int = 30

@dataclass class AgentConfig: model: str max_tokens: int temperature: float timeout_seconds: int max_retries: int

@dataclass class ProductionConfig: """Complete production configuration""" # Environment environment: str # development, staging, production

# Database database: DatabaseConfig

# AI Agents agent: AgentConfig

# API api_port: int api_workers: int rate_limit_per_minute: int

# Monitoring log_level: str metrics_enabled: bool tracing_enabled: bool

@classmethod def load(cls, config_path: str = None) -> "ProductionConfig": """Load configuration from file and environment"""

# Load base config from file if config_path and Path(config_path).exists(): with open(config_path) as f: file_config = yaml.safe_load(f) else: file_config = {}

# Environment overrides env = os.environ.get("ENVIRONMENT", "development")

return cls( environment=env, database=DatabaseConfig( host=os.environ.get("DB_HOST", file_config.get("database", {}).get("host", "localhost")), port=int(os.environ.get("DB_PORT", file_config.get("database", {}).get("port", 5432))), database=os.environ.get("DB_NAME", file_config.get("database", {}).get("database", "agents")), user=os.environ.get("DB_USER", file_config.get("database", {}).get("user", "postgres")), password=os.environ["DB_PASSWORD"], # Required from environment pool_size=int(os.environ.get("DB_POOL_SIZE", 10)), ), agent=AgentConfig( model=os.environ.get("AGENT_MODEL", "claude-sonnet-4-20250514"), max_tokens=int(os.environ.get("AGENT_MAX_TOKENS", 4096)), temperature=float(os.environ.get("AGENT_TEMPERATURE", 0.7)), timeout_seconds=int(os.environ.get("AGENT_TIMEOUT", 120)), max_retries=int(os.environ.get("AGENT_RETRIES", 3)), ), api_port=int(os.environ.get("API_PORT", 8000)), api_workers=int(os.environ.get("API_WORKERS", 4)), rate_limit_per_minute=int(os.environ.get("RATE_LIMIT", 100)), log_level=os.environ.get("LOG_LEVEL", "INFO"), metrics_enabled=os.environ.get("METRICS_ENABLED", "true").lower() == "true", tracing_enabled=os.environ.get("TRACING_ENABLED", "false").lower() == "true", )

config.yaml example

""" database: host: db.example.com port: 5432 database: agents_prod user: agent_service

agent: model: claude-sonnet-4-20250514 max_tokens: 4096 temperature: 0.7 timeout_seconds: 120 max_retries: 3

api: port: 8000 workers: 4 rate_limit_per_minute: 100

logging: level: INFO format: json

monitoring: metrics_enabled: true tracing_enabled: true """

---

Scene 5: Service Wrapper (8:00-10:00)

[Visual]: Agent wrapped in production service [Animation]: Service handling requests

[Audio/Script]:

"Wrap your agents in a production-ready service."

[Demo - Service Wrapper]:

from fastapi import FastAPI, HTTPException, Depends
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from typing import Dict, Any, Optional
import uvicorn
import asyncio

Request/Response models

class TaskRequest(BaseModel): task: str context: Optional[Dict[str, Any]] = None priority: str = "normal" timeout_seconds: int = 120

class TaskResponse(BaseModel): task_id: str status: str result: Optional[Dict[str, Any]] = None error: Optional[str] = None duration_ms: int

Service

class AgentService: """Production-ready agent service"""

def __init__(self, config: ProductionConfig): self.config = config self.app = FastAPI( title="AI Agent Service", version="1.0.0", docs_url="/docs" if config.environment != "production" else None ) self._setup_middleware() self._setup_routes()

def _setup_middleware(self): """Configure middleware""" # CORS self.app.add_middleware( CORSMiddleware, allow_origins=["*"] if self.config.environment == "development" else [], allow_methods=["GET", "POST"], allow_headers=["*"], )

def _setup_routes(self): """Configure API routes"""

@self.app.get("/health") async def health_check(): """Health check endpoint for load balancers""" return {"status": "healthy", "environment": self.config.environment}

@self.app.get("/ready") async def readiness_check(): """Readiness check - are we ready to serve traffic?""" # Check dependencies db_ok = await self._check_database() return { "ready": db_ok, "checks": {"database": db_ok} }

@self.app.post("/api/v1/task", response_model=TaskResponse) async def execute_task(request: TaskRequest): """Execute an agent task""" import time import uuid

task_id = str(uuid.uuid4())[:8] start_time = time.time()

try: # Apply timeout result = await asyncio.wait_for( self._execute_task(request.task, request.context or {}), timeout=request.timeout_seconds )

duration_ms = int((time.time() - start_time) * 1000)

return TaskResponse( task_id=task_id, status="completed", result=result, duration_ms=duration_ms )

except asyncio.TimeoutError: duration_ms = int((time.time() - start_time) * 1000) return TaskResponse( task_id=task_id, status="timeout", error=f"Task exceeded {request.timeout_seconds}s timeout", duration_ms=duration_ms )

except Exception as e: duration_ms = int((time.time() - start_time) * 1000) return TaskResponse( task_id=task_id, status="error", error=str(e), duration_ms=duration_ms )

async def _execute_task(self, task: str, context: Dict) -> Dict: """Execute the agent task""" # Your agent execution logic here return {"output": "Task completed"}

async def _check_database(self) -> bool: """Check database connectivity""" try: # Perform simple query return True except Exception: return False

def run(self): """Run the service""" uvicorn.run( self.app, host="0.0.0.0", port=self.config.api_port, workers=self.config.api_workers, log_level=self.config.log_level.lower() )

if __name__ == "__main__": config = ProductionConfig.load("config.yaml") service = AgentService(config) service.run()

---

Scene 6: Process Management (10:00-11:30)

[Visual]: Process supervisor managing services [Animation]: Processes being monitored and restarted

[Audio/Script]:

"Production services need process management."

[Demo - Systemd Service]:

/etc/systemd/system/agent-service.service

[Unit] Description=AI Agent Service After=network.target postgresql.service Wants=postgresql.service

[Service] Type=simple User=agent Group=agent WorkingDirectory=/opt/agent-service

Environment

EnvironmentFile=/opt/agent-service/.env Environment=PYTHONUNBUFFERED=1

Execution

ExecStart=/opt/agent-service/venv/bin/python -m uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4 ExecReload=/bin/kill -HUP $MAINPID

Restart policy

Restart=always RestartSec=5

Resource limits

MemoryLimit=4G CPUQuota=200%

Logging

StandardOutput=journal StandardError=journal SyslogIdentifier=agent-service

[Install] WantedBy=multi-user.target

[Demo - Commands]:

Enable and start service

sudo systemctl enable agent-service sudo systemctl start agent-service

Check status

sudo systemctl status agent-service

View logs

sudo journalctl -u agent-service -f

Restart on deploy

sudo systemctl restart agent-service

---

Scene 7: Challenge Time (11:30-12:00)

[Visual]: Challenge specification [Animation]: XP reward display

[Audio/Script]:

"Your challenge: Deploy your multi-agent system as a production service.
>
Requirements:
1. Configuration management with environment variables
2. FastAPI service wrapper with health checks
3. Systemd service file for process management
4. Basic API endpoint for task execution
>
Complete this for 700 XP and the 'Deployment Ready' badge.
>
Next: Scaling your AI systems for massive load."

---

Post-Video Challenge

Challenge ID: TRACK4_001_CHALLENGE Type: Code + Deployment Instructions:

Task 1: Create configuration system

claude "Create a production configuration system:
1. config.py with dataclass configs
2. Load from YAML + environment overrides
3. Validate required settings
4. Support dev/staging/prod environments"

Task 2: Create service wrapper

claude "Create a FastAPI service wrapper:
1. /health and /ready endpoints
2. /api/v1/task POST endpoint
3. Request/response models
4. Timeout handling
5. Error responses"

Task 3: Create systemd service

claude "Create systemd service file:
1. Service definition
2. Environment file support
3. Restart policy
4. Resource limits
5. Logging to journal"

Task 4: Test deployment

Start the service

sudo systemctl start agent-service

Test health endpoint

curl http://localhost:8000/health

Test task endpoint

curl -X POST http://localhost:8000/api/v1/task \ -H "Content-Type: application/json" \ -d '{"task": "Hello, world!"}'

Rewards:

  • XP: 700 (400 base + 300 challenge)
  • Achievement: "Deployment Ready"
---

SEO Metadata

Alt-text: Production deployment fundamentals for AI agents - configuration management, service wrappers, process management, systemd deployment.

Tags: production AI, deployment, configuration management, FastAPI, systemd, AI service, production environment

Keywords: deploy ai agents, production ai service, ai configuration management, fastapi ai, systemd ai service, production deployment

Laaste wysiging: Wednesday, 10 December 2025, 1:05 AM