Performance and Scaling
Guidelines for scaling Pixashot deployments, including CPU and memory optimization, worker configuration, and performance tuning.
This guide covers scaling considerations and performance optimization for Pixashot deployments.
Resource Requirements
Base Requirements Per Instance
- CPU: 1 core minimum
- RAM: 2GB minimum
- Storage: 1GB base + temporary storage
- Network: Reliable outbound connection
Scaling Formula
Workers = MIN(CPU_CORES * 2, AVAILABLE_MEMORY_GB)
Concurrent_Requests = Workers * 10
Memory Scaling
Memory Per Request Type
Standard Screenshot: ~50MB
Full Page: ~100MB
PDF Generation: ~150MB
Complex Interactions: ~200MB
Instance Sizing
# Light load (<10 req/min)
--memory=2g --workers=2
# Medium load (10-50 req/min)
--memory=4g --workers=4
# Heavy load (50+ req/min)
--memory=8g --workers=8
Worker Configuration
Environment Variables
# Light usage
export WORKERS=2
export MAX_REQUESTS=500
export KEEP_ALIVE=300
# Medium usage
export WORKERS=4
export MAX_REQUESTS=1000
export KEEP_ALIVE=300
# Heavy usage
export WORKERS=8
export MAX_REQUESTS=2000
export KEEP_ALIVE=300
Performance Optimization
Browser Context
{
"block_media": true, // Reduce memory usage
"wait_for_network": "mostly_idle", // Faster than "idle"
"pixel_density": 1.0, // Lower for large viewports
"format": "jpeg", // More efficient than PNG
"image_quality": 85 // Balance quality vs size
}
Caching
# Enable response caching
export CACHE_MAX_SIZE=1000 # Adjust based on memory
export RATE_LIMIT_ENABLED=true # Prevent overload
export RATE_LIMIT_CAPTURE="5 per second"
Cloud Run Auto-scaling
Configuration
# Optimal scaling settings
gcloud run deploy pixashot \
--min-instances=1 \
--max-instances=10 \
--concurrency=80 \
--cpu=2 \
--memory=4Gi \
--timeout=300
Load Balancing
- Set appropriate
min-instances
for steady traffic - Configure
max-instances
based on peak load - Use
concurrency
to control requests per instance - Monitor cold start performance
Monitoring Metrics
Key metrics to track:
- Memory usage per request
- CPU utilization
- Response times
- Error rates
- Cold start frequency
- Cache hit rates
Health Endpoints
# Monitor these endpoints
/health # Overall health
/health/live # Liveness check
/health/ready # Readiness check
Scaling Checklist
Resource Monitoring
- Configure CPU alerts
- Set memory usage alerts
- Monitor network latency
- Track error rates
Performance Tuning
- Optimize worker count
- Configure request limits
- Enable caching
- Set appropriate timeouts
Load Testing
- Test concurrent requests
- Measure memory usage
- Monitor response times
- Verify error handling
Best Practices
Resource Management
- Start with conservative limits
- Monitor actual usage
- Scale gradually
- Implement proper cleanup
Error Handling
- Set appropriate timeouts
- Implement retry logic
- Monitor failure rates
- Log errors comprehensively
Performance
- Use efficient formats
- Optimize wait strategies
- Enable caching
- Monitor response times
For detailed monitoring configuration, see the Monitoring Guide.
Get the Latest Updates