diff --git a/api/core/workflow/docs/WORKER_POOL_CONFIG.md b/api/core/workflow/docs/WORKER_POOL_CONFIG.md deleted file mode 100644 index db4cf3b6d6..0000000000 --- a/api/core/workflow/docs/WORKER_POOL_CONFIG.md +++ /dev/null @@ -1,173 +0,0 @@ -# GraphEngine Worker Pool Configuration - -## Overview - -The GraphEngine now supports **dynamic worker pool management** to optimize performance and resource usage. Instead of a fixed 10-worker pool, the engine can: - -1. **Start with optimal worker count** based on graph complexity -1. **Scale up** when workload increases -1. **Scale down** when workers are idle -1. **Respect configurable min/max limits** - -## Benefits - -- **Resource Efficiency**: Uses fewer workers for simple sequential workflows -- **Better Performance**: Scales up for parallel-heavy workflows -- **Gevent Optimization**: Works efficiently with Gevent's greenlet model -- **Memory Savings**: Reduces memory footprint for simple workflows - -## Configuration - -### Configuration Variables (via dify_config) - -| Variable | Default | Description | -|----------|---------|-------------| -| `GRAPH_ENGINE_MIN_WORKERS` | 1 | Minimum number of workers per engine | -| `GRAPH_ENGINE_MAX_WORKERS` | 10 | Maximum number of workers per engine | -| `GRAPH_ENGINE_SCALE_UP_THRESHOLD` | 3 | Queue depth that triggers scale up | -| `GRAPH_ENGINE_SCALE_DOWN_IDLE_TIME` | 5.0 | Seconds of idle time before scaling down | - -### Example Configurations - -#### Low-Resource Environment - -```bash -export GRAPH_ENGINE_MIN_WORKERS=1 -export GRAPH_ENGINE_MAX_WORKERS=3 -export GRAPH_ENGINE_SCALE_UP_THRESHOLD=2 -export GRAPH_ENGINE_SCALE_DOWN_IDLE_TIME=3.0 -``` - -#### High-Performance Environment - -```bash -export GRAPH_ENGINE_MIN_WORKERS=2 -export GRAPH_ENGINE_MAX_WORKERS=20 -export GRAPH_ENGINE_SCALE_UP_THRESHOLD=5 -export GRAPH_ENGINE_SCALE_DOWN_IDLE_TIME=10.0 -``` - -#### Default (Balanced) - -```bash -# Uses defaults: min=1, max=10, threshold=3, idle_time=5.0 -``` - -## How It Works - -### Initial Worker Calculation - -The engine analyzes the graph structure at startup: - -- **Sequential graphs** (no branches): 1 worker -- **Limited parallelism** (few branches): 2 workers -- **Moderate parallelism**: 3 workers -- **High parallelism** (many branches): 5 workers - -### Dynamic Scaling - -During execution: - -1. **Scale Up** triggers when: - - - Queue depth exceeds `SCALE_UP_THRESHOLD` - - All workers are busy and queue has items - - Not at `MAX_WORKERS` limit - -1. **Scale Down** triggers when: - - - Worker idle for more than `SCALE_DOWN_IDLE_TIME` seconds - - Above `MIN_WORKERS` limit - -### Gevent Compatibility - -Since Gevent patches threading to use greenlets: - -- Workers are lightweight coroutines, not OS threads -- Dynamic scaling has minimal overhead -- Can efficiently handle many concurrent workers - -## Migration Guide - -### Before (Fixed 10 Workers) - -```python -# Every GraphEngine instance created 10 workers -# Resource waste for simple workflows -# No adaptation to workload -``` - -### After (Dynamic Workers) - -```python -# GraphEngine creates 1-5 initial workers based on graph -# Scales up/down based on workload -# Configurable via environment variables -``` - -### Backward Compatibility - -The default configuration (`max=10`) maintains compatibility with existing deployments. To get the old behavior exactly: - -```bash -export GRAPH_ENGINE_MIN_WORKERS=10 -export GRAPH_ENGINE_MAX_WORKERS=10 -``` - -## Performance Impact - -### Memory Usage - -- **Simple workflows**: ~80% reduction (1 vs 10 workers) -- **Complex workflows**: Similar or slightly better - -### Execution Time - -- **Sequential workflows**: No change -- **Parallel workflows**: Improved with proper scaling -- **Bursty workloads**: Better adaptation - -### Example Metrics - -| Workflow Type | Old (10 workers) | New (Dynamic) | Improvement | -|--------------|------------------|---------------|-------------| -| Sequential | 10 workers idle | 1 worker active | 90% fewer workers | -| 3-way parallel | 7 workers idle | 3 workers active | 70% fewer workers | -| Heavy parallel | 10 workers busy | 10+ workers (scales up) | Better throughput | - -## Monitoring - -Log messages indicate scaling activity: - -```shell -INFO: GraphEngine initialized with 2 workers (min: 1, max: 10) -INFO: Scaled up workers: 2 -> 3 (queue_depth: 4) -INFO: Scaled down workers: 3 -> 2 (removed 1 idle workers) -``` - -## Best Practices - -1. **Start with defaults** - They work well for most cases -1. **Monitor queue depth** - Adjust `SCALE_UP_THRESHOLD` if queues back up -1. **Consider workload patterns**: - - Bursty: Lower `SCALE_DOWN_IDLE_TIME` - - Steady: Higher `SCALE_DOWN_IDLE_TIME` -1. **Test with your workloads** - Measure and tune - -## Troubleshooting - -### Workers not scaling up - -- Check `GRAPH_ENGINE_MAX_WORKERS` limit -- Verify queue depth exceeds threshold -- Check logs for scaling messages - -### Workers scaling down too quickly - -- Increase `GRAPH_ENGINE_SCALE_DOWN_IDLE_TIME` -- Consider workload patterns - -### Out of memory - -- Reduce `GRAPH_ENGINE_MAX_WORKERS` -- Check for memory leaks in nodes