dify/api/configs/feature
echooffx 34793e0d92 feat(api): add scheduled cleanup task for dataset_queries
The dataset_queries table grows without bound because every RAG retrieval
and hit-test inserts a row. This adds a configurable Celery Beat task
(clean_dataset_queries_task) that deletes rows older than a retention
period (default 60 days) in batches, gated by ENABLE_CLEAN_DATASET_QUERIES_TASK.

Retention is clamped to max(config, PLAN_SANDBOX_CLEAN_DAY_SETTING) to
avoid breaking clean_unused_datasets_task which reads DatasetQuery.created_at.

Also adds a created_at index on dataset_queries via alembic migration
to keep the delete scan performant as the table grows.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 16:04:02 +08:00
..
hosted_service feat: credit pool (#30720) 2026-01-08 13:17:30 +08:00
__init__.py feat(api): add scheduled cleanup task for dataset_queries 2026-04-30 16:04:02 +08:00