dify/api/core/rag/retrieval
GitHub Contributor 3b8559521d fix: use isolated session in _on_query to prevent premature commit
The _on_query method was calling db.session.commit() on the Flask-scoped
SQLAlchemy session, which committed all pending dirty state from the
current request — not just the DatasetQuery audit rows.

This broke transaction isolation: if the downstream workflow failed, the
subsequent db.session.rollback() could not revert the already-committed
modifications (e.g. token deductions, partial node executions), leaving
dirty data in the database.

The same file already demonstrates the correct pattern in
_on_retrieval_end, which uses sessionmaker(bind=db.engine).begin() with an
independent session. This change applies the same approach to _on_query.

Additionally fixed a latent bug where dataset_queries.add_all() was called
inside the loop on every iteration, re-adding previously accumulated rows.

Fixes #37886
2026-06-26 04:09:16 +08:00
..
output_parser refactor: replace bare dict with typed annotations in core rag module (#35097) 2026-04-14 06:16:16 +00:00
router chore: Remove pyright in favor of pyrefly (#36154) 2026-05-14 05:49:08 +00:00
__init__.py FEAT: NEW WORKFLOW ENGINE (#3160) 2024-04-08 18:51:46 +08:00
dataset_retrieval.py fix: use isolated session in _on_query to prevent premature commit 2026-06-26 04:09:16 +08:00
retrieval_methods.py fix RetrievalMethod StrEnum (#26768) 2025-10-13 10:29:37 +08:00
template_prompts.py fix: use only supported operators in metadata filter system prompts (#19195) 2025-05-03 20:08:08 +08:00