fix: use isolated session in _on_query to prevent premature commit

The _on_query method was calling db.session.commit() on the Flask-scoped SQLAlchemy session, which committed all pending dirty state from the current request — not just the DatasetQuery audit rows. This broke transaction isolation: if the downstream workflow failed, the subsequent db.session.rollback() could not revert the already-committed modifications (e.g. token deductions, partial node executions), leaving dirty data in the database. The same file already demonstrates the correct pattern in _on_retrieval_end, which uses sessionmaker(bind=db.engine).begin() with an independent session. This change applies the same approach to _on_query. Additionally fixed a latent bug where dataset_queries.add_all() was called inside the loop on every iteration, re-adding previously accumulated rows. Fixes #37886
2026-06-26 06:41:10 +08:00 · 2026-06-26 04:09:16 +08:00 · 2026-06-26 04:09:16 +08:00 · 3b8559521d
commit 3b8559521d
parent a246dc8b17
1 changed files with 6 additions and 3 deletions
--- a/api/core/rag/retrieval/dataset_retrieval.py
+++ b/api/core/rag/retrieval/dataset_retrieval.py
@ -1030,6 +1030,9 @@ class DatasetRetrieval:
    ):
        """
        Persist dataset query audit rows for retrieval requests.
+
+        Uses an independent session to avoid committing the request-scoped
+        db.session, which would break transaction isolation for the caller.
        """
        if not query and not attachment_ids:
            return
@ -1059,9 +1062,9 @@ class DatasetRetrieval:
                    created_by=created_by,
                )
                dataset_queries.append(dataset_query)
-            if dataset_queries:
-                db.session.add_all(dataset_queries)
-        db.session.commit()
+        if dataset_queries:
+            with sessionmaker(bind=db.engine).begin() as session:
+                session.add_all(dataset_queries)

    def _retriever(
        self,