Merge branch 'main' of github.com:langgenius/dify into p363

This commit is contained in:
hjlarry 2026-03-01 18:15:04 +08:00
commit 1c36665ca9
291 changed files with 7803 additions and 6342 deletions

View File

@ -0,0 +1,168 @@
---
name: backend-code-review
description: Review backend code for quality, security, maintainability, and best practices based on established checklist rules. Use when the user requests a review, analysis, or improvement of backend files (e.g., `.py`) under the `api/` directory. Do NOT use for frontend files (e.g., `.tsx`, `.ts`, `.js`). Supports pending-change review, code snippets review, and file-focused review.
---
# Backend Code Review
## When to use this skill
Use this skill whenever the user asks to **review, analyze, or improve** backend code (e.g., `.py`) under the `api/` directory. Supports the following review modes:
- **Pending-change review**: when the user asks to review current changes (inspect staged/working-tree files slated for commit to get the changes).
- **Code snippets review**: when the user pastes code snippets (e.g., a function/class/module excerpt) into the chat and asks for a review.
- **File-focused review**: when the user points to specific files and asks for a review of those files (one file or a small, explicit set of files, e.g., `api/...`, `api/app.py`).
Do NOT use this skill when:
- The request is about frontend code or UI (e.g., `.tsx`, `.ts`, `.js`, `web/`).
- The user is not asking for a review/analysis/improvement of backend code.
- The scope is not under `api/` (unless the user explicitly asks to review backend-related changes outside `api/`).
## How to use this skill
Follow these steps when using this skill:
1. **Identify the review mode** (pending-change vs snippet vs file-focused) based on the users input. Keep the scope tight: review only what the user provided or explicitly referenced.
2. Follow the rules defined in **Checklist** to perform the review. If no Checklist rule matches, apply **General Review Rules** as a fallback to perform the best-effort review.
3. Compose the final output strictly follow the **Required Output Format**.
Notes when using this skill:
- Always include actionable fixes or suggestions (including possible code snippets).
- Use best-effort `File:Line` references when a file path and line numbers are available; otherwise, use the most specific identifier you can.
## Checklist
- db schema design: if the review scope includes code/files under `api/models/` or `api/migrations/`, follow [references/db-schema-rule.md](references/db-schema-rule.md) to perform the review
- architecture: if the review scope involves controller/service/core-domain/libs/model layering, dependency direction, or moving responsibilities across modules, follow [references/architecture-rule.md](references/architecture-rule.md) to perform the review
- repositories abstraction: if the review scope contains table/model operations (e.g., `select(...)`, `session.execute(...)`, joins, CRUD) and is not under `api/repositories`, `api/core/repositories`, or `api/extensions/*/repositories/`, follow [references/repositories-rule.md](references/repositories-rule.md) to perform the review
- sqlalchemy patterns: if the review scope involves SQLAlchemy session/query usage, db transaction/crud usage, or raw SQL usage, follow [references/sqlalchemy-rule.md](references/sqlalchemy-rule.md) to perform the review
## General Review Rules
### 1. Security Review
Check for:
- SQL injection vulnerabilities
- Server-Side Request Forgery (SSRF)
- Command injection
- Insecure deserialization
- Hardcoded secrets/credentials
- Improper authentication/authorization
- Insecure direct object references
### 2. Performance Review
Check for:
- N+1 queries
- Missing database indexes
- Memory leaks
- Blocking operations in async code
- Missing caching opportunities
### 3. Code Quality Review
Check for:
- Code forward compatibility
- Code duplication (DRY violations)
- Functions doing too much (SRP violations)
- Deep nesting / complex conditionals
- Magic numbers/strings
- Poor naming
- Missing error handling
- Incomplete type coverage
### 4. Testing Review
Check for:
- Missing test coverage for new code
- Tests that don't test behavior
- Flaky test patterns
- Missing edge cases
## Required Output Format
When this skill invoked, the response must exactly follow one of the two templates:
### Template A (any findings)
```markdown
# Code Review Summary
Found <X> critical issues need to be fixed:
## 🔴 Critical (Must Fix)
### 1. <brief description of the issue>
FilePath: <path> line <line>
<relevant code snippet or pointer>
#### Explanation
<detailed explanation and references of the issue>
#### Suggested Fix
1. <brief description of suggested fix>
2. <code example> (optional, omit if not applicable)
---
... (repeat for each critical issue) ...
Found <Y> suggestions for improvement:
## 🟡 Suggestions (Should Consider)
### 1. <brief description of the suggestion>
FilePath: <path> line <line>
<relevant code snippet or pointer>
#### Explanation
<detailed explanation and references of the suggestion>
#### Suggested Fix
1. <brief description of suggested fix>
2. <code example> (optional, omit if not applicable)
---
... (repeat for each suggestion) ...
Found <Z> optional nits:
## 🟢 Nits (Optional)
### 1. <brief description of the nit>
FilePath: <path> line <line>
<relevant code snippet or pointer>
#### Explanation
<explanation and references of the optional nit>
#### Suggested Fix
- <minor suggestions>
---
... (repeat for each nits) ...
## ✅ What's Good
- <Positive feedback on good patterns>
```
- If there are no critical issues or suggestions or option nits or good points, just omit that section.
- If the issue number is more than 10, summarize as "Found 10+ critical issues/suggestions/optional nits" and only output the first 10 items.
- Don't compress the blank lines between sections; keep them as-is for readability.
- If there is any issue requires code changes, append a brief follow-up question to ask whether the user wants to apply the fix(es) after the structured output. For example: "Would you like me to use the Suggested fix(es) to address these issues?"
### Template B (no issues)
```markdown
## Code Review Summary
✅ No issues found.
```

View File

@ -0,0 +1,91 @@
# Rule Catalog — Architecture
## Scope
- Covers: controller/service/core-domain/libs/model layering, dependency direction, responsibility placement, observability-friendly flow.
## Rules
### Keep business logic out of controllers
- Category: maintainability
- Severity: critical
- Description: Controllers should parse input, call services, and return serialized responses. Business decisions inside controllers make behavior hard to reuse and test.
- Suggested fix: Move domain/business logic into the service or core/domain layer. Keep controller handlers thin and orchestration-focused.
- Example:
- Bad:
```python
@bp.post("/apps/<app_id>/publish")
def publish_app(app_id: str):
payload = request.get_json() or {}
if payload.get("force") and current_user.role != "admin":
raise ValueError("only admin can force publish")
app = App.query.get(app_id)
app.status = "published"
db.session.commit()
return {"result": "ok"}
```
- Good:
```python
@bp.post("/apps/<app_id>/publish")
def publish_app(app_id: str):
payload = PublishRequest.model_validate(request.get_json() or {})
app_service.publish_app(app_id=app_id, force=payload.force, actor_id=current_user.id)
return {"result": "ok"}
```
### Preserve layer dependency direction
- Category: best practices
- Severity: critical
- Description: Controllers may depend on services, and services may depend on core/domain abstractions. Reversing this direction (for example, core importing controller/web modules) creates cycles and leaks transport concerns into domain code.
- Suggested fix: Extract shared contracts into core/domain or service-level modules and make upper layers depend on lower, not the reverse.
- Example:
- Bad:
```python
# core/policy/publish_policy.py
from controllers.console.app import request_context
def can_publish() -> bool:
return request_context.current_user.is_admin
```
- Good:
```python
# core/policy/publish_policy.py
def can_publish(role: str) -> bool:
return role == "admin"
# service layer adapts web/user context to domain input
allowed = can_publish(role=current_user.role)
```
### Keep libs business-agnostic
- Category: maintainability
- Severity: critical
- Description: Modules under `api/libs/` should remain reusable, business-agnostic building blocks. They must not encode product/domain-specific rules, workflow orchestration, or business decisions.
- Suggested fix:
- If business logic appears in `api/libs/`, extract it into the appropriate `services/` or `core/` module and keep `libs` focused on generic, cross-cutting helpers.
- Keep `libs` dependencies clean: avoid importing service/controller/domain-specific modules into `api/libs/`.
- Example:
- Bad:
```python
# api/libs/conversation_filter.py
from services.conversation_service import ConversationService
def should_archive_conversation(conversation, tenant_id: str) -> bool:
# Domain policy and service dependency are leaking into libs.
service = ConversationService()
if service.has_paid_plan(tenant_id):
return conversation.idle_days > 90
return conversation.idle_days > 30
```
- Good:
```python
# api/libs/datetime_utils.py (business-agnostic helper)
def older_than_days(idle_days: int, threshold_days: int) -> bool:
return idle_days > threshold_days
# services/conversation_service.py (business logic stays in service/core)
from libs.datetime_utils import older_than_days
def should_archive_conversation(conversation, tenant_id: str) -> bool:
threshold_days = 90 if has_paid_plan(tenant_id) else 30
return older_than_days(conversation.idle_days, threshold_days)
```

View File

@ -0,0 +1,157 @@
# Rule Catalog — DB Schema Design
## Scope
- Covers: model/base inheritance, schema boundaries in model properties, tenant-aware schema design, index redundancy checks, dialect portability in models, and cross-database compatibility in migrations.
- Does NOT cover: session lifecycle, transaction boundaries, and query execution patterns (handled by `sqlalchemy-rule.md`).
## Rules
### Do not query other tables inside `@property`
- Category: [maintainability, performance]
- Severity: critical
- Description: A model `@property` must not open sessions or query other tables. This hides dependencies across models, tightly couples schema objects to data access, and can cause N+1 query explosions when iterating collections.
- Suggested fix:
- Keep model properties pure and local to already-loaded fields.
- Move cross-table data fetching to service/repository methods.
- For list/batch reads, fetch required related data explicitly (join/preload/bulk query) before rendering derived values.
- Example:
- Bad:
```python
class Conversation(TypeBase):
__tablename__ = "conversations"
@property
def app_name(self) -> str:
with Session(db.engine, expire_on_commit=False) as session:
app = session.execute(select(App).where(App.id == self.app_id)).scalar_one()
return app.name
```
- Good:
```python
class Conversation(TypeBase):
__tablename__ = "conversations"
@property
def display_title(self) -> str:
return self.name or "Untitled"
# Service/repository layer performs explicit batch fetch for related App rows.
```
### Prefer including `tenant_id` in model definitions
- Category: maintainability
- Severity: suggestion
- Description: In multi-tenant domains, include `tenant_id` in schema definitions whenever the entity belongs to tenant-owned data. This improves data isolation safety and keeps future partitioning/sharding strategies practical as data volume grows.
- Suggested fix:
- Add a `tenant_id` column and ensure related unique/index constraints include tenant dimension when applicable.
- Propagate `tenant_id` through service/repository contracts to keep access paths tenant-aware.
- Exception: if a table is explicitly designed as non-tenant-scoped global metadata, document that design decision clearly.
- Example:
- Bad:
```python
from sqlalchemy.orm import Mapped
class Dataset(TypeBase):
__tablename__ = "datasets"
id: Mapped[str] = mapped_column(StringUUID, primary_key=True)
name: Mapped[str] = mapped_column(sa.String(255), nullable=False)
```
- Good:
```python
from sqlalchemy.orm import Mapped
class Dataset(TypeBase):
__tablename__ = "datasets"
id: Mapped[str] = mapped_column(StringUUID, primary_key=True)
tenant_id: Mapped[str] = mapped_column(StringUUID, nullable=False, index=True)
name: Mapped[str] = mapped_column(sa.String(255), nullable=False)
```
### Detect and avoid duplicate/redundant indexes
- Category: performance
- Severity: suggestion
- Description: Review index definitions for leftmost-prefix redundancy. For example, index `(a, b, c)` can safely cover most lookups for `(a, b)`. Keeping both may increase write overhead and can mislead the optimizer into suboptimal execution plans.
- Suggested fix:
- Before adding an index, compare against existing composite indexes by leftmost-prefix rules.
- Drop or avoid creating redundant prefixes unless there is a proven query-pattern need.
- Apply the same review standard in both model `__table_args__` and migration index DDL.
- Example:
- Bad:
```python
__table_args__ = (
sa.Index("idx_msg_tenant_app", "tenant_id", "app_id"),
sa.Index("idx_msg_tenant_app_created", "tenant_id", "app_id", "created_at"),
)
```
- Good:
```python
__table_args__ = (
# Keep the wider index unless profiling proves a dedicated short index is needed.
sa.Index("idx_msg_tenant_app_created", "tenant_id", "app_id", "created_at"),
)
```
### Avoid PostgreSQL-only dialect usage in models; wrap in `models.types`
- Category: maintainability
- Severity: critical
- Description: Model/schema definitions should avoid PostgreSQL-only constructs directly in business models. When database-specific behavior is required, encapsulate it in `api/models/types.py` using both PostgreSQL and MySQL dialect implementations, then consume that abstraction from model code.
- Suggested fix:
- Do not directly place dialect-only types/operators in model columns when a portable wrapper can be used.
- Add or extend wrappers in `models.types` (for example, `AdjustedJSON`, `LongText`, `BinaryData`) to normalize behavior across PostgreSQL and MySQL.
- Example:
- Bad:
```python
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import Mapped
class ToolConfig(TypeBase):
__tablename__ = "tool_configs"
config: Mapped[dict] = mapped_column(JSONB, nullable=False)
```
- Good:
```python
from sqlalchemy.orm import Mapped
from models.types import AdjustedJSON
class ToolConfig(TypeBase):
__tablename__ = "tool_configs"
config: Mapped[dict] = mapped_column(AdjustedJSON(), nullable=False)
```
### Guard migration incompatibilities with dialect checks and shared types
- Category: maintainability
- Severity: critical
- Description: Migration scripts under `api/migrations/versions/` must account for PostgreSQL/MySQL incompatibilities explicitly. For dialect-sensitive DDL or defaults, branch on the active dialect (for example, `conn.dialect.name == "postgresql"`), and prefer reusable compatibility abstractions from `models.types` where applicable.
- Suggested fix:
- In migration upgrades/downgrades, bind connection and branch by dialect for incompatible SQL fragments.
- Reuse `models.types` wrappers in column definitions when that keeps behavior aligned with runtime models.
- Avoid one-dialect-only migration logic unless there is a documented, deliberate compatibility exception.
- Example:
- Bad:
```python
with op.batch_alter_table("dataset_keyword_tables") as batch_op:
batch_op.add_column(
sa.Column(
"data_source_type",
sa.String(255),
server_default=sa.text("'database'::character varying"),
nullable=False,
)
)
```
- Good:
```python
def _is_pg(conn) -> bool:
return conn.dialect.name == "postgresql"
conn = op.get_bind()
default_expr = sa.text("'database'::character varying") if _is_pg(conn) else sa.text("'database'")
with op.batch_alter_table("dataset_keyword_tables") as batch_op:
batch_op.add_column(
sa.Column("data_source_type", sa.String(255), server_default=default_expr, nullable=False)
)
```

View File

@ -0,0 +1,61 @@
# Rule Catalog - Repositories Abstraction
## Scope
- Covers: when to reuse existing repository abstractions, when to introduce new repositories, and how to preserve dependency direction between service/core and infrastructure implementations.
- Does NOT cover: SQLAlchemy session lifecycle and query-shape specifics (handled by `sqlalchemy-rule.md`), and table schema/migration design (handled by `db-schema-rule.md`).
## Rules
### Introduce repositories abstraction
- Category: maintainability
- Severity: suggestion
- Description: If a table/model already has a repository abstraction, all reads/writes/queries for that table should use the existing repository. If no repository exists, introduce one only when complexity justifies it, such as large/high-volume tables, repeated complex query logic, or likely storage-strategy variation.
- Suggested fix:
- First check `api/repositories`, `api/core/repositories`, and `api/extensions/*/repositories/` to verify whether the table/model already has a repository abstraction. If it exists, route all operations through it and add missing repository methods instead of bypassing it with ad-hoc SQLAlchemy access.
- If no repository exists, add one only when complexity warrants it (for example, repeated complex queries, large data domains, or multiple storage strategies), while preserving dependency direction (service/core depends on abstraction; infra provides implementation).
- Example:
- Bad:
```python
# Existing repository is ignored and service uses ad-hoc table queries.
class AppService:
def archive_app(self, app_id: str, tenant_id: str) -> None:
app = self.session.execute(
select(App).where(App.id == app_id, App.tenant_id == tenant_id)
).scalar_one()
app.archived = True
self.session.commit()
```
- Good:
```python
# Case A: Existing repository must be reused for all table operations.
class AppService:
def archive_app(self, app_id: str, tenant_id: str) -> None:
app = self.app_repo.get_by_id(app_id=app_id, tenant_id=tenant_id)
app.archived = True
self.app_repo.save(app)
# If the query is missing, extend the existing abstraction.
active_apps = self.app_repo.list_active_for_tenant(tenant_id=tenant_id)
```
- Bad:
```python
# No repository exists, but large-domain query logic is scattered in service code.
class ConversationService:
def list_recent_for_app(self, app_id: str, tenant_id: str, limit: int) -> list[Conversation]:
...
# many filters/joins/pagination variants duplicated across services
```
- Good:
```python
# Case B: Introduce repository for large/complex domains or storage variation.
class ConversationRepository(Protocol):
def list_recent_for_app(self, app_id: str, tenant_id: str, limit: int) -> list[Conversation]: ...
class SqlAlchemyConversationRepository:
def list_recent_for_app(self, app_id: str, tenant_id: str, limit: int) -> list[Conversation]:
...
class ConversationService:
def __init__(self, conversation_repo: ConversationRepository):
self.conversation_repo = conversation_repo
```

View File

@ -0,0 +1,139 @@
# Rule Catalog — SQLAlchemy Patterns
## Scope
- Covers: SQLAlchemy session and transaction lifecycle, query construction, tenant scoping, raw SQL boundaries, and write-path concurrency safeguards.
- Does NOT cover: table/model schema and migration design details (handled by `db-schema-rule.md`).
## Rules
### Use Session context manager with explicit transaction control behavior
- Category: best practices
- Severity: critical
- Description: Session and transaction lifecycle must be explicit and bounded on write paths. Missing commits can silently drop intended updates, while ad-hoc or long-lived transactions increase contention, lock duration, and deadlock risk.
- Suggested fix:
- Use **explicit `session.commit()`** after completing a related write unit.
- Or use **`session.begin()` context manager** for automatic commit/rollback on a scoped block.
- Keep transaction windows short: avoid network I/O, heavy computation, or unrelated work inside the transaction.
- Example:
- Bad:
```python
# Missing commit: write may never be persisted.
with Session(db.engine, expire_on_commit=False) as session:
run = session.get(WorkflowRun, run_id)
run.status = "cancelled"
# Long transaction: external I/O inside a DB transaction.
with Session(db.engine, expire_on_commit=False) as session, session.begin():
run = session.get(WorkflowRun, run_id)
run.status = "cancelled"
call_external_api()
```
- Good:
```python
# Option 1: explicit commit.
with Session(db.engine, expire_on_commit=False) as session:
run = session.get(WorkflowRun, run_id)
run.status = "cancelled"
session.commit()
# Option 2: scoped transaction with automatic commit/rollback.
with Session(db.engine, expire_on_commit=False) as session, session.begin():
run = session.get(WorkflowRun, run_id)
run.status = "cancelled"
# Keep non-DB work outside transaction scope.
call_external_api()
```
### Enforce tenant_id scoping on shared-resource queries
- Category: security
- Severity: critical
- Description: Reads and writes against shared tables must be scoped by `tenant_id` to prevent cross-tenant data leakage or corruption.
- Suggested fix: Add `tenant_id` predicate to all tenant-owned entity queries and propagate tenant context through service/repository interfaces.
- Example:
- Bad:
```python
stmt = select(Workflow).where(Workflow.id == workflow_id)
workflow = session.execute(stmt).scalar_one_or_none()
```
- Good:
```python
stmt = select(Workflow).where(
Workflow.id == workflow_id,
Workflow.tenant_id == tenant_id,
)
workflow = session.execute(stmt).scalar_one_or_none()
```
### Prefer SQLAlchemy expressions over raw SQL by default
- Category: maintainability
- Severity: suggestion
- Description: Raw SQL should be exceptional. ORM/Core expressions are easier to evolve, safer to compose, and more consistent with the codebase.
- Suggested fix: Rewrite straightforward raw SQL into SQLAlchemy `select/update/delete` expressions; keep raw SQL only when required by clear technical constraints.
- Example:
- Bad:
```python
row = session.execute(
text("SELECT * FROM workflows WHERE id = :id AND tenant_id = :tenant_id"),
{"id": workflow_id, "tenant_id": tenant_id},
).first()
```
- Good:
```python
stmt = select(Workflow).where(
Workflow.id == workflow_id,
Workflow.tenant_id == tenant_id,
)
row = session.execute(stmt).scalar_one_or_none()
```
### Protect write paths with concurrency safeguards
- Category: quality
- Severity: critical
- Description: Multi-writer paths without explicit concurrency control can silently overwrite data. Choose the safeguard based on contention level, lock scope, and throughput cost instead of defaulting to one strategy.
- Suggested fix:
- **Optimistic locking**: Use when contention is usually low and retries are acceptable. Add a version (or updated_at) guard in `WHERE` and treat `rowcount == 0` as a conflict.
- **Redis distributed lock**: Use when the critical section spans multiple steps/processes (or includes non-DB side effects) and you need cross-worker mutual exclusion.
- **SELECT ... FOR UPDATE**: Use when contention is high on the same rows and strict in-transaction serialization is required. Keep transactions short to reduce lock wait/deadlock risk.
- In all cases, scope by `tenant_id` and verify affected row counts for conditional writes.
- Example:
- Bad:
```python
# No tenant scope, no conflict detection, and no lock on a contested write path.
session.execute(update(WorkflowRun).where(WorkflowRun.id == run_id).values(status="cancelled"))
session.commit() # silently overwrites concurrent updates
```
- Good:
```python
# 1) Optimistic lock (low contention, retry on conflict)
result = session.execute(
update(WorkflowRun)
.where(
WorkflowRun.id == run_id,
WorkflowRun.tenant_id == tenant_id,
WorkflowRun.version == expected_version,
)
.values(status="cancelled", version=WorkflowRun.version + 1)
)
if result.rowcount == 0:
raise WorkflowStateConflictError("stale version, retry")
# 2) Redis distributed lock (cross-worker critical section)
lock_name = f"workflow_run_lock:{tenant_id}:{run_id}"
with redis_client.lock(lock_name, timeout=20):
session.execute(
update(WorkflowRun)
.where(WorkflowRun.id == run_id, WorkflowRun.tenant_id == tenant_id)
.values(status="cancelled")
)
session.commit()
# 3) Pessimistic lock with SELECT ... FOR UPDATE (high contention)
run = session.execute(
select(WorkflowRun)
.where(WorkflowRun.id == run_id, WorkflowRun.tenant_id == tenant_id)
.with_for_update()
).scalar_one()
run.status = "cancelled"
session.commit()
```

View File

@ -0,0 +1 @@
../../.agents/skills/backend-code-review

View File

@ -77,14 +77,7 @@ jobs:
}
const body = diff.trim()
? `### Pyrefly Diff
<details>
<summary>base → PR</summary>
\`\`\`diff
${diff}
\`\`\`
</details>`
? '### Pyrefly Diff\n<details>\n<summary>base → PR</summary>\n\n```diff\n' + diff + '\n```\n</details>'
: '### Pyrefly Diff\nNo changes detected.';
await github.rest.issues.createComment({

View File

@ -74,14 +74,16 @@ jobs:
}
const body = diff.trim()
? `### Pyrefly Diff
<details>
<summary>base → PR</summary>
\`\`\`diff
${diff}
\`\`\`
</details>`
? [
'### Pyrefly Diff',
'<details>',
'<summary>base → PR</summary>',
'',
'```diff',
diff,
'```',
'</details>',
].join('\n')
: '### Pyrefly Diff\nNo changes detected.';
await github.rest.issues.createComment({

View File

@ -3,14 +3,22 @@ name: Web Tests
on:
workflow_call:
permissions:
contents: read
concurrency:
group: web-tests-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
test:
name: Web Tests
name: Web Tests (${{ matrix.shardIndex }}/${{ matrix.shardTotal }})
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
defaults:
run:
shell: bash
@ -39,7 +47,58 @@ jobs:
run: pnpm install --frozen-lockfile
- name: Run tests
run: pnpm test:ci
run: pnpm vitest run --reporter=blob --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }} --coverage
- name: Upload blob report
if: ${{ !cancelled() }}
uses: actions/upload-artifact@v6
with:
name: blob-report-${{ matrix.shardIndex }}
path: web/.vitest-reports/*
include-hidden-files: true
retention-days: 1
merge-reports:
name: Merge Test Reports
if: ${{ !cancelled() }}
needs: [test]
runs-on: ubuntu-latest
defaults:
run:
shell: bash
working-directory: ./web
steps:
- name: Checkout code
uses: actions/checkout@v6
with:
persist-credentials: false
- name: Install pnpm
uses: pnpm/action-setup@v4
with:
package_json_file: web/package.json
run_install: false
- name: Setup Node.js
uses: actions/setup-node@v6
with:
node-version: 24
cache: pnpm
cache-dependency-path: ./web/pnpm-lock.yaml
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Download blob reports
uses: actions/download-artifact@v6
with:
path: web/.vitest-reports
pattern: blob-report-*
merge-multiple: true
- name: Merge reports
run: pnpm vitest --merge-reports --coverage --silent=passed-only
- name: Coverage Summary
if: always()

View File

@ -50,7 +50,6 @@ forbidden_modules =
allow_indirect_imports = True
ignore_imports =
core.workflow.nodes.agent.agent_node -> extensions.ext_database
core.workflow.nodes.datasource.datasource_node -> extensions.ext_database
core.workflow.nodes.knowledge_index.knowledge_index_node -> extensions.ext_database
core.workflow.nodes.llm.file_saver -> extensions.ext_database
core.workflow.nodes.llm.llm_utils -> extensions.ext_database
@ -106,15 +105,10 @@ ignore_imports =
core.workflow.nodes.agent.agent_node -> core.model_manager
core.workflow.nodes.agent.agent_node -> core.provider_manager
core.workflow.nodes.agent.agent_node -> core.tools.tool_manager
core.workflow.nodes.datasource.datasource_node -> models.model
core.workflow.nodes.datasource.datasource_node -> models.tools
core.workflow.nodes.datasource.datasource_node -> services.datasource_provider_service
core.workflow.nodes.document_extractor.node -> core.helper.ssrf_proxy
core.workflow.nodes.http_request.node -> core.tools.tool_file_manager
core.workflow.nodes.iteration.iteration_node -> core.app.workflow.node_factory
core.workflow.nodes.knowledge_index.knowledge_index_node -> core.rag.index_processor.index_processor_factory
core.workflow.nodes.llm.llm_utils -> configs
core.workflow.nodes.llm.llm_utils -> core.app.entities.app_invoke_entities
core.workflow.nodes.llm.llm_utils -> core.model_manager
core.workflow.nodes.llm.protocols -> core.model_manager
core.workflow.nodes.llm.llm_utils -> core.model_runtime.model_providers.__base.large_language_model
@ -133,36 +127,21 @@ ignore_imports =
core.workflow.nodes.human_input.human_input_node -> core.app.entities.app_invoke_entities
core.workflow.nodes.knowledge_index.knowledge_index_node -> core.app.entities.app_invoke_entities
core.workflow.nodes.knowledge_retrieval.knowledge_retrieval_node -> core.app.app_config.entities
core.workflow.nodes.llm.node -> core.app.entities.app_invoke_entities
core.workflow.nodes.parameter_extractor.parameter_extractor_node -> core.app.entities.app_invoke_entities
core.workflow.nodes.parameter_extractor.parameter_extractor_node -> core.prompt.advanced_prompt_transform
core.workflow.nodes.parameter_extractor.parameter_extractor_node -> core.prompt.simple_prompt_transform
core.workflow.nodes.parameter_extractor.parameter_extractor_node -> core.model_runtime.model_providers.__base.large_language_model
core.workflow.nodes.question_classifier.question_classifier_node -> core.app.entities.app_invoke_entities
core.workflow.nodes.question_classifier.question_classifier_node -> core.prompt.advanced_prompt_transform
core.workflow.nodes.question_classifier.question_classifier_node -> core.prompt.simple_prompt_transform
core.workflow.nodes.start.entities -> core.app.app_config.entities
core.workflow.nodes.start.start_node -> core.app.app_config.entities
core.workflow.workflow_entry -> core.app.apps.exc
core.workflow.workflow_entry -> core.app.entities.app_invoke_entities
core.workflow.workflow_entry -> core.app.workflow.node_factory
core.workflow.nodes.datasource.datasource_node -> core.datasource.datasource_manager
core.workflow.nodes.datasource.datasource_node -> core.datasource.utils.message_transformer
core.workflow.nodes.llm.llm_utils -> core.entities.provider_entities
core.workflow.nodes.parameter_extractor.parameter_extractor_node -> core.model_manager
core.workflow.nodes.question_classifier.question_classifier_node -> core.model_manager
core.workflow.nodes.llm.llm_utils -> core.variables.segments
core.workflow.nodes.loop.entities -> core.variables.types
core.workflow.nodes.tool.tool_node -> core.tools.utils.message_transformer
core.workflow.nodes.tool.tool_node -> models
core.workflow.nodes.agent.agent_node -> models.model
core.workflow.nodes.code.code_node -> core.helper.code_executor.code_node_provider
core.workflow.nodes.code.code_node -> core.helper.code_executor.javascript.javascript_code_provider
core.workflow.nodes.code.code_node -> core.helper.code_executor.python3.python3_code_provider
core.workflow.nodes.code.entities -> core.helper.code_executor.code_executor
core.workflow.nodes.datasource.datasource_node -> core.variables.variables
core.workflow.nodes.http_request.executor -> core.helper.ssrf_proxy
core.workflow.nodes.http_request.node -> core.helper.ssrf_proxy
core.workflow.nodes.llm.file_saver -> core.helper.ssrf_proxy
core.workflow.nodes.llm.node -> core.helper.code_executor
core.workflow.nodes.template_transform.template_renderer -> core.helper.code_executor.code_executor
@ -190,57 +169,7 @@ ignore_imports =
core.workflow.nodes.llm.file_saver -> core.tools.signature
core.workflow.nodes.llm.file_saver -> core.tools.tool_file_manager
core.workflow.nodes.tool.tool_node -> core.tools.errors
core.workflow.conversation_variable_updater -> core.variables
core.workflow.graph_engine.entities.commands -> core.variables.variables
core.workflow.nodes.agent.agent_node -> core.variables.segments
core.workflow.nodes.answer.answer_node -> core.variables
core.workflow.nodes.code.code_node -> core.variables.segments
core.workflow.nodes.code.code_node -> core.variables.types
core.workflow.nodes.code.entities -> core.variables.types
core.workflow.nodes.datasource.datasource_node -> core.variables.segments
core.workflow.nodes.document_extractor.node -> core.variables
core.workflow.nodes.document_extractor.node -> core.variables.segments
core.workflow.nodes.http_request.executor -> core.variables.segments
core.workflow.nodes.http_request.node -> core.variables.segments
core.workflow.nodes.human_input.entities -> core.variables.consts
core.workflow.nodes.iteration.iteration_node -> core.variables
core.workflow.nodes.iteration.iteration_node -> core.variables.segments
core.workflow.nodes.iteration.iteration_node -> core.variables.variables
core.workflow.nodes.knowledge_retrieval.knowledge_retrieval_node -> core.variables
core.workflow.nodes.knowledge_retrieval.knowledge_retrieval_node -> core.variables.segments
core.workflow.nodes.list_operator.node -> core.variables
core.workflow.nodes.list_operator.node -> core.variables.segments
core.workflow.nodes.llm.node -> core.variables
core.workflow.nodes.loop.loop_node -> core.variables
core.workflow.nodes.parameter_extractor.entities -> core.variables.types
core.workflow.nodes.parameter_extractor.exc -> core.variables.types
core.workflow.nodes.parameter_extractor.parameter_extractor_node -> core.variables.types
core.workflow.nodes.tool.tool_node -> core.variables.segments
core.workflow.nodes.tool.tool_node -> core.variables.variables
core.workflow.nodes.trigger_webhook.node -> core.variables.types
core.workflow.nodes.trigger_webhook.node -> core.variables.variables
core.workflow.nodes.variable_aggregator.entities -> core.variables.types
core.workflow.nodes.variable_aggregator.variable_aggregator_node -> core.variables.segments
core.workflow.nodes.variable_assigner.common.helpers -> core.variables
core.workflow.nodes.variable_assigner.common.helpers -> core.variables.consts
core.workflow.nodes.variable_assigner.common.helpers -> core.variables.types
core.workflow.nodes.variable_assigner.v1.node -> core.variables
core.workflow.nodes.variable_assigner.v2.helpers -> core.variables
core.workflow.nodes.variable_assigner.v2.node -> core.variables
core.workflow.nodes.variable_assigner.v2.node -> core.variables.consts
core.workflow.runtime.graph_runtime_state_protocol -> core.variables.segments
core.workflow.runtime.read_only_wrappers -> core.variables.segments
core.workflow.runtime.variable_pool -> core.variables
core.workflow.runtime.variable_pool -> core.variables.consts
core.workflow.runtime.variable_pool -> core.variables.segments
core.workflow.runtime.variable_pool -> core.variables.variables
core.workflow.utils.condition.processor -> core.variables
core.workflow.utils.condition.processor -> core.variables.segments
core.workflow.variable_loader -> core.variables
core.workflow.variable_loader -> core.variables.consts
core.workflow.workflow_type_encoder -> core.variables
core.workflow.nodes.agent.agent_node -> extensions.ext_database
core.workflow.nodes.datasource.datasource_node -> extensions.ext_database
core.workflow.nodes.knowledge_index.knowledge_index_node -> extensions.ext_database
core.workflow.nodes.llm.file_saver -> extensions.ext_database
core.workflow.nodes.llm.llm_utils -> extensions.ext_database

View File

@ -42,7 +42,7 @@ The scripts resolve paths relative to their location, so you can run them from a
1. Set up your application by visiting `http://localhost:3000`.
1. Optional: start the worker service (async tasks, runs from `api`).
1. Start the worker service (async and scheduler tasks, runs from `api`).
```bash
./dev/start-worker
@ -54,86 +54,6 @@ The scripts resolve paths relative to their location, so you can run them from a
./dev/start-beat
```
### Manual commands
<details>
<summary>Show manual setup and run steps</summary>
These commands assume you start from the repository root.
1. Start the docker-compose stack.
The backend requires middleware, including PostgreSQL, Redis, and Weaviate, which can be started together using `docker-compose`.
```bash
cp docker/middleware.env.example docker/middleware.env
# Use mysql or another vector database profile if you are not using postgres/weaviate.
docker compose -f docker/docker-compose.middleware.yaml --profile postgresql --profile weaviate -p dify up -d
```
1. Copy env files.
```bash
cp api/.env.example api/.env
cp web/.env.example web/.env.local
```
1. Install UV if needed.
```bash
pip install uv
# Or on macOS
brew install uv
```
1. Install API dependencies.
```bash
cd api
uv sync --group dev
```
1. Install web dependencies.
```bash
cd web
pnpm install
cd ..
```
1. Start backend (runs migrations first, in a new terminal).
```bash
cd api
uv run flask db upgrade
uv run flask run --host 0.0.0.0 --port=5001 --debug
```
1. Start Dify [web](../web) service (in a new terminal).
```bash
cd web
pnpm dev:inspect
```
1. Set up your application by visiting `http://localhost:3000`.
1. Optional: start the worker service (async tasks, in a new terminal).
```bash
cd api
uv run celery -A app.celery worker -P threads -c 2 --loglevel INFO -Q api_token,dataset,priority_dataset,priority_pipeline,pipeline,mail,ops_trace,app_deletion,plugin,workflow_storage,conversation,workflow,schedule_poller,schedule_executor,triggered_workflow_dispatcher,trigger_refresh_executor,retention
```
1. Optional: start Celery Beat (scheduled tasks, in a new terminal).
```bash
cd api
uv run celery -A app.celery beat
```
</details>
### Environment notes
> [!IMPORTANT]

File diff suppressed because one or more lines are too long

View File

@ -15,11 +15,11 @@ from controllers.console.app.error import (
from controllers.console.app.wraps import get_app_model
from controllers.console.wraps import account_initialization_required, edit_permission_required, setup_required
from controllers.web.error import InvalidArgumentError, NotFoundError
from core.variables.segment_group import SegmentGroup
from core.variables.segments import ArrayFileSegment, FileSegment, Segment
from core.variables.types import SegmentType
from core.workflow.constants import CONVERSATION_VARIABLE_NODE_ID, SYSTEM_VARIABLE_NODE_ID
from core.workflow.file import helpers as file_helpers
from core.workflow.variables.segment_group import SegmentGroup
from core.workflow.variables.segments import ArrayFileSegment, FileSegment, Segment
from core.workflow.variables.types import SegmentType
from extensions.ext_database import db
from factories.file_factory import build_from_mapping, build_from_mappings
from factories.variable_factory import build_segment_with_type

View File

@ -21,8 +21,8 @@ from controllers.console.app.workflow_draft_variable import (
from controllers.console.datasets.wraps import get_rag_pipeline
from controllers.console.wraps import account_initialization_required, setup_required
from controllers.web.error import InvalidArgumentError, NotFoundError
from core.variables.types import SegmentType
from core.workflow.constants import CONVERSATION_VARIABLE_NODE_ID, SYSTEM_VARIABLE_NODE_ID
from core.workflow.variables.types import SegmentType
from extensions.ext_database import db
from factories.file_factory import build_from_mapping, build_from_mappings
from factories.variable_factory import build_segment_with_type

View File

@ -36,9 +36,9 @@ ERROR_MSG_INVALID_ENCRYPTED_DATA = "Invalid encrypted data"
ERROR_MSG_INVALID_ENCRYPTED_CODE = "Invalid encrypted code"
def account_initialization_required(view: Callable[P, R]):
def account_initialization_required(view: Callable[P, R]) -> Callable[P, R]:
@wraps(view)
def decorated(*args: P.args, **kwargs: P.kwargs):
def decorated(*args: P.args, **kwargs: P.kwargs) -> R:
# check account initialization
current_user, _ = current_account_with_tenant()
if current_user.status == AccountStatus.UNINITIALIZED:
@ -214,9 +214,9 @@ def cloud_utm_record(view: Callable[P, R]):
return decorated
def setup_required(view: Callable[P, R]):
def setup_required(view: Callable[P, R]) -> Callable[P, R]:
@wraps(view)
def decorated(*args: P.args, **kwargs: P.kwargs):
def decorated(*args: P.args, **kwargs: P.kwargs) -> R:
# check setup
if (
dify_config.EDITION == "SELF_HOSTED"

View File

@ -25,7 +25,6 @@ from core.app.workflow.layers.persistence import PersistenceWorkflowInfo, Workfl
from core.db.session_factory import session_factory
from core.moderation.base import ModerationError
from core.moderation.input_moderation import InputModeration
from core.variables.variables import Variable
from core.workflow.enums import WorkflowType
from core.workflow.graph_engine.command_channels.redis_channel import RedisChannel
from core.workflow.graph_engine.layers.base import GraphEngineLayer
@ -34,6 +33,7 @@ from core.workflow.repositories.workflow_node_execution_repository import Workfl
from core.workflow.runtime import GraphRuntimeState, VariablePool
from core.workflow.system_variable import SystemVariable
from core.workflow.variable_loader import VariableLoader
from core.workflow.variables.variables import Variable
from core.workflow.workflow_entry import WorkflowEntry
from extensions.ext_database import db
from extensions.ext_redis import redis_client

View File

@ -669,16 +669,14 @@ class AdvancedChatAppGenerateTaskPipeline(GraphRuntimeStateSupport):
) -> Generator[StreamResponse, None, None]:
"""Handle retriever resources events."""
self._message_cycle_manager.handle_retriever_resources(event)
return
yield # Make this a generator
yield from ()
def _handle_annotation_reply_event(
self, event: QueueAnnotationReplyEvent, **kwargs
) -> Generator[StreamResponse, None, None]:
"""Handle annotation reply events."""
self._message_cycle_manager.handle_annotation_reply(event)
return
yield # Make this a generator
yield from ()
def _handle_message_replace_event(
self, event: QueueMessageReplaceEvent, **kwargs

View File

@ -122,7 +122,7 @@ class AppQueueManager(ABC):
"""Attach the live graph runtime state reference for downstream consumers."""
self._graph_runtime_state = graph_runtime_state
def publish(self, event: AppQueueEvent, pub_from: PublishFrom):
def publish(self, event: AppQueueEvent, pub_from: PublishFrom) -> None:
"""
Publish event to queue
:param event:

View File

@ -49,7 +49,6 @@ from core.plugin.impl.datasource import PluginDatasourceManager
from core.tools.entities.tool_entities import ToolProviderType
from core.tools.tool_manager import ToolManager
from core.trigger.trigger_manager import TriggerManager
from core.variables.segments import ArrayFileSegment, FileSegment, Segment
from core.workflow.entities.pause_reason import HumanInputRequired
from core.workflow.entities.workflow_start_reason import WorkflowStartReason
from core.workflow.enums import (
@ -62,6 +61,7 @@ from core.workflow.enums import (
from core.workflow.file import FILE_MODEL_IDENTITY, File
from core.workflow.runtime import GraphRuntimeState
from core.workflow.system_variable import SystemVariable
from core.workflow.variables.segments import ArrayFileSegment, FileSegment, Segment
from core.workflow.workflow_entry import WorkflowEntry
from core.workflow.workflow_type_encoder import WorkflowRuntimeTypeConverter
from extensions.ext_database import db

View File

@ -11,7 +11,6 @@ from core.app.entities.app_invoke_entities import (
)
from core.app.workflow.layers.persistence import PersistenceWorkflowInfo, WorkflowPersistenceLayer
from core.app.workflow.node_factory import DifyNodeFactory
from core.variables.variables import RAGPipelineVariable, RAGPipelineVariableInput
from core.workflow.entities.graph_init_params import GraphInitParams
from core.workflow.enums import WorkflowType
from core.workflow.graph import Graph
@ -21,6 +20,7 @@ from core.workflow.repositories.workflow_node_execution_repository import Workfl
from core.workflow.runtime import GraphRuntimeState, VariablePool
from core.workflow.system_variable import SystemVariable
from core.workflow.variable_loader import VariableLoader
from core.workflow.variables.variables import RAGPipelineVariable, RAGPipelineVariableInput
from core.workflow.workflow_entry import WorkflowEntry
from extensions.ext_database import db
from models.dataset import Document, Pipeline

View File

@ -1,12 +1,12 @@
import logging
from core.variables import VariableBase
from core.workflow.constants import CONVERSATION_VARIABLE_NODE_ID
from core.workflow.conversation_variable_updater import ConversationVariableUpdater
from core.workflow.enums import NodeType
from core.workflow.graph_engine.layers.base import GraphEngineLayer
from core.workflow.graph_events import GraphEngineEvent, NodeRunSucceededEvent
from core.workflow.nodes.variable_assigner.common import helpers as common_helpers
from core.workflow.variables import VariableBase
logger = logging.getLogger(__name__)

View File

@ -83,14 +83,21 @@ def fetch_model_config(
raise ModelNotExistError(f"Model {node_data_model.name} not exist.")
provider_model.raise_for_status()
stop: list[str] = []
if "stop" in node_data_model.completion_params:
stop = node_data_model.completion_params.pop("stop")
completion_params = dict(node_data_model.completion_params)
stop = completion_params.pop("stop", [])
if not isinstance(stop, list):
stop = []
model_schema = model_instance.model_type_instance.get_model_schema(node_data_model.name, credentials)
if not model_schema:
raise ModelNotExistError(f"Model {node_data_model.name} not exist.")
model_instance.provider = node_data_model.provider
model_instance.model_name = node_data_model.name
model_instance.credentials = credentials
model_instance.parameters = completion_params
model_instance.stop = tuple(stop)
return model_instance, ModelConfigWithCredentialsEntity(
provider=node_data_model.provider,
model=node_data_model.name,
@ -98,6 +105,6 @@ def fetch_model_config(
mode=node_data_model.mode,
provider_model_bundle=provider_model_bundle,
credentials=credentials,
parameters=node_data_model.completion_params,
parameters=completion_params,
stop=stop,
)

View File

@ -1,13 +1,20 @@
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any, final
from typing import TYPE_CHECKING, Any, cast, final
from typing_extensions import override
from configs import dify_config
from core.app.llm.model_access import build_dify_model_access
from core.helper.code_executor.code_executor import CodeExecutionError, CodeExecutor
from core.helper.code_executor.code_node_provider import CodeNodeProvider
from core.datasource.datasource_manager import DatasourceManager
from core.helper.code_executor.code_executor import (
CodeExecutionError,
CodeExecutor,
)
from core.helper.ssrf_proxy import ssrf_proxy
from core.model_manager import ModelInstance
from core.model_runtime.entities.model_entities import ModelType
from core.model_runtime.model_providers.__base.large_language_model import LargeLanguageModel
from core.prompt.entities.advanced_prompt_entities import MemoryConfig
from core.rag.retrieval.dataset_retrieval import DatasetRetrieval
from core.tools.tool_file_manager import ToolFileManager
from core.workflow.entities.graph_config import NodeConfigDict
@ -18,10 +25,15 @@ from core.workflow.nodes.base.node import Node
from core.workflow.nodes.code.code_node import CodeNode, WorkflowCodeExecutor
from core.workflow.nodes.code.entities import CodeLanguage
from core.workflow.nodes.code.limits import CodeNodeLimits
from core.workflow.nodes.datasource import DatasourceNode
from core.workflow.nodes.document_extractor import DocumentExtractorNode, UnstructuredApiConfig
from core.workflow.nodes.http_request import HttpRequestNode, build_http_request_config
from core.workflow.nodes.knowledge_retrieval.knowledge_retrieval_node import KnowledgeRetrievalNode
from core.workflow.nodes.llm import llm_utils
from core.workflow.nodes.llm.entities import ModelConfig
from core.workflow.nodes.llm.exc import LLMModeRequiredError, ModelNotExistError
from core.workflow.nodes.llm.node import LLMNode
from core.workflow.nodes.llm.protocols import PromptMessageMemory
from core.workflow.nodes.node_mapping import LATEST_VERSION, NODE_TYPE_CLASSES_MAPPING
from core.workflow.nodes.parameter_extractor.parameter_extractor_node import ParameterExtractorNode
from core.workflow.nodes.question_classifier.question_classifier_node import QuestionClassifierNode
@ -70,7 +82,6 @@ class DifyNodeFactory(NodeFactory):
self.graph_init_params = graph_init_params
self.graph_runtime_state = graph_runtime_state
self._code_executor: WorkflowCodeExecutor = DefaultWorkflowCodeExecutor()
self._code_providers: tuple[type[CodeNodeProvider], ...] = CodeNode.default_code_providers()
self._code_limits = CodeNodeLimits(
max_string_length=dify_config.CODE_MAX_STRING_LENGTH,
max_number=dify_config.CODE_MAX_NUMBER,
@ -142,7 +153,6 @@ class DifyNodeFactory(NodeFactory):
graph_init_params=self.graph_init_params,
graph_runtime_state=self.graph_runtime_state,
code_executor=self._code_executor,
code_providers=self._code_providers,
code_limits=self._code_limits,
)
@ -169,6 +179,8 @@ class DifyNodeFactory(NodeFactory):
)
if node_type == NodeType.LLM:
model_instance = self._build_model_instance_for_llm_node(node_data)
memory = self._build_memory_for_llm_node(node_data=node_data, model_instance=model_instance)
return LLMNode(
id=node_id,
config=node_config,
@ -176,6 +188,17 @@ class DifyNodeFactory(NodeFactory):
graph_runtime_state=self.graph_runtime_state,
credentials_provider=self._llm_credentials_provider,
model_factory=self._llm_model_factory,
model_instance=model_instance,
memory=memory,
)
if node_type == NodeType.DATASOURCE:
return DatasourceNode(
id=node_id,
config=node_config,
graph_init_params=self.graph_init_params,
graph_runtime_state=self.graph_runtime_state,
datasource_manager=DatasourceManager,
)
if node_type == NodeType.KNOWLEDGE_RETRIEVAL:
@ -197,6 +220,7 @@ class DifyNodeFactory(NodeFactory):
)
if node_type == NodeType.QUESTION_CLASSIFIER:
model_instance = self._build_model_instance_for_llm_node(node_data)
return QuestionClassifierNode(
id=node_id,
config=node_config,
@ -204,9 +228,11 @@ class DifyNodeFactory(NodeFactory):
graph_runtime_state=self.graph_runtime_state,
credentials_provider=self._llm_credentials_provider,
model_factory=self._llm_model_factory,
model_instance=model_instance,
)
if node_type == NodeType.PARAMETER_EXTRACTOR:
model_instance = self._build_model_instance_for_llm_node(node_data)
return ParameterExtractorNode(
id=node_id,
config=node_config,
@ -214,6 +240,7 @@ class DifyNodeFactory(NodeFactory):
graph_runtime_state=self.graph_runtime_state,
credentials_provider=self._llm_credentials_provider,
model_factory=self._llm_model_factory,
model_instance=model_instance,
)
return node_class(
@ -222,3 +249,55 @@ class DifyNodeFactory(NodeFactory):
graph_init_params=self.graph_init_params,
graph_runtime_state=self.graph_runtime_state,
)
def _build_model_instance_for_llm_node(self, node_data: Mapping[str, Any]) -> ModelInstance:
node_data_model = ModelConfig.model_validate(node_data["model"])
if not node_data_model.mode:
raise LLMModeRequiredError("LLM mode is required.")
credentials = self._llm_credentials_provider.fetch(node_data_model.provider, node_data_model.name)
model_instance = self._llm_model_factory.init_model_instance(node_data_model.provider, node_data_model.name)
provider_model_bundle = model_instance.provider_model_bundle
provider_model = provider_model_bundle.configuration.get_provider_model(
model=node_data_model.name,
model_type=ModelType.LLM,
)
if provider_model is None:
raise ModelNotExistError(f"Model {node_data_model.name} not exist.")
provider_model.raise_for_status()
completion_params = dict(node_data_model.completion_params)
stop = completion_params.pop("stop", [])
if not isinstance(stop, list):
stop = []
model_schema = model_instance.model_type_instance.get_model_schema(node_data_model.name, credentials)
if not model_schema:
raise ModelNotExistError(f"Model {node_data_model.name} not exist.")
model_instance.provider = node_data_model.provider
model_instance.model_name = node_data_model.name
model_instance.credentials = credentials
model_instance.parameters = completion_params
model_instance.stop = tuple(stop)
model_instance.model_type_instance = cast(LargeLanguageModel, model_instance.model_type_instance)
return model_instance
def _build_memory_for_llm_node(
self,
*,
node_data: Mapping[str, Any],
model_instance: ModelInstance,
) -> PromptMessageMemory | None:
raw_memory_config = node_data.get("memory")
if raw_memory_config is None:
return None
node_memory = MemoryConfig.model_validate(raw_memory_config)
return llm_utils.fetch_memory(
variable_pool=self.graph_runtime_state.variable_pool,
app_id=self.graph_init_params.app_id,
node_data_memory=node_memory,
model_instance=model_instance,
)

View File

@ -1,16 +1,39 @@
import logging
from collections.abc import Generator
from threading import Lock
from typing import Any, cast
from sqlalchemy import select
import contexts
from core.datasource.__base.datasource_plugin import DatasourcePlugin
from core.datasource.__base.datasource_provider import DatasourcePluginProviderController
from core.datasource.entities.datasource_entities import DatasourceProviderType
from core.datasource.entities.datasource_entities import (
DatasourceMessage,
DatasourceProviderType,
GetOnlineDocumentPageContentRequest,
OnlineDriveDownloadFileRequest,
)
from core.datasource.errors import DatasourceProviderNotFoundError
from core.datasource.local_file.local_file_provider import LocalFileDatasourcePluginProviderController
from core.datasource.online_document.online_document_plugin import OnlineDocumentDatasourcePlugin
from core.datasource.online_document.online_document_provider import OnlineDocumentDatasourcePluginProviderController
from core.datasource.online_drive.online_drive_plugin import OnlineDriveDatasourcePlugin
from core.datasource.online_drive.online_drive_provider import OnlineDriveDatasourcePluginProviderController
from core.datasource.utils.message_transformer import DatasourceFileMessageTransformer
from core.datasource.website_crawl.website_crawl_provider import WebsiteCrawlDatasourcePluginProviderController
from core.db.session_factory import session_factory
from core.plugin.impl.datasource import PluginDatasourceManager
from core.workflow.entities.workflow_node_execution import WorkflowNodeExecutionStatus
from core.workflow.enums import WorkflowNodeExecutionMetadataKey
from core.workflow.file import File
from core.workflow.file.enums import FileTransferMethod, FileType
from core.workflow.node_events import NodeRunResult, StreamChunkEvent, StreamCompletedEvent
from core.workflow.repositories.datasource_manager_protocol import DatasourceParameter, OnlineDriveDownloadFileParam
from factories import file_factory
from models.model import UploadFile
from models.tools import ToolFile
from services.datasource_provider_service import DatasourceProviderService
logger = logging.getLogger(__name__)
@ -103,3 +126,238 @@ class DatasourceManager:
tenant_id,
datasource_type,
).get_datasource(datasource_name)
@classmethod
def get_icon_url(cls, provider_id: str, tenant_id: str, datasource_name: str, datasource_type: str) -> str:
datasource_runtime = cls.get_datasource_runtime(
provider_id=provider_id,
datasource_name=datasource_name,
tenant_id=tenant_id,
datasource_type=DatasourceProviderType.value_of(datasource_type),
)
return datasource_runtime.get_icon_url(tenant_id)
@classmethod
def stream_online_results(
cls,
*,
user_id: str,
datasource_name: str,
datasource_type: str,
provider_id: str,
tenant_id: str,
provider: str,
plugin_id: str,
credential_id: str,
datasource_param: DatasourceParameter | None = None,
online_drive_request: OnlineDriveDownloadFileParam | None = None,
) -> Generator[DatasourceMessage, None, Any]:
"""
Pull-based streaming of domain messages from datasource plugins.
Returns a generator that yields DatasourceMessage and finally returns a minimal final payload.
Only ONLINE_DOCUMENT and ONLINE_DRIVE are streamable here; other types are handled by nodes directly.
"""
ds_type = DatasourceProviderType.value_of(datasource_type)
runtime = cls.get_datasource_runtime(
provider_id=provider_id,
datasource_name=datasource_name,
tenant_id=tenant_id,
datasource_type=ds_type,
)
dsp_service = DatasourceProviderService()
credentials = dsp_service.get_datasource_credentials(
tenant_id=tenant_id,
provider=provider,
plugin_id=plugin_id,
credential_id=credential_id,
)
if ds_type == DatasourceProviderType.ONLINE_DOCUMENT:
doc_runtime = cast(OnlineDocumentDatasourcePlugin, runtime)
if credentials:
doc_runtime.runtime.credentials = credentials
if datasource_param is None:
raise ValueError("datasource_param is required for ONLINE_DOCUMENT streaming")
inner_gen: Generator[DatasourceMessage, None, None] = doc_runtime.get_online_document_page_content(
user_id=user_id,
datasource_parameters=GetOnlineDocumentPageContentRequest(
workspace_id=datasource_param.workspace_id,
page_id=datasource_param.page_id,
type=datasource_param.type,
),
provider_type=ds_type,
)
elif ds_type == DatasourceProviderType.ONLINE_DRIVE:
drive_runtime = cast(OnlineDriveDatasourcePlugin, runtime)
if credentials:
drive_runtime.runtime.credentials = credentials
if online_drive_request is None:
raise ValueError("online_drive_request is required for ONLINE_DRIVE streaming")
inner_gen = drive_runtime.online_drive_download_file(
user_id=user_id,
request=OnlineDriveDownloadFileRequest(
id=online_drive_request.id,
bucket=online_drive_request.bucket,
),
provider_type=ds_type,
)
else:
raise ValueError(f"Unsupported datasource type for streaming: {ds_type}")
# Bridge through to caller while preserving generator return contract
yield from inner_gen
# No structured final data here; node/adapter will assemble outputs
return {}
@classmethod
def stream_node_events(
cls,
*,
node_id: str,
user_id: str,
datasource_name: str,
datasource_type: str,
provider_id: str,
tenant_id: str,
provider: str,
plugin_id: str,
credential_id: str,
parameters_for_log: dict[str, Any],
datasource_info: dict[str, Any],
variable_pool: Any,
datasource_param: DatasourceParameter | None = None,
online_drive_request: OnlineDriveDownloadFileParam | None = None,
) -> Generator[StreamChunkEvent | StreamCompletedEvent, None, None]:
ds_type = DatasourceProviderType.value_of(datasource_type)
messages = cls.stream_online_results(
user_id=user_id,
datasource_name=datasource_name,
datasource_type=datasource_type,
provider_id=provider_id,
tenant_id=tenant_id,
provider=provider,
plugin_id=plugin_id,
credential_id=credential_id,
datasource_param=datasource_param,
online_drive_request=online_drive_request,
)
transformed = DatasourceFileMessageTransformer.transform_datasource_invoke_messages(
messages=messages, user_id=user_id, tenant_id=tenant_id, conversation_id=None
)
variables: dict[str, Any] = {}
file_out: File | None = None
for message in transformed:
mtype = message.type
if mtype in {
DatasourceMessage.MessageType.IMAGE_LINK,
DatasourceMessage.MessageType.BINARY_LINK,
DatasourceMessage.MessageType.IMAGE,
}:
wanted_ds_type = ds_type in {
DatasourceProviderType.ONLINE_DRIVE,
DatasourceProviderType.ONLINE_DOCUMENT,
}
if wanted_ds_type and isinstance(message.message, DatasourceMessage.TextMessage):
url = message.message.text
datasource_file_id = str(url).split("/")[-1].split(".")[0]
with session_factory.create_session() as session:
stmt = select(ToolFile).where(
ToolFile.id == datasource_file_id, ToolFile.tenant_id == tenant_id
)
datasource_file = session.scalar(stmt)
if not datasource_file:
raise ValueError(
f"ToolFile not found for file_id={datasource_file_id}, tenant_id={tenant_id}"
)
mime_type = datasource_file.mimetype
if datasource_file is not None:
mapping = {
"tool_file_id": datasource_file_id,
"type": file_factory.get_file_type_by_mime_type(mime_type),
"transfer_method": FileTransferMethod.TOOL_FILE,
"url": url,
}
file_out = file_factory.build_from_mapping(mapping=mapping, tenant_id=tenant_id)
elif mtype == DatasourceMessage.MessageType.TEXT:
assert isinstance(message.message, DatasourceMessage.TextMessage)
yield StreamChunkEvent(selector=[node_id, "text"], chunk=message.message.text, is_final=False)
elif mtype == DatasourceMessage.MessageType.LINK:
assert isinstance(message.message, DatasourceMessage.TextMessage)
yield StreamChunkEvent(
selector=[node_id, "text"], chunk=f"Link: {message.message.text}\n", is_final=False
)
elif mtype == DatasourceMessage.MessageType.VARIABLE:
assert isinstance(message.message, DatasourceMessage.VariableMessage)
name = message.message.variable_name
value = message.message.variable_value
if message.message.stream:
assert isinstance(value, str), "stream variable_value must be str"
variables[name] = variables.get(name, "") + value
yield StreamChunkEvent(selector=[node_id, name], chunk=value, is_final=False)
else:
variables[name] = value
elif mtype == DatasourceMessage.MessageType.FILE:
if ds_type == DatasourceProviderType.ONLINE_DRIVE and message.meta:
f = message.meta.get("file")
if isinstance(f, File):
file_out = f
else:
pass
yield StreamChunkEvent(selector=[node_id, "text"], chunk="", is_final=True)
if ds_type == DatasourceProviderType.ONLINE_DRIVE and file_out is not None:
variable_pool.add([node_id, "file"], file_out)
if ds_type == DatasourceProviderType.ONLINE_DOCUMENT:
yield StreamCompletedEvent(
node_run_result=NodeRunResult(
status=WorkflowNodeExecutionStatus.SUCCEEDED,
inputs=parameters_for_log,
metadata={WorkflowNodeExecutionMetadataKey.DATASOURCE_INFO: datasource_info},
outputs={**variables},
)
)
else:
yield StreamCompletedEvent(
node_run_result=NodeRunResult(
status=WorkflowNodeExecutionStatus.SUCCEEDED,
inputs=parameters_for_log,
metadata={WorkflowNodeExecutionMetadataKey.DATASOURCE_INFO: datasource_info},
outputs={
"file": file_out,
"datasource_type": ds_type,
},
)
)
@classmethod
def get_upload_file_by_id(cls, file_id: str, tenant_id: str) -> File:
with session_factory.create_session() as session:
upload_file = (
session.query(UploadFile).where(UploadFile.id == file_id, UploadFile.tenant_id == tenant_id).first()
)
if not upload_file:
raise ValueError(f"UploadFile not found for file_id={file_id}, tenant_id={tenant_id}")
file_info = File(
id=upload_file.id,
filename=upload_file.name,
extension="." + upload_file.extension,
mime_type=upload_file.mime_type,
tenant_id=tenant_id,
type=FileType.CUSTOM,
transfer_method=FileTransferMethod.LOCAL_FILE,
remote_url=upload_file.source_url,
related_id=upload_file.id,
size=upload_file.size,
storage_key=upload_file.key,
url=upload_file.source_url,
)
return file_info

View File

@ -379,4 +379,11 @@ class OnlineDriveDownloadFileRequest(BaseModel):
"""
id: str = Field(..., description="The id of the file")
bucket: str | None = Field(None, description="The name of the bucket")
bucket: str = Field("", description="The name of the bucket")
@field_validator("bucket", mode="before")
@classmethod
def _coerce_bucket(cls, v) -> str:
if v is None:
return ""
return str(v)

View File

@ -1,6 +1,5 @@
import logging
from collections.abc import Mapping
from enum import StrEnum
from threading import Lock
from typing import Any
@ -14,6 +13,7 @@ from core.helper.code_executor.jinja2.jinja2_transformer import Jinja2TemplateTr
from core.helper.code_executor.python3.python3_transformer import Python3TemplateTransformer
from core.helper.code_executor.template_transformer import TemplateTransformer
from core.helper.http_client_pooling import get_pooled_http_client
from core.workflow.nodes.code.entities import CodeLanguage
logger = logging.getLogger(__name__)
code_execution_endpoint_url = URL(str(dify_config.CODE_EXECUTION_ENDPOINT))
@ -40,12 +40,6 @@ class CodeExecutionResponse(BaseModel):
data: Data
class CodeLanguage(StrEnum):
PYTHON3 = "python3"
JINJA2 = "jinja2"
JAVASCRIPT = "javascript"
def _build_code_executor_client() -> httpx.Client:
return httpx.Client(
verify=CODE_EXECUTION_SSL_VERIFY,

View File

@ -5,7 +5,7 @@ from base64 import b64encode
from collections.abc import Mapping
from typing import Any
from core.variables.utils import dumps_with_segments
from core.workflow.variables.utils import dumps_with_segments
class TemplateTransformer(ABC):

View File

@ -1,5 +1,5 @@
import logging
from collections.abc import Callable, Generator, Iterable, Sequence
from collections.abc import Callable, Generator, Iterable, Mapping, Sequence
from typing import IO, Any, Literal, Optional, Union, cast, overload
from configs import dify_config
@ -38,6 +38,9 @@ class ModelInstance:
self.model_name = model
self.provider = provider_model_bundle.configuration.provider.provider
self.credentials = self._fetch_credentials_from_bundle(provider_model_bundle, model)
# Runtime LLM invocation fields.
self.parameters: Mapping[str, Any] = {}
self.stop: Sequence[str] = ()
self.model_type_instance = self.provider_model_bundle.model_type_instance
self.load_balancing_manager = self._get_load_balancing_manager(
configuration=provider_model_bundle.configuration,

View File

@ -83,19 +83,21 @@ def _merge_tool_call_delta(
tool_call.function.arguments += delta.function.arguments
def _build_llm_result_from_first_chunk(
def _build_llm_result_from_chunks(
model: str,
prompt_messages: Sequence[PromptMessage],
chunks: Iterator[LLMResultChunk],
) -> LLMResult:
"""
Build a single `LLMResult` from the first returned chunk.
Build a single `LLMResult` by accumulating all returned chunks.
This is used for `stream=False` because the plugin side may still implement the response via a chunked stream.
Some models only support streaming output (e.g. Qwen3 open-source edition)
and the plugin side may still implement the response via a chunked stream,
so all chunks must be consumed and concatenated into a single ``LLMResult``.
Note:
This function always drains the `chunks` iterator after reading the first chunk to ensure any underlying
streaming resources are released (e.g., HTTP connections owned by the plugin runtime).
The ``usage`` is taken from the last chunk that carries it, which is the
typical convention for streaming responses (the final chunk contains the
aggregated token counts).
"""
content = ""
content_list: list[PromptMessageContentUnionTypes] = []
@ -104,24 +106,27 @@ def _build_llm_result_from_first_chunk(
tools_calls: list[AssistantPromptMessage.ToolCall] = []
try:
first_chunk = next(chunks, None)
if first_chunk is not None:
if isinstance(first_chunk.delta.message.content, str):
content += first_chunk.delta.message.content
elif isinstance(first_chunk.delta.message.content, list):
content_list.extend(first_chunk.delta.message.content)
for chunk in chunks:
if isinstance(chunk.delta.message.content, str):
content += chunk.delta.message.content
elif isinstance(chunk.delta.message.content, list):
content_list.extend(chunk.delta.message.content)
if first_chunk.delta.message.tool_calls:
_increase_tool_call(first_chunk.delta.message.tool_calls, tools_calls)
if chunk.delta.message.tool_calls:
_increase_tool_call(chunk.delta.message.tool_calls, tools_calls)
usage = first_chunk.delta.usage or LLMUsage.empty_usage()
system_fingerprint = first_chunk.system_fingerprint
if chunk.delta.usage:
usage = chunk.delta.usage
if chunk.system_fingerprint:
system_fingerprint = chunk.system_fingerprint
except Exception:
logger.exception("Error while consuming non-stream plugin chunk iterator.")
raise
finally:
try:
for _ in chunks:
pass
except Exception:
logger.debug("Failed to drain non-stream plugin chunk iterator.", exc_info=True)
# Drain any remaining chunks to release underlying streaming resources (e.g. HTTP connections).
close = getattr(chunks, "close", None)
if callable(close):
close()
return LLMResult(
model=model,
@ -174,7 +179,7 @@ def _normalize_non_stream_plugin_result(
) -> LLMResult:
if isinstance(result, LLMResult):
return result
return _build_llm_result_from_first_chunk(model=model, prompt_messages=prompt_messages, chunks=result)
return _build_llm_result_from_chunks(model=model, prompt_messages=prompt_messages, chunks=result)
def _increase_tool_call(

View File

@ -14,10 +14,9 @@ class BaseTraceInstance(ABC):
Base trace instance for ops trace services
"""
@abstractmethod
def __init__(self, trace_config: BaseTracingConfig):
"""
Abstract initializer for the trace instance.
Initializer for the trace instance.
Distribute trace tasks by matching entities
"""
self.trace_config = trace_config

View File

@ -41,8 +41,8 @@ logger = logging.getLogger(__name__)
class OpsTraceProviderConfigMap(collections.UserDict[str, dict[str, Any]]):
def __getitem__(self, provider: str) -> dict[str, Any]:
match provider:
def __getitem__(self, key: str) -> dict[str, Any]:
match key:
case TracingProviderEnum.LANGFUSE:
from core.ops.entities.config_entity import LangfuseConfig
from core.ops.langfuse_trace.langfuse_trace import LangFuseDataTrace
@ -149,7 +149,7 @@ class OpsTraceProviderConfigMap(collections.UserDict[str, dict[str, Any]]):
}
case _:
raise KeyError(f"Unsupported tracing provider: {provider}")
raise KeyError(f"Unsupported tracing provider: {key}")
provider_config_map = OpsTraceProviderConfigMap()

View File

@ -4,6 +4,7 @@ from typing import cast
from core.app.entities.app_invoke_entities import ModelConfigWithCredentialsEntity
from core.helper.code_executor.jinja2.jinja2_formatter import Jinja2Formatter
from core.memory.token_buffer_memory import TokenBufferMemory
from core.model_manager import ModelInstance
from core.model_runtime.entities import (
AssistantPromptMessage,
PromptMessage,
@ -44,7 +45,8 @@ class AdvancedPromptTransform(PromptTransform):
context: str | None,
memory_config: MemoryConfig | None,
memory: TokenBufferMemory | None,
model_config: ModelConfigWithCredentialsEntity,
model_config: ModelConfigWithCredentialsEntity | None = None,
model_instance: ModelInstance | None = None,
image_detail_config: ImagePromptMessageContent.DETAIL | None = None,
) -> list[PromptMessage]:
prompt_messages = []
@ -59,6 +61,7 @@ class AdvancedPromptTransform(PromptTransform):
memory_config=memory_config,
memory=memory,
model_config=model_config,
model_instance=model_instance,
image_detail_config=image_detail_config,
)
elif isinstance(prompt_template, list) and all(isinstance(item, ChatModelMessage) for item in prompt_template):
@ -71,6 +74,7 @@ class AdvancedPromptTransform(PromptTransform):
memory_config=memory_config,
memory=memory,
model_config=model_config,
model_instance=model_instance,
image_detail_config=image_detail_config,
)
@ -85,7 +89,8 @@ class AdvancedPromptTransform(PromptTransform):
context: str | None,
memory_config: MemoryConfig | None,
memory: TokenBufferMemory | None,
model_config: ModelConfigWithCredentialsEntity,
model_config: ModelConfigWithCredentialsEntity | None = None,
model_instance: ModelInstance | None = None,
image_detail_config: ImagePromptMessageContent.DETAIL | None = None,
) -> list[PromptMessage]:
"""
@ -111,6 +116,7 @@ class AdvancedPromptTransform(PromptTransform):
parser=parser,
prompt_inputs=prompt_inputs,
model_config=model_config,
model_instance=model_instance,
)
if query:
@ -146,7 +152,8 @@ class AdvancedPromptTransform(PromptTransform):
context: str | None,
memory_config: MemoryConfig | None,
memory: TokenBufferMemory | None,
model_config: ModelConfigWithCredentialsEntity,
model_config: ModelConfigWithCredentialsEntity | None = None,
model_instance: ModelInstance | None = None,
image_detail_config: ImagePromptMessageContent.DETAIL | None = None,
) -> list[PromptMessage]:
"""
@ -198,8 +205,13 @@ class AdvancedPromptTransform(PromptTransform):
prompt_message_contents: list[PromptMessageContentUnionTypes] = []
if memory and memory_config:
prompt_messages = self._append_chat_histories(memory, memory_config, prompt_messages, model_config)
prompt_messages = self._append_chat_histories(
memory,
memory_config,
prompt_messages,
model_config=model_config,
model_instance=model_instance,
)
if files and query is not None:
for file in files:
prompt_message_contents.append(
@ -276,7 +288,8 @@ class AdvancedPromptTransform(PromptTransform):
role_prefix: MemoryConfig.RolePrefix,
parser: PromptTemplateParser,
prompt_inputs: Mapping[str, str],
model_config: ModelConfigWithCredentialsEntity,
model_config: ModelConfigWithCredentialsEntity | None = None,
model_instance: ModelInstance | None = None,
) -> Mapping[str, str]:
prompt_inputs = dict(prompt_inputs)
if "#histories#" in parser.variable_keys:
@ -286,7 +299,11 @@ class AdvancedPromptTransform(PromptTransform):
prompt_inputs = {k: inputs[k] for k in parser.variable_keys if k in inputs}
tmp_human_message = UserPromptMessage(content=parser.format(prompt_inputs))
rest_tokens = self._calculate_rest_token([tmp_human_message], model_config)
rest_tokens = self._calculate_rest_token(
[tmp_human_message],
model_config=model_config,
model_instance=model_instance,
)
histories = self._get_history_messages_from_memory(
memory=memory,

View File

@ -41,7 +41,7 @@ class AgentHistoryPromptTransform(PromptTransform):
if not self.memory:
return prompt_messages
max_token_limit = self._calculate_rest_token(self.prompt_messages, self.model_config)
max_token_limit = self._calculate_rest_token(self.prompt_messages, model_config=self.model_config)
model_type_instance = self.model_config.provider_model_bundle.model_type_instance
model_type_instance = cast(LargeLanguageModel, model_type_instance)

View File

@ -4,45 +4,83 @@ from core.app.entities.app_invoke_entities import ModelConfigWithCredentialsEnti
from core.memory.token_buffer_memory import TokenBufferMemory
from core.model_manager import ModelInstance
from core.model_runtime.entities.message_entities import PromptMessage
from core.model_runtime.entities.model_entities import ModelPropertyKey
from core.model_runtime.entities.model_entities import AIModelEntity, ModelPropertyKey
from core.prompt.entities.advanced_prompt_entities import MemoryConfig
class PromptTransform:
def _resolve_model_runtime(
self,
*,
model_config: ModelConfigWithCredentialsEntity | None = None,
model_instance: ModelInstance | None = None,
) -> tuple[ModelInstance, AIModelEntity]:
if model_instance is None:
if model_config is None:
raise ValueError("Either model_config or model_instance must be provided.")
model_instance = ModelInstance(
provider_model_bundle=model_config.provider_model_bundle, model=model_config.model
)
model_instance.credentials = model_config.credentials
model_instance.parameters = model_config.parameters
model_instance.stop = model_config.stop
model_schema = model_instance.model_type_instance.get_model_schema(
model=model_instance.model_name,
credentials=model_instance.credentials,
)
if model_schema is None:
if model_config is None:
raise ValueError("Model schema not found for the provided model instance.")
model_schema = model_config.model_schema
return model_instance, model_schema
def _append_chat_histories(
self,
memory: TokenBufferMemory,
memory_config: MemoryConfig,
prompt_messages: list[PromptMessage],
model_config: ModelConfigWithCredentialsEntity,
*,
model_config: ModelConfigWithCredentialsEntity | None = None,
model_instance: ModelInstance | None = None,
) -> list[PromptMessage]:
rest_tokens = self._calculate_rest_token(prompt_messages, model_config)
rest_tokens = self._calculate_rest_token(
prompt_messages,
model_config=model_config,
model_instance=model_instance,
)
histories = self._get_history_messages_list_from_memory(memory, memory_config, rest_tokens)
prompt_messages.extend(histories)
return prompt_messages
def _calculate_rest_token(
self, prompt_messages: list[PromptMessage], model_config: ModelConfigWithCredentialsEntity
self,
prompt_messages: list[PromptMessage],
*,
model_config: ModelConfigWithCredentialsEntity | None = None,
model_instance: ModelInstance | None = None,
) -> int:
model_instance, model_schema = self._resolve_model_runtime(
model_config=model_config,
model_instance=model_instance,
)
model_parameters = model_instance.parameters
rest_tokens = 2000
model_context_tokens = model_config.model_schema.model_properties.get(ModelPropertyKey.CONTEXT_SIZE)
model_context_tokens = model_schema.model_properties.get(ModelPropertyKey.CONTEXT_SIZE)
if model_context_tokens:
model_instance = ModelInstance(
provider_model_bundle=model_config.provider_model_bundle, model=model_config.model
)
curr_message_tokens = model_instance.get_llm_num_tokens(prompt_messages)
max_tokens = 0
for parameter_rule in model_config.model_schema.parameter_rules:
for parameter_rule in model_schema.parameter_rules:
if parameter_rule.name == "max_tokens" or (
parameter_rule.use_template and parameter_rule.use_template == "max_tokens"
):
max_tokens = (
model_config.parameters.get(parameter_rule.name)
or model_config.parameters.get(parameter_rule.use_template or "")
model_parameters.get(parameter_rule.name)
or model_parameters.get(parameter_rule.use_template or "")
) or 0
rest_tokens = model_context_tokens - max_tokens - curr_message_tokens

View File

@ -252,7 +252,7 @@ class SimplePromptTransform(PromptTransform):
if memory:
tmp_human_message = UserPromptMessage(content=prompt)
rest_tokens = self._calculate_rest_token([tmp_human_message], model_config)
rest_tokens = self._calculate_rest_token([tmp_human_message], model_config=model_config)
histories = self._get_history_messages_from_memory(
memory=memory,
memory_config=MemoryConfig(

View File

@ -75,15 +75,15 @@ class BaseIndexProcessor(ABC):
multimodal_documents: list[AttachmentDocument] | None = None,
with_keywords: bool = True,
**kwargs,
):
) -> None:
raise NotImplementedError
@abstractmethod
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs):
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs) -> None:
raise NotImplementedError
@abstractmethod
def index(self, dataset: Dataset, document: DatasetDocument, chunks: Any):
def index(self, dataset: Dataset, document: DatasetDocument, chunks: Any) -> None:
raise NotImplementedError
@abstractmethod

View File

@ -115,7 +115,7 @@ class ParagraphIndexProcessor(BaseIndexProcessor):
multimodal_documents: list[AttachmentDocument] | None = None,
with_keywords: bool = True,
**kwargs,
):
) -> None:
if dataset.indexing_technique == "high_quality":
vector = Vector(dataset)
vector.create(documents)
@ -130,7 +130,7 @@ class ParagraphIndexProcessor(BaseIndexProcessor):
else:
keyword.add_texts(documents)
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs):
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs) -> None:
# Note: Summary indexes are now disabled (not deleted) when segments are disabled.
# This method is called for actual deletion scenarios (e.g., when segment is deleted).
# For disable operations, disable_summaries_for_segments is called directly in the task.
@ -196,7 +196,7 @@ class ParagraphIndexProcessor(BaseIndexProcessor):
docs.append(doc)
return docs
def index(self, dataset: Dataset, document: DatasetDocument, chunks: Any):
def index(self, dataset: Dataset, document: DatasetDocument, chunks: Any) -> None:
documents: list[Any] = []
all_multimodal_documents: list[Any] = []
if isinstance(chunks, list):
@ -469,7 +469,7 @@ class ParagraphIndexProcessor(BaseIndexProcessor):
if not isinstance(result, LLMResult):
raise ValueError("Expected LLMResult when stream=False")
summary_content = getattr(result.message, "content", "")
summary_content = result.message.get_text_content()
usage = result.usage
# Deduct quota for summary generation (same as workflow nodes)

View File

@ -126,7 +126,7 @@ class ParentChildIndexProcessor(BaseIndexProcessor):
multimodal_documents: list[AttachmentDocument] | None = None,
with_keywords: bool = True,
**kwargs,
):
) -> None:
if dataset.indexing_technique == "high_quality":
vector = Vector(dataset)
for document in documents:
@ -139,7 +139,7 @@ class ParentChildIndexProcessor(BaseIndexProcessor):
if multimodal_documents and dataset.is_multimodal:
vector.create_multimodal(multimodal_documents)
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs):
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs) -> None:
# node_ids is segment's node_ids
# Note: Summary indexes are now disabled (not deleted) when segments are disabled.
# This method is called for actual deletion scenarios (e.g., when segment is deleted).
@ -272,7 +272,7 @@ class ParentChildIndexProcessor(BaseIndexProcessor):
child_nodes.append(child_document)
return child_nodes
def index(self, dataset: Dataset, document: DatasetDocument, chunks: Any):
def index(self, dataset: Dataset, document: DatasetDocument, chunks: Any) -> None:
parent_childs = ParentChildStructureChunk.model_validate(chunks)
documents = []
for parent_child in parent_childs.parent_child_chunks:

View File

@ -139,14 +139,14 @@ class QAIndexProcessor(BaseIndexProcessor):
multimodal_documents: list[AttachmentDocument] | None = None,
with_keywords: bool = True,
**kwargs,
):
) -> None:
if dataset.indexing_technique == "high_quality":
vector = Vector(dataset)
vector.create(documents)
if multimodal_documents and dataset.is_multimodal:
vector.create_multimodal(multimodal_documents)
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs):
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs) -> None:
# Note: Summary indexes are now disabled (not deleted) when segments are disabled.
# This method is called for actual deletion scenarios (e.g., when segment is deleted).
# For disable operations, disable_summaries_for_segments is called directly in the task.
@ -206,7 +206,7 @@ class QAIndexProcessor(BaseIndexProcessor):
docs.append(doc)
return docs
def index(self, dataset: Dataset, document: DatasetDocument, chunks: Any):
def index(self, dataset: Dataset, document: DatasetDocument, chunks: Any) -> None:
qa_chunks = QAStructureChunk.model_validate(chunks)
documents = []
for qa_chunk in qa_chunks.qa_chunks:

View File

@ -1,7 +1,7 @@
import abc
from typing import Protocol
from core.variables import VariableBase
from core.workflow.variables import VariableBase
class ConversationVariableUpdater(Protocol):

View File

@ -11,7 +11,7 @@ from typing import Any
from pydantic import BaseModel, Field
from core.variables.variables import Variable
from core.workflow.variables.variables import Variable
class CommandType(StrEnum):

View File

@ -25,7 +25,6 @@ from core.tools.entities.tool_entities import (
)
from core.tools.tool_manager import ToolManager
from core.tools.utils.message_transformer import ToolFileMessageTransformer
from core.variables.segments import ArrayFileSegment, StringSegment
from core.workflow.enums import (
NodeType,
SystemVariableKey,
@ -44,6 +43,7 @@ from core.workflow.nodes.agent.entities import AgentNodeData, AgentOldVersionMod
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.base.variable_template_parser import VariableTemplateParser
from core.workflow.runtime import VariablePool
from core.workflow.variables.segments import ArrayFileSegment, StringSegment
from extensions.ext_database import db
from factories import file_factory
from factories.agent_factory import get_plugin_agent_strategy

View File

@ -1,13 +1,13 @@
from collections.abc import Mapping, Sequence
from typing import Any
from core.variables import ArrayFileSegment, FileSegment, Segment
from core.workflow.enums import NodeExecutionType, NodeType, WorkflowNodeExecutionStatus
from core.workflow.node_events import NodeRunResult
from core.workflow.nodes.answer.entities import AnswerNodeData
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.base.template import Template
from core.workflow.nodes.base.variable_template_parser import VariableTemplateParser
from core.workflow.variables import ArrayFileSegment, FileSegment, Segment
class AnswerNode(Node[AnswerNodeData]):

View File

@ -1,17 +1,15 @@
from collections.abc import Mapping, Sequence
from decimal import Decimal
from typing import TYPE_CHECKING, Any, ClassVar, Protocol, cast
from textwrap import dedent
from typing import TYPE_CHECKING, Any, Protocol, cast
from core.helper.code_executor.code_node_provider import CodeNodeProvider
from core.helper.code_executor.javascript.javascript_code_provider import JavascriptCodeProvider
from core.helper.code_executor.python3.python3_code_provider import Python3CodeProvider
from core.variables.segments import ArrayFileSegment
from core.variables.types import SegmentType
from core.workflow.enums import NodeType, WorkflowNodeExecutionStatus
from core.workflow.node_events import NodeRunResult
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.code.entities import CodeLanguage, CodeNodeData
from core.workflow.nodes.code.limits import CodeNodeLimits
from core.workflow.variables.segments import ArrayFileSegment
from core.workflow.variables.types import SegmentType
from .exc import (
CodeNodeError,
@ -36,12 +34,44 @@ class WorkflowCodeExecutor(Protocol):
def is_execution_error(self, error: Exception) -> bool: ...
def _build_default_config(*, language: CodeLanguage, code: str) -> Mapping[str, object]:
return {
"type": "code",
"config": {
"variables": [
{"variable": "arg1", "value_selector": []},
{"variable": "arg2", "value_selector": []},
],
"code_language": language,
"code": code,
"outputs": {"result": {"type": "string", "children": None}},
},
}
_DEFAULT_CODE_BY_LANGUAGE: Mapping[CodeLanguage, str] = {
CodeLanguage.PYTHON3: dedent(
"""
def main(arg1: str, arg2: str):
return {
"result": arg1 + arg2,
}
"""
),
CodeLanguage.JAVASCRIPT: dedent(
"""
function main({arg1, arg2}) {
return {
result: arg1 + arg2
}
}
"""
),
}
class CodeNode(Node[CodeNodeData]):
node_type = NodeType.CODE
_DEFAULT_CODE_PROVIDERS: ClassVar[tuple[type[CodeNodeProvider], ...]] = (
Python3CodeProvider,
JavascriptCodeProvider,
)
_limits: CodeNodeLimits
def __init__(
@ -52,7 +82,6 @@ class CodeNode(Node[CodeNodeData]):
graph_runtime_state: "GraphRuntimeState",
*,
code_executor: WorkflowCodeExecutor,
code_providers: Sequence[type[CodeNodeProvider]] | None = None,
code_limits: CodeNodeLimits,
) -> None:
super().__init__(
@ -62,9 +91,6 @@ class CodeNode(Node[CodeNodeData]):
graph_runtime_state=graph_runtime_state,
)
self._code_executor: WorkflowCodeExecutor = code_executor
self._code_providers: tuple[type[CodeNodeProvider], ...] = (
tuple(code_providers) if code_providers else self._DEFAULT_CODE_PROVIDERS
)
self._limits = code_limits
@classmethod
@ -78,15 +104,10 @@ class CodeNode(Node[CodeNodeData]):
if filters:
code_language = cast(CodeLanguage, filters.get("code_language", CodeLanguage.PYTHON3))
code_provider: type[CodeNodeProvider] = next(
provider for provider in cls._DEFAULT_CODE_PROVIDERS if provider.is_accept_language(code_language)
)
return code_provider.get_default_config()
@classmethod
def default_code_providers(cls) -> tuple[type[CodeNodeProvider], ...]:
return cls._DEFAULT_CODE_PROVIDERS
default_code = _DEFAULT_CODE_BY_LANGUAGE.get(code_language)
if default_code is None:
raise CodeNodeError(f"Unsupported code language: {code_language}")
return _build_default_config(language=code_language, code=default_code)
@classmethod
def version(cls) -> str:
@ -108,7 +129,6 @@ class CodeNode(Node[CodeNodeData]):
variables[variable_name] = variable.to_object() if variable else None
# Run code
try:
_ = self._select_code_provider(code_language)
result = self._code_executor.execute(
language=code_language,
code=code,
@ -130,12 +150,6 @@ class CodeNode(Node[CodeNodeData]):
return NodeRunResult(status=WorkflowNodeExecutionStatus.SUCCEEDED, inputs=variables, outputs=result)
def _select_code_provider(self, code_language: CodeLanguage) -> type[CodeNodeProvider]:
for provider in self._code_providers:
if provider.is_accept_language(code_language):
return provider
raise CodeNodeError(f"Unsupported code language: {code_language}")
def _check_string(self, value: str | None, variable: str) -> str | None:
"""
Check string

View File

@ -1,11 +1,18 @@
from enum import StrEnum
from typing import Annotated, Literal
from pydantic import AfterValidator, BaseModel
from core.helper.code_executor.code_executor import CodeLanguage
from core.variables.types import SegmentType
from core.workflow.nodes.base import BaseNodeData
from core.workflow.nodes.base.entities import VariableSelector
from core.workflow.variables.types import SegmentType
class CodeLanguage(StrEnum):
PYTHON3 = "python3"
JINJA2 = "jinja2"
JAVASCRIPT = "javascript"
_ALLOWED_OUTPUT_FROM_CODE = frozenset(
[

View File

@ -1,40 +1,26 @@
from collections.abc import Generator, Mapping, Sequence
from typing import Any, cast
from typing import TYPE_CHECKING, Any
from sqlalchemy import select
from sqlalchemy.orm import Session
from core.datasource.entities.datasource_entities import (
DatasourceMessage,
DatasourceParameter,
DatasourceProviderType,
GetOnlineDocumentPageContentRequest,
OnlineDriveDownloadFileRequest,
)
from core.datasource.online_document.online_document_plugin import OnlineDocumentDatasourcePlugin
from core.datasource.online_drive.online_drive_plugin import OnlineDriveDatasourcePlugin
from core.datasource.utils.message_transformer import DatasourceFileMessageTransformer
from core.datasource.entities.datasource_entities import DatasourceProviderType
from core.plugin.impl.exc import PluginDaemonClientSideError
from core.variables.segments import ArrayAnySegment
from core.variables.variables import ArrayAnyVariable
from core.workflow.entities.workflow_node_execution import WorkflowNodeExecutionStatus
from core.workflow.enums import NodeExecutionType, NodeType, SystemVariableKey
from core.workflow.file import File
from core.workflow.file.enums import FileTransferMethod, FileType
from core.workflow.node_events import NodeRunResult, StreamChunkEvent, StreamCompletedEvent
from core.workflow.node_events import NodeRunResult, StreamCompletedEvent
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.base.variable_template_parser import VariableTemplateParser
from core.workflow.nodes.tool.exc import ToolFileError
from core.workflow.runtime import VariablePool
from extensions.ext_database import db
from factories import file_factory
from models.model import UploadFile
from models.tools import ToolFile
from services.datasource_provider_service import DatasourceProviderService
from core.workflow.repositories.datasource_manager_protocol import (
DatasourceManagerProtocol,
DatasourceParameter,
OnlineDriveDownloadFileParam,
)
from ...entities.workflow_node_execution import WorkflowNodeExecutionMetadataKey
from .entities import DatasourceNodeData
from .exc import DatasourceNodeError, DatasourceParameterError
from .exc import DatasourceNodeError
if TYPE_CHECKING:
from core.workflow.entities import GraphInitParams
from core.workflow.runtime import GraphRuntimeState
class DatasourceNode(Node[DatasourceNodeData]):
@ -45,6 +31,22 @@ class DatasourceNode(Node[DatasourceNodeData]):
node_type = NodeType.DATASOURCE
execution_type = NodeExecutionType.ROOT
def __init__(
self,
id: str,
config: Mapping[str, Any],
graph_init_params: "GraphInitParams",
graph_runtime_state: "GraphRuntimeState",
datasource_manager: DatasourceManagerProtocol,
):
super().__init__(
id=id,
config=config,
graph_init_params=graph_init_params,
graph_runtime_state=graph_runtime_state,
)
self.datasource_manager = datasource_manager
def _run(self) -> Generator:
"""
Run the datasource node
@ -52,84 +54,69 @@ class DatasourceNode(Node[DatasourceNodeData]):
node_data = self.node_data
variable_pool = self.graph_runtime_state.variable_pool
datasource_type_segement = variable_pool.get(["sys", SystemVariableKey.DATASOURCE_TYPE])
if not datasource_type_segement:
datasource_type_segment = variable_pool.get(["sys", SystemVariableKey.DATASOURCE_TYPE])
if not datasource_type_segment:
raise DatasourceNodeError("Datasource type is not set")
datasource_type = str(datasource_type_segement.value) if datasource_type_segement.value else None
datasource_info_segement = variable_pool.get(["sys", SystemVariableKey.DATASOURCE_INFO])
if not datasource_info_segement:
datasource_type = str(datasource_type_segment.value) if datasource_type_segment.value else None
datasource_info_segment = variable_pool.get(["sys", SystemVariableKey.DATASOURCE_INFO])
if not datasource_info_segment:
raise DatasourceNodeError("Datasource info is not set")
datasource_info_value = datasource_info_segement.value
datasource_info_value = datasource_info_segment.value
if not isinstance(datasource_info_value, dict):
raise DatasourceNodeError("Invalid datasource info format")
datasource_info: dict[str, Any] = datasource_info_value
# get datasource runtime
from core.datasource.datasource_manager import DatasourceManager
if datasource_type is None:
raise DatasourceNodeError("Datasource type is not set")
datasource_type = DatasourceProviderType.value_of(datasource_type)
provider_id = f"{node_data.plugin_id}/{node_data.provider_name}"
datasource_runtime = DatasourceManager.get_datasource_runtime(
provider_id=f"{node_data.plugin_id}/{node_data.provider_name}",
datasource_info["icon"] = self.datasource_manager.get_icon_url(
provider_id=provider_id,
datasource_name=node_data.datasource_name or "",
tenant_id=self.tenant_id,
datasource_type=datasource_type,
datasource_type=datasource_type.value,
)
datasource_info["icon"] = datasource_runtime.get_icon_url(self.tenant_id)
parameters_for_log = datasource_info
try:
datasource_provider_service = DatasourceProviderService()
credentials = datasource_provider_service.get_datasource_credentials(
tenant_id=self.tenant_id,
provider=node_data.provider_name,
plugin_id=node_data.plugin_id,
credential_id=datasource_info.get("credential_id", ""),
)
match datasource_type:
case DatasourceProviderType.ONLINE_DOCUMENT:
datasource_runtime = cast(OnlineDocumentDatasourcePlugin, datasource_runtime)
if credentials:
datasource_runtime.runtime.credentials = credentials
online_document_result: Generator[DatasourceMessage, None, None] = (
datasource_runtime.get_online_document_page_content(
user_id=self.user_id,
datasource_parameters=GetOnlineDocumentPageContentRequest(
workspace_id=datasource_info.get("workspace_id", ""),
page_id=datasource_info.get("page", {}).get("page_id", ""),
type=datasource_info.get("page", {}).get("type", ""),
),
provider_type=datasource_type,
case DatasourceProviderType.ONLINE_DOCUMENT | DatasourceProviderType.ONLINE_DRIVE:
# Build typed request objects
datasource_parameters = None
if datasource_type == DatasourceProviderType.ONLINE_DOCUMENT:
datasource_parameters = DatasourceParameter(
workspace_id=datasource_info.get("workspace_id", ""),
page_id=datasource_info.get("page", {}).get("page_id", ""),
type=datasource_info.get("page", {}).get("type", ""),
)
)
yield from self._transform_message(
messages=online_document_result,
parameters_for_log=parameters_for_log,
datasource_info=datasource_info,
)
case DatasourceProviderType.ONLINE_DRIVE:
datasource_runtime = cast(OnlineDriveDatasourcePlugin, datasource_runtime)
if credentials:
datasource_runtime.runtime.credentials = credentials
online_drive_result: Generator[DatasourceMessage, None, None] = (
datasource_runtime.online_drive_download_file(
user_id=self.user_id,
request=OnlineDriveDownloadFileRequest(
id=datasource_info.get("id", ""),
bucket=datasource_info.get("bucket"),
),
provider_type=datasource_type,
online_drive_request = None
if datasource_type == DatasourceProviderType.ONLINE_DRIVE:
online_drive_request = OnlineDriveDownloadFileParam(
id=datasource_info.get("id", ""),
bucket=datasource_info.get("bucket", ""),
)
)
yield from self._transform_datasource_file_message(
messages=online_drive_result,
credential_id = datasource_info.get("credential_id", "")
yield from self.datasource_manager.stream_node_events(
node_id=self._node_id,
user_id=self.user_id,
datasource_name=node_data.datasource_name or "",
datasource_type=datasource_type.value,
provider_id=provider_id,
tenant_id=self.tenant_id,
provider=node_data.provider_name,
plugin_id=node_data.plugin_id,
credential_id=credential_id,
parameters_for_log=parameters_for_log,
datasource_info=datasource_info,
variable_pool=variable_pool,
datasource_type=datasource_type,
datasource_param=datasource_parameters,
online_drive_request=online_drive_request,
)
case DatasourceProviderType.WEBSITE_CRAWL:
yield StreamCompletedEvent(
@ -147,23 +134,9 @@ class DatasourceNode(Node[DatasourceNodeData]):
related_id = datasource_info.get("related_id")
if not related_id:
raise DatasourceNodeError("File is not exist")
upload_file = db.session.query(UploadFile).where(UploadFile.id == related_id).first()
if not upload_file:
raise ValueError("Invalid upload file Info")
file_info = File(
id=upload_file.id,
filename=upload_file.name,
extension="." + upload_file.extension,
mime_type=upload_file.mime_type,
tenant_id=self.tenant_id,
type=FileType.CUSTOM,
transfer_method=FileTransferMethod.LOCAL_FILE,
remote_url=upload_file.source_url,
related_id=upload_file.id,
size=upload_file.size,
storage_key=upload_file.key,
url=upload_file.source_url,
file_info = self.datasource_manager.get_upload_file_by_id(
file_id=related_id, tenant_id=self.tenant_id
)
variable_pool.add([self._node_id, "file"], file_info)
# variable_pool.add([self.node_id, "file"], file_info.to_dict())
@ -201,55 +174,6 @@ class DatasourceNode(Node[DatasourceNodeData]):
)
)
def _generate_parameters(
self,
*,
datasource_parameters: Sequence[DatasourceParameter],
variable_pool: VariablePool,
node_data: DatasourceNodeData,
for_log: bool = False,
) -> dict[str, Any]:
"""
Generate parameters based on the given tool parameters, variable pool, and node data.
Args:
tool_parameters (Sequence[ToolParameter]): The list of tool parameters.
variable_pool (VariablePool): The variable pool containing the variables.
node_data (ToolNodeData): The data associated with the tool node.
Returns:
Mapping[str, Any]: A dictionary containing the generated parameters.
"""
datasource_parameters_dictionary = {parameter.name: parameter for parameter in datasource_parameters}
result: dict[str, Any] = {}
if node_data.datasource_parameters:
for parameter_name in node_data.datasource_parameters:
parameter = datasource_parameters_dictionary.get(parameter_name)
if not parameter:
result[parameter_name] = None
continue
datasource_input = node_data.datasource_parameters[parameter_name]
if datasource_input.type == "variable":
variable = variable_pool.get(datasource_input.value)
if variable is None:
raise DatasourceParameterError(f"Variable {datasource_input.value} does not exist")
parameter_value = variable.value
elif datasource_input.type in {"mixed", "constant"}:
segment_group = variable_pool.convert_template(str(datasource_input.value))
parameter_value = segment_group.log if for_log else segment_group.text
else:
raise DatasourceParameterError(f"Unknown datasource input type '{datasource_input.type}'")
result[parameter_name] = parameter_value
return result
def _fetch_files(self, variable_pool: VariablePool) -> list[File]:
variable = variable_pool.get(["sys", SystemVariableKey.FILES])
assert isinstance(variable, ArrayAnyVariable | ArrayAnySegment)
return list(variable.value) if variable else []
@classmethod
def _extract_variable_selector_to_variable_mapping(
cls,
@ -287,206 +211,6 @@ class DatasourceNode(Node[DatasourceNodeData]):
return result
def _transform_message(
self,
messages: Generator[DatasourceMessage, None, None],
parameters_for_log: dict[str, Any],
datasource_info: dict[str, Any],
) -> Generator:
"""
Convert ToolInvokeMessages into tuple[plain_text, files]
"""
# transform message and handle file storage
message_stream = DatasourceFileMessageTransformer.transform_datasource_invoke_messages(
messages=messages,
user_id=self.user_id,
tenant_id=self.tenant_id,
conversation_id=None,
)
text = ""
files: list[File] = []
json: list[dict | list] = []
variables: dict[str, Any] = {}
for message in message_stream:
match message.type:
case (
DatasourceMessage.MessageType.IMAGE_LINK
| DatasourceMessage.MessageType.BINARY_LINK
| DatasourceMessage.MessageType.IMAGE
):
assert isinstance(message.message, DatasourceMessage.TextMessage)
url = message.message.text
transfer_method = FileTransferMethod.TOOL_FILE
datasource_file_id = str(url).split("/")[-1].split(".")[0]
with Session(db.engine) as session:
stmt = select(ToolFile).where(ToolFile.id == datasource_file_id)
datasource_file = session.scalar(stmt)
if datasource_file is None:
raise ToolFileError(f"Tool file {datasource_file_id} does not exist")
mapping = {
"tool_file_id": datasource_file_id,
"type": file_factory.get_file_type_by_mime_type(datasource_file.mimetype),
"transfer_method": transfer_method,
"url": url,
}
file = file_factory.build_from_mapping(
mapping=mapping,
tenant_id=self.tenant_id,
)
files.append(file)
case DatasourceMessage.MessageType.BLOB:
# get tool file id
assert isinstance(message.message, DatasourceMessage.TextMessage)
assert message.meta
datasource_file_id = message.message.text.split("/")[-1].split(".")[0]
with Session(db.engine) as session:
stmt = select(ToolFile).where(ToolFile.id == datasource_file_id)
datasource_file = session.scalar(stmt)
if datasource_file is None:
raise ToolFileError(f"datasource file {datasource_file_id} not exists")
mapping = {
"tool_file_id": datasource_file_id,
"transfer_method": FileTransferMethod.TOOL_FILE,
}
files.append(
file_factory.build_from_mapping(
mapping=mapping,
tenant_id=self.tenant_id,
)
)
case DatasourceMessage.MessageType.TEXT:
assert isinstance(message.message, DatasourceMessage.TextMessage)
text += message.message.text
yield StreamChunkEvent(
selector=[self._node_id, "text"],
chunk=message.message.text,
is_final=False,
)
case DatasourceMessage.MessageType.JSON:
assert isinstance(message.message, DatasourceMessage.JsonMessage)
json.append(message.message.json_object)
case DatasourceMessage.MessageType.LINK:
assert isinstance(message.message, DatasourceMessage.TextMessage)
stream_text = f"Link: {message.message.text}\n"
text += stream_text
yield StreamChunkEvent(
selector=[self._node_id, "text"],
chunk=stream_text,
is_final=False,
)
case DatasourceMessage.MessageType.VARIABLE:
assert isinstance(message.message, DatasourceMessage.VariableMessage)
variable_name = message.message.variable_name
variable_value = message.message.variable_value
if message.message.stream:
if not isinstance(variable_value, str):
raise ValueError("When 'stream' is True, 'variable_value' must be a string.")
if variable_name not in variables:
variables[variable_name] = ""
variables[variable_name] += variable_value
yield StreamChunkEvent(
selector=[self._node_id, variable_name],
chunk=variable_value,
is_final=False,
)
else:
variables[variable_name] = variable_value
case DatasourceMessage.MessageType.FILE:
assert message.meta is not None
files.append(message.meta["file"])
case (
DatasourceMessage.MessageType.BLOB_CHUNK
| DatasourceMessage.MessageType.LOG
| DatasourceMessage.MessageType.RETRIEVER_RESOURCES
):
pass
# mark the end of the stream
yield StreamChunkEvent(
selector=[self._node_id, "text"],
chunk="",
is_final=True,
)
yield StreamCompletedEvent(
node_run_result=NodeRunResult(
status=WorkflowNodeExecutionStatus.SUCCEEDED,
outputs={**variables},
metadata={
WorkflowNodeExecutionMetadataKey.DATASOURCE_INFO: datasource_info,
},
inputs=parameters_for_log,
)
)
@classmethod
def version(cls) -> str:
return "1"
def _transform_datasource_file_message(
self,
messages: Generator[DatasourceMessage, None, None],
parameters_for_log: dict[str, Any],
datasource_info: dict[str, Any],
variable_pool: VariablePool,
datasource_type: DatasourceProviderType,
) -> Generator:
"""
Convert ToolInvokeMessages into tuple[plain_text, files]
"""
# transform message and handle file storage
message_stream = DatasourceFileMessageTransformer.transform_datasource_invoke_messages(
messages=messages,
user_id=self.user_id,
tenant_id=self.tenant_id,
conversation_id=None,
)
file = None
for message in message_stream:
if message.type == DatasourceMessage.MessageType.BINARY_LINK:
assert isinstance(message.message, DatasourceMessage.TextMessage)
url = message.message.text
transfer_method = FileTransferMethod.TOOL_FILE
datasource_file_id = str(url).split("/")[-1].split(".")[0]
with Session(db.engine) as session:
stmt = select(ToolFile).where(ToolFile.id == datasource_file_id)
datasource_file = session.scalar(stmt)
if datasource_file is None:
raise ToolFileError(f"Tool file {datasource_file_id} does not exist")
mapping = {
"tool_file_id": datasource_file_id,
"type": file_factory.get_file_type_by_mime_type(datasource_file.mimetype),
"transfer_method": transfer_method,
"url": url,
}
file = file_factory.build_from_mapping(
mapping=mapping,
tenant_id=self.tenant_id,
)
if file:
variable_pool.add([self._node_id, "file"], file)
yield StreamCompletedEvent(
node_run_result=NodeRunResult(
status=WorkflowNodeExecutionStatus.SUCCEEDED,
inputs=parameters_for_log,
metadata={WorkflowNodeExecutionMetadataKey.DATASOURCE_INFO: datasource_info},
outputs={
"file": file,
"datasource_type": datasource_type,
},
)
)

View File

@ -21,12 +21,12 @@ from docx.table import Table
from docx.text.paragraph import Paragraph
from core.helper import ssrf_proxy
from core.variables import ArrayFileSegment
from core.variables.segments import ArrayStringSegment, FileSegment
from core.workflow.enums import NodeType, WorkflowNodeExecutionStatus
from core.workflow.file import File, FileTransferMethod, file_manager
from core.workflow.node_events import NodeRunResult
from core.workflow.nodes.base.node import Node
from core.workflow.variables import ArrayFileSegment
from core.workflow.variables.segments import ArrayStringSegment, FileSegment
from .entities import DocumentExtractorNodeData, UnstructuredApiConfig
from .exc import DocumentExtractorError, FileDownloadError, TextExtractionError, UnsupportedFileTypeError

View File

@ -10,11 +10,9 @@ from urllib.parse import urlencode, urlparse
import httpx
from json_repair import repair_json
from core.helper.ssrf_proxy import ssrf_proxy
from core.variables.segments import ArrayFileSegment, FileSegment
from core.workflow.file.enums import FileTransferMethod
from core.workflow.file.file_manager import file_manager as default_file_manager
from core.workflow.runtime import VariablePool
from core.workflow.variables.segments import ArrayFileSegment, FileSegment
from ..protocols import FileManagerProtocol, HttpClientProtocol
from .entities import (
@ -81,8 +79,8 @@ class Executor:
http_request_config: HttpRequestNodeConfig,
max_retries: int | None = None,
ssl_verify: bool | None = None,
http_client: HttpClientProtocol | None = None,
file_manager: FileManagerProtocol | None = None,
http_client: HttpClientProtocol,
file_manager: FileManagerProtocol,
):
self._http_request_config = http_request_config
# If authorization API key is present, convert the API key using the variable pool
@ -116,8 +114,8 @@ class Executor:
self.max_retries = (
max_retries if max_retries is not None else self._http_request_config.ssrf_default_max_retries
)
self._http_client = http_client or ssrf_proxy
self._file_manager = file_manager or default_file_manager
self._http_client = http_client
self._file_manager = file_manager
# init template
self.variable_pool = variable_pool

View File

@ -3,18 +3,15 @@ import mimetypes
from collections.abc import Callable, Mapping, Sequence
from typing import TYPE_CHECKING, Any
from core.helper.ssrf_proxy import ssrf_proxy
from core.tools.tool_file_manager import ToolFileManager
from core.variables.segments import ArrayFileSegment
from core.workflow.enums import NodeType, WorkflowNodeExecutionStatus
from core.workflow.file import File, FileTransferMethod
from core.workflow.file.file_manager import file_manager as default_file_manager
from core.workflow.node_events import NodeRunResult
from core.workflow.nodes.base import variable_template_parser
from core.workflow.nodes.base.entities import VariableSelector
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.http_request.executor import Executor
from core.workflow.nodes.protocols import FileManagerProtocol, HttpClientProtocol
from core.workflow.nodes.protocols import FileManagerProtocol, HttpClientProtocol, ToolFileManagerProtocol
from core.workflow.variables.segments import ArrayFileSegment
from factories import file_factory
from .config import build_http_request_config, resolve_http_request_config
@ -45,9 +42,9 @@ class HttpRequestNode(Node[HttpRequestNodeData]):
graph_runtime_state: "GraphRuntimeState",
*,
http_request_config: HttpRequestNodeConfig,
http_client: HttpClientProtocol | None = None,
tool_file_manager_factory: Callable[[], ToolFileManager] = ToolFileManager,
file_manager: FileManagerProtocol | None = None,
http_client: HttpClientProtocol,
tool_file_manager_factory: Callable[[], ToolFileManagerProtocol],
file_manager: FileManagerProtocol,
) -> None:
super().__init__(
id=id,
@ -55,10 +52,11 @@ class HttpRequestNode(Node[HttpRequestNodeData]):
graph_init_params=graph_init_params,
graph_runtime_state=graph_runtime_state,
)
self._http_request_config = http_request_config
self._http_client = http_client or ssrf_proxy
self._http_client = http_client
self._tool_file_manager_factory = tool_file_manager_factory
self._file_manager = file_manager or default_file_manager
self._file_manager = file_manager
@classmethod
def get_default_config(cls, filters: Mapping[str, object] | None = None) -> Mapping[str, object]:

View File

@ -10,10 +10,10 @@ from typing import Annotated, Any, ClassVar, Literal, Self
from pydantic import BaseModel, Field, field_validator, model_validator
from core.variables.consts import SELECTORS_LENGTH
from core.workflow.nodes.base import BaseNodeData
from core.workflow.nodes.base.variable_template_parser import VariableTemplateParser
from core.workflow.runtime import VariablePool
from core.workflow.variables.consts import SELECTORS_LENGTH
from .enums import ButtonStyle, DeliveryMethodType, EmailRecipientType, FormInputType, PlaceholderType, TimeoutUnit

View File

@ -7,9 +7,6 @@ from typing import TYPE_CHECKING, Any, NewType, cast
from typing_extensions import TypeIs
from core.model_runtime.entities.llm_entities import LLMUsage
from core.variables import IntegerVariable, NoneSegment
from core.variables.segments import ArrayAnySegment, ArraySegment
from core.variables.variables import Variable
from core.workflow.constants import CONVERSATION_VARIABLE_NODE_ID
from core.workflow.enums import (
NodeExecutionType,
@ -36,6 +33,9 @@ from core.workflow.nodes.base import LLMUsageTrackingMixin
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.iteration.entities import ErrorHandleMode, IterationNodeData
from core.workflow.runtime import VariablePool
from core.workflow.variables import IntegerVariable, NoneSegment
from core.workflow.variables.segments import ArrayAnySegment, ArraySegment
from core.workflow.variables.variables import Variable
from libs.datetime_utils import naive_utc_now
from .exc import (

View File

@ -5,12 +5,6 @@ from typing import TYPE_CHECKING, Any, Literal
from core.app.app_config.entities import DatasetRetrieveConfigEntity
from core.model_runtime.entities.llm_entities import LLMUsage
from core.model_runtime.utils.encoders import jsonable_encoder
from core.variables import (
ArrayFileSegment,
FileSegment,
StringSegment,
)
from core.variables.segments import ArrayObjectSegment
from core.workflow.entities import GraphInitParams
from core.workflow.enums import (
NodeType,
@ -22,6 +16,12 @@ from core.workflow.nodes.base import LLMUsageTrackingMixin
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.llm.file_saver import FileSaverImpl, LLMFileSaver
from core.workflow.repositories.rag_retrieval_protocol import KnowledgeRetrievalRequest, RAGRetrievalProtocol, Source
from core.workflow.variables import (
ArrayFileSegment,
FileSegment,
StringSegment,
)
from core.workflow.variables.segments import ArrayObjectSegment
from .entities import KnowledgeRetrievalNodeData
from .exc import (

View File

@ -1,12 +1,12 @@
from collections.abc import Callable, Sequence
from typing import Any, TypeAlias, TypeVar
from core.variables import ArrayFileSegment, ArrayNumberSegment, ArrayStringSegment
from core.variables.segments import ArrayAnySegment, ArrayBooleanSegment, ArraySegment
from core.workflow.enums import NodeType, WorkflowNodeExecutionStatus
from core.workflow.file import File
from core.workflow.node_events import NodeRunResult
from core.workflow.nodes.base.node import Node
from core.workflow.variables import ArrayFileSegment, ArrayNumberSegment, ArrayStringSegment
from core.workflow.variables.segments import ArrayAnySegment, ArrayBooleanSegment, ArraySegment
from .entities import FilterOperator, ListOperatorNodeData, Order
from .exc import InvalidConditionError, InvalidFilterValueError, InvalidKeyError, ListOperatorError

View File

@ -5,21 +5,17 @@ from sqlalchemy import select, update
from sqlalchemy.orm import Session
from configs import dify_config
from core.app.entities.app_invoke_entities import ModelConfigWithCredentialsEntity
from core.entities.provider_entities import ProviderQuotaType, QuotaUnit
from core.memory.token_buffer_memory import TokenBufferMemory
from core.model_manager import ModelInstance
from core.model_runtime.entities.llm_entities import LLMUsage
from core.model_runtime.entities.model_entities import ModelType
from core.model_runtime.entities.model_entities import AIModelEntity
from core.model_runtime.model_providers.__base.large_language_model import LargeLanguageModel
from core.prompt.entities.advanced_prompt_entities import MemoryConfig
from core.variables.segments import ArrayAnySegment, ArrayFileSegment, FileSegment, NoneSegment, StringSegment
from core.workflow.enums import SystemVariableKey
from core.workflow.file.models import File
from core.workflow.nodes.llm.entities import ModelConfig
from core.workflow.nodes.llm.exc import LLMModeRequiredError, ModelNotExistError
from core.workflow.nodes.llm.protocols import CredentialsProvider, ModelFactory
from core.workflow.runtime import VariablePool
from core.workflow.variables.segments import ArrayAnySegment, ArrayFileSegment, FileSegment, NoneSegment, StringSegment
from extensions.ext_database import db
from libs.datetime_utils import naive_utc_now
from models.model import Conversation
@ -29,46 +25,14 @@ from models.provider_ids import ModelProviderID
from .exc import InvalidVariableTypeError
def fetch_model_config(
*,
node_data_model: ModelConfig,
credentials_provider: CredentialsProvider,
model_factory: ModelFactory,
) -> tuple[ModelInstance, ModelConfigWithCredentialsEntity]:
if not node_data_model.mode:
raise LLMModeRequiredError("LLM mode is required.")
credentials = credentials_provider.fetch(node_data_model.provider, node_data_model.name)
model_instance = model_factory.init_model_instance(node_data_model.provider, node_data_model.name)
provider_model_bundle = model_instance.provider_model_bundle
provider_model = provider_model_bundle.configuration.get_provider_model(
model=node_data_model.name,
model_type=ModelType.LLM,
def fetch_model_schema(*, model_instance: ModelInstance) -> AIModelEntity:
model_schema = cast(LargeLanguageModel, model_instance.model_type_instance).get_model_schema(
model_instance.model_name,
model_instance.credentials,
)
if provider_model is None:
raise ModelNotExistError(f"Model {node_data_model.name} not exist.")
provider_model.raise_for_status()
stop: list[str] = []
if "stop" in node_data_model.completion_params:
stop = node_data_model.completion_params.pop("stop")
model_schema = model_instance.model_type_instance.get_model_schema(node_data_model.name, credentials)
if not model_schema:
raise ModelNotExistError(f"Model {node_data_model.name} not exist.")
model_instance.model_type_instance = cast(LargeLanguageModel, model_instance.model_type_instance)
return model_instance, ModelConfigWithCredentialsEntity(
provider=node_data_model.provider,
model=node_data_model.name,
model_schema=model_schema,
mode=node_data_model.mode,
provider_model_bundle=provider_model_bundle,
credentials=credentials,
parameters=node_data_model.completion_params,
stop=stop,
)
raise ValueError(f"Model schema not found for {model_instance.model_name}")
return model_schema
def fetch_files(variable_pool: VariablePool, selector: Sequence[str]) -> Sequence["File"]:

View File

@ -11,11 +11,9 @@ from typing import TYPE_CHECKING, Any, Literal
from sqlalchemy import select
from core.app.entities.app_invoke_entities import ModelConfigWithCredentialsEntity
from core.helper.code_executor import CodeExecutor, CodeLanguage
from core.llm_generator.output_parser.errors import OutputParserError
from core.llm_generator.output_parser.structured_output import invoke_llm_with_structured_output
from core.memory.token_buffer_memory import TokenBufferMemory
from core.model_manager import ModelInstance
from core.model_runtime.entities import (
ImagePromptMessageContent,
@ -38,20 +36,12 @@ from core.model_runtime.entities.message_entities import (
SystemPromptMessage,
UserPromptMessage,
)
from core.model_runtime.entities.model_entities import AIModelEntity, ModelFeature, ModelPropertyKey
from core.model_runtime.entities.model_entities import ModelFeature, ModelPropertyKey
from core.model_runtime.utils.encoders import jsonable_encoder
from core.prompt.entities.advanced_prompt_entities import CompletionModelPromptTemplate, MemoryConfig
from core.prompt.utils.prompt_message_util import PromptMessageUtil
from core.rag.entities.citation_metadata import RetrievalSourceMetadata
from core.tools.signature import sign_upload_file
from core.variables import (
ArrayFileSegment,
ArraySegment,
FileSegment,
NoneSegment,
ObjectSegment,
StringSegment,
)
from core.workflow.constants import SYSTEM_VARIABLE_NODE_ID
from core.workflow.entities import GraphInitParams
from core.workflow.enums import (
@ -72,8 +62,16 @@ from core.workflow.node_events import (
from core.workflow.nodes.base.entities import VariableSelector
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.base.variable_template_parser import VariableTemplateParser
from core.workflow.nodes.llm.protocols import CredentialsProvider, ModelFactory
from core.workflow.nodes.llm.protocols import CredentialsProvider, ModelFactory, PromptMessageMemory
from core.workflow.runtime import VariablePool
from core.workflow.variables import (
ArrayFileSegment,
ArraySegment,
FileSegment,
NoneSegment,
ObjectSegment,
StringSegment,
)
from extensions.ext_database import db
from models.dataset import SegmentAttachmentBinding
from models.model import UploadFile
@ -83,7 +81,6 @@ from .entities import (
LLMNodeChatModelMessage,
LLMNodeCompletionModelPromptTemplate,
LLMNodeData,
ModelConfig,
)
from .exc import (
InvalidContextStructureError,
@ -116,6 +113,8 @@ class LLMNode(Node[LLMNodeData]):
_llm_file_saver: LLMFileSaver
_credentials_provider: CredentialsProvider
_model_factory: ModelFactory
_model_instance: ModelInstance
_memory: PromptMessageMemory | None
def __init__(
self,
@ -126,6 +125,8 @@ class LLMNode(Node[LLMNodeData]):
*,
credentials_provider: CredentialsProvider,
model_factory: ModelFactory,
model_instance: ModelInstance,
memory: PromptMessageMemory | None = None,
llm_file_saver: LLMFileSaver | None = None,
):
super().__init__(
@ -139,6 +140,8 @@ class LLMNode(Node[LLMNodeData]):
self._credentials_provider = credentials_provider
self._model_factory = model_factory
self._model_instance = model_instance
self._memory = memory
if llm_file_saver is None:
llm_file_saver = FileSaverImpl(
@ -202,29 +205,12 @@ class LLMNode(Node[LLMNodeData]):
node_inputs["#context_files#"] = [file.model_dump() for file in context_files]
# fetch model config
model_instance, model_config = self._fetch_model_config(
node_data_model=self.node_data.model,
)
model_name = getattr(model_instance, "model_name", None)
if not isinstance(model_name, str):
model_name = model_config.model
model_provider = getattr(model_instance, "provider", None)
if not isinstance(model_provider, str):
model_provider = model_config.provider
model_schema = model_instance.model_type_instance.get_model_schema(
model_name,
model_instance.credentials,
)
if not model_schema:
raise ValueError(f"Model schema not found for {model_name}")
model_instance = self._model_instance
model_name = model_instance.model_name
model_provider = model_instance.provider
model_stop = model_instance.stop
# fetch memory
memory = llm_utils.fetch_memory(
variable_pool=variable_pool,
app_id=self.app_id,
node_data_memory=self.node_data.memory,
model_instance=model_instance,
)
memory = self._memory
query: str | None = None
if self.node_data.memory:
@ -240,9 +226,7 @@ class LLMNode(Node[LLMNodeData]):
context=context,
memory=memory,
model_instance=model_instance,
model_schema=model_schema,
model_parameters=self.node_data.model.completion_params,
stop=model_config.stop,
stop=model_stop,
prompt_template=self.node_data.prompt_template,
memory_config=self.node_data.memory,
vision_enabled=self.node_data.vision.enabled,
@ -254,7 +238,6 @@ class LLMNode(Node[LLMNodeData]):
# handle invoke result
generator = LLMNode.invoke_llm(
node_data_model=self.node_data.model,
model_instance=model_instance,
prompt_messages=prompt_messages,
stop=stop,
@ -371,7 +354,6 @@ class LLMNode(Node[LLMNodeData]):
@staticmethod
def invoke_llm(
*,
node_data_model: ModelConfig,
model_instance: ModelInstance,
prompt_messages: Sequence[PromptMessage],
stop: Sequence[str] | None = None,
@ -384,11 +366,10 @@ class LLMNode(Node[LLMNodeData]):
node_type: NodeType,
reasoning_format: Literal["separated", "tagged"] = "tagged",
) -> Generator[NodeEventBase | LLMStructuredOutput, None, None]:
model_schema = model_instance.model_type_instance.get_model_schema(
node_data_model.name, model_instance.credentials
)
if not model_schema:
raise ValueError(f"Model schema not found for {node_data_model.name}")
model_parameters = model_instance.parameters
invoke_model_parameters = dict(model_parameters)
model_schema = llm_utils.fetch_model_schema(model_instance=model_instance)
if structured_output_enabled:
output_schema = LLMNode.fetch_structured_output_schema(
@ -402,7 +383,7 @@ class LLMNode(Node[LLMNodeData]):
model_instance=model_instance,
prompt_messages=prompt_messages,
json_schema=output_schema,
model_parameters=node_data_model.completion_params,
model_parameters=invoke_model_parameters,
stop=list(stop or []),
stream=True,
user=user_id,
@ -412,7 +393,7 @@ class LLMNode(Node[LLMNodeData]):
invoke_result = model_instance.invoke_llm(
prompt_messages=list(prompt_messages),
model_parameters=node_data_model.completion_params,
model_parameters=invoke_model_parameters,
stop=list(stop or []),
stream=True,
user=user_id,
@ -771,33 +752,14 @@ class LLMNode(Node[LLMNodeData]):
return None
def _fetch_model_config(
self,
*,
node_data_model: ModelConfig,
) -> tuple[ModelInstance, ModelConfigWithCredentialsEntity]:
model, model_config_with_cred = llm_utils.fetch_model_config(
node_data_model=node_data_model,
credentials_provider=self._credentials_provider,
model_factory=self._model_factory,
)
completion_params = model_config_with_cred.parameters
model_config_with_cred.parameters = completion_params
# NOTE(-LAN-): This line modify the `self.node_data.model`, which is used in `_invoke_llm()`.
node_data_model.completion_params = completion_params
return model, model_config_with_cred
@staticmethod
def fetch_prompt_messages(
*,
sys_query: str | None = None,
sys_files: Sequence[File],
context: str | None = None,
memory: TokenBufferMemory | None = None,
memory: PromptMessageMemory | None = None,
model_instance: ModelInstance,
model_schema: AIModelEntity,
model_parameters: Mapping[str, Any],
prompt_template: Sequence[LLMNodeChatModelMessage] | LLMNodeCompletionModelPromptTemplate,
stop: Sequence[str] | None = None,
memory_config: MemoryConfig | None = None,
@ -808,6 +770,7 @@ class LLMNode(Node[LLMNodeData]):
context_files: list[File] | None = None,
) -> tuple[Sequence[PromptMessage], Sequence[str] | None]:
prompt_messages: list[PromptMessage] = []
model_schema = llm_utils.fetch_model_schema(model_instance=model_instance)
if isinstance(prompt_template, list):
# For chat model
@ -826,8 +789,6 @@ class LLMNode(Node[LLMNodeData]):
memory=memory,
memory_config=memory_config,
model_instance=model_instance,
model_schema=model_schema,
model_parameters=model_parameters,
)
# Extend prompt_messages with memory messages
prompt_messages.extend(memory_messages)
@ -865,8 +826,6 @@ class LLMNode(Node[LLMNodeData]):
memory=memory,
memory_config=memory_config,
model_instance=model_instance,
model_schema=model_schema,
model_parameters=model_parameters,
)
# Insert histories into the prompt
prompt_content = prompt_messages[0].content
@ -1316,23 +1275,23 @@ def _calculate_rest_token(
*,
prompt_messages: list[PromptMessage],
model_instance: ModelInstance,
model_schema: AIModelEntity,
model_parameters: Mapping[str, Any],
) -> int:
rest_tokens = 2000
runtime_model_schema = llm_utils.fetch_model_schema(model_instance=model_instance)
runtime_model_parameters = model_instance.parameters
model_context_tokens = model_schema.model_properties.get(ModelPropertyKey.CONTEXT_SIZE)
model_context_tokens = runtime_model_schema.model_properties.get(ModelPropertyKey.CONTEXT_SIZE)
if model_context_tokens:
curr_message_tokens = model_instance.get_llm_num_tokens(prompt_messages)
max_tokens = 0
for parameter_rule in model_schema.parameter_rules:
for parameter_rule in runtime_model_schema.parameter_rules:
if parameter_rule.name == "max_tokens" or (
parameter_rule.use_template and parameter_rule.use_template == "max_tokens"
):
max_tokens = (
model_parameters.get(parameter_rule.name)
or model_parameters.get(str(parameter_rule.use_template))
runtime_model_parameters.get(parameter_rule.name)
or runtime_model_parameters.get(str(parameter_rule.use_template))
or 0
)
@ -1344,11 +1303,9 @@ def _calculate_rest_token(
def _handle_memory_chat_mode(
*,
memory: TokenBufferMemory | None,
memory: PromptMessageMemory | None,
memory_config: MemoryConfig | None,
model_instance: ModelInstance,
model_schema: AIModelEntity,
model_parameters: Mapping[str, Any],
) -> Sequence[PromptMessage]:
memory_messages: Sequence[PromptMessage] = []
# Get messages from memory for chat model
@ -1356,8 +1313,6 @@ def _handle_memory_chat_mode(
rest_tokens = _calculate_rest_token(
prompt_messages=[],
model_instance=model_instance,
model_schema=model_schema,
model_parameters=model_parameters,
)
memory_messages = memory.get_history_prompt_messages(
max_token_limit=rest_tokens,
@ -1368,11 +1323,9 @@ def _handle_memory_chat_mode(
def _handle_memory_completion_mode(
*,
memory: TokenBufferMemory | None,
memory: PromptMessageMemory | None,
memory_config: MemoryConfig | None,
model_instance: ModelInstance,
model_schema: AIModelEntity,
model_parameters: Mapping[str, Any],
) -> str:
memory_text = ""
# Get history text from memory for completion model
@ -1380,20 +1333,51 @@ def _handle_memory_completion_mode(
rest_tokens = _calculate_rest_token(
prompt_messages=[],
model_instance=model_instance,
model_schema=model_schema,
model_parameters=model_parameters,
)
if not memory_config.role_prefix:
raise MemoryRolePrefixRequiredError("Memory role prefix is required for completion model.")
memory_text = memory.get_history_prompt_text(
memory_messages = memory.get_history_prompt_messages(
max_token_limit=rest_tokens,
message_limit=memory_config.window.size if memory_config.window.enabled else None,
)
memory_text = _convert_history_messages_to_text(
history_messages=memory_messages,
human_prefix=memory_config.role_prefix.user,
ai_prefix=memory_config.role_prefix.assistant,
)
return memory_text
def _convert_history_messages_to_text(
*,
history_messages: Sequence[PromptMessage],
human_prefix: str,
ai_prefix: str,
) -> str:
string_messages: list[str] = []
for message in history_messages:
if message.role == PromptMessageRole.USER:
role = human_prefix
elif message.role == PromptMessageRole.ASSISTANT:
role = ai_prefix
else:
continue
if isinstance(message.content, list):
content_parts = []
for content in message.content:
if isinstance(content, TextPromptMessageContent):
content_parts.append(content.data)
elif isinstance(content, ImagePromptMessageContent):
content_parts.append("[image]")
inner_msg = "\n".join(content_parts)
string_messages.append(f"{role}: {inner_msg}")
else:
string_messages.append(f"{role}: {message.content}")
return "\n".join(string_messages)
def _handle_completion_template(
*,
template: LLMNodeCompletionModelPromptTemplate,

View File

@ -1,8 +1,10 @@
from __future__ import annotations
from collections.abc import Sequence
from typing import Any, Protocol
from core.model_manager import ModelInstance
from core.model_runtime.entities import PromptMessage
class CredentialsProvider(Protocol):
@ -19,3 +21,13 @@ class ModelFactory(Protocol):
def init_model_instance(self, provider_name: str, model_name: str) -> ModelInstance:
"""Create a model instance that is ready for schema lookup and invocation."""
...
class PromptMessageMemory(Protocol):
"""Port for loading memory as prompt messages for LLM nodes."""
def get_history_prompt_messages(
self, max_token_limit: int = 2000, message_limit: int | None = None
) -> Sequence[PromptMessage]:
"""Return historical prompt messages constrained by token/message limits."""
...

View File

@ -3,9 +3,9 @@ from typing import Annotated, Any, Literal
from pydantic import AfterValidator, BaseModel, Field, field_validator
from core.variables.types import SegmentType
from core.workflow.nodes.base import BaseLoopNodeData, BaseLoopState, BaseNodeData
from core.workflow.utils.condition.entities import Condition
from core.workflow.variables.types import SegmentType
_VALID_VAR_TYPE = frozenset(
[

View File

@ -6,7 +6,6 @@ from datetime import datetime
from typing import TYPE_CHECKING, Any, Literal, cast
from core.model_runtime.entities.llm_entities import LLMUsage
from core.variables import Segment, SegmentType
from core.workflow.enums import (
NodeExecutionType,
NodeType,
@ -31,6 +30,7 @@ from core.workflow.nodes.base import LLMUsageTrackingMixin
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.loop.entities import LoopCompletedReason, LoopNodeData, LoopVariableData
from core.workflow.utils.condition.processor import ConditionProcessor
from core.workflow.variables import Segment, SegmentType
from factories.variable_factory import TypeMismatchError, build_segment_with_type, segment_to_variable
from libs.datetime_utils import naive_utc_now

View File

@ -8,9 +8,9 @@ from pydantic import (
)
from core.prompt.entities.advanced_prompt_entities import MemoryConfig
from core.variables.types import SegmentType
from core.workflow.nodes.base import BaseNodeData
from core.workflow.nodes.llm.entities import ModelConfig, VisionConfig
from core.workflow.variables.types import SegmentType
_OLD_BOOL_TYPE_NAME = "bool"
_OLD_SELECT_TYPE_NAME = "select"

View File

@ -1,6 +1,6 @@
from typing import Any
from core.variables.types import SegmentType
from core.workflow.variables.types import SegmentType
class ParameterExtractorNodeError(ValueError):

View File

@ -5,7 +5,6 @@ import uuid
from collections.abc import Mapping, Sequence
from typing import TYPE_CHECKING, Any, cast
from core.app.entities.app_invoke_entities import ModelConfigWithCredentialsEntity
from core.memory.token_buffer_memory import TokenBufferMemory
from core.model_manager import ModelInstance
from core.model_runtime.entities import ImagePromptMessageContent
@ -25,14 +24,14 @@ from core.prompt.advanced_prompt_transform import AdvancedPromptTransform
from core.prompt.entities.advanced_prompt_entities import ChatModelMessage, CompletionModelPromptTemplate
from core.prompt.simple_prompt_transform import ModelMode
from core.prompt.utils.prompt_message_util import PromptMessageUtil
from core.variables.types import ArrayValidation, SegmentType
from core.workflow.enums import NodeType, WorkflowNodeExecutionMetadataKey, WorkflowNodeExecutionStatus
from core.workflow.file import File
from core.workflow.node_events import NodeRunResult
from core.workflow.nodes.base import variable_template_parser
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.llm import ModelConfig, llm_utils
from core.workflow.nodes.llm import llm_utils
from core.workflow.runtime import VariablePool
from core.workflow.variables.types import ArrayValidation, SegmentType
from factories.variable_factory import build_segment_with_type
from .entities import ParameterExtractorNodeData
@ -95,8 +94,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
node_type = NodeType.PARAMETER_EXTRACTOR
_model_instance: ModelInstance | None = None
_model_config: ModelConfigWithCredentialsEntity | None = None
_model_instance: ModelInstance
_credentials_provider: "CredentialsProvider"
_model_factory: "ModelFactory"
@ -109,6 +107,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
*,
credentials_provider: "CredentialsProvider",
model_factory: "ModelFactory",
model_instance: ModelInstance,
) -> None:
super().__init__(
id=id,
@ -118,6 +117,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
)
self._credentials_provider = credentials_provider
self._model_factory = model_factory
self._model_instance = model_instance
@classmethod
def get_default_config(cls, filters: Mapping[str, object] | None = None) -> Mapping[str, object]:
@ -155,18 +155,14 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
else []
)
model_instance, model_config = self._fetch_model_config(node_data.model)
model_instance = self._model_instance
if not isinstance(model_instance.model_type_instance, LargeLanguageModel):
raise InvalidModelTypeError("Model is not a Large Language Model")
llm_model = model_instance.model_type_instance
model_schema = llm_model.get_model_schema(
model=model_config.model,
credentials=model_config.credentials,
)
if not model_schema:
raise ModelSchemaNotFoundError("Model schema not found")
try:
model_schema = llm_utils.fetch_model_schema(model_instance=model_instance)
except ValueError as exc:
raise ModelSchemaNotFoundError("Model schema not found") from exc
# fetch memory
memory = llm_utils.fetch_memory(
variable_pool=variable_pool,
@ -184,7 +180,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
node_data=node_data,
query=query,
variable_pool=self.graph_runtime_state.variable_pool,
model_config=model_config,
model_instance=model_instance,
memory=memory,
files=files,
vision_detail=node_data.vision.configs.detail,
@ -195,7 +191,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
data=node_data,
query=query,
variable_pool=self.graph_runtime_state.variable_pool,
model_config=model_config,
model_instance=model_instance,
memory=memory,
files=files,
vision_detail=node_data.vision.configs.detail,
@ -211,24 +207,23 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
}
process_data = {
"model_mode": model_config.mode,
"model_mode": node_data.model.mode,
"prompts": PromptMessageUtil.prompt_messages_to_prompt_for_saving(
model_mode=model_config.mode, prompt_messages=prompt_messages
model_mode=node_data.model.mode, prompt_messages=prompt_messages
),
"usage": None,
"function": {} if not prompt_message_tools else jsonable_encoder(prompt_message_tools[0]),
"tool_call": None,
"model_provider": model_config.provider,
"model_name": model_config.model,
"model_provider": model_instance.provider,
"model_name": model_instance.model_name,
}
try:
text, usage, tool_call = self._invoke(
node_data_model=node_data.model,
model_instance=model_instance,
prompt_messages=prompt_messages,
tools=prompt_message_tools,
stop=model_config.stop,
stop=model_instance.stop,
)
process_data["usage"] = jsonable_encoder(usage)
process_data["tool_call"] = jsonable_encoder(tool_call)
@ -290,17 +285,16 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
def _invoke(
self,
node_data_model: ModelConfig,
model_instance: ModelInstance,
prompt_messages: list[PromptMessage],
tools: list[PromptMessageTool],
stop: list[str],
stop: Sequence[str],
) -> tuple[str, LLMUsage, AssistantPromptMessage.ToolCall | None]:
invoke_result = model_instance.invoke_llm(
prompt_messages=prompt_messages,
model_parameters=node_data_model.completion_params,
model_parameters=dict(model_instance.parameters),
tools=tools,
stop=stop,
stop=list(stop),
stream=False,
user=self.user_id,
)
@ -324,7 +318,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
node_data: ParameterExtractorNodeData,
query: str,
variable_pool: VariablePool,
model_config: ModelConfigWithCredentialsEntity,
model_instance: ModelInstance,
memory: TokenBufferMemory | None,
files: Sequence[File],
vision_detail: ImagePromptMessageContent.DETAIL | None = None,
@ -337,7 +331,13 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
)
prompt_transform = AdvancedPromptTransform(with_variable_tmpl=True)
rest_token = self._calculate_rest_token(node_data, query, variable_pool, model_config, "")
rest_token = self._calculate_rest_token(
node_data=node_data,
query=query,
variable_pool=variable_pool,
model_instance=model_instance,
context="",
)
prompt_template = self._get_function_calling_prompt_template(
node_data, query, variable_pool, memory, rest_token
)
@ -349,7 +349,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
context="",
memory_config=node_data.memory,
memory=None,
model_config=model_config,
model_instance=model_instance,
image_detail_config=vision_detail,
)
@ -406,7 +406,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
data: ParameterExtractorNodeData,
query: str,
variable_pool: VariablePool,
model_config: ModelConfigWithCredentialsEntity,
model_instance: ModelInstance,
memory: TokenBufferMemory | None,
files: Sequence[File],
vision_detail: ImagePromptMessageContent.DETAIL | None = None,
@ -421,7 +421,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
node_data=data,
query=query,
variable_pool=variable_pool,
model_config=model_config,
model_instance=model_instance,
memory=memory,
files=files,
vision_detail=vision_detail,
@ -431,7 +431,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
node_data=data,
query=query,
variable_pool=variable_pool,
model_config=model_config,
model_instance=model_instance,
memory=memory,
files=files,
vision_detail=vision_detail,
@ -444,7 +444,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
node_data: ParameterExtractorNodeData,
query: str,
variable_pool: VariablePool,
model_config: ModelConfigWithCredentialsEntity,
model_instance: ModelInstance,
memory: TokenBufferMemory | None,
files: Sequence[File],
vision_detail: ImagePromptMessageContent.DETAIL | None = None,
@ -454,7 +454,11 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
"""
prompt_transform = AdvancedPromptTransform(with_variable_tmpl=True)
rest_token = self._calculate_rest_token(
node_data=node_data, query=query, variable_pool=variable_pool, model_config=model_config, context=""
node_data=node_data,
query=query,
variable_pool=variable_pool,
model_instance=model_instance,
context="",
)
prompt_template = self._get_prompt_engineering_prompt_template(
node_data=node_data, query=query, variable_pool=variable_pool, memory=memory, max_token_limit=rest_token
@ -467,7 +471,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
context="",
memory_config=node_data.memory,
memory=memory,
model_config=model_config,
model_instance=model_instance,
image_detail_config=vision_detail,
)
@ -478,7 +482,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
node_data: ParameterExtractorNodeData,
query: str,
variable_pool: VariablePool,
model_config: ModelConfigWithCredentialsEntity,
model_instance: ModelInstance,
memory: TokenBufferMemory | None,
files: Sequence[File],
vision_detail: ImagePromptMessageContent.DETAIL | None = None,
@ -488,7 +492,11 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
"""
prompt_transform = AdvancedPromptTransform(with_variable_tmpl=True)
rest_token = self._calculate_rest_token(
node_data=node_data, query=query, variable_pool=variable_pool, model_config=model_config, context=""
node_data=node_data,
query=query,
variable_pool=variable_pool,
model_instance=model_instance,
context="",
)
prompt_template = self._get_prompt_engineering_prompt_template(
node_data=node_data,
@ -508,7 +516,7 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
context="",
memory_config=node_data.memory,
memory=None,
model_config=model_config,
model_instance=model_instance,
image_detail_config=vision_detail,
)
@ -769,21 +777,16 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
node_data: ParameterExtractorNodeData,
query: str,
variable_pool: VariablePool,
model_config: ModelConfigWithCredentialsEntity,
model_instance: ModelInstance,
context: str | None,
) -> int:
try:
model_schema = llm_utils.fetch_model_schema(model_instance=model_instance)
except ValueError as exc:
raise ModelSchemaNotFoundError("Model schema not found") from exc
prompt_transform = AdvancedPromptTransform(with_variable_tmpl=True)
model_instance, model_config = self._fetch_model_config(node_data.model)
if not isinstance(model_instance.model_type_instance, LargeLanguageModel):
raise InvalidModelTypeError("Model is not a Large Language Model")
llm_model = model_instance.model_type_instance
model_schema = llm_model.get_model_schema(model_config.model, model_config.credentials)
if not model_schema:
raise ModelSchemaNotFoundError("Model schema not found")
if set(model_schema.features or []) & {ModelFeature.MULTI_TOOL_CALL, ModelFeature.MULTI_TOOL_CALL}:
if set(model_schema.features or []) & {ModelFeature.TOOL_CALL, ModelFeature.MULTI_TOOL_CALL}:
prompt_template = self._get_function_calling_prompt_template(node_data, query, variable_pool, None, 2000)
else:
prompt_template = self._get_prompt_engineering_prompt_template(node_data, query, variable_pool, None, 2000)
@ -796,27 +799,28 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
context=context,
memory_config=node_data.memory,
memory=None,
model_config=model_config,
model_instance=model_instance,
)
rest_tokens = 2000
model_context_tokens = model_config.model_schema.model_properties.get(ModelPropertyKey.CONTEXT_SIZE)
model_context_tokens = model_schema.model_properties.get(ModelPropertyKey.CONTEXT_SIZE)
if model_context_tokens:
model_type_instance = model_config.provider_model_bundle.model_type_instance
model_type_instance = cast(LargeLanguageModel, model_type_instance)
model_type_instance = cast(LargeLanguageModel, model_instance.model_type_instance)
curr_message_tokens = (
model_type_instance.get_num_tokens(model_config.model, model_config.credentials, prompt_messages) + 1000
model_type_instance.get_num_tokens(
model_instance.model_name, model_instance.credentials, prompt_messages
)
+ 1000
) # add 1000 to ensure tool call messages
max_tokens = 0
for parameter_rule in model_config.model_schema.parameter_rules:
for parameter_rule in model_schema.parameter_rules:
if parameter_rule.name == "max_tokens" or (
parameter_rule.use_template and parameter_rule.use_template == "max_tokens"
):
max_tokens = (
model_config.parameters.get(parameter_rule.name)
or model_config.parameters.get(parameter_rule.use_template or "")
model_instance.parameters.get(parameter_rule.name)
or model_instance.parameters.get(parameter_rule.use_template or "")
) or 0
rest_tokens = model_context_tokens - max_tokens - curr_message_tokens
@ -824,21 +828,6 @@ class ParameterExtractorNode(Node[ParameterExtractorNodeData]):
return rest_tokens
def _fetch_model_config(
self, node_data_model: ModelConfig
) -> tuple[ModelInstance, ModelConfigWithCredentialsEntity]:
"""
Fetch model config.
"""
if not self._model_instance or not self._model_config:
self._model_instance, self._model_config = llm_utils.fetch_model_config(
node_data_model=node_data_model,
credentials_provider=self._credentials_provider,
model_factory=self._model_factory,
)
return self._model_instance, self._model_config
@classmethod
def _extract_variable_selector_to_variable_mapping(
cls,

View File

@ -27,3 +27,16 @@ class HttpClientProtocol(Protocol):
class FileManagerProtocol(Protocol):
def download(self, f: File, /) -> bytes: ...
class ToolFileManagerProtocol(Protocol):
def create_file_by_raw(
self,
*,
user_id: str,
tenant_id: str,
conversation_id: str | None,
file_binary: bytes,
mimetype: str,
filename: str | None = None,
) -> Any: ...

View File

@ -3,12 +3,10 @@ import re
from collections.abc import Mapping, Sequence
from typing import TYPE_CHECKING, Any
from core.app.entities.app_invoke_entities import ModelConfigWithCredentialsEntity
from core.memory.token_buffer_memory import TokenBufferMemory
from core.model_manager import ModelInstance
from core.model_runtime.entities import LLMUsage, ModelPropertyKey, PromptMessageRole
from core.model_runtime.utils.encoders import jsonable_encoder
from core.prompt.advanced_prompt_transform import AdvancedPromptTransform
from core.prompt.simple_prompt_transform import ModelMode
from core.prompt.utils.prompt_message_util import PromptMessageUtil
from core.workflow.entities import GraphInitParams
@ -22,7 +20,12 @@ from core.workflow.node_events import ModelInvokeCompletedEvent, NodeRunResult
from core.workflow.nodes.base.entities import VariableSelector
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.base.variable_template_parser import VariableTemplateParser
from core.workflow.nodes.llm import LLMNode, LLMNodeChatModelMessage, LLMNodeCompletionModelPromptTemplate, llm_utils
from core.workflow.nodes.llm import (
LLMNode,
LLMNodeChatModelMessage,
LLMNodeCompletionModelPromptTemplate,
llm_utils,
)
from core.workflow.nodes.llm.file_saver import FileSaverImpl, LLMFileSaver
from core.workflow.nodes.llm.protocols import CredentialsProvider, ModelFactory
from libs.json_in_md_parser import parse_and_check_json_markdown
@ -52,6 +55,7 @@ class QuestionClassifierNode(Node[QuestionClassifierNodeData]):
_llm_file_saver: LLMFileSaver
_credentials_provider: "CredentialsProvider"
_model_factory: "ModelFactory"
_model_instance: ModelInstance
def __init__(
self,
@ -62,6 +66,7 @@ class QuestionClassifierNode(Node[QuestionClassifierNodeData]):
*,
credentials_provider: "CredentialsProvider",
model_factory: "ModelFactory",
model_instance: ModelInstance,
llm_file_saver: LLMFileSaver | None = None,
):
super().__init__(
@ -75,6 +80,7 @@ class QuestionClassifierNode(Node[QuestionClassifierNodeData]):
self._credentials_provider = credentials_provider
self._model_factory = model_factory
self._model_instance = model_instance
if llm_file_saver is None:
llm_file_saver = FileSaverImpl(
@ -95,18 +101,8 @@ class QuestionClassifierNode(Node[QuestionClassifierNodeData]):
variable = variable_pool.get(node_data.query_variable_selector) if node_data.query_variable_selector else None
query = variable.value if variable else None
variables = {"query": query}
# fetch model config
model_instance, model_config = llm_utils.fetch_model_config(
node_data_model=node_data.model,
credentials_provider=self._credentials_provider,
model_factory=self._model_factory,
)
model_schema = model_instance.model_type_instance.get_model_schema(
model_instance.model_name,
model_instance.credentials,
)
if not model_schema:
raise ValueError(f"Model schema not found for {model_instance.model_name}")
# fetch model instance
model_instance = self._model_instance
# fetch memory
memory = llm_utils.fetch_memory(
variable_pool=variable_pool,
@ -131,7 +127,7 @@ class QuestionClassifierNode(Node[QuestionClassifierNodeData]):
rest_token = self._calculate_rest_token(
node_data=node_data,
query=query or "",
model_config=model_config,
model_instance=model_instance,
context="",
)
prompt_template = self._get_prompt_template(
@ -149,9 +145,7 @@ class QuestionClassifierNode(Node[QuestionClassifierNodeData]):
sys_query="",
memory=memory,
model_instance=model_instance,
model_schema=model_schema,
model_parameters=node_data.model.completion_params,
stop=model_config.stop,
stop=model_instance.stop,
sys_files=files,
vision_enabled=node_data.vision.enabled,
vision_detail=node_data.vision.configs.detail,
@ -166,7 +160,6 @@ class QuestionClassifierNode(Node[QuestionClassifierNodeData]):
try:
# handle invoke result
generator = LLMNode.invoke_llm(
node_data_model=node_data.model,
model_instance=model_instance,
prompt_messages=prompt_messages,
stop=stop,
@ -205,14 +198,14 @@ class QuestionClassifierNode(Node[QuestionClassifierNodeData]):
category_name = classes_map[category_id_result]
category_id = category_id_result
process_data = {
"model_mode": model_config.mode,
"model_mode": node_data.model.mode,
"prompts": PromptMessageUtil.prompt_messages_to_prompt_for_saving(
model_mode=model_config.mode, prompt_messages=prompt_messages
model_mode=node_data.model.mode, prompt_messages=prompt_messages
),
"usage": jsonable_encoder(usage),
"finish_reason": finish_reason,
"model_provider": model_config.provider,
"model_name": model_config.model,
"model_provider": model_instance.provider,
"model_name": model_instance.model_name,
}
outputs = {
"class_name": category_name,
@ -285,39 +278,40 @@ class QuestionClassifierNode(Node[QuestionClassifierNodeData]):
self,
node_data: QuestionClassifierNodeData,
query: str,
model_config: ModelConfigWithCredentialsEntity,
model_instance: ModelInstance,
context: str | None,
) -> int:
prompt_transform = AdvancedPromptTransform(with_variable_tmpl=True)
model_schema = llm_utils.fetch_model_schema(model_instance=model_instance)
prompt_template = self._get_prompt_template(node_data, query, None, 2000)
prompt_messages = prompt_transform.get_prompt(
prompt_messages, _ = LLMNode.fetch_prompt_messages(
prompt_template=prompt_template,
inputs={},
query="",
files=[],
sys_query="",
sys_files=[],
context=context,
memory_config=node_data.memory,
memory=None,
model_config=model_config,
model_instance=model_instance,
stop=model_instance.stop,
memory_config=node_data.memory,
vision_enabled=False,
vision_detail=node_data.vision.configs.detail,
variable_pool=self.graph_runtime_state.variable_pool,
jinja2_variables=[],
)
rest_tokens = 2000
model_context_tokens = model_config.model_schema.model_properties.get(ModelPropertyKey.CONTEXT_SIZE)
model_context_tokens = model_schema.model_properties.get(ModelPropertyKey.CONTEXT_SIZE)
if model_context_tokens:
model_instance = ModelInstance(
provider_model_bundle=model_config.provider_model_bundle, model=model_config.model
)
curr_message_tokens = model_instance.get_llm_num_tokens(prompt_messages)
max_tokens = 0
for parameter_rule in model_config.model_schema.parameter_rules:
for parameter_rule in model_schema.parameter_rules:
if parameter_rule.name == "max_tokens" or (
parameter_rule.use_template and parameter_rule.use_template == "max_tokens"
):
max_tokens = (
model_config.parameters.get(parameter_rule.name)
or model_config.parameters.get(parameter_rule.use_template or "")
model_instance.parameters.get(parameter_rule.name)
or model_instance.parameters.get(parameter_rule.use_template or "")
) or 0
rest_tokens = model_context_tokens - max_tokens - curr_message_tokens

View File

@ -11,8 +11,6 @@ from core.tools.entities.tool_entities import ToolInvokeMessage, ToolParameter
from core.tools.errors import ToolInvokeError
from core.tools.tool_engine import ToolEngine
from core.tools.utils.message_transformer import ToolFileMessageTransformer
from core.variables.segments import ArrayAnySegment, ArrayFileSegment
from core.variables.variables import ArrayAnyVariable
from core.workflow.enums import (
NodeType,
SystemVariableKey,
@ -23,6 +21,8 @@ from core.workflow.file import File, FileTransferMethod
from core.workflow.node_events import NodeEventBase, NodeRunResult, StreamChunkEvent, StreamCompletedEvent
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.base.variable_template_parser import VariableTemplateParser
from core.workflow.variables.segments import ArrayAnySegment, ArrayFileSegment
from core.workflow.variables.variables import ArrayAnyVariable
from extensions.ext_database import db
from factories import file_factory
from models import ToolFile

View File

@ -2,14 +2,14 @@ import logging
from collections.abc import Mapping
from typing import Any
from core.variables.types import SegmentType
from core.variables.variables import FileVariable
from core.workflow.constants import SYSTEM_VARIABLE_NODE_ID
from core.workflow.entities.workflow_node_execution import WorkflowNodeExecutionStatus
from core.workflow.enums import NodeExecutionType, NodeType
from core.workflow.file import FileTransferMethod
from core.workflow.node_events import NodeRunResult
from core.workflow.nodes.base.node import Node
from core.workflow.variables.types import SegmentType
from core.workflow.variables.variables import FileVariable
from factories import file_factory
from factories.variable_factory import build_segment_with_type

View File

@ -1,7 +1,7 @@
from pydantic import BaseModel
from core.variables.types import SegmentType
from core.workflow.nodes.base import BaseNodeData
from core.workflow.variables.types import SegmentType
class AdvancedSettings(BaseModel):

View File

@ -1,10 +1,10 @@
from collections.abc import Mapping
from core.variables.segments import Segment
from core.workflow.enums import NodeType, WorkflowNodeExecutionStatus
from core.workflow.node_events import NodeRunResult
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.variable_aggregator.entities import VariableAggregatorNodeData
from core.workflow.variables.segments import Segment
class VariableAggregatorNode(Node[VariableAggregatorNodeData]):

View File

@ -3,9 +3,9 @@ from typing import Any, TypeVar
from pydantic import BaseModel
from core.variables import Segment
from core.variables.consts import SELECTORS_LENGTH
from core.variables.types import SegmentType
from core.workflow.variables import Segment
from core.workflow.variables.consts import SELECTORS_LENGTH
from core.workflow.variables.types import SegmentType
# Use double underscore (`__`) prefix for internal variables
# to minimize risk of collision with user-defined variable names.

View File

@ -1,7 +1,6 @@
from collections.abc import Mapping, Sequence
from typing import TYPE_CHECKING, Any
from core.variables import SegmentType, VariableBase
from core.workflow.constants import CONVERSATION_VARIABLE_NODE_ID
from core.workflow.entities import GraphInitParams
from core.workflow.enums import NodeType, WorkflowNodeExecutionStatus
@ -9,6 +8,7 @@ from core.workflow.node_events import NodeRunResult
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.variable_assigner.common import helpers as common_helpers
from core.workflow.nodes.variable_assigner.common.exc import VariableOperatorNodeError
from core.workflow.variables import SegmentType, VariableBase
from .node_data import VariableAssignerData, WriteMode

View File

@ -1,6 +1,6 @@
from typing import Any
from core.variables import SegmentType
from core.workflow.variables import SegmentType
from .enums import Operation

View File

@ -2,14 +2,14 @@ import json
from collections.abc import Mapping, MutableMapping, Sequence
from typing import TYPE_CHECKING, Any
from core.variables import SegmentType, VariableBase
from core.variables.consts import SELECTORS_LENGTH
from core.workflow.constants import CONVERSATION_VARIABLE_NODE_ID
from core.workflow.enums import NodeType, WorkflowNodeExecutionStatus
from core.workflow.node_events import NodeRunResult
from core.workflow.nodes.base.node import Node
from core.workflow.nodes.variable_assigner.common import helpers as common_helpers
from core.workflow.nodes.variable_assigner.common.exc import VariableOperatorNodeError
from core.workflow.variables import SegmentType, VariableBase
from core.workflow.variables.consts import SELECTORS_LENGTH
from . import helpers
from .entities import VariableAssignerNodeData, VariableOperationItem

View File

@ -0,0 +1,50 @@
from collections.abc import Generator
from typing import Any, Protocol
from pydantic import BaseModel
from core.workflow.file import File
from core.workflow.node_events import StreamChunkEvent, StreamCompletedEvent
class DatasourceParameter(BaseModel):
workspace_id: str
page_id: str
type: str
class OnlineDriveDownloadFileParam(BaseModel):
id: str
bucket: str
class DatasourceFinal(BaseModel):
data: dict[str, Any] | None = None
class DatasourceManagerProtocol(Protocol):
@classmethod
def get_icon_url(cls, provider_id: str, tenant_id: str, datasource_name: str, datasource_type: str) -> str: ...
@classmethod
def stream_node_events(
cls,
*,
node_id: str,
user_id: str,
datasource_name: str,
datasource_type: str,
provider_id: str,
tenant_id: str,
provider: str,
plugin_id: str,
credential_id: str,
parameters_for_log: dict[str, Any],
datasource_info: dict[str, Any],
variable_pool: Any,
datasource_param: DatasourceParameter | None = None,
online_drive_request: OnlineDriveDownloadFileParam | None = None,
) -> Generator[StreamChunkEvent | StreamCompletedEvent, None, None]: ...
@classmethod
def get_upload_file_by_id(cls, file_id: str, tenant_id: str) -> File: ...

View File

@ -2,8 +2,8 @@ from collections.abc import Mapping, Sequence
from typing import Any, Protocol
from core.model_runtime.entities.llm_entities import LLMUsage
from core.variables.segments import Segment
from core.workflow.system_variable import SystemVariableReadOnlyView
from core.workflow.variables.segments import Segment
class ReadOnlyVariablePool(Protocol):

View File

@ -5,8 +5,8 @@ from copy import deepcopy
from typing import Any
from core.model_runtime.entities.llm_entities import LLMUsage
from core.variables.segments import Segment
from core.workflow.system_variable import SystemVariableReadOnlyView
from core.workflow.variables.segments import Segment
from .graph_runtime_state import GraphRuntimeState
from .variable_pool import VariablePool

View File

@ -8,10 +8,6 @@ from typing import Annotated, Any, Union, cast
from pydantic import BaseModel, Field
from core.variables import Segment, SegmentGroup, VariableBase
from core.variables.consts import SELECTORS_LENGTH
from core.variables.segments import FileSegment, ObjectSegment
from core.variables.variables import RAGPipelineVariableInput, Variable
from core.workflow.constants import (
CONVERSATION_VARIABLE_NODE_ID,
ENVIRONMENT_VARIABLE_NODE_ID,
@ -20,6 +16,10 @@ from core.workflow.constants import (
)
from core.workflow.file import File, FileAttribute, file_manager
from core.workflow.system_variable import SystemVariable
from core.workflow.variables import Segment, SegmentGroup, VariableBase
from core.workflow.variables.consts import SELECTORS_LENGTH
from core.workflow.variables.segments import FileSegment, ObjectSegment
from core.workflow.variables.variables import RAGPipelineVariableInput, Variable
from factories import variable_factory
VariableValue = Union[str, int, float, dict[str, object], list[object], File]

View File

@ -2,10 +2,10 @@ import json
from collections.abc import Mapping, Sequence
from typing import Literal, NamedTuple
from core.variables import ArrayFileSegment
from core.variables.segments import ArrayBooleanSegment, BooleanSegment
from core.workflow.file import FileAttribute, file_manager
from core.workflow.runtime import VariablePool
from core.workflow.variables import ArrayFileSegment
from core.workflow.variables.segments import ArrayBooleanSegment, BooleanSegment
from .entities import Condition, SubCondition, SupportedComparisonOperator

View File

@ -2,9 +2,9 @@ import abc
from collections.abc import Mapping, Sequence
from typing import Any, Protocol
from core.variables import VariableBase
from core.variables.consts import SELECTORS_LENGTH
from core.workflow.runtime import VariablePool
from core.workflow.variables import VariableBase
from core.workflow.variables.consts import SELECTORS_LENGTH
class VariableLoader(Protocol):

View File

@ -4,8 +4,6 @@ from uuid import uuid4
from pydantic import BaseModel, Discriminator, Field, Tag
from core.helper import encrypter
from .segments import (
ArrayAnySegment,
ArrayBooleanSegment,
@ -27,6 +25,14 @@ from .segments import (
from .types import SegmentType
def _obfuscated_token(token: str) -> str:
if not token:
return token
if len(token) <= 8:
return "*" * 20
return token[:6] + "*" * 12 + token[-2:]
class VariableBase(Segment):
"""
A variable is a segment that has a name.
@ -86,7 +92,7 @@ class SecretVariable(StringVariable):
@property
def log(self) -> str:
return encrypter.obfuscated_token(self.value)
return _obfuscated_token(self.value)
class NoneVariable(NoneSegment, VariableBase):

View File

@ -4,8 +4,8 @@ from typing import Any, overload
from pydantic import BaseModel
from core.variables import Segment
from core.workflow.file.models import File
from core.workflow.variables import Segment
class WorkflowRuntimeTypeConverter:

View File

@ -9,11 +9,11 @@ from opentelemetry.trace import Span
from opentelemetry.trace.status import Status, StatusCode
from pydantic import BaseModel
from core.variables import Segment
from core.workflow.enums import NodeType
from core.workflow.file.models import File
from core.workflow.graph_events import GraphNodeEventBase
from core.workflow.nodes.base.node import Node
from core.workflow.variables import Segment
from extensions.otel.semconv.gen_ai import ChainAttributes, GenAIAttributes

View File

@ -8,9 +8,9 @@ from typing import Any
from opentelemetry.trace import Span
from core.variables import Segment
from core.workflow.graph_events import GraphNodeEventBase
from core.workflow.nodes.base.node import Node
from core.workflow.variables import Segment
from extensions.otel.parser.base import DefaultNodeOTelParser, safe_json_dumps
from extensions.otel.semconv.gen_ai import RetrieverAttributes

View File

@ -3,8 +3,13 @@ from typing import Any, cast
from uuid import uuid4
from configs import dify_config
from core.variables.exc import VariableError
from core.variables.segments import (
from core.workflow.constants import (
CONVERSATION_VARIABLE_NODE_ID,
ENVIRONMENT_VARIABLE_NODE_ID,
)
from core.workflow.file import File
from core.workflow.variables.exc import VariableError
from core.workflow.variables.segments import (
ArrayAnySegment,
ArrayBooleanSegment,
ArrayFileSegment,
@ -21,8 +26,8 @@ from core.variables.segments import (
Segment,
StringSegment,
)
from core.variables.types import SegmentType
from core.variables.variables import (
from core.workflow.variables.types import SegmentType
from core.workflow.variables.variables import (
ArrayAnyVariable,
ArrayBooleanVariable,
ArrayFileVariable,
@ -39,11 +44,6 @@ from core.variables.variables import (
StringVariable,
VariableBase,
)
from core.workflow.constants import (
CONVERSATION_VARIABLE_NODE_ID,
ENVIRONMENT_VARIABLE_NODE_ID,
)
from core.workflow.file import File
class UnsupportedSegmentTypeError(Exception):

View File

@ -1,7 +1,7 @@
from typing import TypedDict
from core.variables.segments import Segment
from core.variables.types import SegmentType
from core.workflow.variables.segments import Segment
from core.workflow.variables.types import SegmentType
class _VarTypedDict(TypedDict, total=False):

View File

@ -1,7 +1,7 @@
from flask_restx import fields
from core.helper import encrypter
from core.variables import SecretVariable, SegmentType, VariableBase
from core.workflow.variables import SecretVariable, SegmentType, VariableBase
from fields.member_fields import simple_account_fields
from libs.helper import TimestampField

View File

@ -13,6 +13,8 @@ from libs.token import check_csrf_token
from models import Account
if TYPE_CHECKING:
from flask.typing import ResponseReturnValue
from models.model import EndUser
@ -38,7 +40,7 @@ P = ParamSpec("P")
R = TypeVar("R")
def login_required(func: Callable[P, R]):
def login_required(func: Callable[P, R]) -> Callable[P, R | ResponseReturnValue]:
"""
If you decorate a view with this, it will ensure that the current user is
logged in and authenticated before calling the actual view. (If they are
@ -73,7 +75,7 @@ def login_required(func: Callable[P, R]):
"""
@wraps(func)
def decorated_view(*args: P.args, **kwargs: P.kwargs):
def decorated_view(*args: P.args, **kwargs: P.kwargs) -> R | ResponseReturnValue:
if request.method in EXEMPT_METHODS or dify_config.LOGIN_DISABLED:
pass
elif current_user is not None and not current_user.is_authenticated:

View File

@ -22,8 +22,6 @@ from sqlalchemy import (
from sqlalchemy.orm import Mapped, declared_attr, mapped_column
from typing_extensions import deprecated
from core.variables import utils as variable_utils
from core.variables.variables import FloatVariable, IntegerVariable, StringVariable
from core.workflow.constants import (
CONVERSATION_VARIABLE_NODE_ID,
SYSTEM_VARIABLE_NODE_ID,
@ -33,6 +31,8 @@ from core.workflow.entities.pause_reason import HumanInputRequired, PauseReason,
from core.workflow.enums import NodeType, WorkflowExecutionStatus
from core.workflow.file.constants import maybe_file_object
from core.workflow.file.models import File
from core.workflow.variables import utils as variable_utils
from core.workflow.variables.variables import FloatVariable, IntegerVariable, StringVariable
from extensions.ext_storage import Storage
from factories.variable_factory import TypeMismatchError, build_segment_with_type
from libs.datetime_utils import naive_utc_now
@ -46,7 +46,7 @@ if TYPE_CHECKING:
from constants import DEFAULT_FILE_NUMBER_LIMITS, HIDDEN_VALUE
from core.helper import encrypter
from core.variables import SecretVariable, Segment, SegmentType, VariableBase
from core.workflow.variables import SecretVariable, Segment, SegmentType, VariableBase
from factories import variable_factory
from libs import helper

View File

@ -289,6 +289,12 @@ class AccountService:
TenantService.create_owner_tenant_if_not_exist(account=account)
# Enterprise-only: best-effort add the account to the default workspace (does not switch current workspace).
if dify_config.ENTERPRISE_ENABLED:
from services.enterprise.enterprise_service import try_join_default_workspace
try_join_default_workspace(str(account.id))
return account
@staticmethod
@ -1407,6 +1413,12 @@ class RegisterService:
tenant_was_created.send(tenant)
db.session.commit()
# Enterprise-only: best-effort add the account to the default workspace (does not switch current workspace).
if dify_config.ENTERPRISE_ENABLED:
from services.enterprise.enterprise_service import try_join_default_workspace
try_join_default_workspace(str(account.id))
except WorkSpaceNotAllowedCreateError:
db.session.rollback()
logger.exception("Register failed")

View File

@ -10,7 +10,7 @@ from configs import dify_config
from core.app.entities.app_invoke_entities import InvokeFrom
from core.db.session_factory import session_factory
from core.llm_generator.llm_generator import LLMGenerator
from core.variables.types import SegmentType
from core.workflow.variables.types import SegmentType
from extensions.ext_database import db
from factories import variable_factory
from libs.datetime_utils import naive_utc_now
@ -180,6 +180,14 @@ class ConversationService:
@classmethod
def delete(cls, app_model: App, conversation_id: str, user: Union[Account, EndUser] | None):
"""
Delete a conversation only if it belongs to the given user and app context.
Raises:
ConversationNotExistsError: When the conversation is not visible to the current user.
"""
conversation = cls.get_conversation(app_model, conversation_id, user)
try:
logger.info(
"Initiating conversation deletion for app_name %s, conversation_id: %s",
@ -187,10 +195,10 @@ class ConversationService:
conversation_id,
)
db.session.query(Conversation).where(Conversation.id == conversation_id).delete(synchronize_session=False)
db.session.delete(conversation)
db.session.commit()
delete_conversation_related_data.delay(conversation_id)
delete_conversation_related_data.delay(conversation.id)
except Exception as e:
db.session.rollback()

View File

@ -1,7 +1,7 @@
from sqlalchemy import select
from sqlalchemy.orm import Session, sessionmaker
from core.variables.variables import VariableBase
from core.workflow.variables.variables import VariableBase
from models import ConversationVariable

View File

@ -39,6 +39,9 @@ class BaseRequest:
endpoint: str,
json: Any | None = None,
params: Mapping[str, Any] | None = None,
*,
timeout: float | httpx.Timeout | None = None,
raise_for_status: bool = False,
) -> Any:
headers = {"Content-Type": "application/json", cls.secret_key_header: cls.secret_key}
url = f"{cls.base_url}{endpoint}"
@ -53,7 +56,16 @@ class BaseRequest:
logger.debug("Failed to generate traceparent header", exc_info=True)
with httpx.Client(mounts=mounts) as client:
response = client.request(method, url, json=json, params=params, headers=headers)
# IMPORTANT:
# - In httpx, passing timeout=None disables timeouts (infinite) and overrides the library default.
# - To preserve httpx's default timeout behavior for existing call sites, only pass the kwarg when set.
request_kwargs: dict[str, Any] = {"json": json, "params": params, "headers": headers}
if timeout is not None:
request_kwargs["timeout"] = timeout
response = client.request(method, url, **request_kwargs)
if raise_for_status:
response.raise_for_status()
return response.json()

View File

@ -1,9 +1,16 @@
import logging
import uuid
from datetime import datetime
from pydantic import BaseModel, Field
from pydantic import BaseModel, ConfigDict, Field, model_validator
from configs import dify_config
from services.enterprise.base import EnterpriseRequest
logger = logging.getLogger(__name__)
DEFAULT_WORKSPACE_JOIN_TIMEOUT_SECONDS = 1.0
class WebAppSettings(BaseModel):
access_mode: str = Field(
@ -30,6 +37,55 @@ class WorkspacePermission(BaseModel):
)
class DefaultWorkspaceJoinResult(BaseModel):
"""
Result of ensuring an account is a member of the enterprise default workspace.
- joined=True is idempotent (already a member also returns True)
- joined=False means enterprise default workspace is not configured or invalid/archived
"""
workspace_id: str = Field(default="", alias="workspaceId")
joined: bool
message: str
model_config = ConfigDict(extra="forbid", populate_by_name=True)
@model_validator(mode="after")
def _check_workspace_id_when_joined(self) -> "DefaultWorkspaceJoinResult":
if self.joined and not self.workspace_id:
raise ValueError("workspace_id must be non-empty when joined is True")
return self
def try_join_default_workspace(account_id: str) -> None:
"""
Enterprise-only side-effect: ensure account is a member of the default workspace.
This is a best-effort integration. Failures must not block user registration.
"""
if not dify_config.ENTERPRISE_ENABLED:
return
try:
result = EnterpriseService.join_default_workspace(account_id=account_id)
if result.joined:
logger.info(
"Joined enterprise default workspace for account %s (workspace_id=%s)",
account_id,
result.workspace_id,
)
else:
logger.info(
"Skipped joining enterprise default workspace for account %s (message=%s)",
account_id,
result.message,
)
except Exception:
logger.warning("Failed to join enterprise default workspace for account %s", account_id, exc_info=True)
class EnterpriseService:
@classmethod
def get_info(cls):
@ -39,6 +95,34 @@ class EnterpriseService:
def get_workspace_info(cls, tenant_id: str):
return EnterpriseRequest.send_request("GET", f"/workspace/{tenant_id}/info")
@classmethod
def join_default_workspace(cls, *, account_id: str) -> DefaultWorkspaceJoinResult:
"""
Call enterprise inner API to add an account to the default workspace.
NOTE: EnterpriseRequest.base_url is expected to already include the `/inner/api` prefix,
so the endpoint here is `/default-workspace/members`.
"""
# Ensure we are sending a UUID-shaped string (enterprise side validates too).
try:
uuid.UUID(account_id)
except ValueError as e:
raise ValueError(f"account_id must be a valid UUID: {account_id}") from e
data = EnterpriseRequest.send_request(
"POST",
"/default-workspace/members",
json={"account_id": account_id},
timeout=DEFAULT_WORKSPACE_JOIN_TIMEOUT_SECONDS,
raise_for_status=True,
)
if not isinstance(data, dict):
raise ValueError("Invalid response format from enterprise default workspace API")
if "joined" not in data or "message" not in data:
raise ValueError("Invalid response payload from enterprise default workspace API")
return DefaultWorkspaceJoinResult.model_validate(data)
@classmethod
def get_app_sso_settings_last_update_time(cls) -> datetime:
data = EnterpriseRequest.send_request("GET", "/sso/app/last-update-time")

View File

@ -36,7 +36,6 @@ from core.rag.entities.event import (
)
from core.repositories.factory import DifyCoreRepositoryFactory
from core.repositories.sqlalchemy_workflow_node_execution_repository import SQLAlchemyWorkflowNodeExecutionRepository
from core.variables.variables import VariableBase
from core.workflow.entities.workflow_node_execution import (
WorkflowNodeExecution,
WorkflowNodeExecutionStatus,
@ -52,6 +51,7 @@ from core.workflow.nodes.node_mapping import LATEST_VERSION, NODE_TYPE_CLASSES_M
from core.workflow.repositories.workflow_node_execution_repository import OrderConfig
from core.workflow.runtime import VariablePool
from core.workflow.system_variable import SystemVariable
from core.workflow.variables.variables import VariableBase
from core.workflow.workflow_entry import WorkflowEntry
from extensions.ext_database import db
from libs.infinite_scroll_pagination import InfiniteScrollPagination

View File

@ -16,9 +16,9 @@ from werkzeug.exceptions import RequestEntityTooLarge
from configs import dify_config
from core.app.entities.app_invoke_entities import InvokeFrom
from core.tools.tool_file_manager import ToolFileManager
from core.variables.types import SegmentType
from core.workflow.enums import NodeType
from core.workflow.file.models import FileTransferMethod
from core.workflow.variables.types import SegmentType
from enums.quota_type import QuotaType
from extensions.ext_database import db
from extensions.ext_redis import redis_client

Some files were not shown because too many files have changed in this diff Show More