Migrate from direct TraceQueueManager.add_trace_task calls to TelemetryFacade.emit
with TelemetryEvent abstraction. This reduces CE code invasion by consolidating
telemetry logic in core/telemetry/ with a single guard in ops_trace_manager.py.
The model_provider field in prompt generation traces was being incorrectly
extracted by parsing the model name (e.g., 'deepseek-chat'), which resulted
in an empty string when the model name didn't contain a '/' character.
Now extracts the provider directly from the model_config parameter, with
a fallback to the old parsing logic for backward compatibility.
Changes:
- Update _emit_prompt_generation_trace to accept model_config parameter
- Extract provider from model_config.get('provider') when available
- Update all 6 call sites to pass model_config
- Maintain backward compatibility with fallback logic
- get_ops_trace_instance was trying to query App table with storage_id format "tenant-{uuid}"
- This caused psycopg2.errors.InvalidTextRepresentation when app_id is None
- Added early return for tenant-prefixed storage identifiers to skip App lookup
- Enterprise telemetry still works correctly with these storage IDs
- The no_variable=false code path in generate_rule_config was missing trace emission
- Added timing wrapper and _emit_prompt_generation_trace call to ensure metrics/logs are captured
- Trace now emitted on both success and failure cases for consistency with no_variable=true path
- Forward kwargs to message_trace to preserve node_execution_id
- Add node_execution_id extraction to all trace methods
- Add app_id parameter to prompt generation API endpoints
- Enable app_id tracing for rule_generate, code_generate, and structured_output operations
Remove app_id parameter from three endpoints and update trace manager to use
tenant_id as storage identifier when app_id is unavailable. This allows
standalone prompt generation utilities to emit telemetry.
Changes:
- controllers/console/app/generator.py: Remove app_id=None from 3 endpoints
(RuleGenerateApi, RuleCodeGenerateApi, RuleStructuredOutputGenerateApi)
- core/ops/ops_trace_manager.py: Use tenant_id fallback in send_to_celery
- Extract tenant_id from task.kwargs when app_id is None
- Use 'tenant-{tenant_id}' format as storage identifier
- Skip traces only if neither app_id nor tenant_id available
The trace metadata still contains the actual tenant_id, so enterprise
telemetry correctly emits metrics and logs grouped by tenant.
Remove app_id=None from three prompt generation endpoints that lack proper
app context. These standalone utilities only have tenant_id available, so
we don't pass app_id at all rather than passing incomplete information.
Affected endpoints:
- /rule-generate (RuleGenerateApi)
- /code-generate (RuleCodeGenerateApi)
- /structured-output-generate (RuleStructuredOutputGenerateApi)
- Add PromptGenerationTraceInfo trace entity with operation_type field
- Implement telemetry for rule-generate, code-generate, structured-output, instruction-modify operations
- Emit metrics: tokens (total/input/output), duration histogram, requests counter, errors counter
- Emit structured logs with model info and operation context
- Content redaction controlled by ENTERPRISE_INCLUDE_CONTENT env var
- Fix user_id propagation in TraceTask kwargs
- Fix latency calculation when llm_result is None
No spans exported - metrics and logs only for lightweight observability.
- Create dedicated MeterProvider instance (independent from ext_otel.py)
- Add create_metric_exporter() to _ExporterFactory with HTTP/gRPC support
- Enterprise metrics now work without requiring standard OTEL to be enabled
- Add MeterProvider shutdown to cleanup lifecycle
- Update module docstring to reflect full independence (Tracer, Logger, Meter)
Only export structured telemetry logs, not all application logs. The attach_log_handler method now attaches to the 'dify.telemetry' logger instead of the root logger.
- Add ENTERPRISE_OTLP_PROTOCOL config (http/grpc, default: http)
- Introduce _ExporterFactory class for protocol-agnostic exporter creation
- Support both HTTP and gRPC OTLP endpoints for traces and logs
- Refactor endpoint path handling into factory methods
- Add ENTERPRISE_OTEL_LOGS_ENABLED and ENTERPRISE_OTLP_LOGS_ENDPOINT config options
- Implement EnterpriseLoggingHandler for log record translation with trace/span ID parsing
- Add LoggerProvider and BatchLogRecordProcessor for OTLP log export
- Correlate telemetry logs with spans via span_id_source parameter
- Attach log handler during enterprise telemetry initialization
The backend part of the human in the loop (HITL) feature and relevant architecture / workflow engine changes.
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: 盐粒 Yanli <yanli@dify.ai>
Co-authored-by: CrabSAMA <40541269+CrabSAMA@users.noreply.github.com>
Co-authored-by: Stephen Zhou <38493346+hyoban@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: yihong <zouzou0208@gmail.com>
Co-authored-by: Joel <iamjoel007@gmail.com>