After force-logout due to license expiry, the login page calls
/system-features without auth. The license block was gated behind
is_authenticated, so the frontend always saw status='none' instead
of the actual expiry status. Split the guard so license.status and
expired_at are always returned while workspace usage details remain
auth-gated.
Cache license status for 10 minutes to reduce HTTP calls to enterprise API.
Only caches license status, not full system features.
Changes:
- Add EnterpriseService.get_cached_license_status() method
- Cache key: enterprise:license:status
- TTL: 600 seconds (10 minutes)
- Graceful degradation: falls back to API call if Redis fails
Performance improvement:
- Before: HTTP call (~50-200ms) on every API request
- After: Redis lookup (~1ms) on cached requests
- Reduces load on enterprise service by ~99%
Extend license middleware to also block webapp API (/api/*) when
enterprise license is expired/inactive/lost.
Changes:
- Check both /console/api and /api endpoints
- Add webapp-specific exempt paths:
- /api/passport (webapp authentication)
- /api/login, /api/logout, /api/oauth
- /api/forgot-password
- /api/system-features (webapp needs this to check license status)
This ensures both console users and webapp users are blocked when
license expires, maintaining consistent enforcement across all APIs.
Add before_request middleware that validates enterprise license status
for all /console/api endpoints when ENTERPRISE_ENABLED is true.
Behavior:
- Checks license status before each console API request
- Returns 403 with clear error message when license is expired/inactive/lost
- Exempts auth endpoints (login, oauth, forgot-password, etc.)
- Exempts /console/api/features so frontend can fetch license status
- Gracefully handles errors to avoid service disruption
This ensures all business APIs are blocked when license expires,
addressing the issue where APIs remained callable after expiry.
When enterprise API returns 403/404, the response contains error JSON
instead of expected data structure. Code was accessing fields directly
causing KeyError → 500 Internal Server Error.
Changes:
- Add enterprise-specific error classes (EnterpriseAPIError, etc.)
- Implement centralized error validation in EnterpriseRequest.send_request()
- Extract error messages from API responses (message/error/detail fields)
- Raise domain-specific errors based on HTTP status codes
- Preserve backward compatibility with raise_for_status parameter
This prevents KeyError crashes and returns proper HTTP error codes
(403/404) instead of 500 errors.
The CE safety commit (8a3485454a) converted module-level dicts to lazy
functions but forgot to update __init__.py, which still imported the
now-deleted TRACE_TASK_TO_CASE constant causing an ImportError at startup.
Add get_trace_task_to_case() to gateway.py as a lazy public wrapper
(inverse of _get_case_to_trace_task) and update __init__.py to call it.
- Move enqueue_draft_node_execution_trace import inside call site in workflow_service.py
- Make gateway.py enterprise type imports lazy (routing dicts built on first call)
- Restore typed ModelConfig in llm_generator method signatures (revert dict regression)
- Fix generate_structured_output using wrong key model_parameters -> completion_params
- Replace unsafe cast(str, msg.content) with get_text_content() across llm_generator
- Remove duplicated payload classes from generator.py, import from core.llm_generator.entities
- Gate _lookup_app_and_workspace_names and credential lookups in ops_trace_manager behind is_enterprise_telemetry_enabled()
Move gen_ai.usage.*, gen_ai.request.model, gen_ai.provider.name, and
gen_ai.user.id from companion-log-only to span attributes on workflow
and node execution spans.
These are small scalars with no size risk. Having them on spans enables
filtering and grouping in trace UIs (Tempo, Jaeger, Datadog) without
requiring a cross-signal join to companion logs.
Data dictionary updated: span tables gain the new fields; companion log
'additional attributes' tables trimmed to only list fields not already
covered by 'All span attributes'.
Workflow RT: replace float(info.workflow_run_elapsed_time) with
(end_time - start_time).total_seconds() using workflow_run.created_at and
workflow_run.finished_at. The elapsed_time DB field defaults to 0 and can
be stale if the workflow_storage Celery task has not committed yet when the
trace fires. Wall-clock timestamps are more reliable; elapsed_time is kept
as fallback.
Message RT: change end_time from created_at + provider_response_latency to
message.updated_at when updated_at > created_at. The pipeline explicitly
sets message.updated_at = naive_utc_now() at the moment the LLM response
is complete, making it the canonical response-complete timestamp.
Falls back to the latency-based calculation for error/aborted messages.
Add should_include_content() helper to extensions/otel/parser/base.py that
returns True in CE (no behaviour change) and respects ENTERPRISE_INCLUDE_CONTENT
in EE. Gate all content-bearing span attributes in LLM, retrieval, tool, and
default node parsers so that gen_ai.completion, gen_ai.prompt, retrieval.document,
tool call arguments/results, and node input/output values are suppressed when
ENTERPRISE_ENABLED=True and ENTERPRISE_INCLUDE_CONTENT=False.
- Add _lookup_llm_credential_info() to query Provider/ProviderModel tables
- Lookup LLM credentials when tool credential_id is null
- Fall back to provider-level credential if no model-specific credential
- Extract model_provider/model_name from process_data (LLM nodes store
model info there, not in execution_metadata)
- Add invoke_from to node execution trace metadata dict
- Add credential_id to node execution trace metadata dict
- Add conversation_id to metadata after message_id lookup
- Add tool_name to tool_info dict in tool node
- Extract model_provider/model_name from process_data (LLM nodes store
model info there, not in execution_metadata)
- Add invoke_from to node execution trace metadata dict
- Add credential_id to node execution trace metadata dict
- Add conversation_id to metadata after message_id lookup
- Add tool_name to tool_info dict in tool node
Fixes Celery worker error where process_enterprise_telemetry task
was unregistered despite being dispatched from the app.
Added conditional import when ENTERPRISE_TELEMETRY_ENABLED=true
to ensure the task is available in the worker process.
Resolves: KeyError 'tasks.enterprise_telemetry_task.process_enterprise_telemetry'
- Add resolved_parent_context property to BaseTraceInfo for reusable parent context extraction
- Refactor enterprise_trace.py to use property instead of duplicated dict plucking (~19 lines eliminated)
- Fix UUID validation in exporter.py with specific error logging for invalid trace correlation IDs
- Add error isolation in event_handlers.py to prevent telemetry failures from breaking user operations
- Replace pickle-based payload_fallback with JSON storage rehydration for security
- Update TelemetryEnvelope to use Pydantic v2 ConfigDict with extra='forbid'
- Update tests to reflect contract changes and new error handling behavior