dify/api
QuantumGhost 349c3cf7b8
feat(api): Add image multimodal support for LLMNode (#17372)
Enhance `LLMNode` with multimodal capability, introducing support for
image outputs.

This implementation extracts base64-encoded images from LLM responses,
saves them to the storage service, and records the file metadata in the
`ToolFile` table. In conversations, these images are rendered as
markdown-based inline images.
Additionally, the images are included in the LLMNode's output as
file variables, enabling subsequent nodes in the workflow to utilize them.

To integrate file outputs into workflows, adjustments to the frontend code
are necessary.

For multimodal output functionality, updates to related model configurations
are required. Currently, this capability has been applied exclusively to
Google's Gemini models.

Close #15814.

Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
2025-04-30 17:28:02 +08:00
..
.idea fix nltk averaged_perceptron_tagger download and fix score limit is none (#7582) 2024-08-26 15:14:05 +08:00
.vscode feat/enhance the multi-modal support (#8818) 2024-10-21 10:43:49 +08:00
configs feat: add AWS Managed IAM auth for OpenSearch vector DB (#18963) 2025-04-29 15:10:08 +08:00
constants feat(api): Add image multimodal support for LLMNode (#17372) 2025-04-30 17:28:02 +08:00
contexts feat: Add caching mechanism for plugin model schemas (#14898) 2025-03-04 18:02:06 +08:00
controllers feat(api): Add image multimodal support for LLMNode (#17372) 2025-04-30 17:28:02 +08:00
core feat(api): Add image multimodal support for LLMNode (#17372) 2025-04-30 17:28:02 +08:00
docker add MAX_TASK_PRE_CHILD for celery (#18985) 2025-04-28 17:06:00 +08:00
events Remove dead code (#17899) 2025-04-11 20:33:52 +08:00
extensions [Observability][Bugfix] Fix expected an instance of Token, got None error in OpenTelemetry (#18934) 2025-04-28 10:31:13 +08:00
factories refactor: rename plugin manager to plugin client and rename path from manager to impl (#18876) 2025-04-27 14:22:25 +08:00
fields Resolves #18536 Retreive conversation variables (#18581) 2025-04-25 11:52:25 +08:00
libs Update login.py (#15320) 2025-03-10 09:49:14 +08:00
migrations Enhance Code Consistency Across Repository with `.editorconfig` (#19023) 2025-04-29 18:04:33 +08:00
models feat(api): Add image multimodal support for LLMNode (#17372) 2025-04-30 17:28:02 +08:00
schedule Fix function's name mismatch (#16681) 2025-03-25 10:25:15 +08:00
services fix(api): resolve external knowledge API error due to excessive URL validation (#19003) 2025-04-29 22:32:38 +08:00
tasks refactor: Refactors repository imports structure (#18901) 2025-04-27 17:29:03 +08:00
templates Enhance Code Consistency Across Repository with `.editorconfig` (#19023) 2025-04-29 18:04:33 +08:00
tests feat(api): Add image multimodal support for LLMNode (#17372) 2025-04-30 17:28:02 +08:00
.dockerignore Enhance Code Consistency Across Repository with `.editorconfig` (#19023) 2025-04-29 18:04:33 +08:00
.env.example [Lindorm VDB] Add the QUERY_TIMEOUT parameter to force the search query to fail. (#18613) 2025-04-25 09:42:58 +08:00
.ruff.toml chore(api): enhance ruff rules to disallow dangerous functions and modules (#16461) 2025-03-21 17:49:35 +08:00
Dockerfile build: introduce uv as Python package manager (#16317) 2025-04-15 16:16:49 +08:00
README.md chore: merge lint dependency group into dev group of python packages (#18088) 2025-04-15 20:50:06 +08:00
app.py fix(app.py): if condition (#12314) 2025-01-03 01:36:23 +08:00
app_factory.py [Observability][Bugfix] Fix expected an instance of Token, got None error in OpenTelemetry (#18934) 2025-04-28 10:31:13 +08:00
commands.py Enhance Code Consistency Across Repository with `.editorconfig` (#19023) 2025-04-29 18:04:33 +08:00
dify_app.py refactor: assembling the app features in modular way (#9129) 2024-11-30 23:05:22 +08:00
mypy.ini Remove the useless excluded item in mypy.ini (#16777) 2025-03-26 09:02:45 +08:00
pyproject.toml immediately return initialed tiktokenizer instance and remove dead code in usage of tiktokenizer (#17957) 2025-04-30 16:07:20 +08:00
pytest.ini [Unit Test] Generate coverage number for UT (#18106) 2025-04-16 11:55:37 +08:00
uv.lock immediately return initialed tiktokenizer instance and remove dead code in usage of tiktokenizer (#17957) 2025-04-30 16:07:20 +08:00

README.md

Dify Backend API

Usage

[!IMPORTANT]

In the v1.3.0 release, poetry has been replaced with uv as the package manager for Dify API backend service.

  1. Start the docker-compose stack

    The backend require some middleware, including PostgreSQL, Redis, and Weaviate, which can be started together using docker-compose.

    cd ../docker
    cp middleware.env.example middleware.env
    # change the profile to other vector database if you are not using weaviate
    docker compose -f docker-compose.middleware.yaml --profile weaviate -p dify up -d
    cd ../api
    
  2. Copy .env.example to .env

    cp .env.example .env 
    
  3. Generate a SECRET_KEY in the .env file.

    bash for Linux

    sed -i "/^SECRET_KEY=/c\SECRET_KEY=$(openssl rand -base64 42)" .env
    

    bash for Mac

    secret_key=$(openssl rand -base64 42)
    sed -i '' "/^SECRET_KEY=/c\\
    SECRET_KEY=${secret_key}" .env
    
  4. Create environment.

    Dify API service uses UV to manage dependencies. First, you need to add the uv package manager, if you don't have it already.

    pip install uv
    # Or on macOS
    brew install uv
    
  5. Install dependencies

    uv sync --dev
    
  6. Run migrate

    Before the first launch, migrate the database to the latest version.

    uv run flask db upgrade
    
  7. Start backend

    uv run flask run --host 0.0.0.0 --port=5001 --debug
    
  8. Start Dify web service.

  9. Setup your application by visiting http://localhost:3000.

  10. If you need to handle and debug the async tasks (e.g. dataset importing and documents indexing), please start the worker service.

uv run celery -A app.celery worker -P gevent -c 1 --loglevel INFO -Q dataset,generation,mail,ops_trace,app_deletion

Testing

  1. Install dependencies for both the backend and the test environment

    uv sync --dev
    
  2. Run the tests locally with mocked system environment variables in tool.pytest_env section in pyproject.toml

    uv run -P api bash dev/pytest/pytest_all_tests.sh