+
+Dify é uma plataforma de desenvolvimento de aplicativos LLM de código aberto. Sua interface intuitiva combina workflow de IA, pipeline RAG, capacidades de agente, gerenciamento de modelos, recursos de observabilidade e muito mais, permitindo que você vá rapidamente do protótipo à produção. Aqui está uma lista das principais funcionalidades:
+
+
+**1. Workflow**:
+ Construa e teste workflows poderosos de IA em uma interface visual, aproveitando todos os recursos a seguir e muito mais.
+
+
+ https://github.com/langgenius/dify/assets/13230914/356df23e-1604-483d-80a6-9517ece318aa
+
+
+
+**2. Suporte abrangente a modelos**:
+ Integração perfeita com centenas de LLMs proprietários e de código aberto de diversas provedoras e soluções auto-hospedadas, abrangendo GPT, Mistral, Llama3 e qualquer modelo compatível com a API da OpenAI. A lista completa de provedores suportados pode ser encontrada [aqui](https://docs.dify.ai/getting-started/readme/model-providers).
+
+
+
+
+**3. IDE de Prompt**:
+ Interface intuitiva para criação de prompts, comparação de desempenho de modelos e adição de recursos como conversão de texto para fala em um aplicativo baseado em chat.
+
+**4. Pipeline RAG**:
+ Extensas capacidades de RAG que cobrem desde a ingestão de documentos até a recuperação, com suporte nativo para extração de texto de PDFs, PPTs e outros formatos de documentos comuns.
+
+**5. Capacidades de agente**:
+ Você pode definir agentes com base em LLM Function Calling ou ReAct e adicionar ferramentas pré-construídas ou personalizadas para o agente. O Dify oferece mais de 50 ferramentas integradas para agentes de IA, como Google Search, DALL·E, Stable Diffusion e WolframAlpha.
+
+**6. LLMOps**:
+ Monitore e analise os registros e o desempenho do aplicativo ao longo do tempo. É possível melhorar continuamente prompts, conjuntos de dados e modelos com base nos dados de produção e anotações.
+
+**7. Backend como Serviço**:
+ Todas os recursos do Dify vêm com APIs correspondentes, permitindo que você integre o Dify sem esforço na lógica de negócios da sua empresa.
+
+
+## Comparação de recursos
+
+
+
Recurso
+
Dify.AI
+
LangChain
+
Flowise
+
OpenAI Assistants API
+
+
+
Abordagem de Programação
+
Orientada a API + Aplicativo
+
Código Python
+
Orientada a Aplicativo
+
Orientada a API
+
+
+
LLMs Suportados
+
Variedade Rica
+
Variedade Rica
+
Variedade Rica
+
Apenas OpenAI
+
+
+
RAG Engine
+
✅
+
✅
+
✅
+
✅
+
+
+
Agente
+
✅
+
✅
+
❌
+
✅
+
+
+
Workflow
+
✅
+
❌
+
✅
+
❌
+
+
+
Observabilidade
+
✅
+
✅
+
❌
+
❌
+
+
+
Recursos Empresariais (SSO/Controle de Acesso)
+
✅
+
❌
+
❌
+
❌
+
+
+
Implantação Local
+
✅
+
✅
+
✅
+
❌
+
+
+
+## Usando o Dify
+
+- **Nuvem **
+Oferecemos o serviço [Dify Cloud](https://dify.ai) para qualquer pessoa experimentar sem nenhuma configuração. Ele fornece todas as funcionalidades da versão auto-hospedada, incluindo 200 chamadas GPT-4 gratuitas no plano sandbox.
+
+- **Auto-hospedagem do Dify Community Edition**
+Configure rapidamente o Dify no seu ambiente com este [guia inicial](#quick-start).
+Use nossa [documentação](https://docs.dify.ai) para referências adicionais e instruções mais detalhadas.
+
+- **Dify para empresas/organizações**
+Oferecemos recursos adicionais voltados para empresas. [Envie suas perguntas através deste chatbot](https://udify.app/chat/22L1zSxg6yW1cWQg) ou [envie-nos um e-mail](mailto:business@dify.ai?subject=[GitHub]Business%20License%20Inquiry) para discutir necessidades empresariais.
+ > Para startups e pequenas empresas que utilizam AWS, confira o [Dify Premium no AWS Marketplace](https://aws.amazon.com/marketplace/pp/prodview-t22mebxzwjhu6) e implemente no seu próprio AWS VPC com um clique. É uma oferta AMI acessível com a opção de criar aplicativos com logotipo e marca personalizados.
+
+
+## Mantendo-se atualizado
+
+Dê uma estrela no Dify no GitHub e seja notificado imediatamente sobre novos lançamentos.
+
+
+
+
+
+## Início rápido
+> Antes de instalar o Dify, certifique-se de que sua máquina atenda aos seguintes requisitos mínimos de sistema:
+>
+>- CPU >= 2 Núcleos
+>- RAM >= 4 GiB
+
+
+
+A maneira mais fácil de iniciar o servidor Dify é executar nosso arquivo [docker-compose.yml](docker/docker-compose.yaml). Antes de rodar o comando de instalação, certifique-se de que o [Docker](https://docs.docker.com/get-docker/) e o [Docker Compose](https://docs.docker.com/compose/install/) estão instalados na sua máquina:
+
+```bash
+cd docker
+cp .env.example .env
+docker compose up -d
+```
+
+Após a execução, você pode acessar o painel do Dify no navegador em [http://localhost/install](http://localhost/install) e iniciar o processo de inicialização.
+
+> Se você deseja contribuir com o Dify ou fazer desenvolvimento adicional, consulte nosso [guia para implantar a partir do código fonte](https://docs.dify.ai/getting-started/install-self-hosted/local-source-code).
+
+## Próximos passos
+
+Se precisar personalizar a configuração, consulte os comentários no nosso arquivo [.env.example](docker/.env.example) e atualize os valores correspondentes no seu arquivo `.env`. Além disso, talvez seja necessário fazer ajustes no próprio arquivo `docker-compose.yaml`, como alterar versões de imagem, mapeamentos de portas ou montagens de volumes, com base no seu ambiente de implantação específico e nas suas necessidades. Após fazer quaisquer alterações, execute novamente `docker-compose up -d`. Você pode encontrar a lista completa de variáveis de ambiente disponíveis [aqui](https://docs.dify.ai/getting-started/install-self-hosted/environments).
+
+Se deseja configurar uma instalação de alta disponibilidade, há [Helm Charts](https://helm.sh/) e arquivos YAML contribuídos pela comunidade que permitem a implantação do Dify no Kubernetes.
+
+- [Helm Chart de @LeoQuote](https://github.com/douban/charts/tree/master/charts/dify)
+- [Helm Chart de @BorisPolonsky](https://github.com/BorisPolonsky/dify-helm)
+- [Arquivo YAML de @Winson-030](https://github.com/Winson-030/dify-kubernetes)
+
+#### Usando o Terraform para Implantação
+
+Implante o Dify na Plataforma Cloud com um único clique usando [terraform](https://www.terraform.io/)
+
+##### Azure Global
+- [Azure Terraform por @nikawang](https://github.com/nikawang/dify-azure-terraform)
+
+##### Google Cloud
+- [Google Cloud Terraform por @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)
+
+## Contribuindo
+
+Para aqueles que desejam contribuir com código, veja nosso [Guia de Contribuição](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md).
+Ao mesmo tempo, considere apoiar o Dify compartilhando-o nas redes sociais e em eventos e conferências.
+
+> Estamos buscando contribuidores para ajudar na tradução do Dify para idiomas além de Mandarim e Inglês. Se você tiver interesse em ajudar, consulte o [README i18n](https://github.com/langgenius/dify/blob/main/web/i18n/README.md) para mais informações e deixe-nos um comentário no canal `global-users` em nosso [Servidor da Comunidade no Discord](https://discord.gg/8Tpq4AcN9c).
+
+**Contribuidores**
+
+
+
+
+
+## Comunidade e contato
+
+* [Discussões no GitHub](https://github.com/langgenius/dify/discussions). Melhor para: compartilhar feedback e fazer perguntas.
+* [Problemas no GitHub](https://github.com/langgenius/dify/issues). Melhor para: relatar bugs encontrados no Dify.AI e propor novos recursos. Veja nosso [Guia de Contribuição](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md).
+* [Discord](https://discord.gg/FngNHpbcY7). Melhor para: compartilhar suas aplicações e interagir com a comunidade.
+* [X(Twitter)](https://twitter.com/dify_ai). Melhor para: compartilhar suas aplicações e interagir com a comunidade.
+
+## Histórico de estrelas
+
+[](https://star-history.com/#langgenius/dify&Date)
+
+## Divulgação de segurança
+
+Para proteger sua privacidade, evite postar problemas de segurança no GitHub. Em vez disso, envie suas perguntas para security@dify.ai e forneceremos uma resposta mais detalhada.
+
+## Licença
+
+Este repositório está disponível sob a [Licença de Código Aberto Dify](LICENSE), que é essencialmente Apache 2.0 com algumas restrições adicionais.
\ No newline at end of file
diff --git a/api/.env.example b/api/.env.example
index 184a811c51..79d6ffdf6a 100644
--- a/api/.env.example
+++ b/api/.env.example
@@ -120,7 +120,7 @@ SUPABASE_URL=your-server-url
WEB_API_CORS_ALLOW_ORIGINS=http://127.0.0.1:3000,*
CONSOLE_CORS_ALLOW_ORIGINS=http://127.0.0.1:3000,*
-# Vector database configuration, support: weaviate, qdrant, milvus, myscale, relyt, pgvecto_rs, pgvector, pgvector, chroma, opensearch, tidb_vector, vikingdb, upstash
+# Vector database configuration, support: weaviate, qdrant, milvus, myscale, relyt, pgvecto_rs, pgvector, pgvector, chroma, opensearch, tidb_vector, couchbase, vikingdb, upstash
VECTOR_STORE=weaviate
# Weaviate configuration
@@ -136,6 +136,13 @@ QDRANT_CLIENT_TIMEOUT=20
QDRANT_GRPC_ENABLED=false
QDRANT_GRPC_PORT=6334
+#Couchbase configuration
+COUCHBASE_CONNECTION_STRING=127.0.0.1
+COUCHBASE_USER=Administrator
+COUCHBASE_PASSWORD=password
+COUCHBASE_BUCKET_NAME=Embeddings
+COUCHBASE_SCOPE_NAME=_default
+
# Milvus configuration
MILVUS_URI=http://127.0.0.1:19530
MILVUS_TOKEN=
@@ -195,6 +202,20 @@ TIDB_VECTOR_USER=xxx.root
TIDB_VECTOR_PASSWORD=xxxxxx
TIDB_VECTOR_DATABASE=dify
+# Tidb on qdrant configuration
+TIDB_ON_QDRANT_URL=http://127.0.0.1
+TIDB_ON_QDRANT_API_KEY=dify
+TIDB_ON_QDRANT_CLIENT_TIMEOUT=20
+TIDB_ON_QDRANT_GRPC_ENABLED=false
+TIDB_ON_QDRANT_GRPC_PORT=6334
+TIDB_PUBLIC_KEY=dify
+TIDB_PRIVATE_KEY=dify
+TIDB_API_URL=http://127.0.0.1
+TIDB_IAM_API_URL=http://127.0.0.1
+TIDB_REGION=regions/aws-us-east-1
+TIDB_PROJECT_ID=dify
+TIDB_SPEND_LIMIT=100
+
# Chroma configuration
CHROMA_HOST=127.0.0.1
CHROMA_PORT=8000
@@ -242,6 +263,14 @@ VIKINGDB_SCHEMA=http
VIKINGDB_CONNECTION_TIMEOUT=30
VIKINGDB_SOCKET_TIMEOUT=30
+# OceanBase Vector configuration
+OCEANBASE_VECTOR_HOST=127.0.0.1
+OCEANBASE_VECTOR_PORT=2881
+OCEANBASE_VECTOR_USER=root@test
+OCEANBASE_VECTOR_PASSWORD=
+OCEANBASE_VECTOR_DATABASE=test
+OCEANBASE_MEMORY_LIMIT=6G
+
# Upload configuration
UPLOAD_FILE_SIZE_LIMIT=15
UPLOAD_FILE_BATCH_LIMIT=5
diff --git a/api/commands.py b/api/commands.py
index 720a4447da..10122ceb3d 100644
--- a/api/commands.py
+++ b/api/commands.py
@@ -278,6 +278,8 @@ def migrate_knowledge_vector_database():
VectorType.BAIDU,
VectorType.VIKINGDB,
VectorType.UPSTASH,
+ VectorType.COUCHBASE,
+ VectorType.OCEANBASE,
}
page = 1
while True:
diff --git a/api/configs/feature/__init__.py b/api/configs/feature/__init__.py
index 0fa926038d..a8a4170f67 100644
--- a/api/configs/feature/__init__.py
+++ b/api/configs/feature/__init__.py
@@ -10,6 +10,7 @@ from pydantic import (
PositiveInt,
computed_field,
)
+from pydantic_extra_types.timezone_name import TimeZoneName
from pydantic_settings import BaseSettings
from configs.feature.hosted_service import HostedServiceConfig
@@ -339,8 +340,9 @@ class LoggingConfig(BaseSettings):
default=None,
)
- LOG_TZ: Optional[str] = Field(
- description="Timezone for log timestamps (e.g., 'America/New_York')",
+ LOG_TZ: Optional[TimeZoneName] = Field(
+ description="Timezone for log timestamps. Allowed timezone values can be referred to IANA Time Zone Database,"
+ " e.g., 'America/New_York')",
default=None,
)
diff --git a/api/configs/middleware/__init__.py b/api/configs/middleware/__init__.py
index 3d68e29d0e..38bb804613 100644
--- a/api/configs/middleware/__init__.py
+++ b/api/configs/middleware/__init__.py
@@ -17,9 +17,11 @@ from configs.middleware.storage.tencent_cos_storage_config import TencentCloudCO
from configs.middleware.storage.volcengine_tos_storage_config import VolcengineTOSStorageConfig
from configs.middleware.vdb.analyticdb_config import AnalyticdbConfig
from configs.middleware.vdb.chroma_config import ChromaConfig
+from configs.middleware.vdb.couchbase_config import CouchbaseConfig
from configs.middleware.vdb.elasticsearch_config import ElasticsearchConfig
from configs.middleware.vdb.milvus_config import MilvusConfig
from configs.middleware.vdb.myscale_config import MyScaleConfig
+from configs.middleware.vdb.oceanbase_config import OceanBaseVectorConfig
from configs.middleware.vdb.opensearch_config import OpenSearchConfig
from configs.middleware.vdb.oracle_config import OracleConfig
from configs.middleware.vdb.pgvector_config import PGVectorConfig
@@ -251,9 +253,11 @@ class MiddlewareConfig(
TiDBVectorConfig,
WeaviateConfig,
ElasticsearchConfig,
+ CouchbaseConfig,
InternalTestConfig,
VikingDBConfig,
UpstashConfig,
TidbOnQdrantConfig,
+ OceanBaseVectorConfig,
):
pass
diff --git a/api/configs/middleware/vdb/couchbase_config.py b/api/configs/middleware/vdb/couchbase_config.py
new file mode 100644
index 0000000000..391089ec6e
--- /dev/null
+++ b/api/configs/middleware/vdb/couchbase_config.py
@@ -0,0 +1,34 @@
+from typing import Optional
+
+from pydantic import BaseModel, Field
+
+
+class CouchbaseConfig(BaseModel):
+ """
+ Couchbase configs
+ """
+
+ COUCHBASE_CONNECTION_STRING: Optional[str] = Field(
+ description="COUCHBASE connection string",
+ default=None,
+ )
+
+ COUCHBASE_USER: Optional[str] = Field(
+ description="COUCHBASE user",
+ default=None,
+ )
+
+ COUCHBASE_PASSWORD: Optional[str] = Field(
+ description="COUCHBASE password",
+ default=None,
+ )
+
+ COUCHBASE_BUCKET_NAME: Optional[str] = Field(
+ description="COUCHBASE bucket name",
+ default=None,
+ )
+
+ COUCHBASE_SCOPE_NAME: Optional[str] = Field(
+ description="COUCHBASE scope name",
+ default=None,
+ )
diff --git a/api/configs/middleware/vdb/oceanbase_config.py b/api/configs/middleware/vdb/oceanbase_config.py
new file mode 100644
index 0000000000..87427af960
--- /dev/null
+++ b/api/configs/middleware/vdb/oceanbase_config.py
@@ -0,0 +1,35 @@
+from typing import Optional
+
+from pydantic import Field, PositiveInt
+from pydantic_settings import BaseSettings
+
+
+class OceanBaseVectorConfig(BaseSettings):
+ """
+ Configuration settings for OceanBase Vector database
+ """
+
+ OCEANBASE_VECTOR_HOST: Optional[str] = Field(
+ description="Hostname or IP address of the OceanBase Vector server (e.g. 'localhost')",
+ default=None,
+ )
+
+ OCEANBASE_VECTOR_PORT: Optional[PositiveInt] = Field(
+ description="Port number on which the OceanBase Vector server is listening (default is 2881)",
+ default=2881,
+ )
+
+ OCEANBASE_VECTOR_USER: Optional[str] = Field(
+ description="Username for authenticating with the OceanBase Vector database",
+ default=None,
+ )
+
+ OCEANBASE_VECTOR_PASSWORD: Optional[str] = Field(
+ description="Password for authenticating with the OceanBase Vector database",
+ default=None,
+ )
+
+ OCEANBASE_VECTOR_DATABASE: Optional[str] = Field(
+ description="Name of the OceanBase Vector database to connect to",
+ default=None,
+ )
diff --git a/api/configs/middleware/vdb/tidb_on_qdrant_config.py b/api/configs/middleware/vdb/tidb_on_qdrant_config.py
index 98268798ef..d2625af264 100644
--- a/api/configs/middleware/vdb/tidb_on_qdrant_config.py
+++ b/api/configs/middleware/vdb/tidb_on_qdrant_config.py
@@ -63,3 +63,8 @@ class TidbOnQdrantConfig(BaseSettings):
description="Tidb project id",
default=None,
)
+
+ TIDB_SPEND_LIMIT: Optional[int] = Field(
+ description="Tidb spend limit",
+ default=100,
+ )
diff --git a/api/configs/packaging/__init__.py b/api/configs/packaging/__init__.py
index 389a64f53e..3dc87e3058 100644
--- a/api/configs/packaging/__init__.py
+++ b/api/configs/packaging/__init__.py
@@ -9,7 +9,7 @@ class PackagingInfo(BaseSettings):
CURRENT_VERSION: str = Field(
description="Dify version",
- default="0.10.1",
+ default="0.10.2",
)
COMMIT_SHA: str = Field(
diff --git a/api/controllers/console/datasets/datasets.py b/api/controllers/console/datasets/datasets.py
index c8022efb8b..4f4d186edd 100644
--- a/api/controllers/console/datasets/datasets.py
+++ b/api/controllers/console/datasets/datasets.py
@@ -628,6 +628,7 @@ class DatasetRetrievalSettingApi(Resource):
| VectorType.BAIDU
| VectorType.VIKINGDB
| VectorType.UPSTASH
+ | VectorType.OCEANBASE
):
return {"retrieval_method": [RetrievalMethod.SEMANTIC_SEARCH.value]}
case (
@@ -640,6 +641,7 @@ class DatasetRetrievalSettingApi(Resource):
| VectorType.ELASTICSEARCH
| VectorType.PGVECTOR
| VectorType.TIDB_ON_QDRANT
+ | VectorType.COUCHBASE
):
return {
"retrieval_method": [
@@ -668,6 +670,7 @@ class DatasetRetrievalSettingMockApi(Resource):
| VectorType.BAIDU
| VectorType.VIKINGDB
| VectorType.UPSTASH
+ | VectorType.OCEANBASE
):
return {"retrieval_method": [RetrievalMethod.SEMANTIC_SEARCH.value]}
case (
@@ -678,6 +681,7 @@ class DatasetRetrievalSettingMockApi(Resource):
| VectorType.MYSCALE
| VectorType.ORACLE
| VectorType.ELASTICSEARCH
+ | VectorType.COUCHBASE
| VectorType.PGVECTOR
):
return {
diff --git a/api/controllers/service_api/dataset/document.py b/api/controllers/service_api/dataset/document.py
index fb48a6c76c..0a0a38c4c6 100644
--- a/api/controllers/service_api/dataset/document.py
+++ b/api/controllers/service_api/dataset/document.py
@@ -230,7 +230,7 @@ class DocumentUpdateByFileApi(DatasetApiResource):
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
document = documents[0]
- documents_and_batch_fields = {"document": marshal(document, document_fields), "batch": batch}
+ documents_and_batch_fields = {"document": marshal(document, document_fields), "batch": document.batch}
return documents_and_batch_fields, 200
diff --git a/api/core/agent/base_agent_runner.py b/api/core/agent/base_agent_runner.py
index 514dcfbd68..507455c176 100644
--- a/api/core/agent/base_agent_runner.py
+++ b/api/core/agent/base_agent_runner.py
@@ -165,6 +165,12 @@ class BaseAgentRunner(AppRunner):
continue
parameter_type = parameter.type.as_normal_type()
+ if parameter.type in {
+ ToolParameter.ToolParameterType.SYSTEM_FILES,
+ ToolParameter.ToolParameterType.FILE,
+ ToolParameter.ToolParameterType.FILES,
+ }:
+ continue
enum = []
if parameter.type == ToolParameter.ToolParameterType.SELECT:
enum = [option.value for option in parameter.options]
@@ -250,6 +256,12 @@ class BaseAgentRunner(AppRunner):
continue
parameter_type = parameter.type.as_normal_type()
+ if parameter.type in {
+ ToolParameter.ToolParameterType.SYSTEM_FILES,
+ ToolParameter.ToolParameterType.FILE,
+ ToolParameter.ToolParameterType.FILES,
+ }:
+ continue
enum = []
if parameter.type == ToolParameter.ToolParameterType.SELECT:
enum = [option.value for option in parameter.options]
diff --git a/api/core/file/file_manager.py b/api/core/file/file_manager.py
index 0c6ce8ce75..b69d7a74c0 100644
--- a/api/core/file/file_manager.py
+++ b/api/core/file/file_manager.py
@@ -76,8 +76,16 @@ def to_prompt_message_content(f: File, /):
def download(f: File, /):
- upload_file = file_repository.get_upload_file(session=db.session(), file=f)
- return _download_file_content(upload_file.key)
+ if f.transfer_method == FileTransferMethod.TOOL_FILE:
+ tool_file = file_repository.get_tool_file(session=db.session(), file=f)
+ return _download_file_content(tool_file.file_key)
+ elif f.transfer_method == FileTransferMethod.LOCAL_FILE:
+ upload_file = file_repository.get_upload_file(session=db.session(), file=f)
+ return _download_file_content(upload_file.key)
+ # remote file
+ response = ssrf_proxy.get(f.remote_url, follow_redirects=True)
+ response.raise_for_status()
+ return response.content
def _download_file_content(path: str, /):
diff --git a/api/core/model_runtime/model_providers/gitee_ai/_assets/Gitee-AI-Logo-full.svg b/api/core/model_runtime/model_providers/gitee_ai/_assets/Gitee-AI-Logo-full.svg
new file mode 100644
index 0000000000..f9738b585b
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/_assets/Gitee-AI-Logo-full.svg
@@ -0,0 +1,6 @@
+
diff --git a/api/core/model_runtime/model_providers/gitee_ai/_assets/Gitee-AI-Logo.svg b/api/core/model_runtime/model_providers/gitee_ai/_assets/Gitee-AI-Logo.svg
new file mode 100644
index 0000000000..1f51187f19
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/_assets/Gitee-AI-Logo.svg
@@ -0,0 +1,3 @@
+
diff --git a/api/core/model_runtime/model_providers/gitee_ai/_common.py b/api/core/model_runtime/model_providers/gitee_ai/_common.py
new file mode 100644
index 0000000000..0750f3b75d
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/_common.py
@@ -0,0 +1,47 @@
+from dashscope.common.error import (
+ AuthenticationError,
+ InvalidParameter,
+ RequestFailure,
+ ServiceUnavailableError,
+ UnsupportedHTTPMethod,
+ UnsupportedModel,
+)
+
+from core.model_runtime.errors.invoke import (
+ InvokeAuthorizationError,
+ InvokeBadRequestError,
+ InvokeConnectionError,
+ InvokeError,
+ InvokeRateLimitError,
+ InvokeServerUnavailableError,
+)
+
+
+class _CommonGiteeAI:
+ @property
+ def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
+ """
+ Map model invoke error to unified error
+ The key is the error type thrown to the caller
+ The value is the error type thrown by the model,
+ which needs to be converted into a unified error type for the caller.
+
+ :return: Invoke error mapping
+ """
+ return {
+ InvokeConnectionError: [
+ RequestFailure,
+ ],
+ InvokeServerUnavailableError: [
+ ServiceUnavailableError,
+ ],
+ InvokeRateLimitError: [],
+ InvokeAuthorizationError: [
+ AuthenticationError,
+ ],
+ InvokeBadRequestError: [
+ InvalidParameter,
+ UnsupportedModel,
+ UnsupportedHTTPMethod,
+ ],
+ }
diff --git a/api/core/model_runtime/model_providers/gitee_ai/gitee_ai.py b/api/core/model_runtime/model_providers/gitee_ai/gitee_ai.py
new file mode 100644
index 0000000000..ca67594ce4
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/gitee_ai.py
@@ -0,0 +1,25 @@
+import logging
+
+from core.model_runtime.entities.model_entities import ModelType
+from core.model_runtime.errors.validate import CredentialsValidateFailedError
+from core.model_runtime.model_providers.__base.model_provider import ModelProvider
+
+logger = logging.getLogger(__name__)
+
+
+class GiteeAIProvider(ModelProvider):
+ def validate_provider_credentials(self, credentials: dict) -> None:
+ """
+ Validate provider credentials
+ if validate failed, raise exception
+
+ :param credentials: provider credentials, credentials form defined in `provider_credential_schema`.
+ """
+ try:
+ model_instance = self.get_model_instance(ModelType.LLM)
+ model_instance.validate_credentials(model="Qwen2-7B-Instruct", credentials=credentials)
+ except CredentialsValidateFailedError as ex:
+ raise ex
+ except Exception as ex:
+ logger.exception(f"{self.get_provider_schema().provider} credentials validate failed")
+ raise ex
diff --git a/api/core/model_runtime/model_providers/gitee_ai/gitee_ai.yaml b/api/core/model_runtime/model_providers/gitee_ai/gitee_ai.yaml
new file mode 100644
index 0000000000..7f7d0f2e53
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/gitee_ai.yaml
@@ -0,0 +1,35 @@
+provider: gitee_ai
+label:
+ en_US: Gitee AI
+ zh_Hans: Gitee AI
+description:
+ en_US: 快速体验大模型,领先探索 AI 开源世界
+ zh_Hans: 快速体验大模型,领先探索 AI 开源世界
+icon_small:
+ en_US: Gitee-AI-Logo.svg
+icon_large:
+ en_US: Gitee-AI-Logo-full.svg
+help:
+ title:
+ en_US: Get your token from Gitee AI
+ zh_Hans: 从 Gitee AI 获取 token
+ url:
+ en_US: https://ai.gitee.com/dashboard/settings/tokens
+supported_model_types:
+ - llm
+ - text-embedding
+ - rerank
+ - speech2text
+ - tts
+configurate_methods:
+ - predefined-model
+provider_credential_schema:
+ credential_form_schemas:
+ - variable: api_key
+ label:
+ en_US: API Key
+ type: secret-input
+ required: true
+ placeholder:
+ zh_Hans: 在此输入您的 API Key
+ en_US: Enter your API Key
diff --git a/api/core/model_runtime/model_providers/gitee_ai/llm/Qwen2-72B-Instruct.yaml b/api/core/model_runtime/model_providers/gitee_ai/llm/Qwen2-72B-Instruct.yaml
new file mode 100644
index 0000000000..0348438a75
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/llm/Qwen2-72B-Instruct.yaml
@@ -0,0 +1,105 @@
+model: Qwen2-72B-Instruct
+label:
+ zh_Hans: Qwen2-72B-Instruct
+ en_US: Qwen2-72B-Instruct
+model_type: llm
+features:
+ - agent-thought
+model_properties:
+ mode: chat
+ context_size: 6400
+parameter_rules:
+ - name: stream
+ use_template: boolean
+ label:
+ en_US: "Stream"
+ zh_Hans: "流式"
+ type: boolean
+ default: true
+ required: true
+ help:
+ en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
+ zh_Hans: "是否通过流式分批返回结果。如果设置为 true,生成过程中实时地向用户推送每一部分生成的文本。"
+
+ - name: max_tokens
+ use_template: max_tokens
+ label:
+ en_US: "Max Tokens"
+ zh_Hans: "最大Token数"
+ type: int
+ default: 512
+ min: 1
+ required: true
+ help:
+ en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
+ zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
+
+ - name: temperature
+ use_template: temperature
+ label:
+ en_US: "Temperature"
+ zh_Hans: "采样温度"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_p
+ use_template: top_p
+ label:
+ en_US: "Top P"
+ zh_Hans: "Top P"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens;当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_k
+ use_template: top_k
+ label:
+ en_US: "Top K"
+ zh_Hans: "Top K"
+ type: int
+ default: 50
+ min: 0
+ max: 100
+ required: true
+ help:
+ en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
+ zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
+
+ - name: frequency_penalty
+ use_template: frequency_penalty
+ label:
+ en_US: "Frequency Penalty"
+ zh_Hans: "频率惩罚"
+ type: float
+ default: 0
+ min: -1.0
+ max: 1.0
+ precision: 1
+ required: false
+ help:
+ en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
+ zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
+
+ - name: user
+ use_template: text
+ label:
+ en_US: "User"
+ zh_Hans: "用户"
+ type: string
+ required: false
+ help:
+ en_US: "Used to track and differentiate conversation requests from different users."
+ zh_Hans: "用于追踪和区分不同用户的对话请求。"
diff --git a/api/core/model_runtime/model_providers/gitee_ai/llm/Qwen2-7B-Instruct.yaml b/api/core/model_runtime/model_providers/gitee_ai/llm/Qwen2-7B-Instruct.yaml
new file mode 100644
index 0000000000..ba1ad788f5
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/llm/Qwen2-7B-Instruct.yaml
@@ -0,0 +1,105 @@
+model: Qwen2-7B-Instruct
+label:
+ zh_Hans: Qwen2-7B-Instruct
+ en_US: Qwen2-7B-Instruct
+model_type: llm
+features:
+ - agent-thought
+model_properties:
+ mode: chat
+ context_size: 32768
+parameter_rules:
+ - name: stream
+ use_template: boolean
+ label:
+ en_US: "Stream"
+ zh_Hans: "流式"
+ type: boolean
+ default: true
+ required: true
+ help:
+ en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
+ zh_Hans: "是否通过流式分批返回结果。如果设置为 true,生成过程中实时地向用户推送每一部分生成的文本。"
+
+ - name: max_tokens
+ use_template: max_tokens
+ label:
+ en_US: "Max Tokens"
+ zh_Hans: "最大Token数"
+ type: int
+ default: 512
+ min: 1
+ required: true
+ help:
+ en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
+ zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
+
+ - name: temperature
+ use_template: temperature
+ label:
+ en_US: "Temperature"
+ zh_Hans: "采样温度"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_p
+ use_template: top_p
+ label:
+ en_US: "Top P"
+ zh_Hans: "Top P"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens;当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_k
+ use_template: top_k
+ label:
+ en_US: "Top K"
+ zh_Hans: "Top K"
+ type: int
+ default: 50
+ min: 0
+ max: 100
+ required: true
+ help:
+ en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
+ zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
+
+ - name: frequency_penalty
+ use_template: frequency_penalty
+ label:
+ en_US: "Frequency Penalty"
+ zh_Hans: "频率惩罚"
+ type: float
+ default: 0
+ min: -1.0
+ max: 1.0
+ precision: 1
+ required: false
+ help:
+ en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
+ zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
+
+ - name: user
+ use_template: text
+ label:
+ en_US: "User"
+ zh_Hans: "用户"
+ type: string
+ required: false
+ help:
+ en_US: "Used to track and differentiate conversation requests from different users."
+ zh_Hans: "用于追踪和区分不同用户的对话请求。"
diff --git a/api/core/model_runtime/model_providers/gitee_ai/llm/Yi-1.5-34B-Chat.yaml b/api/core/model_runtime/model_providers/gitee_ai/llm/Yi-1.5-34B-Chat.yaml
new file mode 100644
index 0000000000..f7260c987b
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/llm/Yi-1.5-34B-Chat.yaml
@@ -0,0 +1,105 @@
+model: Yi-1.5-34B-Chat
+label:
+ zh_Hans: Yi-1.5-34B-Chat
+ en_US: Yi-1.5-34B-Chat
+model_type: llm
+features:
+ - agent-thought
+model_properties:
+ mode: chat
+ context_size: 4096
+parameter_rules:
+ - name: stream
+ use_template: boolean
+ label:
+ en_US: "Stream"
+ zh_Hans: "流式"
+ type: boolean
+ default: true
+ required: true
+ help:
+ en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
+ zh_Hans: "是否通过流式分批返回结果。如果设置为 true,生成过程中实时地向用户推送每一部分生成的文本。"
+
+ - name: max_tokens
+ use_template: max_tokens
+ label:
+ en_US: "Max Tokens"
+ zh_Hans: "最大Token数"
+ type: int
+ default: 512
+ min: 1
+ required: true
+ help:
+ en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
+ zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
+
+ - name: temperature
+ use_template: temperature
+ label:
+ en_US: "Temperature"
+ zh_Hans: "采样温度"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_p
+ use_template: top_p
+ label:
+ en_US: "Top P"
+ zh_Hans: "Top P"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens;当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_k
+ use_template: top_k
+ label:
+ en_US: "Top K"
+ zh_Hans: "Top K"
+ type: int
+ default: 50
+ min: 0
+ max: 100
+ required: true
+ help:
+ en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
+ zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
+
+ - name: frequency_penalty
+ use_template: frequency_penalty
+ label:
+ en_US: "Frequency Penalty"
+ zh_Hans: "频率惩罚"
+ type: float
+ default: 0
+ min: -1.0
+ max: 1.0
+ precision: 1
+ required: false
+ help:
+ en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
+ zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
+
+ - name: user
+ use_template: text
+ label:
+ en_US: "User"
+ zh_Hans: "用户"
+ type: string
+ required: false
+ help:
+ en_US: "Used to track and differentiate conversation requests from different users."
+ zh_Hans: "用于追踪和区分不同用户的对话请求。"
diff --git a/api/core/model_runtime/model_providers/gitee_ai/llm/_position.yaml b/api/core/model_runtime/model_providers/gitee_ai/llm/_position.yaml
new file mode 100644
index 0000000000..21f6120742
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/llm/_position.yaml
@@ -0,0 +1,7 @@
+- Qwen2-7B-Instruct
+- Qwen2-72B-Instruct
+- Yi-1.5-34B-Chat
+- glm-4-9b-chat
+- deepseek-coder-33B-instruct-chat
+- deepseek-coder-33B-instruct-completions
+- codegeex4-all-9b
diff --git a/api/core/model_runtime/model_providers/gitee_ai/llm/codegeex4-all-9b.yaml b/api/core/model_runtime/model_providers/gitee_ai/llm/codegeex4-all-9b.yaml
new file mode 100644
index 0000000000..8632cd92ab
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/llm/codegeex4-all-9b.yaml
@@ -0,0 +1,105 @@
+model: codegeex4-all-9b
+label:
+ zh_Hans: codegeex4-all-9b
+ en_US: codegeex4-all-9b
+model_type: llm
+features:
+ - agent-thought
+model_properties:
+ mode: chat
+ context_size: 40960
+parameter_rules:
+ - name: stream
+ use_template: boolean
+ label:
+ en_US: "Stream"
+ zh_Hans: "流式"
+ type: boolean
+ default: true
+ required: true
+ help:
+ en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
+ zh_Hans: "是否通过流式分批返回结果。如果设置为 true,生成过程中实时地向用户推送每一部分生成的文本。"
+
+ - name: max_tokens
+ use_template: max_tokens
+ label:
+ en_US: "Max Tokens"
+ zh_Hans: "最大Token数"
+ type: int
+ default: 512
+ min: 1
+ required: true
+ help:
+ en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
+ zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
+
+ - name: temperature
+ use_template: temperature
+ label:
+ en_US: "Temperature"
+ zh_Hans: "采样温度"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_p
+ use_template: top_p
+ label:
+ en_US: "Top P"
+ zh_Hans: "Top P"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens;当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_k
+ use_template: top_k
+ label:
+ en_US: "Top K"
+ zh_Hans: "Top K"
+ type: int
+ default: 50
+ min: 0
+ max: 100
+ required: true
+ help:
+ en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
+ zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
+
+ - name: frequency_penalty
+ use_template: frequency_penalty
+ label:
+ en_US: "Frequency Penalty"
+ zh_Hans: "频率惩罚"
+ type: float
+ default: 0
+ min: -1.0
+ max: 1.0
+ precision: 1
+ required: false
+ help:
+ en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
+ zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
+
+ - name: user
+ use_template: text
+ label:
+ en_US: "User"
+ zh_Hans: "用户"
+ type: string
+ required: false
+ help:
+ en_US: "Used to track and differentiate conversation requests from different users."
+ zh_Hans: "用于追踪和区分不同用户的对话请求。"
diff --git a/api/core/model_runtime/model_providers/gitee_ai/llm/deepseek-coder-33B-instruct-chat.yaml b/api/core/model_runtime/model_providers/gitee_ai/llm/deepseek-coder-33B-instruct-chat.yaml
new file mode 100644
index 0000000000..2ac00761d5
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/llm/deepseek-coder-33B-instruct-chat.yaml
@@ -0,0 +1,105 @@
+model: deepseek-coder-33B-instruct-chat
+label:
+ zh_Hans: deepseek-coder-33B-instruct-chat
+ en_US: deepseek-coder-33B-instruct-chat
+model_type: llm
+features:
+ - agent-thought
+model_properties:
+ mode: chat
+ context_size: 9000
+parameter_rules:
+ - name: stream
+ use_template: boolean
+ label:
+ en_US: "Stream"
+ zh_Hans: "流式"
+ type: boolean
+ default: true
+ required: true
+ help:
+ en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
+ zh_Hans: "是否通过流式分批返回结果。如果设置为 true,生成过程中实时地向用户推送每一部分生成的文本。"
+
+ - name: max_tokens
+ use_template: max_tokens
+ label:
+ en_US: "Max Tokens"
+ zh_Hans: "最大Token数"
+ type: int
+ default: 512
+ min: 1
+ required: true
+ help:
+ en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
+ zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
+
+ - name: temperature
+ use_template: temperature
+ label:
+ en_US: "Temperature"
+ zh_Hans: "采样温度"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_p
+ use_template: top_p
+ label:
+ en_US: "Top P"
+ zh_Hans: "Top P"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens;当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_k
+ use_template: top_k
+ label:
+ en_US: "Top K"
+ zh_Hans: "Top K"
+ type: int
+ default: 50
+ min: 0
+ max: 100
+ required: true
+ help:
+ en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
+ zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
+
+ - name: frequency_penalty
+ use_template: frequency_penalty
+ label:
+ en_US: "Frequency Penalty"
+ zh_Hans: "频率惩罚"
+ type: float
+ default: 0
+ min: -1.0
+ max: 1.0
+ precision: 1
+ required: false
+ help:
+ en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
+ zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
+
+ - name: user
+ use_template: text
+ label:
+ en_US: "User"
+ zh_Hans: "用户"
+ type: string
+ required: false
+ help:
+ en_US: "Used to track and differentiate conversation requests from different users."
+ zh_Hans: "用于追踪和区分不同用户的对话请求。"
diff --git a/api/core/model_runtime/model_providers/gitee_ai/llm/deepseek-coder-33B-instruct-completions.yaml b/api/core/model_runtime/model_providers/gitee_ai/llm/deepseek-coder-33B-instruct-completions.yaml
new file mode 100644
index 0000000000..7c364d89f7
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/llm/deepseek-coder-33B-instruct-completions.yaml
@@ -0,0 +1,91 @@
+model: deepseek-coder-33B-instruct-completions
+label:
+ zh_Hans: deepseek-coder-33B-instruct-completions
+ en_US: deepseek-coder-33B-instruct-completions
+model_type: llm
+features:
+ - agent-thought
+model_properties:
+ mode: completion
+ context_size: 9000
+parameter_rules:
+ - name: stream
+ use_template: boolean
+ label:
+ en_US: "Stream"
+ zh_Hans: "流式"
+ type: boolean
+ default: true
+ required: true
+ help:
+ en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
+ zh_Hans: "是否通过流式分批返回结果。如果设置为 true,生成过程中实时地向用户推送每一部分生成的文本。"
+
+ - name: max_tokens
+ use_template: max_tokens
+ label:
+ en_US: "Max Tokens"
+ zh_Hans: "最大Token数"
+ type: int
+ default: 512
+ min: 1
+ required: true
+ help:
+ en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
+ zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
+
+ - name: temperature
+ use_template: temperature
+ label:
+ en_US: "Temperature"
+ zh_Hans: "采样温度"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_p
+ use_template: top_p
+ label:
+ en_US: "Top P"
+ zh_Hans: "Top P"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens;当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: frequency_penalty
+ use_template: frequency_penalty
+ label:
+ en_US: "Frequency Penalty"
+ zh_Hans: "频率惩罚"
+ type: float
+ default: 0
+ min: -1.0
+ max: 1.0
+ precision: 1
+ required: false
+ help:
+ en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
+ zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
+
+ - name: user
+ use_template: text
+ label:
+ en_US: "User"
+ zh_Hans: "用户"
+ type: string
+ required: false
+ help:
+ en_US: "Used to track and differentiate conversation requests from different users."
+ zh_Hans: "用于追踪和区分不同用户的对话请求。"
diff --git a/api/core/model_runtime/model_providers/gitee_ai/llm/glm-4-9b-chat.yaml b/api/core/model_runtime/model_providers/gitee_ai/llm/glm-4-9b-chat.yaml
new file mode 100644
index 0000000000..2afe1cf959
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/llm/glm-4-9b-chat.yaml
@@ -0,0 +1,105 @@
+model: glm-4-9b-chat
+label:
+ zh_Hans: glm-4-9b-chat
+ en_US: glm-4-9b-chat
+model_type: llm
+features:
+ - agent-thought
+model_properties:
+ mode: chat
+ context_size: 32768
+parameter_rules:
+ - name: stream
+ use_template: boolean
+ label:
+ en_US: "Stream"
+ zh_Hans: "流式"
+ type: boolean
+ default: true
+ required: true
+ help:
+ en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
+ zh_Hans: "是否通过流式分批返回结果。如果设置为 true,生成过程中实时地向用户推送每一部分生成的文本。"
+
+ - name: max_tokens
+ use_template: max_tokens
+ label:
+ en_US: "Max Tokens"
+ zh_Hans: "最大Token数"
+ type: int
+ default: 512
+ min: 1
+ required: true
+ help:
+ en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
+ zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
+
+ - name: temperature
+ use_template: temperature
+ label:
+ en_US: "Temperature"
+ zh_Hans: "采样温度"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_p
+ use_template: top_p
+ label:
+ en_US: "Top P"
+ zh_Hans: "Top P"
+ type: float
+ default: 0.7
+ min: 0.0
+ max: 1.0
+ precision: 1
+ required: true
+ help:
+ en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+ zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens;当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
+
+ - name: top_k
+ use_template: top_k
+ label:
+ en_US: "Top K"
+ zh_Hans: "Top K"
+ type: int
+ default: 50
+ min: 0
+ max: 100
+ required: true
+ help:
+ en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
+ zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
+
+ - name: frequency_penalty
+ use_template: frequency_penalty
+ label:
+ en_US: "Frequency Penalty"
+ zh_Hans: "频率惩罚"
+ type: float
+ default: 0
+ min: -1.0
+ max: 1.0
+ precision: 1
+ required: false
+ help:
+ en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
+ zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
+
+ - name: user
+ use_template: text
+ label:
+ en_US: "User"
+ zh_Hans: "用户"
+ type: string
+ required: false
+ help:
+ en_US: "Used to track and differentiate conversation requests from different users."
+ zh_Hans: "用于追踪和区分不同用户的对话请求。"
diff --git a/api/core/model_runtime/model_providers/gitee_ai/llm/llm.py b/api/core/model_runtime/model_providers/gitee_ai/llm/llm.py
new file mode 100644
index 0000000000..b65db6f665
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/llm/llm.py
@@ -0,0 +1,47 @@
+from collections.abc import Generator
+from typing import Optional, Union
+
+from core.model_runtime.entities.llm_entities import LLMMode, LLMResult
+from core.model_runtime.entities.message_entities import (
+ PromptMessage,
+ PromptMessageTool,
+)
+from core.model_runtime.model_providers.openai_api_compatible.llm.llm import OAIAPICompatLargeLanguageModel
+
+
+class GiteeAILargeLanguageModel(OAIAPICompatLargeLanguageModel):
+ MODEL_TO_IDENTITY: dict[str, str] = {
+ "Yi-1.5-34B-Chat": "Yi-34B-Chat",
+ "deepseek-coder-33B-instruct-completions": "deepseek-coder-33B-instruct",
+ "deepseek-coder-33B-instruct-chat": "deepseek-coder-33B-instruct",
+ }
+
+ def _invoke(
+ self,
+ model: str,
+ credentials: dict,
+ prompt_messages: list[PromptMessage],
+ model_parameters: dict,
+ tools: Optional[list[PromptMessageTool]] = None,
+ stop: Optional[list[str]] = None,
+ stream: bool = True,
+ user: Optional[str] = None,
+ ) -> Union[LLMResult, Generator]:
+ self._add_custom_parameters(credentials, model, model_parameters)
+ return super()._invoke(model, credentials, prompt_messages, model_parameters, tools, stop, stream)
+
+ def validate_credentials(self, model: str, credentials: dict) -> None:
+ self._add_custom_parameters(credentials, model, None)
+ super().validate_credentials(model, credentials)
+
+ @staticmethod
+ def _add_custom_parameters(credentials: dict, model: str, model_parameters: dict) -> None:
+ if model is None:
+ model = "bge-large-zh-v1.5"
+
+ model_identity = GiteeAILargeLanguageModel.MODEL_TO_IDENTITY.get(model, model)
+ credentials["endpoint_url"] = f"https://ai.gitee.com/api/serverless/{model_identity}/"
+ if model.endswith("completions"):
+ credentials["mode"] = LLMMode.COMPLETION.value
+ else:
+ credentials["mode"] = LLMMode.CHAT.value
diff --git a/api/core/model_runtime/model_providers/gitee_ai/rerank/__init__.py b/api/core/model_runtime/model_providers/gitee_ai/rerank/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/api/core/model_runtime/model_providers/gitee_ai/rerank/_position.yaml b/api/core/model_runtime/model_providers/gitee_ai/rerank/_position.yaml
new file mode 100644
index 0000000000..83162fd338
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/rerank/_position.yaml
@@ -0,0 +1 @@
+- bge-reranker-v2-m3
diff --git a/api/core/model_runtime/model_providers/gitee_ai/rerank/bge-reranker-v2-m3.yaml b/api/core/model_runtime/model_providers/gitee_ai/rerank/bge-reranker-v2-m3.yaml
new file mode 100644
index 0000000000..f0681641e1
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/rerank/bge-reranker-v2-m3.yaml
@@ -0,0 +1,4 @@
+model: bge-reranker-v2-m3
+model_type: rerank
+model_properties:
+ context_size: 1024
diff --git a/api/core/model_runtime/model_providers/gitee_ai/rerank/rerank.py b/api/core/model_runtime/model_providers/gitee_ai/rerank/rerank.py
new file mode 100644
index 0000000000..231345c2f4
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/rerank/rerank.py
@@ -0,0 +1,128 @@
+from typing import Optional
+
+import httpx
+
+from core.model_runtime.entities.common_entities import I18nObject
+from core.model_runtime.entities.model_entities import AIModelEntity, FetchFrom, ModelPropertyKey, ModelType
+from core.model_runtime.entities.rerank_entities import RerankDocument, RerankResult
+from core.model_runtime.errors.invoke import (
+ InvokeAuthorizationError,
+ InvokeBadRequestError,
+ InvokeConnectionError,
+ InvokeError,
+ InvokeRateLimitError,
+ InvokeServerUnavailableError,
+)
+from core.model_runtime.errors.validate import CredentialsValidateFailedError
+from core.model_runtime.model_providers.__base.rerank_model import RerankModel
+
+
+class GiteeAIRerankModel(RerankModel):
+ """
+ Model class for rerank model.
+ """
+
+ def _invoke(
+ self,
+ model: str,
+ credentials: dict,
+ query: str,
+ docs: list[str],
+ score_threshold: Optional[float] = None,
+ top_n: Optional[int] = None,
+ user: Optional[str] = None,
+ ) -> RerankResult:
+ """
+ Invoke rerank model
+
+ :param model: model name
+ :param credentials: model credentials
+ :param query: search query
+ :param docs: docs for reranking
+ :param score_threshold: score threshold
+ :param top_n: top n documents to return
+ :param user: unique user id
+ :return: rerank result
+ """
+ if len(docs) == 0:
+ return RerankResult(model=model, docs=[])
+
+ base_url = credentials.get("base_url", "https://ai.gitee.com/api/serverless")
+ base_url = base_url.removesuffix("/")
+
+ try:
+ body = {"model": model, "query": query, "documents": docs}
+ if top_n is not None:
+ body["top_n"] = top_n
+ response = httpx.post(
+ f"{base_url}/{model}/rerank",
+ json=body,
+ headers={"Authorization": f"Bearer {credentials.get('api_key')}"},
+ )
+
+ response.raise_for_status()
+ results = response.json()
+
+ rerank_documents = []
+ for result in results["results"]:
+ rerank_document = RerankDocument(
+ index=result["index"],
+ text=result["document"]["text"],
+ score=result["relevance_score"],
+ )
+ if score_threshold is None or result["relevance_score"] >= score_threshold:
+ rerank_documents.append(rerank_document)
+ return RerankResult(model=model, docs=rerank_documents)
+ except httpx.HTTPStatusError as e:
+ raise InvokeServerUnavailableError(str(e))
+
+ def validate_credentials(self, model: str, credentials: dict) -> None:
+ """
+ Validate model credentials
+
+ :param model: model name
+ :param credentials: model credentials
+ :return:
+ """
+ try:
+ self._invoke(
+ model=model,
+ credentials=credentials,
+ query="What is the capital of the United States?",
+ docs=[
+ "Carson City is the capital city of the American state of Nevada. At the 2010 United States "
+ "Census, Carson City had a population of 55,274.",
+ "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that "
+ "are a political division controlled by the United States. Its capital is Saipan.",
+ ],
+ score_threshold=0.01,
+ )
+ except Exception as ex:
+ raise CredentialsValidateFailedError(str(ex))
+
+ @property
+ def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
+ """
+ Map model invoke error to unified error
+ """
+ return {
+ InvokeConnectionError: [httpx.ConnectError],
+ InvokeServerUnavailableError: [httpx.RemoteProtocolError],
+ InvokeRateLimitError: [],
+ InvokeAuthorizationError: [httpx.HTTPStatusError],
+ InvokeBadRequestError: [httpx.RequestError],
+ }
+
+ def get_customizable_model_schema(self, model: str, credentials: dict) -> AIModelEntity:
+ """
+ generate custom model entities from credentials
+ """
+ entity = AIModelEntity(
+ model=model,
+ label=I18nObject(en_US=model),
+ model_type=ModelType.RERANK,
+ fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
+ model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size"))},
+ )
+
+ return entity
diff --git a/api/core/model_runtime/model_providers/gitee_ai/speech2text/__init__.py b/api/core/model_runtime/model_providers/gitee_ai/speech2text/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/api/core/model_runtime/model_providers/gitee_ai/speech2text/_position.yaml b/api/core/model_runtime/model_providers/gitee_ai/speech2text/_position.yaml
new file mode 100644
index 0000000000..8e9b47598b
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/speech2text/_position.yaml
@@ -0,0 +1,2 @@
+- whisper-base
+- whisper-large
diff --git a/api/core/model_runtime/model_providers/gitee_ai/speech2text/speech2text.py b/api/core/model_runtime/model_providers/gitee_ai/speech2text/speech2text.py
new file mode 100644
index 0000000000..5597f5b43e
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/speech2text/speech2text.py
@@ -0,0 +1,53 @@
+import os
+from typing import IO, Optional
+
+import requests
+
+from core.model_runtime.errors.invoke import InvokeBadRequestError
+from core.model_runtime.errors.validate import CredentialsValidateFailedError
+from core.model_runtime.model_providers.__base.speech2text_model import Speech2TextModel
+from core.model_runtime.model_providers.gitee_ai._common import _CommonGiteeAI
+
+
+class GiteeAISpeech2TextModel(_CommonGiteeAI, Speech2TextModel):
+ """
+ Model class for OpenAI Compatible Speech to text model.
+ """
+
+ def _invoke(self, model: str, credentials: dict, file: IO[bytes], user: Optional[str] = None) -> str:
+ """
+ Invoke speech2text model
+
+ :param model: model name
+ :param credentials: model credentials
+ :param file: audio file
+ :param user: unique user id
+ :return: text for given audio file
+ """
+ # doc: https://ai.gitee.com/docs/openapi/serverless#tag/serverless/POST/{service}/speech-to-text
+
+ endpoint_url = f"https://ai.gitee.com/api/serverless/{model}/speech-to-text"
+ files = [("file", file)]
+ _, file_ext = os.path.splitext(file.name)
+ headers = {"Content-Type": f"audio/{file_ext}", "Authorization": f"Bearer {credentials.get('api_key')}"}
+ response = requests.post(endpoint_url, headers=headers, files=files)
+ if response.status_code != 200:
+ raise InvokeBadRequestError(response.text)
+ response_data = response.json()
+ return response_data["text"]
+
+ def validate_credentials(self, model: str, credentials: dict) -> None:
+ """
+ Validate model credentials
+
+ :param model: model name
+ :param credentials: model credentials
+ :return:
+ """
+ try:
+ audio_file_path = self._get_demo_file_path()
+
+ with open(audio_file_path, "rb") as audio_file:
+ self._invoke(model, credentials, audio_file)
+ except Exception as ex:
+ raise CredentialsValidateFailedError(str(ex))
diff --git a/api/core/model_runtime/model_providers/gitee_ai/speech2text/whisper-base.yaml b/api/core/model_runtime/model_providers/gitee_ai/speech2text/whisper-base.yaml
new file mode 100644
index 0000000000..a50bf5fc2d
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/speech2text/whisper-base.yaml
@@ -0,0 +1,5 @@
+model: whisper-base
+model_type: speech2text
+model_properties:
+ file_upload_limit: 1
+ supported_file_extensions: flac,mp3,mp4,mpeg,mpga,m4a,ogg,wav,webm
diff --git a/api/core/model_runtime/model_providers/gitee_ai/speech2text/whisper-large.yaml b/api/core/model_runtime/model_providers/gitee_ai/speech2text/whisper-large.yaml
new file mode 100644
index 0000000000..1be7b1a391
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/speech2text/whisper-large.yaml
@@ -0,0 +1,5 @@
+model: whisper-large
+model_type: speech2text
+model_properties:
+ file_upload_limit: 1
+ supported_file_extensions: flac,mp3,mp4,mpeg,mpga,m4a,ogg,wav,webm
diff --git a/api/core/model_runtime/model_providers/gitee_ai/text_embedding/_position.yaml b/api/core/model_runtime/model_providers/gitee_ai/text_embedding/_position.yaml
new file mode 100644
index 0000000000..e8abe6440d
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/text_embedding/_position.yaml
@@ -0,0 +1,3 @@
+- bge-large-zh-v1.5
+- bge-small-zh-v1.5
+- bge-m3
diff --git a/api/core/model_runtime/model_providers/gitee_ai/text_embedding/bge-large-zh-v1.5.yaml b/api/core/model_runtime/model_providers/gitee_ai/text_embedding/bge-large-zh-v1.5.yaml
new file mode 100644
index 0000000000..9e3ca76e88
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/text_embedding/bge-large-zh-v1.5.yaml
@@ -0,0 +1,8 @@
+model: bge-large-zh-v1.5
+label:
+ zh_Hans: bge-large-zh-v1.5
+ en_US: bge-large-zh-v1.5
+model_type: text-embedding
+model_properties:
+ context_size: 200000
+ max_chunks: 20
diff --git a/api/core/model_runtime/model_providers/gitee_ai/text_embedding/bge-m3.yaml b/api/core/model_runtime/model_providers/gitee_ai/text_embedding/bge-m3.yaml
new file mode 100644
index 0000000000..a7a99a98a3
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/text_embedding/bge-m3.yaml
@@ -0,0 +1,8 @@
+model: bge-m3
+label:
+ zh_Hans: bge-m3
+ en_US: bge-m3
+model_type: text-embedding
+model_properties:
+ context_size: 200000
+ max_chunks: 20
diff --git a/api/core/model_runtime/model_providers/gitee_ai/text_embedding/bge-small-zh-v1.5.yaml b/api/core/model_runtime/model_providers/gitee_ai/text_embedding/bge-small-zh-v1.5.yaml
new file mode 100644
index 0000000000..bd760408fa
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/text_embedding/bge-small-zh-v1.5.yaml
@@ -0,0 +1,8 @@
+model: bge-small-zh-v1.5
+label:
+ zh_Hans: bge-small-zh-v1.5
+ en_US: bge-small-zh-v1.5
+model_type: text-embedding
+model_properties:
+ context_size: 200000
+ max_chunks: 20
diff --git a/api/core/model_runtime/model_providers/gitee_ai/text_embedding/text_embedding.py b/api/core/model_runtime/model_providers/gitee_ai/text_embedding/text_embedding.py
new file mode 100644
index 0000000000..b833c5652c
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/text_embedding/text_embedding.py
@@ -0,0 +1,31 @@
+from typing import Optional
+
+from core.entities.embedding_type import EmbeddingInputType
+from core.model_runtime.entities.text_embedding_entities import TextEmbeddingResult
+from core.model_runtime.model_providers.openai_api_compatible.text_embedding.text_embedding import (
+ OAICompatEmbeddingModel,
+)
+
+
+class GiteeAIEmbeddingModel(OAICompatEmbeddingModel):
+ def _invoke(
+ self,
+ model: str,
+ credentials: dict,
+ texts: list[str],
+ user: Optional[str] = None,
+ input_type: EmbeddingInputType = EmbeddingInputType.DOCUMENT,
+ ) -> TextEmbeddingResult:
+ self._add_custom_parameters(credentials, model)
+ return super()._invoke(model, credentials, texts, user, input_type)
+
+ def validate_credentials(self, model: str, credentials: dict) -> None:
+ self._add_custom_parameters(credentials, None)
+ super().validate_credentials(model, credentials)
+
+ @staticmethod
+ def _add_custom_parameters(credentials: dict, model: str) -> None:
+ if model is None:
+ model = "bge-m3"
+
+ credentials["endpoint_url"] = f"https://ai.gitee.com/api/serverless/{model}/v1/"
diff --git a/api/core/model_runtime/model_providers/gitee_ai/tts/ChatTTS.yaml b/api/core/model_runtime/model_providers/gitee_ai/tts/ChatTTS.yaml
new file mode 100644
index 0000000000..940391dfab
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/tts/ChatTTS.yaml
@@ -0,0 +1,11 @@
+model: ChatTTS
+model_type: tts
+model_properties:
+ default_voice: 'default'
+ voices:
+ - mode: 'default'
+ name: 'Default'
+ language: [ 'zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID' ]
+ word_limit: 3500
+ audio_type: 'mp3'
+ max_workers: 5
diff --git a/api/core/model_runtime/model_providers/gitee_ai/tts/FunAudioLLM-CosyVoice-300M.yaml b/api/core/model_runtime/model_providers/gitee_ai/tts/FunAudioLLM-CosyVoice-300M.yaml
new file mode 100644
index 0000000000..8fc5734801
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/tts/FunAudioLLM-CosyVoice-300M.yaml
@@ -0,0 +1,11 @@
+model: FunAudioLLM-CosyVoice-300M
+model_type: tts
+model_properties:
+ default_voice: 'default'
+ voices:
+ - mode: 'default'
+ name: 'Default'
+ language: [ 'zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID' ]
+ word_limit: 3500
+ audio_type: 'mp3'
+ max_workers: 5
diff --git a/api/core/model_runtime/model_providers/gitee_ai/tts/__init__.py b/api/core/model_runtime/model_providers/gitee_ai/tts/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/api/core/model_runtime/model_providers/gitee_ai/tts/_position.yaml b/api/core/model_runtime/model_providers/gitee_ai/tts/_position.yaml
new file mode 100644
index 0000000000..13c6ec8454
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/tts/_position.yaml
@@ -0,0 +1,4 @@
+- speecht5_tts
+- ChatTTS
+- fish-speech-1.2-sft
+- FunAudioLLM-CosyVoice-300M
diff --git a/api/core/model_runtime/model_providers/gitee_ai/tts/fish-speech-1.2-sft.yaml b/api/core/model_runtime/model_providers/gitee_ai/tts/fish-speech-1.2-sft.yaml
new file mode 100644
index 0000000000..93cc28bc9d
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/tts/fish-speech-1.2-sft.yaml
@@ -0,0 +1,11 @@
+model: fish-speech-1.2-sft
+model_type: tts
+model_properties:
+ default_voice: 'default'
+ voices:
+ - mode: 'default'
+ name: 'Default'
+ language: [ 'zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID' ]
+ word_limit: 3500
+ audio_type: 'mp3'
+ max_workers: 5
diff --git a/api/core/model_runtime/model_providers/gitee_ai/tts/speecht5_tts.yaml b/api/core/model_runtime/model_providers/gitee_ai/tts/speecht5_tts.yaml
new file mode 100644
index 0000000000..f9c843bd41
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/tts/speecht5_tts.yaml
@@ -0,0 +1,11 @@
+model: speecht5_tts
+model_type: tts
+model_properties:
+ default_voice: 'default'
+ voices:
+ - mode: 'default'
+ name: 'Default'
+ language: [ 'zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID' ]
+ word_limit: 3500
+ audio_type: 'mp3'
+ max_workers: 5
diff --git a/api/core/model_runtime/model_providers/gitee_ai/tts/tts.py b/api/core/model_runtime/model_providers/gitee_ai/tts/tts.py
new file mode 100644
index 0000000000..ed2bd5b13d
--- /dev/null
+++ b/api/core/model_runtime/model_providers/gitee_ai/tts/tts.py
@@ -0,0 +1,79 @@
+from typing import Optional
+
+import requests
+
+from core.model_runtime.errors.invoke import InvokeBadRequestError
+from core.model_runtime.errors.validate import CredentialsValidateFailedError
+from core.model_runtime.model_providers.__base.tts_model import TTSModel
+from core.model_runtime.model_providers.gitee_ai._common import _CommonGiteeAI
+
+
+class GiteeAIText2SpeechModel(_CommonGiteeAI, TTSModel):
+ """
+ Model class for OpenAI Speech to text model.
+ """
+
+ def _invoke(
+ self, model: str, tenant_id: str, credentials: dict, content_text: str, voice: str, user: Optional[str] = None
+ ) -> any:
+ """
+ _invoke text2speech model
+
+ :param model: model name
+ :param tenant_id: user tenant id
+ :param credentials: model credentials
+ :param content_text: text content to be translated
+ :param voice: model timbre
+ :param user: unique user id
+ :return: text translated to audio file
+ """
+ return self._tts_invoke_streaming(model=model, credentials=credentials, content_text=content_text, voice=voice)
+
+ def validate_credentials(self, model: str, credentials: dict) -> None:
+ """
+ validate credentials text2speech model
+
+ :param model: model name
+ :param credentials: model credentials
+ :return: text translated to audio file
+ """
+ try:
+ self._tts_invoke_streaming(
+ model=model,
+ credentials=credentials,
+ content_text="Hello Dify!",
+ voice=self._get_model_default_voice(model, credentials),
+ )
+ except Exception as ex:
+ raise CredentialsValidateFailedError(str(ex))
+
+ def _tts_invoke_streaming(self, model: str, credentials: dict, content_text: str, voice: str) -> any:
+ """
+ _tts_invoke_streaming text2speech model
+ :param model: model name
+ :param credentials: model credentials
+ :param content_text: text content to be translated
+ :param voice: model timbre
+ :return: text translated to audio file
+ """
+ try:
+ # doc: https://ai.gitee.com/docs/openapi/serverless#tag/serverless/POST/{service}/text-to-speech
+ endpoint_url = "https://ai.gitee.com/api/serverless/" + model + "/text-to-speech"
+
+ headers = {"Content-Type": "application/json"}
+ api_key = credentials.get("api_key")
+ if api_key:
+ headers["Authorization"] = f"Bearer {api_key}"
+
+ payload = {"inputs": content_text}
+ response = requests.post(endpoint_url, headers=headers, json=payload)
+
+ if response.status_code != 200:
+ raise InvokeBadRequestError(response.text)
+
+ data = response.content
+
+ for i in range(0, len(data), 1024):
+ yield data[i : i + 1024]
+ except Exception as ex:
+ raise InvokeBadRequestError(str(ex))
diff --git a/api/core/model_runtime/model_providers/google/llm/llm.py b/api/core/model_runtime/model_providers/google/llm/llm.py
index e686ad08d9..b1b07a611b 100644
--- a/api/core/model_runtime/model_providers/google/llm/llm.py
+++ b/api/core/model_runtime/model_providers/google/llm/llm.py
@@ -116,26 +116,33 @@ class GoogleLargeLanguageModel(LargeLanguageModel):
:param tools: tool messages
:return: glm tools
"""
- return glm.Tool(
- function_declarations=[
- glm.FunctionDeclaration(
- name=tool.name,
- parameters=glm.Schema(
- type=glm.Type.OBJECT,
- properties={
- key: {
- "type_": value.get("type", "string").upper(),
- "description": value.get("description", ""),
- "enum": value.get("enum", []),
- }
- for key, value in tool.parameters.get("properties", {}).items()
- },
- required=tool.parameters.get("required", []),
- ),
+ function_declarations = []
+ for tool in tools:
+ properties = {}
+ for key, value in tool.parameters.get("properties", {}).items():
+ properties[key] = {
+ "type_": glm.Type.STRING,
+ "description": value.get("description", ""),
+ "enum": value.get("enum", []),
+ }
+
+ if properties:
+ parameters = glm.Schema(
+ type=glm.Type.OBJECT,
+ properties=properties,
+ required=tool.parameters.get("required", []),
)
- for tool in tools
- ]
- )
+ else:
+ parameters = None
+
+ function_declaration = glm.FunctionDeclaration(
+ name=tool.name,
+ parameters=parameters,
+ description=tool.description,
+ )
+ function_declarations.append(function_declaration)
+
+ return glm.Tool(function_declarations=function_declarations)
def validate_credentials(self, model: str, credentials: dict) -> None:
"""
diff --git a/api/core/rag/datasource/vdb/couchbase/__init__.py b/api/core/rag/datasource/vdb/couchbase/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/api/core/rag/datasource/vdb/couchbase/couchbase_vector.py b/api/core/rag/datasource/vdb/couchbase/couchbase_vector.py
new file mode 100644
index 0000000000..3f88d2ca2b
--- /dev/null
+++ b/api/core/rag/datasource/vdb/couchbase/couchbase_vector.py
@@ -0,0 +1,378 @@
+import json
+import logging
+import time
+import uuid
+from datetime import timedelta
+from typing import Any
+
+from couchbase import search
+from couchbase.auth import PasswordAuthenticator
+from couchbase.cluster import Cluster
+from couchbase.management.search import SearchIndex
+
+# needed for options -- cluster, timeout, SQL++ (N1QL) query, etc.
+from couchbase.options import ClusterOptions, SearchOptions
+from couchbase.vector_search import VectorQuery, VectorSearch
+from flask import current_app
+from pydantic import BaseModel, model_validator
+
+from core.rag.datasource.vdb.vector_base import BaseVector
+from core.rag.datasource.vdb.vector_factory import AbstractVectorFactory
+from core.rag.datasource.vdb.vector_type import VectorType
+from core.rag.embedding.embedding_base import Embeddings
+from core.rag.models.document import Document
+from extensions.ext_redis import redis_client
+from models.dataset import Dataset
+
+logger = logging.getLogger(__name__)
+
+
+class CouchbaseConfig(BaseModel):
+ connection_string: str
+ user: str
+ password: str
+ bucket_name: str
+ scope_name: str
+
+ @model_validator(mode="before")
+ @classmethod
+ def validate_config(cls, values: dict) -> dict:
+ if not values.get("connection_string"):
+ raise ValueError("config COUCHBASE_CONNECTION_STRING is required")
+ if not values.get("user"):
+ raise ValueError("config COUCHBASE_USER is required")
+ if not values.get("password"):
+ raise ValueError("config COUCHBASE_PASSWORD is required")
+ if not values.get("bucket_name"):
+ raise ValueError("config COUCHBASE_PASSWORD is required")
+ if not values.get("scope_name"):
+ raise ValueError("config COUCHBASE_SCOPE_NAME is required")
+ return values
+
+
+class CouchbaseVector(BaseVector):
+ def __init__(self, collection_name: str, config: CouchbaseConfig):
+ super().__init__(collection_name)
+ self._client_config = config
+
+ """Connect to couchbase"""
+
+ auth = PasswordAuthenticator(config.user, config.password)
+ options = ClusterOptions(auth)
+ self._cluster = Cluster(config.connection_string, options)
+ self._bucket = self._cluster.bucket(config.bucket_name)
+ self._scope = self._bucket.scope(config.scope_name)
+ self._bucket_name = config.bucket_name
+ self._scope_name = config.scope_name
+
+ # Wait until the cluster is ready for use.
+ self._cluster.wait_until_ready(timedelta(seconds=5))
+
+ def create(self, texts: list[Document], embeddings: list[list[float]], **kwargs):
+ index_id = str(uuid.uuid4()).replace("-", "")
+ self._create_collection(uuid=index_id, vector_length=len(embeddings[0]))
+ self.add_texts(texts, embeddings)
+
+ def _create_collection(self, vector_length: int, uuid: str):
+ lock_name = "vector_indexing_lock_{}".format(self._collection_name)
+ with redis_client.lock(lock_name, timeout=20):
+ collection_exist_cache_key = "vector_indexing_{}".format(self._collection_name)
+ if redis_client.get(collection_exist_cache_key):
+ return
+ if self._collection_exists(self._collection_name):
+ return
+ manager = self._bucket.collections()
+ manager.create_collection(self._client_config.scope_name, self._collection_name)
+
+ index_manager = self._scope.search_indexes()
+
+ index_definition = json.loads("""
+{
+ "type": "fulltext-index",
+ "name": "Embeddings._default.Vector_Search",
+ "uuid": "26d4db528e78b716",
+ "sourceType": "gocbcore",
+ "sourceName": "Embeddings",
+ "sourceUUID": "2242e4a25b4decd6650c9c7b3afa1dbf",
+ "planParams": {
+ "maxPartitionsPerPIndex": 1024,
+ "indexPartitions": 1
+ },
+ "params": {
+ "doc_config": {
+ "docid_prefix_delim": "",
+ "docid_regexp": "",
+ "mode": "scope.collection.type_field",
+ "type_field": "type"
+ },
+ "mapping": {
+ "analysis": { },
+ "default_analyzer": "standard",
+ "default_datetime_parser": "dateTimeOptional",
+ "default_field": "_all",
+ "default_mapping": {
+ "dynamic": true,
+ "enabled": true
+ },
+ "default_type": "_default",
+ "docvalues_dynamic": false,
+ "index_dynamic": true,
+ "store_dynamic": true,
+ "type_field": "_type",
+ "types": {
+ "collection_name": {
+ "dynamic": true,
+ "enabled": true,
+ "properties": {
+ "embedding": {
+ "dynamic": false,
+ "enabled": true,
+ "fields": [
+ {
+ "dims": 1536,
+ "index": true,
+ "name": "embedding",
+ "similarity": "dot_product",
+ "type": "vector",
+ "vector_index_optimized_for": "recall"
+ }
+ ]
+ },
+ "metadata": {
+ "dynamic": true,
+ "enabled": true
+ },
+ "text": {
+ "dynamic": false,
+ "enabled": true,
+ "fields": [
+ {
+ "index": true,
+ "name": "text",
+ "store": true,
+ "type": "text"
+ }
+ ]
+ }
+ }
+ }
+ }
+ },
+ "store": {
+ "indexType": "scorch",
+ "segmentVersion": 16
+ }
+ },
+ "sourceParams": { }
+ }
+""")
+ index_definition["name"] = self._collection_name + "_search"
+ index_definition["uuid"] = uuid
+ index_definition["params"]["mapping"]["types"]["collection_name"]["properties"]["embedding"]["fields"][0][
+ "dims"
+ ] = vector_length
+ index_definition["params"]["mapping"]["types"][self._scope_name + "." + self._collection_name] = (
+ index_definition["params"]["mapping"]["types"].pop("collection_name")
+ )
+ time.sleep(2)
+ index_manager.upsert_index(
+ SearchIndex(
+ index_definition["name"],
+ params=index_definition["params"],
+ source_name=self._bucket_name,
+ ),
+ )
+ time.sleep(1)
+
+ redis_client.set(collection_exist_cache_key, 1, ex=3600)
+
+ def _collection_exists(self, name: str):
+ scope_collection_map: dict[str, Any] = {}
+
+ # Get a list of all scopes in the bucket
+ for scope in self._bucket.collections().get_all_scopes():
+ scope_collection_map[scope.name] = []
+
+ # Get a list of all the collections in the scope
+ for collection in scope.collections:
+ scope_collection_map[scope.name].append(collection.name)
+
+ # Check if the collection exists in the scope
+ return self._collection_name in scope_collection_map[self._scope_name]
+
+ def get_type(self) -> str:
+ return VectorType.COUCHBASE
+
+ def add_texts(self, documents: list[Document], embeddings: list[list[float]], **kwargs):
+ uuids = self._get_uuids(documents)
+ texts = [d.page_content for d in documents]
+ metadatas = [d.metadata for d in documents]
+
+ doc_ids = []
+
+ documents_to_insert = [
+ {"text": text, "embedding": vector, "metadata": metadata}
+ for id, text, vector, metadata in zip(uuids, texts, embeddings, metadatas)
+ ]
+ for doc, id in zip(documents_to_insert, uuids):
+ result = self._scope.collection(self._collection_name).upsert(id, doc)
+
+ doc_ids.extend(uuids)
+
+ return doc_ids
+
+ def text_exists(self, id: str) -> bool:
+ # Use a parameterized query for safety and correctness
+ query = f"""
+ SELECT COUNT(1) AS count FROM
+ `{self._client_config.bucket_name}`.{self._client_config.scope_name}.{self._collection_name}
+ WHERE META().id = $doc_id
+ """
+ # Pass the id as a parameter to the query
+ result = self._cluster.query(query, named_parameters={"doc_id": id}).execute()
+ for row in result:
+ return row["count"] > 0
+ return False # Return False if no rows are returned
+
+ def delete_by_ids(self, ids: list[str]) -> None:
+ query = f"""
+ DELETE FROM `{self._bucket_name}`.{self._client_config.scope_name}.{self._collection_name}
+ WHERE META().id IN $doc_ids;
+ """
+ try:
+ self._cluster.query(query, named_parameters={"doc_ids": ids}).execute()
+ except Exception as e:
+ logger.error(e)
+
+ def delete_by_document_id(self, document_id: str):
+ query = f"""
+ DELETE FROM
+ `{self._client_config.bucket_name}`.{self._client_config.scope_name}.{self._collection_name}
+ WHERE META().id = $doc_id;
+ """
+ self._cluster.query(query, named_parameters={"doc_id": document_id}).execute()
+
+ # def get_ids_by_metadata_field(self, key: str, value: str):
+ # query = f"""
+ # SELECT id FROM
+ # `{self._client_config.bucket_name}`.{self._client_config.scope_name}.{self._collection_name}
+ # WHERE `metadata.{key}` = $value;
+ # """
+ # result = self._cluster.query(query, named_parameters={'value':value})
+ # return [row['id'] for row in result.rows()]
+
+ def delete_by_metadata_field(self, key: str, value: str) -> None:
+ query = f"""
+ DELETE FROM `{self._client_config.bucket_name}`.{self._client_config.scope_name}.{self._collection_name}
+ WHERE metadata.{key} = $value;
+ """
+ self._cluster.query(query, named_parameters={"value": value}).execute()
+
+ def search_by_vector(self, query_vector: list[float], **kwargs: Any) -> list[Document]:
+ top_k = kwargs.get("top_k", 5)
+ score_threshold = kwargs.get("score_threshold") or 0.0
+
+ search_req = search.SearchRequest.create(
+ VectorSearch.from_vector_query(
+ VectorQuery(
+ "embedding",
+ query_vector,
+ top_k,
+ )
+ )
+ )
+ try:
+ search_iter = self._scope.search(
+ self._collection_name + "_search",
+ search_req,
+ SearchOptions(limit=top_k, collections=[self._collection_name], fields=["*"]),
+ )
+
+ docs = []
+ # Parse the results
+ for row in search_iter.rows():
+ text = row.fields.pop("text")
+ metadata = self._format_metadata(row.fields)
+ score = row.score
+ metadata["score"] = score
+ doc = Document(page_content=text, metadata=metadata)
+ if score >= score_threshold:
+ docs.append(doc)
+ except Exception as e:
+ raise ValueError(f"Search failed with error: {e}")
+
+ return docs
+
+ def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
+ top_k = kwargs.get("top_k", 2)
+ try:
+ CBrequest = search.SearchRequest.create(search.QueryStringQuery("text:" + query))
+ search_iter = self._scope.search(
+ self._collection_name + "_search", CBrequest, SearchOptions(limit=top_k, fields=["*"])
+ )
+
+ docs = []
+ for row in search_iter.rows():
+ text = row.fields.pop("text")
+ metadata = self._format_metadata(row.fields)
+ score = row.score
+ metadata["score"] = score
+ doc = Document(page_content=text, metadata=metadata)
+ docs.append(doc)
+
+ except Exception as e:
+ raise ValueError(f"Search failed with error: {e}")
+
+ return docs
+
+ def delete(self):
+ manager = self._bucket.collections()
+ scopes = manager.get_all_scopes()
+
+ for scope in scopes:
+ for collection in scope.collections:
+ if collection.name == self._collection_name:
+ manager.drop_collection("_default", self._collection_name)
+
+ def _format_metadata(self, row_fields: dict[str, Any]) -> dict[str, Any]:
+ """Helper method to format the metadata from the Couchbase Search API.
+ Args:
+ row_fields (Dict[str, Any]): The fields to format.
+
+ Returns:
+ Dict[str, Any]: The formatted metadata.
+ """
+ metadata = {}
+ for key, value in row_fields.items():
+ # Couchbase Search returns the metadata key with a prefix
+ # `metadata.` We remove it to get the original metadata key
+ if key.startswith("metadata"):
+ new_key = key.split("metadata" + ".")[-1]
+ metadata[new_key] = value
+ else:
+ metadata[key] = value
+
+ return metadata
+
+
+class CouchbaseVectorFactory(AbstractVectorFactory):
+ def init_vector(self, dataset: Dataset, attributes: list, embeddings: Embeddings) -> CouchbaseVector:
+ if dataset.index_struct_dict:
+ class_prefix: str = dataset.index_struct_dict["vector_store"]["class_prefix"]
+ collection_name = class_prefix
+ else:
+ dataset_id = dataset.id
+ collection_name = Dataset.gen_collection_name_by_id(dataset_id)
+ dataset.index_struct = json.dumps(self.gen_index_struct_dict(VectorType.COUCHBASE, collection_name))
+
+ config = current_app.config
+ return CouchbaseVector(
+ collection_name=collection_name,
+ config=CouchbaseConfig(
+ connection_string=config.get("COUCHBASE_CONNECTION_STRING"),
+ user=config.get("COUCHBASE_USER"),
+ password=config.get("COUCHBASE_PASSWORD"),
+ bucket_name=config.get("COUCHBASE_BUCKET_NAME"),
+ scope_name=config.get("COUCHBASE_SCOPE_NAME"),
+ ),
+ )
diff --git a/api/core/rag/datasource/vdb/elasticsearch/elasticsearch_vector.py b/api/core/rag/datasource/vdb/elasticsearch/elasticsearch_vector.py
index 052a187225..c62042af80 100644
--- a/api/core/rag/datasource/vdb/elasticsearch/elasticsearch_vector.py
+++ b/api/core/rag/datasource/vdb/elasticsearch/elasticsearch_vector.py
@@ -142,7 +142,7 @@ class ElasticSearchVector(BaseVector):
def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
query_str = {"match": {Field.CONTENT_KEY.value: query}}
- results = self._client.search(index=self._collection_name, query=query_str)
+ results = self._client.search(index=self._collection_name, query=query_str, size=kwargs.get("top_k", 4))
docs = []
for hit in results["hits"]["hits"]:
docs.append(
diff --git a/api/core/rag/datasource/vdb/oceanbase/__init__.py b/api/core/rag/datasource/vdb/oceanbase/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/api/core/rag/datasource/vdb/oceanbase/oceanbase_vector.py b/api/core/rag/datasource/vdb/oceanbase/oceanbase_vector.py
new file mode 100644
index 0000000000..8dd26a073b
--- /dev/null
+++ b/api/core/rag/datasource/vdb/oceanbase/oceanbase_vector.py
@@ -0,0 +1,209 @@
+import json
+import logging
+import math
+from typing import Any
+
+from pydantic import BaseModel, model_validator
+from pyobvector import VECTOR, ObVecClient
+from sqlalchemy import JSON, Column, String, func
+from sqlalchemy.dialects.mysql import LONGTEXT
+
+from configs import dify_config
+from core.rag.datasource.vdb.vector_base import BaseVector
+from core.rag.datasource.vdb.vector_factory import AbstractVectorFactory
+from core.rag.datasource.vdb.vector_type import VectorType
+from core.rag.embedding.embedding_base import Embeddings
+from core.rag.models.document import Document
+from extensions.ext_redis import redis_client
+from models.dataset import Dataset
+
+logger = logging.getLogger(__name__)
+
+DEFAULT_OCEANBASE_HNSW_BUILD_PARAM = {"M": 16, "efConstruction": 256}
+DEFAULT_OCEANBASE_HNSW_SEARCH_PARAM = {"efSearch": 64}
+OCEANBASE_SUPPORTED_VECTOR_INDEX_TYPE = "HNSW"
+DEFAULT_OCEANBASE_VECTOR_METRIC_TYPE = "l2"
+
+
+class OceanBaseVectorConfig(BaseModel):
+ host: str
+ port: int
+ user: str
+ password: str
+ database: str
+
+ @model_validator(mode="before")
+ @classmethod
+ def validate_config(cls, values: dict) -> dict:
+ if not values["host"]:
+ raise ValueError("config OCEANBASE_VECTOR_HOST is required")
+ if not values["port"]:
+ raise ValueError("config OCEANBASE_VECTOR_PORT is required")
+ if not values["user"]:
+ raise ValueError("config OCEANBASE_VECTOR_USER is required")
+ if not values["database"]:
+ raise ValueError("config OCEANBASE_VECTOR_DATABASE is required")
+ return values
+
+
+class OceanBaseVector(BaseVector):
+ def __init__(self, collection_name: str, config: OceanBaseVectorConfig):
+ super().__init__(collection_name)
+ self._config = config
+ self._hnsw_ef_search = -1
+ self._client = ObVecClient(
+ uri=f"{self._config.host}:{self._config.port}",
+ user=self._config.user,
+ password=self._config.password,
+ db_name=self._config.database,
+ )
+
+ def get_type(self) -> str:
+ return VectorType.OCEANBASE
+
+ def create(self, texts: list[Document], embeddings: list[list[float]], **kwargs):
+ self._vec_dim = len(embeddings[0])
+ self._create_collection()
+ self.add_texts(texts, embeddings)
+
+ def _create_collection(self) -> None:
+ lock_name = "vector_indexing_lock_" + self._collection_name
+ with redis_client.lock(lock_name, timeout=20):
+ collection_exist_cache_key = "vector_indexing_" + self._collection_name
+ if redis_client.get(collection_exist_cache_key):
+ return
+
+ if self._client.check_table_exists(self._collection_name):
+ return
+
+ self.delete()
+
+ cols = [
+ Column("id", String(36), primary_key=True, autoincrement=False),
+ Column("vector", VECTOR(self._vec_dim)),
+ Column("text", LONGTEXT),
+ Column("metadata", JSON),
+ ]
+ vidx_params = self._client.prepare_index_params()
+ vidx_params.add_index(
+ field_name="vector",
+ index_type=OCEANBASE_SUPPORTED_VECTOR_INDEX_TYPE,
+ index_name="vector_index",
+ metric_type=DEFAULT_OCEANBASE_VECTOR_METRIC_TYPE,
+ params=DEFAULT_OCEANBASE_HNSW_BUILD_PARAM,
+ )
+
+ self._client.create_table_with_index_params(
+ table_name=self._collection_name,
+ columns=cols,
+ vidxs=vidx_params,
+ )
+ vals = []
+ params = self._client.perform_raw_text_sql("SHOW PARAMETERS LIKE '%ob_vector_memory_limit_percentage%'")
+ for row in params:
+ val = int(row[6])
+ vals.append(val)
+ if len(vals) == 0:
+ print("ob_vector_memory_limit_percentage not found in parameters.")
+ exit(1)
+ if any(val == 0 for val in vals):
+ try:
+ self._client.perform_raw_text_sql("ALTER SYSTEM SET ob_vector_memory_limit_percentage = 30")
+ except Exception as e:
+ raise Exception(
+ "Failed to set ob_vector_memory_limit_percentage. "
+ + "Maybe the database user has insufficient privilege.",
+ e,
+ )
+ redis_client.set(collection_exist_cache_key, 1, ex=3600)
+
+ def add_texts(self, documents: list[Document], embeddings: list[list[float]], **kwargs):
+ ids = self._get_uuids(documents)
+ for id, doc, emb in zip(ids, documents, embeddings):
+ self._client.insert(
+ table_name=self._collection_name,
+ data={
+ "id": id,
+ "vector": emb,
+ "text": doc.page_content,
+ "metadata": doc.metadata,
+ },
+ )
+
+ def text_exists(self, id: str) -> bool:
+ cur = self._client.get(table_name=self._collection_name, id=id)
+ return cur.rowcount != 0
+
+ def delete_by_ids(self, ids: list[str]) -> None:
+ self._client.delete(table_name=self._collection_name, ids=ids)
+
+ def get_ids_by_metadata_field(self, key: str, value: str) -> list[str]:
+ cur = self._client.get(
+ table_name=self._collection_name,
+ where_clause=f"metadata->>'$.{key}' = '{value}'",
+ output_column_name=["id"],
+ )
+ return [row[0] for row in cur]
+
+ def delete_by_metadata_field(self, key: str, value: str) -> None:
+ ids = self.get_ids_by_metadata_field(key, value)
+ self.delete_by_ids(ids)
+
+ def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
+ return []
+
+ def search_by_vector(self, query_vector: list[float], **kwargs: Any) -> list[Document]:
+ ef_search = kwargs.get("ef_search", self._hnsw_ef_search)
+ if ef_search != self._hnsw_ef_search:
+ self._client.set_ob_hnsw_ef_search(ef_search)
+ self._hnsw_ef_search = ef_search
+ topk = kwargs.get("top_k", 10)
+ cur = self._client.ann_search(
+ table_name=self._collection_name,
+ vec_column_name="vector",
+ vec_data=query_vector,
+ topk=topk,
+ distance_func=func.l2_distance,
+ output_column_names=["text", "metadata"],
+ with_dist=True,
+ )
+ docs = []
+ for text, metadata, distance in cur:
+ metadata = json.loads(metadata)
+ metadata["score"] = 1 - distance / math.sqrt(2)
+ docs.append(
+ Document(
+ page_content=text,
+ metadata=metadata,
+ )
+ )
+ return docs
+
+ def delete(self) -> None:
+ self._client.drop_table_if_exist(self._collection_name)
+
+
+class OceanBaseVectorFactory(AbstractVectorFactory):
+ def init_vector(
+ self,
+ dataset: Dataset,
+ attributes: list,
+ embeddings: Embeddings,
+ ) -> BaseVector:
+ if dataset.index_struct_dict:
+ class_prefix: str = dataset.index_struct_dict["vector_store"]["class_prefix"]
+ collection_name = class_prefix.lower()
+ else:
+ dataset_id = dataset.id
+ collection_name = Dataset.gen_collection_name_by_id(dataset_id).lower()
+ dataset.index_struct = json.dumps(self.gen_index_struct_dict(VectorType.OCEANBASE, collection_name))
+ return OceanBaseVector(
+ collection_name,
+ OceanBaseVectorConfig(
+ host=dify_config.OCEANBASE_VECTOR_HOST,
+ port=dify_config.OCEANBASE_VECTOR_PORT,
+ user=dify_config.OCEANBASE_VECTOR_USER,
+ password=(dify_config.OCEANBASE_VECTOR_PASSWORD or ""),
+ database=dify_config.OCEANBASE_VECTOR_DATABASE,
+ ),
+ )
diff --git a/api/core/rag/datasource/vdb/tidb_on_qdrant/tidb_service.py b/api/core/rag/datasource/vdb/tidb_on_qdrant/tidb_service.py
index f10d6339ee..0cd2a46460 100644
--- a/api/core/rag/datasource/vdb/tidb_on_qdrant/tidb_service.py
+++ b/api/core/rag/datasource/vdb/tidb_on_qdrant/tidb_service.py
@@ -4,6 +4,7 @@ import uuid
import requests
from requests.auth import HTTPDigestAuth
+from configs import dify_config
from extensions.ext_database import db
from extensions.ext_redis import redis_client
from models.dataset import TidbAuthBinding
@@ -208,7 +209,7 @@ class TidbService:
}
spending_limit = {
- "monthly": 10,
+ "monthly": dify_config.TIDB_SPEND_LIMIT,
}
password = str(uuid.uuid4()).replace("-", "")[:16]
display_name = str(uuid.uuid4()).replace("-", "")
diff --git a/api/core/rag/datasource/vdb/vector_factory.py b/api/core/rag/datasource/vdb/vector_factory.py
index 59a5aadacd..c8cb007ae8 100644
--- a/api/core/rag/datasource/vdb/vector_factory.py
+++ b/api/core/rag/datasource/vdb/vector_factory.py
@@ -114,6 +114,10 @@ class Vector:
from core.rag.datasource.vdb.analyticdb.analyticdb_vector import AnalyticdbVectorFactory
return AnalyticdbVectorFactory
+ case VectorType.COUCHBASE:
+ from core.rag.datasource.vdb.couchbase.couchbase_vector import CouchbaseVectorFactory
+
+ return CouchbaseVectorFactory
case VectorType.BAIDU:
from core.rag.datasource.vdb.baidu.baidu_vector import BaiduVectorFactory
@@ -130,6 +134,10 @@ class Vector:
from core.rag.datasource.vdb.tidb_on_qdrant.tidb_on_qdrant_vector import TidbOnQdrantVectorFactory
return TidbOnQdrantVectorFactory
+ case VectorType.OCEANBASE:
+ from core.rag.datasource.vdb.oceanbase.oceanbase_vector import OceanBaseVectorFactory
+
+ return OceanBaseVectorFactory
case _:
raise ValueError(f"Vector store {vector_type} is not supported.")
diff --git a/api/core/rag/datasource/vdb/vector_type.py b/api/core/rag/datasource/vdb/vector_type.py
index 3b6df94f78..e3b37ece88 100644
--- a/api/core/rag/datasource/vdb/vector_type.py
+++ b/api/core/rag/datasource/vdb/vector_type.py
@@ -16,7 +16,9 @@ class VectorType(str, Enum):
TENCENT = "tencent"
ORACLE = "oracle"
ELASTICSEARCH = "elasticsearch"
+ COUCHBASE = "couchbase"
BAIDU = "baidu"
VIKINGDB = "vikingdb"
UPSTASH = "upstash"
TIDB_ON_QDRANT = "tidb_on_qdrant"
+ OCEANBASE = "oceanbase"
diff --git a/api/core/rag/rerank/rerank_type.py b/api/core/rag/rerank/rerank_type.py
index d4894e3cc6..d71eb2daa8 100644
--- a/api/core/rag/rerank/rerank_type.py
+++ b/api/core/rag/rerank/rerank_type.py
@@ -1,6 +1,6 @@
from enum import Enum
-class RerankMode(Enum):
+class RerankMode(str, Enum):
RERANKING_MODEL = "reranking_model"
WEIGHTED_SCORE = "weighted_score"
diff --git a/api/core/rag/retrieval/dataset_retrieval.py b/api/core/rag/retrieval/dataset_retrieval.py
index 3455cdc3c4..7a5bf39fa6 100644
--- a/api/core/rag/retrieval/dataset_retrieval.py
+++ b/api/core/rag/retrieval/dataset_retrieval.py
@@ -22,6 +22,7 @@ from core.rag.datasource.keyword.jieba.jieba_keyword_table_handler import JiebaK
from core.rag.datasource.retrieval_service import RetrievalService
from core.rag.entities.context_entities import DocumentContext
from core.rag.models.document import Document
+from core.rag.rerank.rerank_type import RerankMode
from core.rag.retrieval.retrieval_methods import RetrievalMethod
from core.rag.retrieval.router.multi_dataset_function_call_router import FunctionCallMultiDatasetRouter
from core.rag.retrieval.router.multi_dataset_react_route import ReactMultiDatasetRouter
@@ -361,10 +362,39 @@ class DatasetRetrieval:
reranking_enable: bool = True,
message_id: Optional[str] = None,
):
+ if not available_datasets:
+ return []
threads = []
all_documents = []
dataset_ids = [dataset.id for dataset in available_datasets]
- index_type = None
+ index_type_check = all(
+ item.indexing_technique == available_datasets[0].indexing_technique for item in available_datasets
+ )
+ if not index_type_check and (not reranking_enable or reranking_mode != RerankMode.RERANKING_MODEL):
+ raise ValueError(
+ "The configured knowledge base list have different indexing technique, please set reranking model."
+ )
+ index_type = available_datasets[0].indexing_technique
+ if index_type == "high_quality":
+ embedding_model_check = all(
+ item.embedding_model == available_datasets[0].embedding_model for item in available_datasets
+ )
+ embedding_model_provider_check = all(
+ item.embedding_model_provider == available_datasets[0].embedding_model_provider
+ for item in available_datasets
+ )
+ if (
+ reranking_enable
+ and reranking_mode == "weighted_score"
+ and (not embedding_model_check or not embedding_model_provider_check)
+ ):
+ raise ValueError(
+ "The configured knowledge base list have different embedding model, please set reranking model."
+ )
+ if reranking_enable and reranking_mode == RerankMode.WEIGHTED_SCORE:
+ weights["vector_setting"]["embedding_provider_name"] = available_datasets[0].embedding_model_provider
+ weights["vector_setting"]["embedding_model_name"] = available_datasets[0].embedding_model
+
for dataset in available_datasets:
index_type = dataset.indexing_technique
retrieval_thread = threading.Thread(
diff --git a/api/core/tools/provider/builtin/baidu_translate/_assets/icon.png b/api/core/tools/provider/builtin/baidu_translate/_assets/icon.png
new file mode 100644
index 0000000000..8eb8f21513
Binary files /dev/null and b/api/core/tools/provider/builtin/baidu_translate/_assets/icon.png differ
diff --git a/api/core/tools/provider/builtin/baidu_translate/_baidu_translate_tool_base.py b/api/core/tools/provider/builtin/baidu_translate/_baidu_translate_tool_base.py
new file mode 100644
index 0000000000..ce907c3c61
--- /dev/null
+++ b/api/core/tools/provider/builtin/baidu_translate/_baidu_translate_tool_base.py
@@ -0,0 +1,11 @@
+from hashlib import md5
+
+
+class BaiduTranslateToolBase:
+ def _get_sign(self, appid, secret, salt, query):
+ """
+ get baidu translate sign
+ """
+ # concatenate the string in the order of appid+q+salt+secret
+ str = appid + query + salt + secret
+ return md5(str.encode("utf-8")).hexdigest()
diff --git a/api/core/tools/provider/builtin/baidu_translate/baidu_translate.py b/api/core/tools/provider/builtin/baidu_translate/baidu_translate.py
new file mode 100644
index 0000000000..cccd2f8c8f
--- /dev/null
+++ b/api/core/tools/provider/builtin/baidu_translate/baidu_translate.py
@@ -0,0 +1,17 @@
+from typing import Any
+
+from core.tools.errors import ToolProviderCredentialValidationError
+from core.tools.provider.builtin.baidu_translate.tools.translate import BaiduTranslateTool
+from core.tools.provider.builtin_tool_provider import BuiltinToolProviderController
+
+
+class BaiduTranslateProvider(BuiltinToolProviderController):
+ def _validate_credentials(self, credentials: dict[str, Any]) -> None:
+ try:
+ BaiduTranslateTool().fork_tool_runtime(
+ runtime={
+ "credentials": credentials,
+ }
+ ).invoke(user_id="", tool_parameters={"q": "这是一段测试文本", "from": "auto", "to": "en"})
+ except Exception as e:
+ raise ToolProviderCredentialValidationError(str(e))
diff --git a/api/core/tools/provider/builtin/baidu_translate/baidu_translate.yaml b/api/core/tools/provider/builtin/baidu_translate/baidu_translate.yaml
new file mode 100644
index 0000000000..06dadeeefc
--- /dev/null
+++ b/api/core/tools/provider/builtin/baidu_translate/baidu_translate.yaml
@@ -0,0 +1,39 @@
+identity:
+ author: Xiao Ley
+ name: baidu_translate
+ label:
+ en_US: Baidu Translate
+ zh_Hans: 百度翻译
+ description:
+ en_US: Translate text using Baidu
+ zh_Hans: 使用百度进行翻译
+ icon: icon.png
+ tags:
+ - utilities
+credentials_for_provider:
+ appid:
+ type: secret-input
+ required: true
+ label:
+ en_US: Baidu translate appid
+ zh_Hans: Baidu translate appid
+ placeholder:
+ en_US: Please input your Baidu translate appid
+ zh_Hans: 请输入你的百度翻译 appid
+ help:
+ en_US: Get your Baidu translate appid from Baidu translate
+ zh_Hans: 从百度翻译开放平台获取你的 appid
+ url: https://api.fanyi.baidu.com
+ secret:
+ type: secret-input
+ required: true
+ label:
+ en_US: Baidu translate secret
+ zh_Hans: Baidu translate secret
+ placeholder:
+ en_US: Please input your Baidu translate secret
+ zh_Hans: 请输入你的百度翻译 secret
+ help:
+ en_US: Get your Baidu translate secret from Baidu translate
+ zh_Hans: 从百度翻译开放平台获取你的 secret
+ url: https://api.fanyi.baidu.com
diff --git a/api/core/tools/provider/builtin/baidu_translate/tools/fieldtranslate.py b/api/core/tools/provider/builtin/baidu_translate/tools/fieldtranslate.py
new file mode 100644
index 0000000000..bce259f31d
--- /dev/null
+++ b/api/core/tools/provider/builtin/baidu_translate/tools/fieldtranslate.py
@@ -0,0 +1,78 @@
+import random
+from hashlib import md5
+from typing import Any, Union
+
+import requests
+
+from core.tools.entities.tool_entities import ToolInvokeMessage
+from core.tools.provider.builtin.baidu_translate._baidu_translate_tool_base import BaiduTranslateToolBase
+from core.tools.tool.builtin_tool import BuiltinTool
+
+
+class BaiduFieldTranslateTool(BuiltinTool, BaiduTranslateToolBase):
+ def _invoke(
+ self,
+ user_id: str,
+ tool_parameters: dict[str, Any],
+ ) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
+ """
+ invoke tools
+ """
+ BAIDU_FIELD_TRANSLATE_URL = "https://fanyi-api.baidu.com/api/trans/vip/fieldtranslate"
+
+ appid = self.runtime.credentials.get("appid", "")
+ if not appid:
+ raise ValueError("invalid baidu translate appid")
+
+ secret = self.runtime.credentials.get("secret", "")
+ if not secret:
+ raise ValueError("invalid baidu translate secret")
+
+ q = tool_parameters.get("q", "")
+ if not q:
+ raise ValueError("Please input text to translate")
+
+ from_ = tool_parameters.get("from", "")
+ if not from_:
+ raise ValueError("Please select source language")
+
+ to = tool_parameters.get("to", "")
+ if not to:
+ raise ValueError("Please select destination language")
+
+ domain = tool_parameters.get("domain", "")
+ if not domain:
+ raise ValueError("Please select domain")
+
+ salt = str(random.randint(32768, 16777215))
+ sign = self._get_sign(appid, secret, salt, q, domain)
+
+ headers = {"Content-Type": "application/x-www-form-urlencoded"}
+ params = {
+ "q": q,
+ "from": from_,
+ "to": to,
+ "appid": appid,
+ "salt": salt,
+ "domain": domain,
+ "sign": sign,
+ "needIntervene": 1,
+ }
+ try:
+ response = requests.post(BAIDU_FIELD_TRANSLATE_URL, headers=headers, data=params)
+ result = response.json()
+
+ if "trans_result" in result:
+ result_text = result["trans_result"][0]["dst"]
+ else:
+ result_text = f'{result["error_code"]}: {result["error_msg"]}'
+
+ return self.create_text_message(str(result_text))
+ except requests.RequestException as e:
+ raise ValueError(f"Translation service error: {e}")
+ except Exception:
+ raise ValueError("Translation service error, please check the network")
+
+ def _get_sign(self, appid, secret, salt, query, domain):
+ str = appid + query + salt + domain + secret
+ return md5(str.encode("utf-8")).hexdigest()
diff --git a/api/core/tools/provider/builtin/baidu_translate/tools/fieldtranslate.yaml b/api/core/tools/provider/builtin/baidu_translate/tools/fieldtranslate.yaml
new file mode 100644
index 0000000000..de51fddbae
--- /dev/null
+++ b/api/core/tools/provider/builtin/baidu_translate/tools/fieldtranslate.yaml
@@ -0,0 +1,123 @@
+identity:
+ name: field_translate
+ author: Xiao Ley
+ label:
+ en_US: Field translate
+ zh_Hans: 百度领域翻译
+description:
+ human:
+ en_US: A tool for Baidu Field translate (Currently, the fields of "novel" and "wiki" only support Chinese to English translation. If the language direction is set to English to Chinese, the default output will be a universal translation result).
+ zh_Hans: 百度领域翻译,提供多种领域的文本翻译(目前“网络文学领域”和“人文社科领域”仅支持中到英,如设置语言方向为英到中,则默认输出通用翻译结果)
+ llm: A tool for Baidu Field translate
+parameters:
+ - name: q
+ type: string
+ required: true
+ label:
+ en_US: Text content
+ zh_Hans: 文本内容
+ human_description:
+ en_US: Text content to be translated
+ zh_Hans: 需要翻译的文本内容
+ llm_description: Text content to be translated
+ form: llm
+ - name: from
+ type: select
+ required: true
+ label:
+ en_US: source language
+ zh_Hans: 源语言
+ human_description:
+ en_US: The source language of the input text
+ zh_Hans: 输入的文本的源语言
+ default: auto
+ form: form
+ options:
+ - value: auto
+ label:
+ en_US: auto
+ zh_Hans: 自动检测
+ - value: zh
+ label:
+ en_US: Chinese
+ zh_Hans: 中文
+ - value: en
+ label:
+ en_US: English
+ zh_Hans: 英语
+ - name: to
+ type: select
+ required: true
+ label:
+ en_US: destination language
+ zh_Hans: 目标语言
+ human_description:
+ en_US: The destination language of the input text
+ zh_Hans: 输入文本的目标语言
+ default: en
+ form: form
+ options:
+ - value: zh
+ label:
+ en_US: Chinese
+ zh_Hans: 中文
+ - value: en
+ label:
+ en_US: English
+ zh_Hans: 英语
+ - name: domain
+ type: select
+ required: true
+ label:
+ en_US: domain
+ zh_Hans: 领域
+ human_description:
+ en_US: The domain of the input text
+ zh_Hans: 输入文本的领域
+ default: novel
+ form: form
+ options:
+ - value: it
+ label:
+ en_US: it
+ zh_Hans: 信息技术领域
+ - value: finance
+ label:
+ en_US: finance
+ zh_Hans: 金融财经领域
+ - value: machinery
+ label:
+ en_US: machinery
+ zh_Hans: 机械制造领域
+ - value: senimed
+ label:
+ en_US: senimed
+ zh_Hans: 生物医药领域
+ - value: novel
+ label:
+ en_US: novel (only support Chinese to English translation)
+ zh_Hans: 网络文学领域(仅支持中到英)
+ - value: academic
+ label:
+ en_US: academic
+ zh_Hans: 学术论文领域
+ - value: aerospace
+ label:
+ en_US: aerospace
+ zh_Hans: 航空航天领域
+ - value: wiki
+ label:
+ en_US: wiki (only support Chinese to English translation)
+ zh_Hans: 人文社科领域(仅支持中到英)
+ - value: news
+ label:
+ en_US: news
+ zh_Hans: 新闻咨询领域
+ - value: law
+ label:
+ en_US: law
+ zh_Hans: 法律法规领域
+ - value: contract
+ label:
+ en_US: contract
+ zh_Hans: 合同领域
diff --git a/api/core/tools/provider/builtin/baidu_translate/tools/language.py b/api/core/tools/provider/builtin/baidu_translate/tools/language.py
new file mode 100644
index 0000000000..3bbaee88b3
--- /dev/null
+++ b/api/core/tools/provider/builtin/baidu_translate/tools/language.py
@@ -0,0 +1,95 @@
+import random
+from typing import Any, Union
+
+import requests
+
+from core.tools.entities.tool_entities import ToolInvokeMessage
+from core.tools.provider.builtin.baidu_translate._baidu_translate_tool_base import BaiduTranslateToolBase
+from core.tools.tool.builtin_tool import BuiltinTool
+
+
+class BaiduLanguageTool(BuiltinTool, BaiduTranslateToolBase):
+ def _invoke(
+ self,
+ user_id: str,
+ tool_parameters: dict[str, Any],
+ ) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
+ """
+ invoke tools
+ """
+ BAIDU_LANGUAGE_URL = "https://fanyi-api.baidu.com/api/trans/vip/language"
+
+ appid = self.runtime.credentials.get("appid", "")
+ if not appid:
+ raise ValueError("invalid baidu translate appid")
+
+ secret = self.runtime.credentials.get("secret", "")
+ if not secret:
+ raise ValueError("invalid baidu translate secret")
+
+ q = tool_parameters.get("q", "")
+ if not q:
+ raise ValueError("Please input text to translate")
+
+ description_language = tool_parameters.get("description_language", "English")
+
+ salt = str(random.randint(32768, 16777215))
+ sign = self._get_sign(appid, secret, salt, q)
+
+ headers = {"Content-Type": "application/x-www-form-urlencoded"}
+ params = {
+ "q": q,
+ "appid": appid,
+ "salt": salt,
+ "sign": sign,
+ }
+
+ try:
+ response = requests.post(BAIDU_LANGUAGE_URL, params=params, headers=headers)
+ result = response.json()
+ if "error_code" not in result:
+ raise ValueError("Translation service error, please check the network")
+
+ result_text = ""
+ if result["error_code"] != 0:
+ result_text = f'{result["error_code"]}: {result["error_msg"]}'
+ else:
+ result_text = result["data"]["src"]
+ result_text = self.mapping_result(description_language, result_text)
+
+ return self.create_text_message(result_text)
+ except requests.RequestException as e:
+ raise ValueError(f"Translation service error: {e}")
+ except Exception:
+ raise ValueError("Translation service error, please check the network")
+
+ def mapping_result(self, description_language: str, result: str) -> str:
+ """
+ mapping result
+ """
+ mapping = {
+ "English": {
+ "zh": "Chinese",
+ "en": "English",
+ "jp": "Japanese",
+ "kor": "Korean",
+ "th": "Thai",
+ "vie": "Vietnamese",
+ "ru": "Russian",
+ },
+ "Chinese": {
+ "zh": "中文",
+ "en": "英文",
+ "jp": "日文",
+ "kor": "韩文",
+ "th": "泰语",
+ "vie": "越南语",
+ "ru": "俄语",
+ },
+ }
+
+ language_mapping = mapping.get(description_language)
+ if not language_mapping:
+ return result
+
+ return language_mapping.get(result, result)
diff --git a/api/core/tools/provider/builtin/baidu_translate/tools/language.yaml b/api/core/tools/provider/builtin/baidu_translate/tools/language.yaml
new file mode 100644
index 0000000000..60cca2e288
--- /dev/null
+++ b/api/core/tools/provider/builtin/baidu_translate/tools/language.yaml
@@ -0,0 +1,43 @@
+identity:
+ name: language
+ author: Xiao Ley
+ label:
+ en_US: Baidu Language
+ zh_Hans: 百度语种识别
+description:
+ human:
+ en_US: A tool for Baidu Language, support Chinese, English, Japanese, Korean, Thai, Vietnamese and Russian
+ zh_Hans: 使用百度进行语种识别,支持的语种:中文、英语、日语、韩语、泰语、越南语和俄语
+ llm: A tool for Baidu Language
+parameters:
+ - name: q
+ type: string
+ required: true
+ label:
+ en_US: Text content
+ zh_Hans: 文本内容
+ human_description:
+ en_US: Text content to be recognized
+ zh_Hans: 需要识别语言的文本内容
+ llm_description: Text content to be recognized
+ form: llm
+ - name: description_language
+ type: select
+ required: true
+ label:
+ en_US: Description language
+ zh_Hans: 描述语言
+ human_description:
+ en_US: Describe the language used to identify the results
+ zh_Hans: 描述识别结果所用的语言
+ default: Chinese
+ form: form
+ options:
+ - value: Chinese
+ label:
+ en_US: Chinese
+ zh_Hans: 中文
+ - value: English
+ label:
+ en_US: English
+ zh_Hans: 英语
diff --git a/api/core/tools/provider/builtin/baidu_translate/tools/translate.py b/api/core/tools/provider/builtin/baidu_translate/tools/translate.py
new file mode 100644
index 0000000000..7cd816a3bc
--- /dev/null
+++ b/api/core/tools/provider/builtin/baidu_translate/tools/translate.py
@@ -0,0 +1,67 @@
+import random
+from typing import Any, Union
+
+import requests
+
+from core.tools.entities.tool_entities import ToolInvokeMessage
+from core.tools.provider.builtin.baidu_translate._baidu_translate_tool_base import BaiduTranslateToolBase
+from core.tools.tool.builtin_tool import BuiltinTool
+
+
+class BaiduTranslateTool(BuiltinTool, BaiduTranslateToolBase):
+ def _invoke(
+ self,
+ user_id: str,
+ tool_parameters: dict[str, Any],
+ ) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
+ """
+ invoke tools
+ """
+ BAIDU_TRANSLATE_URL = "https://fanyi-api.baidu.com/api/trans/vip/translate"
+
+ appid = self.runtime.credentials.get("appid", "")
+ if not appid:
+ raise ValueError("invalid baidu translate appid")
+
+ secret = self.runtime.credentials.get("secret", "")
+ if not secret:
+ raise ValueError("invalid baidu translate secret")
+
+ q = tool_parameters.get("q", "")
+ if not q:
+ raise ValueError("Please input text to translate")
+
+ from_ = tool_parameters.get("from", "")
+ if not from_:
+ raise ValueError("Please select source language")
+
+ to = tool_parameters.get("to", "")
+ if not to:
+ raise ValueError("Please select destination language")
+
+ salt = str(random.randint(32768, 16777215))
+ sign = self._get_sign(appid, secret, salt, q)
+
+ headers = {"Content-Type": "application/x-www-form-urlencoded"}
+ params = {
+ "q": q,
+ "from": from_,
+ "to": to,
+ "appid": appid,
+ "salt": salt,
+ "sign": sign,
+ }
+ try:
+ response = requests.post(BAIDU_TRANSLATE_URL, params=params, headers=headers)
+ result = response.json()
+
+ if "trans_result" in result:
+ result_text = result["trans_result"][0]["dst"]
+ else:
+ result_text = f'{result["error_code"]}: {result["error_msg"]}'
+
+ return self.create_text_message(str(result_text))
+ except requests.RequestException as e:
+ raise ValueError(f"Translation service error: {e}")
+ except Exception:
+ raise ValueError("Translation service error, please check the network")
diff --git a/api/core/tools/provider/builtin/baidu_translate/tools/translate.yaml b/api/core/tools/provider/builtin/baidu_translate/tools/translate.yaml
new file mode 100644
index 0000000000..c8ff32cb6b
--- /dev/null
+++ b/api/core/tools/provider/builtin/baidu_translate/tools/translate.yaml
@@ -0,0 +1,275 @@
+identity:
+ name: translate
+ author: Xiao Ley
+ label:
+ en_US: Translate
+ zh_Hans: 百度翻译
+description:
+ human:
+ en_US: A tool for Baidu Translate
+ zh_Hans: 百度翻译
+ llm: A tool for Baidu Translate
+parameters:
+ - name: q
+ type: string
+ required: true
+ label:
+ en_US: Text content
+ zh_Hans: 文本内容
+ human_description:
+ en_US: Text content to be translated
+ zh_Hans: 需要翻译的文本内容
+ llm_description: Text content to be translated
+ form: llm
+ - name: from
+ type: select
+ required: true
+ label:
+ en_US: source language
+ zh_Hans: 源语言
+ human_description:
+ en_US: The source language of the input text
+ zh_Hans: 输入的文本的源语言
+ default: auto
+ form: form
+ options:
+ - value: auto
+ label:
+ en_US: auto
+ zh_Hans: 自动检测
+ - value: zh
+ label:
+ en_US: Chinese
+ zh_Hans: 中文
+ - value: en
+ label:
+ en_US: English
+ zh_Hans: 英语
+ - value: cht
+ label:
+ en_US: Traditional Chinese
+ zh_Hans: 繁体中文
+ - value: yue
+ label:
+ en_US: Yue
+ zh_Hans: 粤语
+ - value: wyw
+ label:
+ en_US: Wyw
+ zh_Hans: 文言文
+ - value: jp
+ label:
+ en_US: Japanese
+ zh_Hans: 日语
+ - value: kor
+ label:
+ en_US: Korean
+ zh_Hans: 韩语
+ - value: fra
+ label:
+ en_US: French
+ zh_Hans: 法语
+ - value: spa
+ label:
+ en_US: Spanish
+ zh_Hans: 西班牙语
+ - value: th
+ label:
+ en_US: Thai
+ zh_Hans: 泰语
+ - value: ara
+ label:
+ en_US: Arabic
+ zh_Hans: 阿拉伯语
+ - value: ru
+ label:
+ en_US: Russian
+ zh_Hans: 俄语
+ - value: pt
+ label:
+ en_US: Portuguese
+ zh_Hans: 葡萄牙语
+ - value: de
+ label:
+ en_US: German
+ zh_Hans: 德语
+ - value: it
+ label:
+ en_US: Italian
+ zh_Hans: 意大利语
+ - value: el
+ label:
+ en_US: Greek
+ zh_Hans: 希腊语
+ - value: nl
+ label:
+ en_US: Dutch
+ zh_Hans: 荷兰语
+ - value: pl
+ label:
+ en_US: Polish
+ zh_Hans: 波兰语
+ - value: bul
+ label:
+ en_US: Bulgarian
+ zh_Hans: 保加利亚语
+ - value: est
+ label:
+ en_US: Estonian
+ zh_Hans: 爱沙尼亚语
+ - value: dan
+ label:
+ en_US: Danish
+ zh_Hans: 丹麦语
+ - value: fin
+ label:
+ en_US: Finnish
+ zh_Hans: 芬兰语
+ - value: cs
+ label:
+ en_US: Czech
+ zh_Hans: 捷克语
+ - value: rom
+ label:
+ en_US: Romanian
+ zh_Hans: 罗马尼亚语
+ - value: slo
+ label:
+ en_US: Slovak
+ zh_Hans: 斯洛文尼亚语
+ - value: swe
+ label:
+ en_US: Swedish
+ zh_Hans: 瑞典语
+ - value: hu
+ label:
+ en_US: Hungarian
+ zh_Hans: 匈牙利语
+ - value: vie
+ label:
+ en_US: Vietnamese
+ zh_Hans: 越南语
+ - name: to
+ type: select
+ required: true
+ label:
+ en_US: destination language
+ zh_Hans: 目标语言
+ human_description:
+ en_US: The destination language of the input text
+ zh_Hans: 输入文本的目标语言
+ default: en
+ form: form
+ options:
+ - value: zh
+ label:
+ en_US: Chinese
+ zh_Hans: 中文
+ - value: en
+ label:
+ en_US: English
+ zh_Hans: 英语
+ - value: cht
+ label:
+ en_US: Traditional Chinese
+ zh_Hans: 繁体中文
+ - value: yue
+ label:
+ en_US: Yue
+ zh_Hans: 粤语
+ - value: wyw
+ label:
+ en_US: Wyw
+ zh_Hans: 文言文
+ - value: jp
+ label:
+ en_US: Japanese
+ zh_Hans: 日语
+ - value: kor
+ label:
+ en_US: Korean
+ zh_Hans: 韩语
+ - value: fra
+ label:
+ en_US: French
+ zh_Hans: 法语
+ - value: spa
+ label:
+ en_US: Spanish
+ zh_Hans: 西班牙语
+ - value: th
+ label:
+ en_US: Thai
+ zh_Hans: 泰语
+ - value: ara
+ label:
+ en_US: Arabic
+ zh_Hans: 阿拉伯语
+ - value: ru
+ label:
+ en_US: Russian
+ zh_Hans: 俄语
+ - value: pt
+ label:
+ en_US: Portuguese
+ zh_Hans: 葡萄牙语
+ - value: de
+ label:
+ en_US: German
+ zh_Hans: 德语
+ - value: it
+ label:
+ en_US: Italian
+ zh_Hans: 意大利语
+ - value: el
+ label:
+ en_US: Greek
+ zh_Hans: 希腊语
+ - value: nl
+ label:
+ en_US: Dutch
+ zh_Hans: 荷兰语
+ - value: pl
+ label:
+ en_US: Polish
+ zh_Hans: 波兰语
+ - value: bul
+ label:
+ en_US: Bulgarian
+ zh_Hans: 保加利亚语
+ - value: est
+ label:
+ en_US: Estonian
+ zh_Hans: 爱沙尼亚语
+ - value: dan
+ label:
+ en_US: Danish
+ zh_Hans: 丹麦语
+ - value: fin
+ label:
+ en_US: Finnish
+ zh_Hans: 芬兰语
+ - value: cs
+ label:
+ en_US: Czech
+ zh_Hans: 捷克语
+ - value: rom
+ label:
+ en_US: Romanian
+ zh_Hans: 罗马尼亚语
+ - value: slo
+ label:
+ en_US: Slovak
+ zh_Hans: 斯洛文尼亚语
+ - value: swe
+ label:
+ en_US: Swedish
+ zh_Hans: 瑞典语
+ - value: hu
+ label:
+ en_US: Hungarian
+ zh_Hans: 匈牙利语
+ - value: vie
+ label:
+ en_US: Vietnamese
+ zh_Hans: 越南语
diff --git a/api/core/tools/provider/builtin/vectorizer/tools/test_data.py b/api/core/tools/provider/builtin/vectorizer/tools/test_data.py
deleted file mode 100644
index 8effa9818a..0000000000
--- a/api/core/tools/provider/builtin/vectorizer/tools/test_data.py
+++ /dev/null
@@ -1 +0,0 @@
-VECTORIZER_ICON_PNG = "iVBORw0KGgoAAAANSUhEUgAAAGAAAABgCAYAAADimHc4AAAACXBIWXMAACxLAAAsSwGlPZapAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAboSURBVHgB7Z09bBxFFMffRoAvcQqbguBUxu4wCUikMCZ0TmQK4NLQJCJOlQIkokgEGhQ7NCFIKEhQuIqNnIaGMxRY2GVwmlggDHS+pIHELmIXMTEULPP3eeXz7e7szO7MvE1ufpKV03nuNn7/mfcxH7tEHo/H42lXgqwG1bGw65+/aTQM6K0gpJdCoi7ypCIMui5s9Qv9R1OVTqrVxoL1jPbpvH4hrIp/rnmj5+YOhTQ++1kwmdZgT9ovRi6EF4Xhv/XGL0Sv6OLXYMu0BokjYOSDcBQfJI8xhKFP/HAlqCW8v5vqubBr8yn6maCexxiIDR376LnWmBBzQZtPEvx+L3mMAleOZKb1/XgM2EOnyWMFZJKt78UEQKpJHisk2TYmgM967JFk2z3kYcULwIwXgBkvADNeAGa8AMw8Qcwc6N55/eAh0cYmGaOzQtR/kOhQX+M6+/c23r+3RlT/i2ipTrSyRqw4F+CwMMbgANHQwG7jRywLw/wqDDNzI79xYPjqa2L262jjtYzaT0QT3xEbsck4MXUakgWOvUx08liy0ZPYEKNhel4Y6AZpgR7/8Tvq1wEQ+sMJN6Nh9kqwy+bWYwAM8elZovNv6xmlU7iLs280RNO9ls51os/h/8eBVQEig8Dt5OXUsNrno2tluZw0cI3qUXKONQHy9sYkVHqnjntLA2LnFTAv1gSA+zBhfIDvkfVO/B4xRgWZn4fbe2WAnGJFAAxn03+I7PtUXdzE90Sjl4ne+6L4d5nCigAyYyHPn7tFdPN30uJwX/qI6jtISkQZFVLdhd9SrtNPTrFSB6QZBAaYntsptpAyfvk+KYOCamVR/XrNtLqepduiFnkh3g4iIw6YLAhlOJmKwB9zaarhApr/MPREjAZVisSU1s/KYsGzhmKXClYEWLm/8xpV7btXhcv5I7lt2vtJFA3q/T07r1HopdG5l5xhxQVdn28YFn8kBJCBOZmiPHio1m5QuJzlu9ntXApgZwSsNYJslvGjtjrfm8Sq4neceFUtz3dZCzwW09Gqo2hreuPN7HZRnNqa1BP1x8lhczVNK+zT0TqkjYAF4e7Okxoo2PZX5K4IrhNpb/P8FTK2S1+TcUq1HpBFmquJYo1qEYU6RVarJE0c2ooL7C5IRwBZ5nJ9joyRtk5hA3YBdHqWzG1gBKgE/bzMaK5LqMIugKrbUDHu59/YWVRBsWhrsYZdANV5HBUXYGNlC9dFBW8LdgH6FQVYUnQvkQgm3NH8YuO7bM4LsWZBfT3qRY9OxRyJgJRz+Ij+FDPEQ1C3GVMiWAVQ7f31u/ncytxi4wdZTbRGgdcHnpYLD/FcwSrAoOKizfKfVAiIF4kBMPK+Opfe1iWsMUB1BJh2BRgBabSNAOiFqkXYbcNFUF9P+u82FGdWTcEmgGrvh0FUppB1kC073muXEaDq/21kIjLxV9tFAC7/n5X6tkUM0PH/dcP+P0v41fvkFBYBVHs/MD0CDmVsOzEdb7JgEYDT/8uq4rpj44NSjwDTc/CyzV1gxbH7Ac4F0PH/S4ZHAOaFZLiY+2nFuQA6/t9kQMTCz1CG66tbWvWS4VwAVf9vugAbel6efqrsYbKBcwFeVNz8ajobyTppw2F84FQAnfl/kwER6wJZcWdBc7e2KZwKoOP/TVakWb0f7md+kVhwOwI0BDCFyq42rt4PSiuAiRGAEXdK4ZQlV+8HTgVwefwHvR7nhbOA0FwBGDgTIM/Z3SLXUj2hOW1wR10eSrs7Ou9eTB3jo/dzuh/gTABdn35c8dhpM3BxOmeTuXs/cDoCdDY4qe7l32pbaZxL1jF+GXo/cLotBcWVTiZU3T7RMn8rHiijW9FgauP4Ef1TLdhHWgacCgAj6tYCqGKjU/DNbqxIkMYZNs7MpxmnLuhmwYJna1dbdzHjY42hDL4/wqkA6HWuDkAngRH0iYVjRkVwnoZO/0gsuLwpkw7OBcAtwlwvfESHxctmfMBSiOG0oStj4HCF7T3+RWARwIU7QK/HbWlqls52mYJtezqMj3v34C5VOveFy8Ll4QoTsJ8Txp0RsW8/Os2im2LCtSC1RIqLw3RldTVplOKkPEYDhMAPqttnune2rzTv5Y+WKdEem2ixkWqZYSeDSUp3qwIYNOrR7cBjcbOORxkvADNeAGa8AMx4AZjxAjATf5Ab0Tp5rJBk2/iD3PAwYo8Vkmyb9CjDGfLYIaCp1rdiAnT8S5PeDVkgoDuVCsWeJxwToHZ163m3Z8hjloDGk54vn5gFbT/5eZw8phifvZz8XPlA9qmRj8JRCumi+OkljzbbrvxM0qPMm9rIqY6FXZubVBUinMbzcP3jbuXA6Mh2kMx07KPJJLfj8Xg8Hg/4H+KfFYb2WM4MAAAAAElFTkSuQmCC" # noqa: E501
diff --git a/api/core/tools/provider/builtin/vectorizer/tools/vectorizer.py b/api/core/tools/provider/builtin/vectorizer/tools/vectorizer.py
index 4bd601c0bd..c722cd36c8 100644
--- a/api/core/tools/provider/builtin/vectorizer/tools/vectorizer.py
+++ b/api/core/tools/provider/builtin/vectorizer/tools/vectorizer.py
@@ -1,11 +1,12 @@
-from base64 import b64decode
from typing import Any, Union
from httpx import post
+from core.file.enums import FileType
+from core.file.file_manager import download
+from core.tools.entities.common_entities import I18nObject
from core.tools.entities.tool_entities import ToolInvokeMessage, ToolParameter
-from core.tools.errors import ToolProviderCredentialValidationError
-from core.tools.provider.builtin.vectorizer.tools.test_data import VECTORIZER_ICON_PNG
+from core.tools.errors import ToolParameterValidationError
from core.tools.tool.builtin_tool import BuiltinTool
@@ -16,30 +17,30 @@ class VectorizerTool(BuiltinTool):
"""
invoke tools
"""
- api_key_name = self.runtime.credentials.get("api_key_name", None)
- api_key_value = self.runtime.credentials.get("api_key_value", None)
+ api_key_name = self.runtime.credentials.get("api_key_name")
+ api_key_value = self.runtime.credentials.get("api_key_value")
mode = tool_parameters.get("mode", "test")
- if mode == "production":
- mode = "preview"
-
- if not api_key_name or not api_key_value:
- raise ToolProviderCredentialValidationError("Please input api key name and value")
+ # image file for workflow mode
+ image = tool_parameters.get("image")
+ if image and image.type != FileType.IMAGE:
+ raise ToolParameterValidationError("Not a valid image")
+ # image_id for agent mode
image_id = tool_parameters.get("image_id", "")
- if not image_id:
- return self.create_text_message("Please input image id")
- if image_id.startswith("__test_"):
- image_binary = b64decode(VECTORIZER_ICON_PNG)
- else:
+ if image_id:
image_binary = self.get_variable_file(self.VariableKey.IMAGE)
if not image_binary:
return self.create_text_message("Image not found, please request user to generate image firstly.")
+ elif image:
+ image_binary = download(image)
+ else:
+ raise ToolParameterValidationError("Please provide either image or image_id")
response = post(
"https://vectorizer.ai/api/v1/vectorize",
+ data={"mode": mode},
files={"image": image_binary},
- data={"mode": mode} if mode == "test" else {},
auth=(api_key_name, api_key_value),
timeout=30,
)
@@ -59,11 +60,23 @@ class VectorizerTool(BuiltinTool):
return [
ToolParameter.get_simple_instance(
name="image_id",
- llm_description=f"the image id that you want to vectorize, \
- and the image id should be specified in \
+ llm_description=f"the image_id that you want to vectorize, \
+ and the image_id should be specified in \
{[i.name for i in self.list_default_image_variables()]}",
type=ToolParameter.ToolParameterType.SELECT,
- required=True,
+ required=False,
options=[i.name for i in self.list_default_image_variables()],
- )
+ ),
+ ToolParameter(
+ name="image",
+ label=I18nObject(en_US="image", zh_Hans="image"),
+ human_description=I18nObject(
+ en_US="The image to be converted.",
+ zh_Hans="要转换的图片。",
+ ),
+ type=ToolParameter.ToolParameterType.FILE,
+ form=ToolParameter.ToolParameterForm.LLM,
+ llm_description="you should not input this parameter. just input the image_id.",
+ required=False,
+ ),
]
diff --git a/api/core/tools/provider/builtin/vectorizer/tools/vectorizer.yaml b/api/core/tools/provider/builtin/vectorizer/tools/vectorizer.yaml
index 4b4fb9e245..0afd1c201f 100644
--- a/api/core/tools/provider/builtin/vectorizer/tools/vectorizer.yaml
+++ b/api/core/tools/provider/builtin/vectorizer/tools/vectorizer.yaml
@@ -4,14 +4,21 @@ identity:
label:
en_US: Vectorizer.AI
zh_Hans: Vectorizer.AI
- pt_BR: Vectorizer.AI
description:
human:
en_US: Convert your PNG and JPG images to SVG vectors quickly and easily. Fully automatically. Using AI.
zh_Hans: 一个将 PNG 和 JPG 图像快速轻松地转换为 SVG 矢量图的工具。
- pt_BR: Convert your PNG and JPG images to SVG vectors quickly and easily. Fully automatically. Using AI.
llm: A tool for converting images to SVG vectors. you should input the image id as the input of this tool. the image id can be got from parameters.
parameters:
+ - name: image
+ type: file
+ label:
+ en_US: image
+ human_description:
+ en_US: The image to be converted.
+ zh_Hans: 要转换的图片。
+ llm_description: you should not input this parameter. just input the image_id.
+ form: llm
- name: mode
type: select
required: true
@@ -20,19 +27,15 @@ parameters:
label:
en_US: production
zh_Hans: 生产模式
- pt_BR: production
- value: test
label:
en_US: test
zh_Hans: 测试模式
- pt_BR: test
default: test
label:
en_US: Mode
zh_Hans: 模式
- pt_BR: Mode
human_description:
en_US: It is free to integrate with and test out the API in test mode, no subscription required.
zh_Hans: 在测试模式下,可以免费测试API。
- pt_BR: It is free to integrate with and test out the API in test mode, no subscription required.
form: form
diff --git a/api/core/tools/provider/builtin/vectorizer/vectorizer.py b/api/core/tools/provider/builtin/vectorizer/vectorizer.py
index 3b868572f9..8140348723 100644
--- a/api/core/tools/provider/builtin/vectorizer/vectorizer.py
+++ b/api/core/tools/provider/builtin/vectorizer/vectorizer.py
@@ -1,5 +1,7 @@
from typing import Any
+from core.file import File
+from core.file.enums import FileTransferMethod, FileType
from core.tools.errors import ToolProviderCredentialValidationError
from core.tools.provider.builtin.vectorizer.tools.vectorizer import VectorizerTool
from core.tools.provider.builtin_tool_provider import BuiltinToolProviderController
@@ -7,6 +9,12 @@ from core.tools.provider.builtin_tool_provider import BuiltinToolProviderControl
class VectorizerProvider(BuiltinToolProviderController):
def _validate_credentials(self, credentials: dict[str, Any]) -> None:
+ test_img = File(
+ tenant_id="__test_123",
+ remote_url="https://cloud.dify.ai/logo/logo-site.png",
+ type=FileType.IMAGE,
+ transfer_method=FileTransferMethod.REMOTE_URL,
+ )
try:
VectorizerTool().fork_tool_runtime(
runtime={
@@ -14,7 +22,7 @@ class VectorizerProvider(BuiltinToolProviderController):
}
).invoke(
user_id="",
- tool_parameters={"mode": "test", "image_id": "__test_123"},
+ tool_parameters={"mode": "test", "image": test_img},
)
except Exception as e:
raise ToolProviderCredentialValidationError(str(e))
diff --git a/api/core/tools/provider/builtin/vectorizer/vectorizer.yaml b/api/core/tools/provider/builtin/vectorizer/vectorizer.yaml
index 1257f8d285..94dae20876 100644
--- a/api/core/tools/provider/builtin/vectorizer/vectorizer.yaml
+++ b/api/core/tools/provider/builtin/vectorizer/vectorizer.yaml
@@ -4,11 +4,9 @@ identity:
label:
en_US: Vectorizer.AI
zh_Hans: Vectorizer.AI
- pt_BR: Vectorizer.AI
description:
en_US: Convert your PNG and JPG images to SVG vectors quickly and easily. Fully automatically. Using AI.
zh_Hans: 一个将 PNG 和 JPG 图像快速轻松地转换为 SVG 矢量图的工具。
- pt_BR: Convert your PNG and JPG images to SVG vectors quickly and easily. Fully automatically. Using AI.
icon: icon.png
tags:
- productivity
@@ -20,15 +18,12 @@ credentials_for_provider:
label:
en_US: Vectorizer.AI API Key name
zh_Hans: Vectorizer.AI API Key name
- pt_BR: Vectorizer.AI API Key name
placeholder:
en_US: Please input your Vectorizer.AI ApiKey name
zh_Hans: 请输入你的 Vectorizer.AI ApiKey name
- pt_BR: Please input your Vectorizer.AI ApiKey name
help:
en_US: Get your Vectorizer.AI API Key from Vectorizer.AI.
zh_Hans: 从 Vectorizer.AI 获取您的 Vectorizer.AI API Key。
- pt_BR: Get your Vectorizer.AI API Key from Vectorizer.AI.
url: https://vectorizer.ai/api
api_key_value:
type: secret-input
@@ -36,12 +31,9 @@ credentials_for_provider:
label:
en_US: Vectorizer.AI API Key
zh_Hans: Vectorizer.AI API Key
- pt_BR: Vectorizer.AI API Key
placeholder:
en_US: Please input your Vectorizer.AI ApiKey
zh_Hans: 请输入你的 Vectorizer.AI ApiKey
- pt_BR: Please input your Vectorizer.AI ApiKey
help:
en_US: Get your Vectorizer.AI API Key from Vectorizer.AI.
zh_Hans: 从 Vectorizer.AI 获取您的 Vectorizer.AI API Key。
- pt_BR: Get your Vectorizer.AI API Key from Vectorizer.AI.
diff --git a/api/core/tools/tool_manager.py b/api/core/tools/tool_manager.py
index 9e984732b7..63f7775164 100644
--- a/api/core/tools/tool_manager.py
+++ b/api/core/tools/tool_manager.py
@@ -242,11 +242,15 @@ class ToolManager:
parameters = tool_entity.get_all_runtime_parameters()
for parameter in parameters:
# check file types
- if parameter.type in {
- ToolParameter.ToolParameterType.SYSTEM_FILES,
- ToolParameter.ToolParameterType.FILE,
- ToolParameter.ToolParameterType.FILES,
- }:
+ if (
+ parameter.type
+ in {
+ ToolParameter.ToolParameterType.SYSTEM_FILES,
+ ToolParameter.ToolParameterType.FILE,
+ ToolParameter.ToolParameterType.FILES,
+ }
+ and parameter.required
+ ):
raise ValueError(f"file type parameter {parameter.name} not supported in agent")
if parameter.form == ToolParameter.ToolParameterForm.FORM:
diff --git a/api/core/workflow/nodes/answer/answer_stream_generate_router.py b/api/core/workflow/nodes/answer/answer_stream_generate_router.py
index bc4b056148..96e24a7db3 100644
--- a/api/core/workflow/nodes/answer/answer_stream_generate_router.py
+++ b/api/core/workflow/nodes/answer/answer_stream_generate_router.py
@@ -153,6 +153,7 @@ class AnswerStreamGeneratorRouter:
NodeType.IF_ELSE,
NodeType.QUESTION_CLASSIFIER,
NodeType.ITERATION,
+ NodeType.CONVERSATION_VARIABLE_ASSIGNER,
}:
answer_dependencies[answer_node_id].append(source_node_id)
else:
diff --git a/api/core/workflow/nodes/http_request/entities.py b/api/core/workflow/nodes/http_request/entities.py
index dec76a277e..36ded104c1 100644
--- a/api/core/workflow/nodes/http_request/entities.py
+++ b/api/core/workflow/nodes/http_request/entities.py
@@ -94,7 +94,7 @@ class Response:
@property
def is_file(self):
content_type = self.content_type
- content_disposition = self.response.headers.get("Content-Disposition", "")
+ content_disposition = self.response.headers.get("content-disposition", "")
return "attachment" in content_disposition or (
not any(non_file in content_type for non_file in NON_FILE_CONTENT_TYPES)
@@ -103,7 +103,7 @@ class Response:
@property
def content_type(self) -> str:
- return self.headers.get("Content-Type", "")
+ return self.headers.get("content-type", "")
@property
def text(self) -> str:
diff --git a/api/core/workflow/nodes/http_request/executor.py b/api/core/workflow/nodes/http_request/executor.py
index 0270d7e0fd..6872478299 100644
--- a/api/core/workflow/nodes/http_request/executor.py
+++ b/api/core/workflow/nodes/http_request/executor.py
@@ -33,7 +33,7 @@ class Executor:
params: Mapping[str, str] | None
content: str | bytes | None
data: Mapping[str, Any] | None
- files: Mapping[str, bytes] | None
+ files: Mapping[str, tuple[str | None, bytes, str]] | None
json: Any
headers: dict[str, str]
auth: HttpRequestNodeAuthorization
@@ -141,7 +141,11 @@ class Executor:
files = {k: self.variable_pool.get_file(selector) for k, selector in file_selectors.items()}
files = {k: v for k, v in files.items() if v is not None}
files = {k: variable.value for k, variable in files.items()}
- files = {k: file_manager.download(v) for k, v in files.items() if v.related_id is not None}
+ files = {
+ k: (v.filename, file_manager.download(v), v.mime_type or "application/octet-stream")
+ for k, v in files.items()
+ if v.related_id is not None
+ }
self.data = form_data
self.files = files
diff --git a/api/core/workflow/nodes/http_request/node.py b/api/core/workflow/nodes/http_request/node.py
index 483d0e2b7e..a037bee665 100644
--- a/api/core/workflow/nodes/http_request/node.py
+++ b/api/core/workflow/nodes/http_request/node.py
@@ -142,10 +142,11 @@ class HttpRequestNode(BaseNode[HttpRequestNodeData]):
Extract files from response
"""
files = []
+ is_file = response.is_file
content_type = response.content_type
content = response.content
- if content_type:
+ if is_file and content_type:
# extract filename from url
filename = path.basename(url)
# extract extension if possible
diff --git a/api/core/workflow/nodes/llm/node.py b/api/core/workflow/nodes/llm/node.py
index abf77f3339..472587cb03 100644
--- a/api/core/workflow/nodes/llm/node.py
+++ b/api/core/workflow/nodes/llm/node.py
@@ -327,7 +327,7 @@ class LLMNode(BaseNode[LLMNodeData]):
if variable is None:
raise ValueError(f"Variable {variable_selector.variable} not found")
if isinstance(variable, NoneSegment):
- continue
+ inputs[variable_selector.variable] = ""
inputs[variable_selector.variable] = variable.to_object()
memory = node_data.memory
diff --git a/api/extensions/ext_logging.py b/api/extensions/ext_logging.py
index 56b1d6bd28..0fa832f420 100644
--- a/api/extensions/ext_logging.py
+++ b/api/extensions/ext_logging.py
@@ -1,8 +1,10 @@
import logging
import os
import sys
+from datetime import datetime
from logging.handlers import RotatingFileHandler
+import pytz
from flask import Flask
from configs import dify_config
@@ -30,16 +32,10 @@ def init_app(app: Flask):
handlers=log_handlers,
force=True,
)
+
log_tz = dify_config.LOG_TZ
if log_tz:
- from datetime import datetime
-
- import pytz
-
- timezone = pytz.timezone(log_tz)
-
- def time_converter(seconds):
- return datetime.utcfromtimestamp(seconds).astimezone(timezone).timetuple()
-
for handler in logging.root.handlers:
- handler.formatter.converter = time_converter
+ handler.formatter.converter = lambda seconds: (
+ datetime.fromtimestamp(seconds, tz=pytz.UTC).astimezone(log_tz).timetuple()
+ )
diff --git a/api/poetry.lock b/api/poetry.lock
index 618dbb4033..5b581b9965 100644
--- a/api/poetry.lock
+++ b/api/poetry.lock
@@ -1,4 +1,4 @@
-# This file is automatically @generated by Poetry 1.8.3 and should not be changed by hand.
+# This file is automatically @generated by Poetry 1.8.2 and should not be changed by hand.
[[package]]
name = "aiohappyeyeballs"
@@ -932,6 +932,10 @@ files = [
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:a37b8f0391212d29b3a91a799c8e4a2855e0576911cdfb2515487e30e322253d"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_1_ppc64le.whl", hash = "sha256:e84799f09591700a4154154cab9787452925578841a94321d5ee8fb9a9a328f0"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:f66b5337fa213f1da0d9000bc8dc0cb5b896b726eefd9c6046f699b169c41b9e"},
+ {file = "Brotli-1.1.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:5dab0844f2cf82be357a0eb11a9087f70c5430b2c241493fc122bb6f2bb0917c"},
+ {file = "Brotli-1.1.0-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:e4fe605b917c70283db7dfe5ada75e04561479075761a0b3866c081d035b01c1"},
+ {file = "Brotli-1.1.0-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:1e9a65b5736232e7a7f91ff3d02277f11d339bf34099a56cdab6a8b3410a02b2"},
+ {file = "Brotli-1.1.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:58d4b711689366d4a03ac7957ab8c28890415e267f9b6589969e74b6e42225ec"},
{file = "Brotli-1.1.0-cp310-cp310-win32.whl", hash = "sha256:be36e3d172dc816333f33520154d708a2657ea63762ec16b62ece02ab5e4daf2"},
{file = "Brotli-1.1.0-cp310-cp310-win_amd64.whl", hash = "sha256:0c6244521dda65ea562d5a69b9a26120769b7a9fb3db2fe9545935ed6735b128"},
{file = "Brotli-1.1.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:a3daabb76a78f829cafc365531c972016e4aa8d5b4bf60660ad8ecee19df7ccc"},
@@ -944,8 +948,14 @@ files = [
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:19c116e796420b0cee3da1ccec3b764ed2952ccfcc298b55a10e5610ad7885f9"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_1_ppc64le.whl", hash = "sha256:510b5b1bfbe20e1a7b3baf5fed9e9451873559a976c1a78eebaa3b86c57b4265"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:a1fd8a29719ccce974d523580987b7f8229aeace506952fa9ce1d53a033873c8"},
+ {file = "Brotli-1.1.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:c247dd99d39e0338a604f8c2b3bc7061d5c2e9e2ac7ba9cc1be5a69cb6cd832f"},
+ {file = "Brotli-1.1.0-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:1b2c248cd517c222d89e74669a4adfa5577e06ab68771a529060cf5a156e9757"},
+ {file = "Brotli-1.1.0-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:2a24c50840d89ded6c9a8fdc7b6ed3692ed4e86f1c4a4a938e1e92def92933e0"},
+ {file = "Brotli-1.1.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:f31859074d57b4639318523d6ffdca586ace54271a73ad23ad021acd807eb14b"},
{file = "Brotli-1.1.0-cp311-cp311-win32.whl", hash = "sha256:39da8adedf6942d76dc3e46653e52df937a3c4d6d18fdc94a7c29d263b1f5b50"},
{file = "Brotli-1.1.0-cp311-cp311-win_amd64.whl", hash = "sha256:aac0411d20e345dc0920bdec5548e438e999ff68d77564d5e9463a7ca9d3e7b1"},
+ {file = "Brotli-1.1.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:32d95b80260d79926f5fab3c41701dbb818fde1c9da590e77e571eefd14abe28"},
+ {file = "Brotli-1.1.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:b760c65308ff1e462f65d69c12e4ae085cff3b332d894637f6273a12a482d09f"},
{file = "Brotli-1.1.0-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:316cc9b17edf613ac76b1f1f305d2a748f1b976b033b049a6ecdfd5612c70409"},
{file = "Brotli-1.1.0-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:caf9ee9a5775f3111642d33b86237b05808dafcd6268faa492250e9b78046eb2"},
{file = "Brotli-1.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:70051525001750221daa10907c77830bc889cb6d865cc0b813d9db7fefc21451"},
@@ -956,8 +966,24 @@ files = [
{file = "Brotli-1.1.0-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:4093c631e96fdd49e0377a9c167bfd75b6d0bad2ace734c6eb20b348bc3ea180"},
{file = "Brotli-1.1.0-cp312-cp312-musllinux_1_1_ppc64le.whl", hash = "sha256:7e4c4629ddad63006efa0ef968c8e4751c5868ff0b1c5c40f76524e894c50248"},
{file = "Brotli-1.1.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:861bf317735688269936f755fa136a99d1ed526883859f86e41a5d43c61d8966"},
+ {file = "Brotli-1.1.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:87a3044c3a35055527ac75e419dfa9f4f3667a1e887ee80360589eb8c90aabb9"},
+ {file = "Brotli-1.1.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:c5529b34c1c9d937168297f2c1fde7ebe9ebdd5e121297ff9c043bdb2ae3d6fb"},
+ {file = "Brotli-1.1.0-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:ca63e1890ede90b2e4454f9a65135a4d387a4585ff8282bb72964fab893f2111"},
+ {file = "Brotli-1.1.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:e79e6520141d792237c70bcd7a3b122d00f2613769ae0cb61c52e89fd3443839"},
{file = "Brotli-1.1.0-cp312-cp312-win32.whl", hash = "sha256:5f4d5ea15c9382135076d2fb28dde923352fe02951e66935a9efaac8f10e81b0"},
{file = "Brotli-1.1.0-cp312-cp312-win_amd64.whl", hash = "sha256:906bc3a79de8c4ae5b86d3d75a8b77e44404b0f4261714306e3ad248d8ab0951"},
+ {file = "Brotli-1.1.0-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:8bf32b98b75c13ec7cf774164172683d6e7891088f6316e54425fde1efc276d5"},
+ {file = "Brotli-1.1.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:7bc37c4d6b87fb1017ea28c9508b36bbcb0c3d18b4260fcdf08b200c74a6aee8"},
+ {file = "Brotli-1.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3c0ef38c7a7014ffac184db9e04debe495d317cc9c6fb10071f7fefd93100a4f"},
+ {file = "Brotli-1.1.0-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:91d7cc2a76b5567591d12c01f019dd7afce6ba8cba6571187e21e2fc418ae648"},
+ {file = "Brotli-1.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a93dde851926f4f2678e704fadeb39e16c35d8baebd5252c9fd94ce8ce68c4a0"},
+ {file = "Brotli-1.1.0-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f0db75f47be8b8abc8d9e31bc7aad0547ca26f24a54e6fd10231d623f183d089"},
+ {file = "Brotli-1.1.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:6967ced6730aed543b8673008b5a391c3b1076d834ca438bbd70635c73775368"},
+ {file = "Brotli-1.1.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:7eedaa5d036d9336c95915035fb57422054014ebdeb6f3b42eac809928e40d0c"},
+ {file = "Brotli-1.1.0-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:d487f5432bf35b60ed625d7e1b448e2dc855422e87469e3f450aa5552b0eb284"},
+ {file = "Brotli-1.1.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:832436e59afb93e1836081a20f324cb185836c617659b07b129141a8426973c7"},
+ {file = "Brotli-1.1.0-cp313-cp313-win32.whl", hash = "sha256:43395e90523f9c23a3d5bdf004733246fba087f2948f87ab28015f12359ca6a0"},
+ {file = "Brotli-1.1.0-cp313-cp313-win_amd64.whl", hash = "sha256:9011560a466d2eb3f5a6e4929cf4a09be405c64154e12df0dd72713f6500e32b"},
{file = "Brotli-1.1.0-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:a090ca607cbb6a34b0391776f0cb48062081f5f60ddcce5d11838e67a01928d1"},
{file = "Brotli-1.1.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2de9d02f5bda03d27ede52e8cfe7b865b066fa49258cbab568720aa5be80a47d"},
{file = "Brotli-1.1.0-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:2333e30a5e00fe0fe55903c8832e08ee9c3b1382aacf4db26664a16528d51b4b"},
@@ -967,6 +993,10 @@ files = [
{file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_1_i686.whl", hash = "sha256:fd5f17ff8f14003595ab414e45fce13d073e0762394f957182e69035c9f3d7c2"},
{file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_1_ppc64le.whl", hash = "sha256:069a121ac97412d1fe506da790b3e69f52254b9df4eb665cd42460c837193354"},
{file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_1_x86_64.whl", hash = "sha256:e93dfc1a1165e385cc8239fab7c036fb2cd8093728cbd85097b284d7b99249a2"},
+ {file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_2_aarch64.whl", hash = "sha256:aea440a510e14e818e67bfc4027880e2fb500c2ccb20ab21c7a7c8b5b4703d75"},
+ {file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_2_i686.whl", hash = "sha256:6974f52a02321b36847cd19d1b8e381bf39939c21efd6ee2fc13a28b0d99348c"},
+ {file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_2_ppc64le.whl", hash = "sha256:a7e53012d2853a07a4a79c00643832161a910674a893d296c9f1259859a289d2"},
+ {file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_2_x86_64.whl", hash = "sha256:d7702622a8b40c49bffb46e1e3ba2e81268d5c04a34f460978c6b5517a34dd52"},
{file = "Brotli-1.1.0-cp36-cp36m-win32.whl", hash = "sha256:a599669fd7c47233438a56936988a2478685e74854088ef5293802123b5b2460"},
{file = "Brotli-1.1.0-cp36-cp36m-win_amd64.whl", hash = "sha256:d143fd47fad1db3d7c27a1b1d66162e855b5d50a89666af46e1679c496e8e579"},
{file = "Brotli-1.1.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:11d00ed0a83fa22d29bc6b64ef636c4552ebafcef57154b4ddd132f5638fbd1c"},
@@ -978,6 +1008,10 @@ files = [
{file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_1_i686.whl", hash = "sha256:919e32f147ae93a09fe064d77d5ebf4e35502a8df75c29fb05788528e330fe74"},
{file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_1_ppc64le.whl", hash = "sha256:23032ae55523cc7bccb4f6a0bf368cd25ad9bcdcc1990b64a647e7bbcce9cb5b"},
{file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:224e57f6eac61cc449f498cc5f0e1725ba2071a3d4f48d5d9dffba42db196438"},
+ {file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_2_aarch64.whl", hash = "sha256:cb1dac1770878ade83f2ccdf7d25e494f05c9165f5246b46a621cc849341dc01"},
+ {file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_2_i686.whl", hash = "sha256:3ee8a80d67a4334482d9712b8e83ca6b1d9bc7e351931252ebef5d8f7335a547"},
+ {file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_2_ppc64le.whl", hash = "sha256:5e55da2c8724191e5b557f8e18943b1b4839b8efc3ef60d65985bcf6f587dd38"},
+ {file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_2_x86_64.whl", hash = "sha256:d342778ef319e1026af243ed0a07c97acf3bad33b9f29e7ae6a1f68fd083e90c"},
{file = "Brotli-1.1.0-cp37-cp37m-win32.whl", hash = "sha256:587ca6d3cef6e4e868102672d3bd9dc9698c309ba56d41c2b9c85bbb903cdb95"},
{file = "Brotli-1.1.0-cp37-cp37m-win_amd64.whl", hash = "sha256:2954c1c23f81c2eaf0b0717d9380bd348578a94161a65b3a2afc62c86467dd68"},
{file = "Brotli-1.1.0-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:efa8b278894b14d6da122a72fefcebc28445f2d3f880ac59d46c90f4c13be9a3"},
@@ -990,6 +1024,10 @@ files = [
{file = "Brotli-1.1.0-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:1ab4fbee0b2d9098c74f3057b2bc055a8bd92ccf02f65944a241b4349229185a"},
{file = "Brotli-1.1.0-cp38-cp38-musllinux_1_1_ppc64le.whl", hash = "sha256:141bd4d93984070e097521ed07e2575b46f817d08f9fa42b16b9b5f27b5ac088"},
{file = "Brotli-1.1.0-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:fce1473f3ccc4187f75b4690cfc922628aed4d3dd013d047f95a9b3919a86596"},
+ {file = "Brotli-1.1.0-cp38-cp38-musllinux_1_2_aarch64.whl", hash = "sha256:d2b35ca2c7f81d173d2fadc2f4f31e88cc5f7a39ae5b6db5513cf3383b0e0ec7"},
+ {file = "Brotli-1.1.0-cp38-cp38-musllinux_1_2_i686.whl", hash = "sha256:af6fa6817889314555aede9a919612b23739395ce767fe7fcbea9a80bf140fe5"},
+ {file = "Brotli-1.1.0-cp38-cp38-musllinux_1_2_ppc64le.whl", hash = "sha256:2feb1d960f760a575dbc5ab3b1c00504b24caaf6986e2dc2b01c09c87866a943"},
+ {file = "Brotli-1.1.0-cp38-cp38-musllinux_1_2_x86_64.whl", hash = "sha256:4410f84b33374409552ac9b6903507cdb31cd30d2501fc5ca13d18f73548444a"},
{file = "Brotli-1.1.0-cp38-cp38-win32.whl", hash = "sha256:db85ecf4e609a48f4b29055f1e144231b90edc90af7481aa731ba2d059226b1b"},
{file = "Brotli-1.1.0-cp38-cp38-win_amd64.whl", hash = "sha256:3d7954194c36e304e1523f55d7042c59dc53ec20dd4e9ea9d151f1b62b4415c0"},
{file = "Brotli-1.1.0-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:5fb2ce4b8045c78ebbc7b8f3c15062e435d47e7393cc57c25115cfd49883747a"},
@@ -1002,6 +1040,10 @@ files = [
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:949f3b7c29912693cee0afcf09acd6ebc04c57af949d9bf77d6101ebb61e388c"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_1_ppc64le.whl", hash = "sha256:89f4988c7203739d48c6f806f1e87a1d96e0806d44f0fba61dba81392c9e474d"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:de6551e370ef19f8de1807d0a9aa2cdfdce2e85ce88b122fe9f6b2b076837e59"},
+ {file = "Brotli-1.1.0-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:0737ddb3068957cf1b054899b0883830bb1fec522ec76b1098f9b6e0f02d9419"},
+ {file = "Brotli-1.1.0-cp39-cp39-musllinux_1_2_i686.whl", hash = "sha256:4f3607b129417e111e30637af1b56f24f7a49e64763253bbc275c75fa887d4b2"},
+ {file = "Brotli-1.1.0-cp39-cp39-musllinux_1_2_ppc64le.whl", hash = "sha256:6c6e0c425f22c1c719c42670d561ad682f7bfeeef918edea971a79ac5252437f"},
+ {file = "Brotli-1.1.0-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:494994f807ba0b92092a163a0a283961369a65f6cbe01e8891132b7a320e61eb"},
{file = "Brotli-1.1.0-cp39-cp39-win32.whl", hash = "sha256:f0d8a7a6b5983c2496e364b969f0e526647a06b075d034f3297dc66f3b360c64"},
{file = "Brotli-1.1.0-cp39-cp39-win_amd64.whl", hash = "sha256:cdad5b9014d83ca68c25d2e9444e28e967ef16e80f6b436918c700c117a85467"},
{file = "Brotli-1.1.0.tar.gz", hash = "sha256:81de08ac11bcb85841e440c13611c00b67d3bf82698314928d0b676362546724"},
@@ -1801,6 +1843,46 @@ requests = ">=2.8"
six = "*"
xmltodict = "*"
+[[package]]
+name = "couchbase"
+version = "4.3.3"
+description = "Python Client for Couchbase"
+optional = false
+python-versions = ">=3.7"
+files = [
+ {file = "couchbase-4.3.3-cp310-cp310-macosx_10_15_x86_64.whl", hash = "sha256:d8069e4f01332859d56cca597874645c914699162b3979d1b432f0dfc186b124"},
+ {file = "couchbase-4.3.3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:1caa6cfef49c785b35b1702102f718227f351df87bba2694b9334520c41e9eb5"},
+ {file = "couchbase-4.3.3-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:f4a9a65c44935249fa078fb90a3c28ea71da9d2d5889fcd514b12d0538010ae0"},
+ {file = "couchbase-4.3.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4f144b8c482c18283d8e419b844630d41f3249b07d43d40b5e3535444e57d0fb"},
+ {file = "couchbase-4.3.3-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:1c534fba6fdc7cf47eed9dee8a57d1e9eb867bf008574e321fa380a77cebf32f"},
+ {file = "couchbase-4.3.3-cp310-cp310-win_amd64.whl", hash = "sha256:b841be06e0e4370b69ebef6bca3409c378186f7d6e964cd645ba18e97216c022"},
+ {file = "couchbase-4.3.3-cp311-cp311-macosx_10_15_x86_64.whl", hash = "sha256:eee7a73b3acbdc78ae314fddf7f975b3c9e05df07df255f4dcc878939a2abae0"},
+ {file = "couchbase-4.3.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:53417cafcf90ff4e2fd81ebba2a08b7ad56f17160d1c5019ad3b09c758aeb363"},
+ {file = "couchbase-4.3.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:0cefd13bea8b0f150f1b9d27fd7614f971f77419b31817781d26ba315ed658bb"},
+ {file = "couchbase-4.3.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:78fa1054d7740e2fe38fce0a2aab4e9a2d30263d894e0615ee5df297f02f59a3"},
+ {file = "couchbase-4.3.3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:eb093899cfad5a7472258a9b6a57775dbf23a6e0180241507ba89ce3ab241e41"},
+ {file = "couchbase-4.3.3-cp311-cp311-win_amd64.whl", hash = "sha256:f7cfbdc699af5715f49365ffbb05a6a7366a534c0d7161edf270ad3e735a6c5d"},
+ {file = "couchbase-4.3.3-cp312-cp312-macosx_10_15_x86_64.whl", hash = "sha256:58352cae9b8affdaa2ac012e0a03c8c2632ee6297a878232888b4e0360d0d5df"},
+ {file = "couchbase-4.3.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:728e7e3b5e1682706cb9d63993d289226d02a25089527b8ecb4e3889dabc38cf"},
+ {file = "couchbase-4.3.3-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:73014bf098cf14187a39cc13453e0d859c1d54568df28f69cc308a9a5f24feb2"},
+ {file = "couchbase-4.3.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:a743375804068ae01b73c916bfca738764c8c12f381bb399ef04e784935856a1"},
+ {file = "couchbase-4.3.3-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:394c122cfe02a76a99e7d5178e64129f6da49843225e78d8629abcab556c24af"},
+ {file = "couchbase-4.3.3-cp312-cp312-win_amd64.whl", hash = "sha256:bf85d7a5cda548d9801614651206068b4445fa37972e62b14d7521a958198693"},
+ {file = "couchbase-4.3.3-cp38-cp38-macosx_10_15_x86_64.whl", hash = "sha256:92d23c9cedd571631070791f2afee0e3d7d8c9ce1bf2ea6e9a4f2fdbc37a0f1e"},
+ {file = "couchbase-4.3.3-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:38c42eb29a73cce2998ae5df45bd61b16dce9765d3bff968ec5cf6a622faa291"},
+ {file = "couchbase-4.3.3-cp38-cp38-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:afed137bf0edc642d7b201b6ab7b1e7117bb4c8eac6b2f253cc6e106f334a2a1"},
+ {file = "couchbase-4.3.3-cp38-cp38-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:954d991377d47883aaf903934c5d0f19577680a2abf80d3ce5bb9b3c80991fc7"},
+ {file = "couchbase-4.3.3-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:d5552b9fa684630698dc98d6f3b1082540634c1b7ad5bf53b843b5da57b0169c"},
+ {file = "couchbase-4.3.3-cp38-cp38-win_amd64.whl", hash = "sha256:f88f2b7e0c894f7237d9f3fb5c46abc44b8151a97b3ca8e75f57d23ebf59f9da"},
+ {file = "couchbase-4.3.3-cp39-cp39-macosx_10_15_x86_64.whl", hash = "sha256:769e1e2367ea1d4de181fcd4b4e353e9abef97d15b581a6c5aea49ece3dc7d59"},
+ {file = "couchbase-4.3.3-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:47f59a0b35ffce060583fd11f98f049f3b70701cf14aab9ac092594aca486aeb"},
+ {file = "couchbase-4.3.3-cp39-cp39-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:440bb93d611827ba0ea2403c6f204fe931467a6cb5811f0e03bf1779204ef843"},
+ {file = "couchbase-4.3.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:cdb4dde62e1d41c0b8707121ab68fa78b7a1508541bd48fc850be396f91bc8d9"},
+ {file = "couchbase-4.3.3-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:7f8cf45f317b39cc19db5c67b565662f08d6c90305b3aa14e04bc22707258213"},
+ {file = "couchbase-4.3.3-cp39-cp39-win_amd64.whl", hash = "sha256:c97d48ad486c8f201b4482d5594258f949369cb44792ed148d5159a3d12ae21b"},
+ {file = "couchbase-4.3.3.tar.gz", hash = "sha256:27808500551564b39b46943cf3daab572694889c1eb638425d363edb48b20da7"},
+]
+
[[package]]
name = "coverage"
version = "7.2.7"
@@ -6850,6 +6932,19 @@ files = [
{file = "pyarrow-17.0.0-cp312-cp312-win_amd64.whl", hash = "sha256:392bc9feabc647338e6c89267635e111d71edad5fcffba204425a7c8d13610d7"},
{file = "pyarrow-17.0.0-cp38-cp38-macosx_10_15_x86_64.whl", hash = "sha256:af5ff82a04b2171415f1410cff7ebb79861afc5dae50be73ce06d6e870615204"},
{file = "pyarrow-17.0.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:edca18eaca89cd6382dfbcff3dd2d87633433043650c07375d095cd3517561d8"},
+ {file = "pyarrow-17.0.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7c7916bff914ac5d4a8fe25b7a25e432ff921e72f6f2b7547d1e325c1ad9d155"},
+ {file = "pyarrow-17.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f553ca691b9e94b202ff741bdd40f6ccb70cdd5fbf65c187af132f1317de6145"},
+ {file = "pyarrow-17.0.0-cp38-cp38-manylinux_2_28_aarch64.whl", hash = "sha256:0cdb0e627c86c373205a2f94a510ac4376fdc523f8bb36beab2e7f204416163c"},
+ {file = "pyarrow-17.0.0-cp38-cp38-manylinux_2_28_x86_64.whl", hash = "sha256:d7d192305d9d8bc9082d10f361fc70a73590a4c65cf31c3e6926cd72b76bc35c"},
+ {file = "pyarrow-17.0.0-cp38-cp38-win_amd64.whl", hash = "sha256:02dae06ce212d8b3244dd3e7d12d9c4d3046945a5933d28026598e9dbbda1fca"},
+ {file = "pyarrow-17.0.0-cp39-cp39-macosx_10_15_x86_64.whl", hash = "sha256:13d7a460b412f31e4c0efa1148e1d29bdf18ad1411eb6757d38f8fbdcc8645fb"},
+ {file = "pyarrow-17.0.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:9b564a51fbccfab5a04a80453e5ac6c9954a9c5ef2890d1bcf63741909c3f8df"},
+ {file = "pyarrow-17.0.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:32503827abbc5aadedfa235f5ece8c4f8f8b0a3cf01066bc8d29de7539532687"},
+ {file = "pyarrow-17.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a155acc7f154b9ffcc85497509bcd0d43efb80d6f733b0dc3bb14e281f131c8b"},
+ {file = "pyarrow-17.0.0-cp39-cp39-manylinux_2_28_aarch64.whl", hash = "sha256:dec8d129254d0188a49f8a1fc99e0560dc1b85f60af729f47de4046015f9b0a5"},
+ {file = "pyarrow-17.0.0-cp39-cp39-manylinux_2_28_x86_64.whl", hash = "sha256:a48ddf5c3c6a6c505904545c25a4ae13646ae1f8ba703c4df4a1bfe4f4006bda"},
+ {file = "pyarrow-17.0.0-cp39-cp39-win_amd64.whl", hash = "sha256:42bf93249a083aca230ba7e2786c5f673507fa97bbd9725a1e2754715151a204"},
+ {file = "pyarrow-17.0.0.tar.gz", hash = "sha256:4beca9521ed2c0921c1023e68d097d0299b62c362639ea315572a58f3f50fd28"},
]
[package.dependencies]
@@ -7216,6 +7311,22 @@ files = [
ed25519 = ["PyNaCl (>=1.4.0)"]
rsa = ["cryptography"]
+[[package]]
+name = "pyobvector"
+version = "0.1.6"
+description = "A python SDK for OceanBase Vector Store, based on SQLAlchemy, compatible with Milvus API."
+optional = false
+python-versions = "<4.0,>=3.9"
+files = [
+ {file = "pyobvector-0.1.6-py3-none-any.whl", hash = "sha256:0d700e865a85b4716b9a03384189e49288cd9d5f3cef88aed4740bc82d5fd136"},
+ {file = "pyobvector-0.1.6.tar.gz", hash = "sha256:05551addcac8c596992d5e38b480c83ca3481c6cfc6f56a1a1bddfb2e6ae037e"},
+]
+
+[package.dependencies]
+numpy = ">=1.26.0,<2.0.0"
+pymysql = ">=1.1.1,<2.0.0"
+sqlalchemy = ">=2.0.32,<3.0.0"
+
[[package]]
name = "pyopenssl"
version = "24.2.1"
@@ -8624,6 +8735,11 @@ files = [
{file = "scikit_learn-1.5.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f60021ec1574e56632be2a36b946f8143bf4e5e6af4a06d85281adc22938e0dd"},
{file = "scikit_learn-1.5.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:394397841449853c2290a32050382edaec3da89e35b3e03d6cc966aebc6a8ae6"},
{file = "scikit_learn-1.5.2-cp312-cp312-win_amd64.whl", hash = "sha256:57cc1786cfd6bd118220a92ede80270132aa353647684efa385a74244a41e3b1"},
+ {file = "scikit_learn-1.5.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:e9a702e2de732bbb20d3bad29ebd77fc05a6b427dc49964300340e4c9328b3f5"},
+ {file = "scikit_learn-1.5.2-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:b0768ad641981f5d3a198430a1d31c3e044ed2e8a6f22166b4d546a5116d7908"},
+ {file = "scikit_learn-1.5.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:178ddd0a5cb0044464fc1bfc4cca5b1833bfc7bb022d70b05db8530da4bb3dd3"},
+ {file = "scikit_learn-1.5.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f7284ade780084d94505632241bf78c44ab3b6f1e8ccab3d2af58e0e950f9c12"},
+ {file = "scikit_learn-1.5.2-cp313-cp313-win_amd64.whl", hash = "sha256:b7b0f9a0b1040830d38c39b91b3a44e1b643f4b36e36567b80b7c6bd2202a27f"},
{file = "scikit_learn-1.5.2-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:757c7d514ddb00ae249832fe87100d9c73c6ea91423802872d9e74970a0e40b9"},
{file = "scikit_learn-1.5.2-cp39-cp39-macosx_12_0_arm64.whl", hash = "sha256:52788f48b5d8bca5c0736c175fa6bdaab2ef00a8f536cda698db61bd89c551c1"},
{file = "scikit_learn-1.5.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:643964678f4b5fbdc95cbf8aec638acc7aa70f5f79ee2cdad1eec3df4ba6ead8"},
@@ -10866,4 +10982,4 @@ cffi = ["cffi (>=1.11)"]
[metadata]
lock-version = "2.0"
python-versions = ">=3.10,<3.13"
-content-hash = "1b268122d3d4771ba219f0e983322e0454b7b8644dba35da38d7d950d489e1ba"
+content-hash = "ef927b98c33d704d680e08db0e5c7d9a4e05454c66fcd6a5f656a65eb08e886b"
diff --git a/api/pyproject.toml b/api/pyproject.toml
index a549601535..ee7cf4d618 100644
--- a/api/pyproject.toml
+++ b/api/pyproject.toml
@@ -239,6 +239,7 @@ alibabacloud_gpdb20160503 = "~3.8.0"
alibabacloud_tea_openapi = "~0.3.9"
chromadb = "0.5.1"
clickhouse-connect = "~0.7.16"
+couchbase = "~4.3.0"
elasticsearch = "8.14.0"
opensearch-py = "2.4.0"
oracledb = "~2.2.1"
@@ -246,6 +247,7 @@ pgvecto-rs = { version = "~0.2.1", extras = ['sqlalchemy'] }
pgvector = "0.2.5"
pymilvus = "~2.4.4"
pymochow = "1.3.1"
+pyobvector = "~0.1.6"
qdrant-client = "1.7.3"
tcvectordb = "1.3.2"
tidb-vector = "0.0.9"
diff --git a/api/pytest.ini b/api/pytest.ini
index dcca08e2e5..a23a4b3f3d 100644
--- a/api/pytest.ini
+++ b/api/pytest.ini
@@ -27,3 +27,4 @@ env =
XINFERENCE_GENERATION_MODEL_UID = generate
XINFERENCE_RERANK_MODEL_UID = rerank
XINFERENCE_SERVER_URL = http://a.abc.com:11451
+ GITEE_AI_API_KEY = aaaaaaaaaaaaaaaaaaaa
diff --git a/api/services/external_knowledge_service.py b/api/services/external_knowledge_service.py
index 4efdf8d7db..b49738c61c 100644
--- a/api/services/external_knowledge_service.py
+++ b/api/services/external_knowledge_service.py
@@ -6,6 +6,8 @@ from typing import Any, Optional, Union
import httpx
import validators
+from constants import HIDDEN_VALUE
+
# from tasks.external_document_indexing_task import external_document_indexing_task
from core.helper import ssrf_proxy
from extensions.ext_database import db
@@ -68,7 +70,7 @@ class ExternalDatasetService:
endpoint = f"{settings['endpoint']}/retrieval"
api_key = settings["api_key"]
- if not validators.url(endpoint):
+ if not validators.url(endpoint, simple_host=True):
raise ValueError(f"invalid endpoint: {endpoint}")
try:
response = httpx.post(endpoint, headers={"Authorization": f"Bearer {api_key}"})
@@ -92,6 +94,8 @@ class ExternalDatasetService:
).first()
if external_knowledge_api is None:
raise ValueError("api template not found")
+ if args.get("settings") and args.get("settings").get("api_key") == HIDDEN_VALUE:
+ args.get("settings")["api_key"] = external_knowledge_api.settings_dict.get("api_key")
external_knowledge_api.name = args.get("name")
external_knowledge_api.description = args.get("description", "")
diff --git a/api/tests/integration_tests/.env.example b/api/tests/integration_tests/.env.example
index 2d52399d29..6791cd891b 100644
--- a/api/tests/integration_tests/.env.example
+++ b/api/tests/integration_tests/.env.example
@@ -83,3 +83,6 @@ VOLC_EMBEDDING_ENDPOINT_ID=
# 360 AI Credentials
ZHINAO_API_KEY=
+
+# Gitee AI Credentials
+GITEE_AI_API_KEY=
diff --git a/api/tests/integration_tests/model_runtime/gitee_ai/__init__.py b/api/tests/integration_tests/model_runtime/gitee_ai/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/api/tests/integration_tests/model_runtime/gitee_ai/test_llm.py b/api/tests/integration_tests/model_runtime/gitee_ai/test_llm.py
new file mode 100644
index 0000000000..753c52ce31
--- /dev/null
+++ b/api/tests/integration_tests/model_runtime/gitee_ai/test_llm.py
@@ -0,0 +1,132 @@
+import os
+from collections.abc import Generator
+
+import pytest
+
+from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk, LLMResultChunkDelta
+from core.model_runtime.entities.message_entities import (
+ AssistantPromptMessage,
+ PromptMessageTool,
+ SystemPromptMessage,
+ UserPromptMessage,
+)
+from core.model_runtime.entities.model_entities import AIModelEntity
+from core.model_runtime.errors.validate import CredentialsValidateFailedError
+from core.model_runtime.model_providers.gitee_ai.llm.llm import GiteeAILargeLanguageModel
+
+
+def test_predefined_models():
+ model = GiteeAILargeLanguageModel()
+ model_schemas = model.predefined_models()
+
+ assert len(model_schemas) >= 1
+ assert isinstance(model_schemas[0], AIModelEntity)
+
+
+def test_validate_credentials_for_chat_model():
+ model = GiteeAILargeLanguageModel()
+
+ with pytest.raises(CredentialsValidateFailedError):
+ # model name to gpt-3.5-turbo because of mocking
+ model.validate_credentials(model="gpt-3.5-turbo", credentials={"api_key": "invalid_key"})
+
+ model.validate_credentials(
+ model="Qwen2-7B-Instruct",
+ credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
+ )
+
+
+def test_invoke_chat_model():
+ model = GiteeAILargeLanguageModel()
+
+ result = model.invoke(
+ model="Qwen2-7B-Instruct",
+ credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
+ prompt_messages=[
+ SystemPromptMessage(
+ content="You are a helpful AI assistant.",
+ ),
+ UserPromptMessage(content="Hello World!"),
+ ],
+ model_parameters={
+ "temperature": 0.0,
+ "top_p": 1.0,
+ "presence_penalty": 0.0,
+ "frequency_penalty": 0.0,
+ "max_tokens": 10,
+ "stream": False,
+ },
+ stop=["How"],
+ stream=False,
+ user="foo",
+ )
+
+ assert isinstance(result, LLMResult)
+ assert len(result.message.content) > 0
+
+
+def test_invoke_stream_chat_model():
+ model = GiteeAILargeLanguageModel()
+
+ result = model.invoke(
+ model="Qwen2-7B-Instruct",
+ credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
+ prompt_messages=[
+ SystemPromptMessage(
+ content="You are a helpful AI assistant.",
+ ),
+ UserPromptMessage(content="Hello World!"),
+ ],
+ model_parameters={"temperature": 0.0, "max_tokens": 100, "stream": False},
+ stream=True,
+ user="foo",
+ )
+
+ assert isinstance(result, Generator)
+
+ for chunk in result:
+ assert isinstance(chunk, LLMResultChunk)
+ assert isinstance(chunk.delta, LLMResultChunkDelta)
+ assert isinstance(chunk.delta.message, AssistantPromptMessage)
+ assert len(chunk.delta.message.content) > 0 if chunk.delta.finish_reason is None else True
+ if chunk.delta.finish_reason is not None:
+ assert chunk.delta.usage is not None
+
+
+def test_get_num_tokens():
+ model = GiteeAILargeLanguageModel()
+
+ num_tokens = model.get_num_tokens(
+ model="Qwen2-7B-Instruct",
+ credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
+ prompt_messages=[UserPromptMessage(content="Hello World!")],
+ )
+
+ assert num_tokens == 10
+
+ num_tokens = model.get_num_tokens(
+ model="Qwen2-7B-Instruct",
+ credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
+ prompt_messages=[
+ SystemPromptMessage(
+ content="You are a helpful AI assistant.",
+ ),
+ UserPromptMessage(content="Hello World!"),
+ ],
+ tools=[
+ PromptMessageTool(
+ name="get_weather",
+ description="Determine weather in my location",
+ parameters={
+ "type": "object",
+ "properties": {
+ "location": {"type": "string", "description": "The city and state e.g. San Francisco, CA"},
+ "unit": {"type": "string", "enum": ["c", "f"]},
+ },
+ "required": ["location"],
+ },
+ ),
+ ],
+ )
+
+ assert num_tokens == 77
diff --git a/api/tests/integration_tests/model_runtime/gitee_ai/test_provider.py b/api/tests/integration_tests/model_runtime/gitee_ai/test_provider.py
new file mode 100644
index 0000000000..f12ed54a45
--- /dev/null
+++ b/api/tests/integration_tests/model_runtime/gitee_ai/test_provider.py
@@ -0,0 +1,15 @@
+import os
+
+import pytest
+
+from core.model_runtime.errors.validate import CredentialsValidateFailedError
+from core.model_runtime.model_providers.gitee_ai.gitee_ai import GiteeAIProvider
+
+
+def test_validate_provider_credentials():
+ provider = GiteeAIProvider()
+
+ with pytest.raises(CredentialsValidateFailedError):
+ provider.validate_provider_credentials(credentials={"api_key": "invalid_key"})
+
+ provider.validate_provider_credentials(credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")})
diff --git a/api/tests/integration_tests/model_runtime/gitee_ai/test_rerank.py b/api/tests/integration_tests/model_runtime/gitee_ai/test_rerank.py
new file mode 100644
index 0000000000..0e5914a61f
--- /dev/null
+++ b/api/tests/integration_tests/model_runtime/gitee_ai/test_rerank.py
@@ -0,0 +1,47 @@
+import os
+
+import pytest
+
+from core.model_runtime.entities.rerank_entities import RerankResult
+from core.model_runtime.errors.validate import CredentialsValidateFailedError
+from core.model_runtime.model_providers.gitee_ai.rerank.rerank import GiteeAIRerankModel
+
+
+def test_validate_credentials():
+ model = GiteeAIRerankModel()
+
+ with pytest.raises(CredentialsValidateFailedError):
+ model.validate_credentials(
+ model="bge-reranker-v2-m3",
+ credentials={"api_key": "invalid_key"},
+ )
+
+ model.validate_credentials(
+ model="bge-reranker-v2-m3",
+ credentials={
+ "api_key": os.environ.get("GITEE_AI_API_KEY"),
+ },
+ )
+
+
+def test_invoke_model():
+ model = GiteeAIRerankModel()
+ result = model.invoke(
+ model="bge-reranker-v2-m3",
+ credentials={
+ "api_key": os.environ.get("GITEE_AI_API_KEY"),
+ },
+ query="What is the capital of the United States?",
+ docs=[
+ "Carson City is the capital city of the American state of Nevada. At the 2010 United States "
+ "Census, Carson City had a population of 55,274.",
+ "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that "
+ "are a political division controlled by the United States. Its capital is Saipan.",
+ ],
+ top_n=1,
+ score_threshold=0.01,
+ )
+
+ assert isinstance(result, RerankResult)
+ assert len(result.docs) == 1
+ assert result.docs[0].score >= 0.01
diff --git a/api/tests/integration_tests/model_runtime/gitee_ai/test_speech2text.py b/api/tests/integration_tests/model_runtime/gitee_ai/test_speech2text.py
new file mode 100644
index 0000000000..4a01453fdd
--- /dev/null
+++ b/api/tests/integration_tests/model_runtime/gitee_ai/test_speech2text.py
@@ -0,0 +1,45 @@
+import os
+
+import pytest
+
+from core.model_runtime.errors.validate import CredentialsValidateFailedError
+from core.model_runtime.model_providers.gitee_ai.speech2text.speech2text import GiteeAISpeech2TextModel
+
+
+def test_validate_credentials():
+ model = GiteeAISpeech2TextModel()
+
+ with pytest.raises(CredentialsValidateFailedError):
+ model.validate_credentials(
+ model="whisper-base",
+ credentials={"api_key": "invalid_key"},
+ )
+
+ model.validate_credentials(
+ model="whisper-base",
+ credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
+ )
+
+
+def test_invoke_model():
+ model = GiteeAISpeech2TextModel()
+
+ # Get the directory of the current file
+ current_dir = os.path.dirname(os.path.abspath(__file__))
+
+ # Get assets directory
+ assets_dir = os.path.join(os.path.dirname(current_dir), "assets")
+
+ # Construct the path to the audio file
+ audio_file_path = os.path.join(assets_dir, "audio.mp3")
+
+ # Open the file and get the file object
+ with open(audio_file_path, "rb") as audio_file:
+ file = audio_file
+
+ result = model.invoke(
+ model="whisper-base", credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")}, file=file
+ )
+
+ assert isinstance(result, str)
+ assert result == "1 2 3 4 5 6 7 8 9 10"
diff --git a/api/tests/integration_tests/model_runtime/gitee_ai/test_text_embedding.py b/api/tests/integration_tests/model_runtime/gitee_ai/test_text_embedding.py
new file mode 100644
index 0000000000..34648f0bc8
--- /dev/null
+++ b/api/tests/integration_tests/model_runtime/gitee_ai/test_text_embedding.py
@@ -0,0 +1,46 @@
+import os
+
+import pytest
+
+from core.model_runtime.entities.text_embedding_entities import TextEmbeddingResult
+from core.model_runtime.errors.validate import CredentialsValidateFailedError
+from core.model_runtime.model_providers.gitee_ai.text_embedding.text_embedding import GiteeAIEmbeddingModel
+
+
+def test_validate_credentials():
+ model = GiteeAIEmbeddingModel()
+
+ with pytest.raises(CredentialsValidateFailedError):
+ model.validate_credentials(model="bge-large-zh-v1.5", credentials={"api_key": "invalid_key"})
+
+ model.validate_credentials(model="bge-large-zh-v1.5", credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")})
+
+
+def test_invoke_model():
+ model = GiteeAIEmbeddingModel()
+
+ result = model.invoke(
+ model="bge-large-zh-v1.5",
+ credentials={
+ "api_key": os.environ.get("GITEE_AI_API_KEY"),
+ },
+ texts=["hello", "world"],
+ user="user",
+ )
+
+ assert isinstance(result, TextEmbeddingResult)
+ assert len(result.embeddings) == 2
+
+
+def test_get_num_tokens():
+ model = GiteeAIEmbeddingModel()
+
+ num_tokens = model.get_num_tokens(
+ model="bge-large-zh-v1.5",
+ credentials={
+ "api_key": os.environ.get("GITEE_AI_API_KEY"),
+ },
+ texts=["hello", "world"],
+ )
+
+ assert num_tokens == 2
diff --git a/api/tests/integration_tests/model_runtime/gitee_ai/test_tts.py b/api/tests/integration_tests/model_runtime/gitee_ai/test_tts.py
new file mode 100644
index 0000000000..9f18161a7b
--- /dev/null
+++ b/api/tests/integration_tests/model_runtime/gitee_ai/test_tts.py
@@ -0,0 +1,23 @@
+import os
+
+from core.model_runtime.model_providers.gitee_ai.tts.tts import GiteeAIText2SpeechModel
+
+
+def test_invoke_model():
+ model = GiteeAIText2SpeechModel()
+
+ result = model.invoke(
+ model="speecht5_tts",
+ tenant_id="test",
+ credentials={
+ "api_key": os.environ.get("GITEE_AI_API_KEY"),
+ },
+ content_text="Hello, world!",
+ voice="",
+ )
+
+ content = b""
+ for chunk in result:
+ content += chunk
+
+ assert content != b""
diff --git a/api/tests/integration_tests/vdb/couchbase/__init__.py b/api/tests/integration_tests/vdb/couchbase/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/api/tests/integration_tests/vdb/couchbase/test_couchbase.py b/api/tests/integration_tests/vdb/couchbase/test_couchbase.py
new file mode 100644
index 0000000000..d76c34ba0e
--- /dev/null
+++ b/api/tests/integration_tests/vdb/couchbase/test_couchbase.py
@@ -0,0 +1,50 @@
+import subprocess
+import time
+
+from core.rag.datasource.vdb.couchbase.couchbase_vector import CouchbaseConfig, CouchbaseVector
+from tests.integration_tests.vdb.test_vector_store import (
+ AbstractVectorTest,
+ get_example_text,
+ setup_mock_redis,
+)
+
+
+def wait_for_healthy_container(service_name="couchbase-server", timeout=300):
+ start_time = time.time()
+ while time.time() - start_time < timeout:
+ result = subprocess.run(
+ ["docker", "inspect", "--format", "{{.State.Health.Status}}", service_name], capture_output=True, text=True
+ )
+ if result.stdout.strip() == "healthy":
+ print(f"{service_name} is healthy!")
+ return True
+ else:
+ print(f"Waiting for {service_name} to be healthy...")
+ time.sleep(10)
+ raise TimeoutError(f"{service_name} did not become healthy in time")
+
+
+class CouchbaseTest(AbstractVectorTest):
+ def __init__(self):
+ super().__init__()
+ self.vector = CouchbaseVector(
+ collection_name=self.collection_name,
+ config=CouchbaseConfig(
+ connection_string="couchbase://127.0.0.1",
+ user="Administrator",
+ password="password",
+ bucket_name="Embeddings",
+ scope_name="_default",
+ ),
+ )
+
+ def search_by_vector(self):
+ # brief sleep to ensure document is indexed
+ time.sleep(5)
+ hits_by_vector = self.vector.search_by_vector(query_vector=self.example_embedding)
+ assert len(hits_by_vector) == 1
+
+
+def test_couchbase(setup_mock_redis):
+ wait_for_healthy_container("couchbase-server", timeout=60)
+ CouchbaseTest().run_all_tests()
diff --git a/api/tests/integration_tests/vdb/oceanbase/__init__.py b/api/tests/integration_tests/vdb/oceanbase/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/api/tests/integration_tests/vdb/oceanbase/test_oceanbase.py b/api/tests/integration_tests/vdb/oceanbase/test_oceanbase.py
new file mode 100644
index 0000000000..ebcb134168
--- /dev/null
+++ b/api/tests/integration_tests/vdb/oceanbase/test_oceanbase.py
@@ -0,0 +1,71 @@
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from core.rag.datasource.vdb.oceanbase.oceanbase_vector import (
+ OceanBaseVector,
+ OceanBaseVectorConfig,
+)
+from tests.integration_tests.vdb.__mock.tcvectordb import setup_tcvectordb_mock
+from tests.integration_tests.vdb.test_vector_store import (
+ AbstractVectorTest,
+ get_example_text,
+ setup_mock_redis,
+)
+
+
+@pytest.fixture
+def oceanbase_vector():
+ return OceanBaseVector(
+ "dify_test_collection",
+ config=OceanBaseVectorConfig(
+ host="127.0.0.1",
+ port="2881",
+ user="root@test",
+ database="test",
+ password="test",
+ ),
+ )
+
+
+class OceanBaseVectorTest(AbstractVectorTest):
+ def __init__(self, vector: OceanBaseVector):
+ super().__init__()
+ self.vector = vector
+
+ def search_by_vector(self):
+ hits_by_vector = self.vector.search_by_vector(query_vector=self.example_embedding)
+ assert len(hits_by_vector) == 0
+
+ def search_by_full_text(self):
+ hits_by_full_text = self.vector.search_by_full_text(query=get_example_text())
+ assert len(hits_by_full_text) == 0
+
+ def text_exists(self):
+ exist = self.vector.text_exists(self.example_doc_id)
+ assert exist == True
+
+ def get_ids_by_metadata_field(self):
+ ids = self.vector.get_ids_by_metadata_field(key="document_id", value=self.example_doc_id)
+ assert len(ids) == 0
+
+
+@pytest.fixture
+def setup_mock_oceanbase_client():
+ with patch("core.rag.datasource.vdb.oceanbase.oceanbase_vector.ObVecClient", new_callable=MagicMock) as mock_client:
+ yield mock_client
+
+
+@pytest.fixture
+def setup_mock_oceanbase_vector(oceanbase_vector):
+ with patch.object(oceanbase_vector, "_client"):
+ yield oceanbase_vector
+
+
+def test_oceanbase_vector(
+ setup_mock_redis,
+ setup_mock_oceanbase_client,
+ setup_mock_oceanbase_vector,
+ oceanbase_vector,
+):
+ OceanBaseVectorTest(oceanbase_vector).run_all_tests()
diff --git a/api/tests/integration_tests/workflow/nodes/test_http.py b/api/tests/integration_tests/workflow/nodes/test_http.py
index 9eea63f722..0da6622658 100644
--- a/api/tests/integration_tests/workflow/nodes/test_http.py
+++ b/api/tests/integration_tests/workflow/nodes/test_http.py
@@ -430,3 +430,37 @@ def test_multi_colons_parse(setup_http_mock):
assert urlencode({"Redirect": "http://example2.com"}) in result.process_data.get("request", "")
assert 'form-data; name="Redirect"\r\n\r\nhttp://example6.com' in result.process_data.get("request", "")
# assert "http://example3.com" == resp.get("headers", {}).get("referer")
+
+
+def test_image_file(monkeypatch):
+ from types import SimpleNamespace
+
+ monkeypatch.setattr(
+ "core.tools.tool_file_manager.ToolFileManager.create_file_by_raw",
+ lambda *args, **kwargs: SimpleNamespace(id="1"),
+ )
+
+ node = init_http_node(
+ config={
+ "id": "1",
+ "data": {
+ "title": "http",
+ "desc": "",
+ "method": "get",
+ "url": "https://cloud.dify.ai/logo/logo-site.png",
+ "authorization": {
+ "type": "no-auth",
+ "config": None,
+ },
+ "params": "",
+ "headers": "",
+ "body": None,
+ },
+ }
+ )
+
+ result = node._run()
+ assert result.process_data is not None
+ assert result.outputs is not None
+ resp = result.outputs
+ assert len(resp.get("files", [])) == 1
diff --git a/api/tests/unit_tests/core/workflow/nodes/test_http_request_node.py b/api/tests/unit_tests/core/workflow/nodes/test_http_request_node.py
index 2a5fda48b1..720037d05f 100644
--- a/api/tests/unit_tests/core/workflow/nodes/test_http_request_node.py
+++ b/api/tests/unit_tests/core/workflow/nodes/test_http_request_node.py
@@ -192,7 +192,7 @@ def test_http_request_node_form_with_file(monkeypatch):
def attr_checker(*args, **kwargs):
assert kwargs["data"] == {"name": "test"}
- assert kwargs["files"] == {"file": b"test"}
+ assert kwargs["files"] == {"file": (None, b"test", "application/octet-stream")}
return httpx.Response(200, content=b"")
monkeypatch.setattr(
diff --git a/api/tests/unit_tests/oss/__mock/aliyun_oss.py b/api/tests/unit_tests/oss/__mock/aliyun_oss.py
new file mode 100644
index 0000000000..27e1c0ad85
--- /dev/null
+++ b/api/tests/unit_tests/oss/__mock/aliyun_oss.py
@@ -0,0 +1,100 @@
+import os
+import posixpath
+from unittest.mock import MagicMock
+
+import pytest
+from _pytest.monkeypatch import MonkeyPatch
+from oss2 import Bucket
+from oss2.models import GetObjectResult, PutObjectResult
+
+from tests.unit_tests.oss.__mock.base import (
+ get_example_bucket,
+ get_example_data,
+ get_example_filename,
+ get_example_filepath,
+ get_example_folder,
+)
+
+
+class MockResponse:
+ def __init__(self, status, headers, request_id):
+ self.status = status
+ self.headers = headers
+ self.request_id = request_id
+
+
+class MockAliyunOssClass:
+ def __init__(
+ self,
+ auth,
+ endpoint,
+ bucket_name,
+ is_cname=False,
+ session=None,
+ connect_timeout=None,
+ app_name="",
+ enable_crc=True,
+ proxies=None,
+ region=None,
+ cloudbox_id=None,
+ is_path_style=False,
+ is_verify_object_strict=True,
+ ):
+ self.bucket_name = get_example_bucket()
+ self.key = posixpath.join(get_example_folder(), get_example_filename())
+ self.content = get_example_data()
+ self.filepath = get_example_filepath()
+ self.resp = MockResponse(
+ 200,
+ {
+ "etag": "ee8de918d05640145b18f70f4c3aa602",
+ "x-oss-version-id": "CAEQNhiBgMDJgZCA0BYiIDc4MGZjZGI2OTBjOTRmNTE5NmU5NmFhZjhjYmY0****",
+ },
+ "request_id",
+ )
+
+ def put_object(self, key, data, headers=None, progress_callback=None):
+ assert key == self.key
+ assert data == self.content
+ return PutObjectResult(self.resp)
+
+ def get_object(self, key, byte_range=None, headers=None, progress_callback=None, process=None, params=None):
+ assert key == self.key
+
+ get_object_output = MagicMock(GetObjectResult)
+ get_object_output.read.return_value = self.content
+ return get_object_output
+
+ def get_object_to_file(
+ self, key, filename, byte_range=None, headers=None, progress_callback=None, process=None, params=None
+ ):
+ assert key == self.key
+ assert filename == self.filepath
+
+ def object_exists(self, key, headers=None):
+ assert key == self.key
+ return True
+
+ def delete_object(self, key, params=None, headers=None):
+ assert key == self.key
+ self.resp.headers["x-oss-delete-marker"] = True
+ return self.resp
+
+
+MOCK = os.getenv("MOCK_SWITCH", "false").lower() == "true"
+
+
+@pytest.fixture
+def setup_aliyun_oss_mock(monkeypatch: MonkeyPatch):
+ if MOCK:
+ monkeypatch.setattr(Bucket, "__init__", MockAliyunOssClass.__init__)
+ monkeypatch.setattr(Bucket, "put_object", MockAliyunOssClass.put_object)
+ monkeypatch.setattr(Bucket, "get_object", MockAliyunOssClass.get_object)
+ monkeypatch.setattr(Bucket, "get_object_to_file", MockAliyunOssClass.get_object_to_file)
+ monkeypatch.setattr(Bucket, "object_exists", MockAliyunOssClass.object_exists)
+ monkeypatch.setattr(Bucket, "delete_object", MockAliyunOssClass.delete_object)
+
+ yield
+
+ if MOCK:
+ monkeypatch.undo()
diff --git a/api/tests/unit_tests/oss/__mock/tencent_cos.py b/api/tests/unit_tests/oss/__mock/tencent_cos.py
new file mode 100644
index 0000000000..5189b68e87
--- /dev/null
+++ b/api/tests/unit_tests/oss/__mock/tencent_cos.py
@@ -0,0 +1,81 @@
+import os
+from unittest.mock import MagicMock
+
+import pytest
+from _pytest.monkeypatch import MonkeyPatch
+from qcloud_cos import CosS3Client
+from qcloud_cos.streambody import StreamBody
+
+from tests.unit_tests.oss.__mock.base import (
+ get_example_bucket,
+ get_example_data,
+ get_example_filename,
+ get_example_filepath,
+)
+
+
+class MockTencentCosClass:
+ def __init__(self, conf, retry=1, session=None):
+ self.bucket_name = get_example_bucket()
+ self.key = get_example_filename()
+ self.content = get_example_data()
+ self.filepath = get_example_filepath()
+ self.resp = {
+ "ETag": "ee8de918d05640145b18f70f4c3aa602",
+ "Server": "tencent-cos",
+ "x-cos-hash-crc64ecma": 16749565679157681890,
+ "x-cos-request-id": "NWU5MDNkYzlfNjRiODJhMDlfMzFmYzhfMTFm****",
+ }
+
+ def put_object(self, Bucket, Body, Key, EnableMD5=False, **kwargs): # noqa: N803
+ assert Bucket == self.bucket_name
+ assert Key == self.key
+ assert Body == self.content
+ return self.resp
+
+ def get_object(self, Bucket, Key, KeySimplifyCheck=True, **kwargs): # noqa: N803
+ assert Bucket == self.bucket_name
+ assert Key == self.key
+
+ mock_stream_body = MagicMock(StreamBody)
+ mock_raw_stream = MagicMock()
+ mock_stream_body.get_raw_stream.return_value = mock_raw_stream
+ mock_raw_stream.read.return_value = self.content
+
+ mock_stream_body.get_stream_to_file = MagicMock()
+
+ def chunk_generator(chunk_size=2):
+ for i in range(0, len(self.content), chunk_size):
+ yield self.content[i : i + chunk_size]
+
+ mock_stream_body.get_stream.return_value = chunk_generator(chunk_size=4096)
+ return {"Body": mock_stream_body}
+
+ def object_exists(self, Bucket, Key): # noqa: N803
+ assert Bucket == self.bucket_name
+ assert Key == self.key
+ return True
+
+ def delete_object(self, Bucket, Key, **kwargs): # noqa: N803
+ assert Bucket == self.bucket_name
+ assert Key == self.key
+ self.resp.update({"x-cos-delete-marker": True})
+ return self.resp
+
+
+MOCK = os.getenv("MOCK_SWITCH", "false").lower() == "true"
+
+
+@pytest.fixture
+def setup_tencent_cos_mock(monkeypatch: MonkeyPatch):
+ if MOCK:
+ monkeypatch.setattr(CosS3Client, "__init__", MockTencentCosClass.__init__)
+ monkeypatch.setattr(CosS3Client, "put_object", MockTencentCosClass.put_object)
+ monkeypatch.setattr(CosS3Client, "get_object", MockTencentCosClass.get_object)
+ monkeypatch.setattr(CosS3Client, "object_exists", MockTencentCosClass.object_exists)
+ monkeypatch.setattr(CosS3Client, "delete_object", MockTencentCosClass.delete_object)
+
+ yield
+
+ if MOCK:
+ monkeypatch.undo()
diff --git a/api/tests/unit_tests/oss/aliyun_oss/aliyun_oss/__init__.py b/api/tests/unit_tests/oss/aliyun_oss/aliyun_oss/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/api/tests/unit_tests/oss/aliyun_oss/aliyun_oss/test_aliyun_oss.py b/api/tests/unit_tests/oss/aliyun_oss/aliyun_oss/test_aliyun_oss.py
new file mode 100644
index 0000000000..65d31352bd
--- /dev/null
+++ b/api/tests/unit_tests/oss/aliyun_oss/aliyun_oss/test_aliyun_oss.py
@@ -0,0 +1,22 @@
+from unittest.mock import MagicMock, patch
+
+import pytest
+from oss2 import Auth
+
+from extensions.storage.aliyun_oss_storage import AliyunOssStorage
+from tests.unit_tests.oss.__mock.aliyun_oss import setup_aliyun_oss_mock
+from tests.unit_tests.oss.__mock.base import (
+ BaseStorageTest,
+ get_example_bucket,
+ get_example_folder,
+)
+
+
+class TestAliyunOss(BaseStorageTest):
+ @pytest.fixture(autouse=True)
+ def setup_method(self, setup_aliyun_oss_mock):
+ """Executed before each test method."""
+ with patch.object(Auth, "__init__", return_value=None):
+ self.storage = AliyunOssStorage()
+ self.storage.bucket_name = get_example_bucket()
+ self.storage.folder = get_example_folder()
diff --git a/api/tests/unit_tests/oss/tencent_cos/__init__.py b/api/tests/unit_tests/oss/tencent_cos/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/api/tests/unit_tests/oss/tencent_cos/test_tencent_cos.py b/api/tests/unit_tests/oss/tencent_cos/test_tencent_cos.py
new file mode 100644
index 0000000000..303f0493bd
--- /dev/null
+++ b/api/tests/unit_tests/oss/tencent_cos/test_tencent_cos.py
@@ -0,0 +1,20 @@
+from unittest.mock import patch
+
+import pytest
+from qcloud_cos import CosConfig
+
+from extensions.storage.tencent_cos_storage import TencentCosStorage
+from tests.unit_tests.oss.__mock.base import (
+ BaseStorageTest,
+ get_example_bucket,
+)
+from tests.unit_tests.oss.__mock.tencent_cos import setup_tencent_cos_mock
+
+
+class TestTencentCos(BaseStorageTest):
+ @pytest.fixture(autouse=True)
+ def setup_method(self, setup_tencent_cos_mock):
+ """Executed before each test method."""
+ with patch.object(CosConfig, "__init__", return_value=None):
+ self.storage = TencentCosStorage()
+ self.storage.bucket_name = get_example_bucket()
diff --git a/api/tests/unit_tests/oss/volcengine_tos/test_volcengine_tos.py b/api/tests/unit_tests/oss/volcengine_tos/test_volcengine_tos.py
index 9f8aa158a9..5afbc9e8b4 100644
--- a/api/tests/unit_tests/oss/volcengine_tos/test_volcengine_tos.py
+++ b/api/tests/unit_tests/oss/volcengine_tos/test_volcengine_tos.py
@@ -1,5 +1,3 @@
-from collections.abc import Generator
-
import pytest
from tos import TosClientV2
diff --git a/dev/pytest/pytest_vdb.sh b/dev/pytest/pytest_vdb.sh
index 579da6a30e..02a9f49279 100755
--- a/dev/pytest/pytest_vdb.sh
+++ b/dev/pytest/pytest_vdb.sh
@@ -11,4 +11,6 @@ pytest api/tests/integration_tests/vdb/chroma \
api/tests/integration_tests/vdb/vikingdb \
api/tests/integration_tests/vdb/baidu \
api/tests/integration_tests/vdb/tcvectordb \
- api/tests/integration_tests/vdb/upstash
\ No newline at end of file
+ api/tests/integration_tests/vdb/upstash \
+ api/tests/integration_tests/vdb/couchbase \
+ api/tests/integration_tests/vdb/oceanbase \
diff --git a/docker-legacy/docker-compose.yaml b/docker-legacy/docker-compose.yaml
index 17b788ff81..e3f1c3b761 100644
--- a/docker-legacy/docker-compose.yaml
+++ b/docker-legacy/docker-compose.yaml
@@ -2,7 +2,7 @@ version: '3'
services:
# API service
api:
- image: langgenius/dify-api:0.10.1
+ image: langgenius/dify-api:0.10.2
restart: always
environment:
# Startup mode, 'api' starts the API server.
@@ -227,7 +227,7 @@ services:
# worker service
# The Celery worker for processing the queue.
worker:
- image: langgenius/dify-api:0.10.1
+ image: langgenius/dify-api:0.10.2
restart: always
environment:
CONSOLE_WEB_URL: ''
@@ -396,7 +396,7 @@ services:
# Frontend web application.
web:
- image: langgenius/dify-web:0.10.1
+ image: langgenius/dify-web:0.10.2
restart: always
environment:
# The base URL of console application api server, refers to the Console base URL of WEB service if console domain is
diff --git a/docker/.env.example b/docker/.env.example
index 49ce48a20d..ef2f331c11 100644
--- a/docker/.env.example
+++ b/docker/.env.example
@@ -375,7 +375,7 @@ SUPABASE_URL=your-server-url
# ------------------------------
# The type of vector store to use.
-# Supported values are `weaviate`, `qdrant`, `milvus`, `myscale`, `relyt`, `pgvector`, `pgvecto-rs`, `chroma`, `opensearch`, `tidb_vector`, `oracle`, `tencent`, `elasticsearch`, `analyticdb`, `vikingdb`.
+# Supported values are `weaviate`, `qdrant`, `milvus`, `myscale`, `relyt`, `pgvector`, `pgvecto-rs`, `chroma`, `opensearch`, `tidb_vector`, `oracle`, `tencent`, `elasticsearch`, `analyticdb`, `couchbase`, `vikingdb`.
VECTOR_STORE=weaviate
# The Weaviate endpoint URL. Only available when VECTOR_STORE is `weaviate`.
@@ -414,6 +414,14 @@ MYSCALE_PASSWORD=
MYSCALE_DATABASE=dify
MYSCALE_FTS_PARAMS=
+# Couchbase configurations, only available when VECTOR_STORE is `couchbase`
+# The connection string must include hostname defined in the docker-compose file (couchbase-server in this case)
+COUCHBASE_CONNECTION_STRING=couchbase://couchbase-server
+COUCHBASE_USER=Administrator
+COUCHBASE_PASSWORD=password
+COUCHBASE_BUCKET_NAME=Embeddings
+COUCHBASE_SCOPE_NAME=_default
+
# pgvector configurations, only available when VECTOR_STORE is `pgvector`
PGVECTOR_HOST=pgvector
PGVECTOR_PORT=5432
@@ -447,6 +455,20 @@ TIDB_VECTOR_USER=xxx.root
TIDB_VECTOR_PASSWORD=xxxxxx
TIDB_VECTOR_DATABASE=dify
+# Tidb on qdrant configuration, only available when VECTOR_STORE is `tidb_on_qdrant`
+TIDB_ON_QDRANT_URL=http://127.0.0.1
+TIDB_ON_QDRANT_API_KEY=dify
+TIDB_ON_QDRANT_CLIENT_TIMEOUT=20
+TIDB_ON_QDRANT_GRPC_ENABLED=false
+TIDB_ON_QDRANT_GRPC_PORT=6334
+TIDB_PUBLIC_KEY=dify
+TIDB_PRIVATE_KEY=dify
+TIDB_API_URL=http://127.0.0.1
+TIDB_IAM_API_URL=http://127.0.0.1
+TIDB_REGION=regions/aws-us-east-1
+TIDB_PROJECT_ID=dify
+TIDB_SPEND_LIMIT=100
+
# Chroma configuration, only available when VECTOR_STORE is `chroma`
CHROMA_HOST=127.0.0.1
CHROMA_PORT=8000
@@ -509,6 +531,14 @@ VIKINGDB_SCHEMA=http
VIKINGDB_CONNECTION_TIMEOUT=30
VIKINGDB_SOCKET_TIMEOUT=30
+# OceanBase Vector configuration, only available when VECTOR_STORE is `oceanbase`
+OCEANBASE_VECTOR_HOST=oceanbase-vector
+OCEANBASE_VECTOR_PORT=2881
+OCEANBASE_VECTOR_USER=root@test
+OCEANBASE_VECTOR_PASSWORD=
+OCEANBASE_VECTOR_DATABASE=test
+OCEANBASE_MEMORY_LIMIT=6G
+
# ------------------------------
# Knowledge Configuration
# ------------------------------
diff --git a/docker/couchbase-server/Dockerfile b/docker/couchbase-server/Dockerfile
new file mode 100644
index 0000000000..bd8af64150
--- /dev/null
+++ b/docker/couchbase-server/Dockerfile
@@ -0,0 +1,4 @@
+FROM couchbase/server:latest AS stage_base
+# FROM couchbase:latest AS stage_base
+COPY init-cbserver.sh /opt/couchbase/init/
+RUN chmod +x /opt/couchbase/init/init-cbserver.sh
\ No newline at end of file
diff --git a/docker/couchbase-server/init-cbserver.sh b/docker/couchbase-server/init-cbserver.sh
new file mode 100755
index 0000000000..e66bc18530
--- /dev/null
+++ b/docker/couchbase-server/init-cbserver.sh
@@ -0,0 +1,44 @@
+#!/bin/bash
+# used to start couchbase server - can't get around this as docker compose only allows you to start one command - so we have to start couchbase like the standard couchbase Dockerfile would
+# https://github.com/couchbase/docker/blob/master/enterprise/couchbase-server/7.2.0/Dockerfile#L88
+
+/entrypoint.sh couchbase-server &
+
+# track if setup is complete so we don't try to setup again
+FILE=/opt/couchbase/init/setupComplete.txt
+
+if ! [ -f "$FILE" ]; then
+ # used to automatically create the cluster based on environment variables
+ # https://docs.couchbase.com/server/current/cli/cbcli/couchbase-cli-cluster-init.html
+
+ echo $COUCHBASE_ADMINISTRATOR_USERNAME ":" $COUCHBASE_ADMINISTRATOR_PASSWORD
+
+ sleep 20s
+ /opt/couchbase/bin/couchbase-cli cluster-init -c 127.0.0.1 \
+ --cluster-username $COUCHBASE_ADMINISTRATOR_USERNAME \
+ --cluster-password $COUCHBASE_ADMINISTRATOR_PASSWORD \
+ --services data,index,query,fts \
+ --cluster-ramsize $COUCHBASE_RAM_SIZE \
+ --cluster-index-ramsize $COUCHBASE_INDEX_RAM_SIZE \
+ --cluster-eventing-ramsize $COUCHBASE_EVENTING_RAM_SIZE \
+ --cluster-fts-ramsize $COUCHBASE_FTS_RAM_SIZE \
+ --index-storage-setting default
+
+ sleep 2s
+
+ # used to auto create the bucket based on environment variables
+ # https://docs.couchbase.com/server/current/cli/cbcli/couchbase-cli-bucket-create.html
+
+ /opt/couchbase/bin/couchbase-cli bucket-create -c localhost:8091 \
+ --username $COUCHBASE_ADMINISTRATOR_USERNAME \
+ --password $COUCHBASE_ADMINISTRATOR_PASSWORD \
+ --bucket $COUCHBASE_BUCKET \
+ --bucket-ramsize $COUCHBASE_BUCKET_RAMSIZE \
+ --bucket-type couchbase
+
+ # create file so we know that the cluster is setup and don't run the setup again
+ touch $FILE
+fi
+ # docker compose will stop the container from running unless we do this
+ # known issue and workaround
+ tail -f /dev/null
diff --git a/docker/docker-compose.yaml b/docker/docker-compose.yaml
index d43bd5e2d1..06c99b5eab 100644
--- a/docker/docker-compose.yaml
+++ b/docker/docker-compose.yaml
@@ -110,6 +110,11 @@ x-shared-env: &shared-api-worker-env
QDRANT_CLIENT_TIMEOUT: ${QDRANT_CLIENT_TIMEOUT:-20}
QDRANT_GRPC_ENABLED: ${QDRANT_GRPC_ENABLED:-false}
QDRANT_GRPC_PORT: ${QDRANT_GRPC_PORT:-6334}
+ COUCHBASE_CONNECTION_STRING: ${COUCHBASE_CONNECTION_STRING:-'couchbase-server'}
+ COUCHBASE_USER: ${COUCHBASE_USER:-Administrator}
+ COUCHBASE_PASSWORD: ${COUCHBASE_PASSWORD:-password}
+ COUCHBASE_BUCKET_NAME: ${COUCHBASE_BUCKET_NAME:-Embeddings}
+ COUCHBASE_SCOPE_NAME: ${COUCHBASE_SCOPE_NAME:-_default}
MILVUS_URI: ${MILVUS_URI:-http://127.0.0.1:19530}
MILVUS_TOKEN: ${MILVUS_TOKEN:-}
MILVUS_USER: ${MILVUS_USER:-root}
@@ -135,6 +140,18 @@ x-shared-env: &shared-api-worker-env
TIDB_VECTOR_USER: ${TIDB_VECTOR_USER:-}
TIDB_VECTOR_PASSWORD: ${TIDB_VECTOR_PASSWORD:-}
TIDB_VECTOR_DATABASE: ${TIDB_VECTOR_DATABASE:-dify}
+ TIDB_ON_QDRANT_URL: ${TIDB_ON_QDRANT_URL:-http://127.0.0.1}
+ TIDB_ON_QDRANT_API_KEY: ${TIDB_ON_QDRANT_API_KEY:-dify}
+ TIDB_ON_QDRANT_CLIENT_TIMEOUT: ${TIDB_ON_QDRANT_CLIENT_TIMEOUT:-20}
+ TIDB_ON_QDRANT_GRPC_ENABLED: ${TIDB_ON_QDRANT_GRPC_ENABLED:-false}
+ TIDB_ON_QDRANT_GRPC_PORT: ${TIDB_ON_QDRANT_GRPC_PORT:-6334}
+ TIDB_PUBLIC_KEY: ${TIDB_PUBLIC_KEY:-dify}
+ TIDB_PRIVATE_KEY: ${TIDB_PRIVATE_KEY:-dify}
+ TIDB_API_URL: ${TIDB_API_URL:-http://127.0.0.1}
+ TIDB_IAM_API_URL: ${TIDB_IAM_API_URL:-http://127.0.0.1}
+ TIDB_REGION: ${TIDB_REGION:-regions/aws-us-east-1}
+ TIDB_PROJECT_ID: ${TIDB_PROJECT_ID:-dify}
+ TIDB_SPEND_LIMIT: ${TIDB_SPEND_LIMIT:-100}
ORACLE_HOST: ${ORACLE_HOST:-oracle}
ORACLE_PORT: ${ORACLE_PORT:-1521}
ORACLE_USER: ${ORACLE_USER:-dify}
@@ -238,11 +255,17 @@ x-shared-env: &shared-api-worker-env
POSITION_PROVIDER_INCLUDES: ${POSITION_PROVIDER_INCLUDES:-}
POSITION_PROVIDER_EXCLUDES: ${POSITION_PROVIDER_EXCLUDES:-}
MAX_VARIABLE_SIZE: ${MAX_VARIABLE_SIZE:-204800}
+ OCEANBASE_VECTOR_HOST: ${OCEANBASE_VECTOR_HOST:-http://oceanbase-vector}
+ OCEANBASE_VECTOR_PORT: ${OCEANBASE_VECTOR_PORT:-2881}
+ OCEANBASE_VECTOR_USER: ${OCEANBASE_VECTOR_USER:-root@test}
+ OCEANBASE_VECTOR_PASSWORD: ${OCEANBASE_VECTOR_PASSWORD:-""}
+ OCEANBASE_VECTOR_DATABASE: ${OCEANBASE_VECTOR_DATABASE:-test}
+ OCEANBASE_MEMORY_LIMIT: ${OCEANBASE_MEMORY_LIMIT:-6G}
services:
# API service
api:
- image: langgenius/dify-api:0.10.1
+ image: langgenius/dify-api:0.10.2
restart: always
environment:
# Use the shared environment variables.
@@ -262,7 +285,7 @@ services:
# worker service
# The Celery worker for processing the queue.
worker:
- image: langgenius/dify-api:0.10.1
+ image: langgenius/dify-api:0.10.2
restart: always
environment:
# Use the shared environment variables.
@@ -281,7 +304,7 @@ services:
# Frontend web application.
web:
- image: langgenius/dify-web:0.10.1
+ image: langgenius/dify-web:0.10.2
restart: always
environment:
CONSOLE_API_URL: ${CONSOLE_API_URL:-}
@@ -475,6 +498,39 @@ services:
environment:
QDRANT_API_KEY: ${QDRANT_API_KEY:-difyai123456}
+ # The Couchbase vector store.
+ couchbase-server:
+ build: ./couchbase-server
+ profiles:
+ - couchbase
+ restart: always
+ environment:
+ - CLUSTER_NAME=dify_search
+ - COUCHBASE_ADMINISTRATOR_USERNAME=${COUCHBASE_USER:-Administrator}
+ - COUCHBASE_ADMINISTRATOR_PASSWORD=${COUCHBASE_PASSWORD:-password}
+ - COUCHBASE_BUCKET=${COUCHBASE_BUCKET_NAME:-Embeddings}
+ - COUCHBASE_BUCKET_RAMSIZE=512
+ - COUCHBASE_RAM_SIZE=2048
+ - COUCHBASE_EVENTING_RAM_SIZE=512
+ - COUCHBASE_INDEX_RAM_SIZE=512
+ - COUCHBASE_FTS_RAM_SIZE=1024
+ hostname: couchbase-server
+ container_name: couchbase-server
+ working_dir: /opt/couchbase
+ stdin_open: true
+ tty: true
+ entrypoint: [""]
+ command: sh -c "/opt/couchbase/init/init-cbserver.sh"
+ volumes:
+ - ./volumes/couchbase/data:/opt/couchbase/var/lib/couchbase/data
+ healthcheck:
+ # ensure bucket was created before proceeding
+ test: [ "CMD-SHELL", "curl -s -f -u Administrator:password http://localhost:8091/pools/default/buckets | grep -q '\\[{' || exit 1" ]
+ interval: 10s
+ retries: 10
+ start_period: 30s
+ timeout: 10s
+
# The pgvector vector database.
pgvector:
image: pgvector/pgvector:pg16
@@ -532,6 +588,18 @@ services:
CHROMA_SERVER_AUTHN_PROVIDER: ${CHROMA_SERVER_AUTHN_PROVIDER:-chromadb.auth.token_authn.TokenAuthenticationServerProvider}
IS_PERSISTENT: ${CHROMA_IS_PERSISTENT:-TRUE}
+ # OceanBase vector database
+ oceanbase-vector:
+ image: quay.io/oceanbase/oceanbase-ce:4.3.3.0-100000142024101215
+ profiles:
+ - oceanbase-vector
+ restart: always
+ volumes:
+ - ./volumes/oceanbase/data:/root/ob
+ - ./volumes/oceanbase/conf:/root/.obd/cluster
+ environment:
+ OB_MEMORY_LIMIT: ${OCEANBASE_MEMORY_LIMIT:-6G}
+
# Oracle vector database
oracle:
image: container-registry.oracle.com/database/free:latest
diff --git a/web/app/components/base/markdown-blocks/button.tsx b/web/app/components/base/markdown-blocks/button.tsx
new file mode 100644
index 0000000000..56647b3bbe
--- /dev/null
+++ b/web/app/components/base/markdown-blocks/button.tsx
@@ -0,0 +1,22 @@
+import { useChatContext } from '@/app/components/base/chat/chat/context'
+import Button from '@/app/components/base/button'
+import cn from '@/utils/classnames'
+
+const MarkdownButton = ({ node }: any) => {
+ const { onSend } = useChatContext()
+ const variant = node.properties.dataVariant
+ const message = node.properties.dataMessage
+ const size = node.properties.dataSize
+
+ return
+}
+MarkdownButton.displayName = 'MarkdownButton'
+
+export default MarkdownButton
diff --git a/web/app/components/base/markdown-blocks/form.tsx b/web/app/components/base/markdown-blocks/form.tsx
new file mode 100644
index 0000000000..f87f2dcd91
--- /dev/null
+++ b/web/app/components/base/markdown-blocks/form.tsx
@@ -0,0 +1,137 @@
+import Button from '@/app/components/base/button'
+import Input from '@/app/components/base/input'
+import Textarea from '@/app/components/base/textarea'
+import { useChatContext } from '@/app/components/base/chat/chat/context'
+
+enum DATA_FORMAT {
+ TEXT = 'text',
+ JSON = 'json',
+}
+enum SUPPORTED_TAGS {
+ LABEL = 'label',
+ INPUT = 'input',
+ TEXTAREA = 'textarea',
+ BUTTON = 'button',
+}
+enum SUPPORTED_TYPES {
+ TEXT = 'text',
+ PASSWORD = 'password',
+ EMAIL = 'email',
+ NUMBER = 'number',
+}
+const MarkdownForm = ({ node }: any) => {
+ // const supportedTypes = ['text', 'password', 'email', 'number']
+ //
+ const { onSend } = useChatContext()
+
+ const getFormValues = (children: any) => {
+ const formValues: { [key: string]: any } = {}
+ children.forEach((child: any) => {
+ if (child.tagName === SUPPORTED_TAGS.INPUT)
+ formValues[child.properties.name] = child.properties.value
+ if (child.tagName === SUPPORTED_TAGS.TEXTAREA)
+ formValues[child.properties.name] = child.properties.value
+ })
+ return formValues
+ }
+ const onSubmit = (e: any) => {
+ e.preventDefault()
+ const format = node.properties.dataFormat || DATA_FORMAT.TEXT
+ const result = getFormValues(node.children)
+ if (format === DATA_FORMAT.JSON) {
+ onSend?.(JSON.stringify(result))
+ }
+ else {
+ const textResult = Object.entries(result)
+ .map(([key, value]) => `${key}: ${value}`)
+ .join('\n')
+ onSend?.(textResult)
+ }
+ }
+ return (
+
+ )
+}
+MarkdownForm.displayName = 'MarkdownForm'
+export default MarkdownForm
diff --git a/web/app/components/base/markdown.tsx b/web/app/components/base/markdown.tsx
index 58868a0eb1..d9112507a7 100644
--- a/web/app/components/base/markdown.tsx
+++ b/web/app/components/base/markdown.tsx
@@ -20,7 +20,8 @@ import { useChatContext } from '@/app/components/base/chat/chat/context'
import VideoGallery from '@/app/components/base/video-gallery'
import AudioGallery from '@/app/components/base/audio-gallery'
import SVGRenderer from '@/app/components/base/svg-gallery'
-import Button from '@/app/components/base/button'
+import MarkdownButton from '@/app/components/base/markdown-blocks/button'
+import MarkdownForm from '@/app/components/base/markdown-blocks/form'
// Available language https://github.com/react-syntax-highlighter/react-syntax-highlighter/blob/master/AVAILABLE_LANGUAGES_HLJS.MD
const capitalizationLanguageNameMap: Record = {
@@ -240,22 +241,6 @@ const Link = ({ node, ...props }: any) => {
}
}
-const MarkdownButton = ({ node }: any) => {
- const { onSend } = useChatContext()
- const variant = node.properties.dataVariant
- const message = node.properties.dataMessage
- const size = node.properties.dataSize
-
- return
-}
-MarkdownButton.displayName = 'MarkdownButton'
-
export function Markdown(props: { content: string; className?: string }) {
const latexContent = preprocessLaTeX(props.content)
return (
@@ -288,6 +273,7 @@ export function Markdown(props: { content: string; className?: string }) {
a: Link,
p: Paragraph,
button: MarkdownButton,
+ form: MarkdownForm,
}}
linkTarget='_blank'
>
diff --git a/web/app/components/datasets/create/step-two/escape.ts b/web/app/components/datasets/create/step-two/escape.ts
index 098f43bc7f..2e1c3a9d73 100644
--- a/web/app/components/datasets/create/step-two/escape.ts
+++ b/web/app/components/datasets/create/step-two/escape.ts
@@ -3,7 +3,7 @@ function escape(input: string): string {
return ''
const res = input
- .replaceAll('\\', '\\\\')
+ // .replaceAll('\\', '\\\\') // This would add too many backslashes
.replaceAll('\0', '\\0')
.replaceAll('\b', '\\b')
.replaceAll('\f', '\\f')
diff --git a/web/app/components/workflow/nodes/_base/hooks/use-node-help-link.ts b/web/app/components/workflow/nodes/_base/hooks/use-node-help-link.ts
index b5fe9554da..2ecdf101d2 100644
--- a/web/app/components/workflow/nodes/_base/hooks/use-node-help-link.ts
+++ b/web/app/components/workflow/nodes/_base/hooks/use-node-help-link.ts
@@ -23,8 +23,8 @@ export const useNodeHelpLink = (nodeType: BlockEnum) => {
[BlockEnum.Code]: 'code',
[BlockEnum.TemplateTransform]: 'template',
[BlockEnum.VariableAssigner]: 'variable-assigner',
- [BlockEnum.VariableAggregator]: 'variable-assigner',
- [BlockEnum.Assigner]: 'variable-assignment',
+ [BlockEnum.VariableAggregator]: 'variable-aggregator',
+ [BlockEnum.Assigner]: 'variable-assigner',
[BlockEnum.Iteration]: 'iteration',
[BlockEnum.IterationStart]: 'iteration',
[BlockEnum.ParameterExtractor]: 'parameter-extractor',
@@ -46,8 +46,8 @@ export const useNodeHelpLink = (nodeType: BlockEnum) => {
[BlockEnum.Code]: 'code',
[BlockEnum.TemplateTransform]: 'template',
[BlockEnum.VariableAssigner]: 'variable-assigner',
- [BlockEnum.VariableAggregator]: 'variable-assigner',
- [BlockEnum.Assigner]: 'variable-assignment',
+ [BlockEnum.VariableAggregator]: 'variable-aggregator',
+ [BlockEnum.Assigner]: 'variable-assigner',
[BlockEnum.Iteration]: 'iteration',
[BlockEnum.IterationStart]: 'iteration',
[BlockEnum.ParameterExtractor]: 'parameter-extractor',
diff --git a/web/hooks/use-refresh-token.ts b/web/hooks/use-refresh-token.ts
index 293f3159de..53dc4faf00 100644
--- a/web/hooks/use-refresh-token.ts
+++ b/web/hooks/use-refresh-token.ts
@@ -41,6 +41,7 @@ const useRefreshToken = () => {
return new Error('No access token or refresh token found')
}
if (localStorage?.getItem('is_refreshing') === '1') {
+ clearTimeout(timer.current)
timer.current = setTimeout(() => {
getNewAccessToken()
}, 1000)
@@ -61,12 +62,14 @@ const useRefreshToken = () => {
localStorage?.setItem('console_token', access_token)
localStorage?.setItem('refresh_token', refresh_token)
const newTokenExpireTime = getExpireTime(access_token)
+ clearTimeout(timer.current)
timer.current = setTimeout(() => {
getNewAccessToken()
}, newTokenExpireTime - advanceTime.current - getCurrentTimeStamp())
}
else {
const newTokenExpireTime = getExpireTime(currentAccessToken)
+ clearTimeout(timer.current)
timer.current = setTimeout(() => {
getNewAccessToken()
}, newTokenExpireTime - advanceTime.current - getCurrentTimeStamp())
@@ -74,8 +77,15 @@ const useRefreshToken = () => {
return null
}, [getExpireTime, getCurrentTimeStamp, handleError])
+ const handleVisibilityChange = useCallback(() => {
+ if (document.visibilityState === 'visible')
+ getNewAccessToken()
+ }, [])
+
useEffect(() => {
+ window.addEventListener('visibilitychange', handleVisibilityChange)
return () => {
+ window.removeEventListener('visibilitychange', handleVisibilityChange)
clearTimeout(timer.current)
localStorage?.removeItem('is_refreshing')
}
diff --git a/web/i18n/ja-JP/app-annotation.ts b/web/i18n/ja-JP/app-annotation.ts
index 6c6c98cdd0..f34d8d2acd 100644
--- a/web/i18n/ja-JP/app-annotation.ts
+++ b/web/i18n/ja-JP/app-annotation.ts
@@ -9,6 +9,8 @@ const translation = {
table: {
header: {
question: '質問',
+ match: 'マッチ',
+ response: '応答',
answer: '回答',
createdAt: '作成日時',
hits: 'ヒット数',
diff --git a/web/i18n/ja-JP/app-debug.ts b/web/i18n/ja-JP/app-debug.ts
index 0ba4c35b0d..620d9b2f55 100644
--- a/web/i18n/ja-JP/app-debug.ts
+++ b/web/i18n/ja-JP/app-debug.ts
@@ -150,7 +150,7 @@ const translation = {
title: '会話履歴',
description: '会話の役割に接頭辞名を設定します',
tip: '会話履歴は有効になっていません。上記のプロンプトに を追加してください。',
- learnMore: '詳細',
+ learnMore: '詳細を見る',
editModal: {
title: '会話役割名の編集',
userPrefix: 'ユーザー接頭辞',
@@ -163,6 +163,7 @@ const translation = {
moderation: {
title: 'コンテンツのモデレーション',
description: 'モデレーションAPIを使用するか、機密語リストを維持することで、モデルの出力を安全にします。',
+ contentEnableLabel: 'モデレート・コンテンツを有効にする',
allEnabled: '入力/出力コンテンツが有効になっています',
inputEnabled: '入力コンテンツが有効になっています',
outputEnabled: '出力コンテンツが有効になっています',
@@ -198,6 +199,25 @@ const translation = {
},
},
},
+ fileUpload: {
+ title: 'ファイル アップロード',
+ description: 'チャットの入力ボックスは画像やドキュメントやその他のファイルのアップロードをサポートします。',
+ supportedTypes: 'サポートされるファイルのタイプ',
+ numberLimit: '最大アップロード数',
+ modalTitle: 'ファイル アップロード設置',
+ },
+ imageUpload: {
+ title: '画像アップロード',
+ description: '画像アップロードをサポートする',
+ supportedTypes: 'サポートされるファイルのタイプ',
+ numberLimit: '最大アップロード数',
+ modalTitle: '画像アップロード設置',
+ },
+ bar: {
+ empty: 'Webアプリのユーザーエクスペリアンスを強化させる機能を有効にする',
+ enableText: '有効な機能',
+ manage: '管理',
+ },
},
codegen: {
title: 'コードジェネレーター',
@@ -278,6 +298,10 @@ const translation = {
waitForBatchResponse: 'バッチタスクへの応答が完了するまでお待ちください。',
notSelectModel: 'モデルを選択してください',
waitForImgUpload: '画像のアップロードが完了するまでお待ちください',
+ waitForFileUpload: 'ファイルのアップロードが完了するまでお待ちください',
+ },
+ warningMessage: {
+ timeoutExceeded: 'タイムアウトのため結果が表示されません。完全な結果を手にいれるためには、ログを参照してください。',
},
chatSubTitle: '手順',
completionSubTitle: '接頭辞プロンプト',
@@ -319,6 +343,8 @@ const translation = {
'paragraph': '段落',
'select': '選択',
'number': '数値',
+ 'single-file': '単一ファイル',
+ 'multi-files': 'ファイルリスト',
'notSet': '設定されていません。プレフィックスのプロンプトで {{input}} を入力してみてください。',
'stringTitle': 'フォームテキストボックスオプション',
'maxLength': '最大長',
@@ -330,6 +356,31 @@ const translation = {
'inputPlaceholder': '入力してください',
'content': 'コンテンツ',
'required': '必須',
+ 'file': {
+ supportFileTypes: 'サッポトされたファイルタイプ',
+ image: {
+ name: '画像',
+ },
+ audio: {
+ name: '音声',
+ },
+ document: {
+ name: 'ドキュメント',
+ },
+ video: {
+ name: '映像',
+ },
+ custom: {
+ name: '他のファイルタイプ',
+ description: '他のファイルタイプを指定する。',
+ createPlaceholder: '+ 拡張子, 例:.doc',
+ },
+ },
+ 'uploadFileTypes': 'アップロードされたファイルのタイプ',
+ 'localUpload': 'ローカル アップロード',
+ 'both': '両方',
+ 'maxNumberOfUploads': 'アップロードの最大数',
+ 'maxNumberTip': 'ドキュメント < {{docLimit}}, 画像 < {{imgLimit}}, 音声 < {{audioLimit}}, 映像 < {{videoLimit}}',
'errorMsg': {
varNameRequired: '変数名は必須です',
labelNameRequired: 'ラベル名は必須です',
@@ -341,6 +392,7 @@ const translation = {
vision: {
name: 'ビジョン',
description: 'ビジョンを有効にすると、モデルが画像を受け取り、それに関する質問に答えることができます。',
+ onlySupportVisionModelTip: 'ビジョンモデルのみをサポート',
settings: '設定',
visionSettings: {
title: 'ビジョン設定',
@@ -369,7 +421,7 @@ const translation = {
voice: '音声',
autoPlay: '自動再生',
autoPlayEnabled: '開ける',
- autoPlayDisabled: '關閉',
+ autoPlayDisabled: '閉じる',
},
},
openingStatement: {
@@ -408,6 +460,7 @@ const translation = {
run: '実行',
},
result: '出力テキスト',
+ noResult: '出力はここに表示されます。',
datasetConfig: {
settingTitle: 'リトリーバル設定',
knowledgeTip: 'ナレッジを追加するには「+」ボタンをクリックしてください',
diff --git a/web/i18n/ja-JP/dataset-settings.ts b/web/i18n/ja-JP/dataset-settings.ts
index 1eb3dabb74..f0b8c76a24 100644
--- a/web/i18n/ja-JP/dataset-settings.ts
+++ b/web/i18n/ja-JP/dataset-settings.ts
@@ -24,7 +24,7 @@ const translation = {
embeddingModelTipLink: '設定',
retrievalSetting: {
title: '検索設定',
- learnMore: '詳細を学ぶ',
+ learnMore: '詳細を見る',
description: ' 検索方法についての詳細',
longDescription: ' 検索方法についての詳細については、いつでもナレッジの設定で変更できます。',
},
diff --git a/web/i18n/ja-JP/dataset.ts b/web/i18n/ja-JP/dataset.ts
index d995509a3f..f15f0dfb1a 100644
--- a/web/i18n/ja-JP/dataset.ts
+++ b/web/i18n/ja-JP/dataset.ts
@@ -101,7 +101,7 @@ const translation = {
end: '.次に、対応するナレッジIDを見つけて、左側のフォームに入力します。すべての情報が正しい場合は、接続ボタンをクリックした後、ナレッジベースの検索テストに自動的にジャンプします。',
},
title: '外部ナレッジベースに接続する方法',
- learnMore: '詳細情報',
+ learnMore: '詳細を見る',
},
connectHelper: {
helper2: '取得機能のみがサポートされています',
diff --git a/web/i18n/ja-JP/workflow.ts b/web/i18n/ja-JP/workflow.ts
index 632e5712e5..b6c7786081 100644
--- a/web/i18n/ja-JP/workflow.ts
+++ b/web/i18n/ja-JP/workflow.ts
@@ -19,6 +19,10 @@ const translation = {
goBackToEdit: '編集に戻る',
conversationLog: '会話ログ',
features: '機能',
+ featuresDescription: 'Webアプリのユーザーエクスペリエンスを強化する',
+ ImageUploadLegacyTip: '開始フォームでファイルタイプ変数を作成できるようになりました。まもなく、画像アップロード機能のサポートは終了いたします。',
+ fileUploadTip: '画像アップロード機能がファイルのアップロード機能にアップグレードされました。',
+ featuresDocLink: '詳細を見る',
debugAndPreview: 'プレビュー',
restart: '再起動',
currentDraft: '現在の下書き',
@@ -55,7 +59,7 @@ const translation = {
viewOnly: '表示のみ',
showRunHistory: '実行履歴を表示',
enableJinja: 'Jinjaテンプレートのサポートを有効にする',
- learnMore: '詳細を学ぶ',
+ learnMore: '詳細を見る',
copy: 'コピー',
duplicate: '複製',
addBlock: 'ブロックを追加',
@@ -95,10 +99,6 @@ const translation = {
addParallelNode: '並列ノードを追加',
parallel: '並列',
branch: 'ブランチ',
- fileUploadTip: '画像のアップロード機能がファイルのアップロードにアップグレードされました。',
- featuresDocLink: '詳細情報',
- ImageUploadLegacyTip: 'これで、開始フォームでファイルタイプ変数を作成できるようになりました。今後、画像のアップロード機能のサポートは終了いたします。',
- featuresDescription: 'Webアプリのユーザーエクスペリエンスを強化',
},
env: {
envPanelTitle: '環境変数',
@@ -229,8 +229,8 @@ const translation = {
'iteration-start': 'イテレーション開始',
'iteration': 'イテレーション',
'parameter-extractor': 'パラメーター抽出',
- 'document-extractor': 'ドキュメントエクストラクター',
- 'list-operator': 'リスト演算子',
+ 'document-extractor': 'テキスト抽出ツール',
+ 'list-operator': 'リスト処理',
},
blocksAbout: {
'start': 'ワークフローの開始に必要なパラメータを定義します',
@@ -248,7 +248,7 @@ const translation = {
'variable-aggregator': '複数のブランチの変数を1つの変数に集約し、下流のノードに対して統一された設定を行います。',
'iteration': 'リストオブジェクトに対して複数のステップを実行し、すべての結果が出力されるまで繰り返します。',
'parameter-extractor': '自然言語からツールの呼び出しやHTTPリクエストのための構造化されたパラメーターを抽出するためにLLMを使用します。',
- 'document-extractor': 'アップロードされたドキュメントを LLM で簡単に理解できるテキスト コンテンツに解析するために使用されます。',
+ 'document-extractor': 'アップロードされたドキュメントを LLM で簡単に理解できるテキストのコンテンツに解析するために使用されます。',
'list-operator': '配列のコンテンツをフィルタリングまたはソートするために使用されます。',
},
operator: {
@@ -405,7 +405,7 @@ const translation = {
writeLabel: '書き込みタイムアウト',
writePlaceholder: '書き込みタイムアウトを秒で入力',
},
- type: '種類',
+ type: 'タイプ',
binaryFileVariable: 'バイナリファイル変数',
},
code: {
@@ -443,21 +443,21 @@ const translation = {
'null': 'null',
'not null': 'nullでない',
'regex match': '正規表現マッチ',
- 'in': 'で',
- 'not exists': '存在しません',
- 'exists': '存在',
+ 'in': '含まれている',
'not in': '含まれていない',
'all of': 'すべての',
+ 'exists': '存在します',
+ 'not exists': '存在しません',
},
enterValue: '値を入力',
addCondition: '条件を追加',
conditionNotSetup: '条件が設定されていません',
selectVariable: '変数を選択...',
optionName: {
- audio: 'オーディオ',
+ audio: '音声',
localUpload: 'ローカルアップロード',
image: '画像',
- video: 'ビデオ',
+ video: '映像',
doc: 'ドキュメント',
url: 'URL',
},
@@ -583,7 +583,7 @@ const translation = {
text: '抽出されたテキスト',
},
inputVar: '入力変数',
- learnMore: '詳細情報',
+ learnMore: '詳細を見る',
supportFileTypes: 'サポートするファイルタイプ: {{types}}。',
},
listFilter: {
@@ -593,13 +593,13 @@ const translation = {
result: 'フィルター結果',
},
limit: 'トップN',
- asc: 'ASCの',
+ asc: 'ASC',
filterCondition: 'フィルター条件',
filterConditionKey: 'フィルター条件キー',
- orderBy: '注文順',
+ orderBy: '並べる順番',
filterConditionComparisonValue: 'フィルター条件の値',
- selectVariableKeyPlaceholder: 'サブ変数キーの選択',
- filterConditionComparisonOperator: 'フィルター条件比較演算子',
+ selectVariableKeyPlaceholder: 'サブ変数キーを選択する',
+ filterConditionComparisonOperator: 'フィルター条件を比較オペレーター',
inputVar: '入力変数',
desc: 'DESC',
},
diff --git a/web/package.json b/web/package.json
index 208f714075..a1ebb26eea 100644
--- a/web/package.json
+++ b/web/package.json
@@ -1,6 +1,6 @@
{
"name": "dify-web",
- "version": "0.10.1",
+ "version": "0.10.2",
"private": true,
"engines": {
"node": ">=18.17.0"