merge main

This commit is contained in:
StyleZhang 2024-10-30 11:20:20 +08:00
commit 0f4291bd36
127 changed files with 4376 additions and 149 deletions

View File

@ -78,7 +78,7 @@ jobs:
- name: Run Workflow
run: poetry run -C api bash dev/pytest/pytest_workflow.sh
- name: Set up Vector Stores (Weaviate, Qdrant, PGVector, Milvus, PgVecto-RS, Chroma, MyScale, ElasticSearch)
- name: Set up Vector Stores (Weaviate, Qdrant, PGVector, Milvus, PgVecto-RS, Chroma, MyScale, ElasticSearch, Couchbase)
uses: hoverkraft-tech/compose-action@v2.0.0
with:
compose-file: |
@ -86,6 +86,7 @@ jobs:
services: |
weaviate
qdrant
couchbase-server
etcd
minio
milvus-standalone

View File

@ -7,5 +7,7 @@ yq eval '.services["milvus-standalone"].ports += ["19530:19530"]' -i docker/dock
yq eval '.services.pgvector.ports += ["5433:5432"]' -i docker/docker-compose.yaml
yq eval '.services["pgvecto-rs"].ports += ["5431:5432"]' -i docker/docker-compose.yaml
yq eval '.services["elasticsearch"].ports += ["9200:9200"]' -i docker/docker-compose.yaml
yq eval '.services.couchbase-server.ports += ["8091-8096:8091-8096"]' -i docker/docker-compose.yaml
yq eval '.services.couchbase-server.ports += ["11210:11210"]' -i docker/docker-compose.yaml
echo "Ports exposed for sandbox, weaviate, qdrant, chroma, milvus, pgvector, pgvecto-rs, elasticsearch"
echo "Ports exposed for sandbox, weaviate, qdrant, chroma, milvus, pgvector, pgvecto-rs, elasticsearch, couchbase"

2
.gitignore vendored
View File

@ -173,6 +173,8 @@ docker/volumes/myscale/log/*
docker/volumes/unstructured/*
docker/volumes/pgvector/data/*
docker/volumes/pgvecto_rs/data/*
docker/volumes/couchbase/*
docker/volumes/oceanbase/*
docker/nginx/conf.d/default.conf
docker/nginx/ssl/*

241
README_PT.md Normal file
View File

@ -0,0 +1,241 @@
![cover-v5-optimized](https://github.com/langgenius/dify/assets/13230914/f9e19af5-61ba-4119-b926-d10c4c06ebab)
<p align="center">
📌 <a href="https://dify.ai/blog/introducing-dify-workflow-file-upload-a-demo-on-ai-podcast">Introduzindo o Dify Workflow com Upload de Arquivo: Recrie o Podcast Google NotebookLM</a>
</p>
<p align="center">
<a href="https://cloud.dify.ai">Dify Cloud</a> ·
<a href="https://docs.dify.ai/getting-started/install-self-hosted">Auto-hospedagem</a> ·
<a href="https://docs.dify.ai">Documentação</a> ·
<a href="https://udify.app/chat/22L1zSxg6yW1cWQg">Consultas empresariais</a>
</p>
<p align="center">
<a href="https://dify.ai" target="_blank">
<img alt="Static Badge" src="https://img.shields.io/badge/Product-F04438"></a>
<a href="https://dify.ai/pricing" target="_blank">
<img alt="Static Badge" src="https://img.shields.io/badge/free-pricing?logo=free&color=%20%23155EEF&label=pricing&labelColor=%20%23528bff"></a>
<a href="https://discord.gg/FngNHpbcY7" target="_blank">
<img src="https://img.shields.io/discord/1082486657678311454?logo=discord&labelColor=%20%235462eb&logoColor=%20%23f5f5f5&color=%20%235462eb"
alt="chat on Discord"></a>
<a href="https://twitter.com/intent/follow?screen_name=dify_ai" target="_blank">
<img src="https://img.shields.io/twitter/follow/dify_ai?logo=X&color=%20%23f5f5f5"
alt="follow on X(Twitter)"></a>
<a href="https://hub.docker.com/u/langgenius" target="_blank">
<img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/langgenius/dify-web?labelColor=%20%23FDB062&color=%20%23f79009"></a>
<a href="https://github.com/langgenius/dify/graphs/commit-activity" target="_blank">
<img alt="Commits last month" src="https://img.shields.io/github/commit-activity/m/langgenius/dify?labelColor=%20%2332b583&color=%20%2312b76a"></a>
<a href="https://github.com/langgenius/dify/" target="_blank">
<img alt="Issues closed" src="https://img.shields.io/github/issues-search?query=repo%3Alanggenius%2Fdify%20is%3Aclosed&label=issues%20closed&labelColor=%20%237d89b0&color=%20%235d6b98"></a>
<a href="https://github.com/langgenius/dify/discussions/" target="_blank">
<img alt="Discussion posts" src="https://img.shields.io/github/discussions/langgenius/dify?labelColor=%20%239b8afb&color=%20%237a5af8"></a>
</p>
<p align="center">
<a href="./README.md"><img alt="README em Inglês" src="https://img.shields.io/badge/English-d9d9d9"></a>
<a href="./README_CN.md"><img alt="简体中文版自述文件" src="https://img.shields.io/badge/简体中文-d9d9d9"></a>
<a href="./README_JA.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-d9d9d9"></a>
<a href="./README_ES.md"><img alt="README em Espanhol" src="https://img.shields.io/badge/Español-d9d9d9"></a>
<a href="./README_FR.md"><img alt="README em Francês" src="https://img.shields.io/badge/Français-d9d9d9"></a>
<a href="./README_KL.md"><img alt="README tlhIngan Hol" src="https://img.shields.io/badge/Klingon-d9d9d9"></a>
<a href="./README_KR.md"><img alt="README em Coreano" src="https://img.shields.io/badge/한국어-d9d9d9"></a>
<a href="./README_AR.md"><img alt="README em Árabe" src="https://img.shields.io/badge/العربية-d9d9d9"></a>
<a href="./README_TR.md"><img alt="README em Turco" src="https://img.shields.io/badge/Türkçe-d9d9d9"></a>
<a href="./README_VI.md"><img alt="README em Vietnamita" src="https://img.shields.io/badge/Ti%E1%BA%BFng%20Vi%E1%BB%87t-d9d9d9"></a>
<a href="./README_PT.md"><img alt="README em Português - BR" src="https://img.shields.io/badge/Portugu%C3%AAs-BR?style=flat&label=BR&color=d9d9d9"></a>
</p>
Dify é uma plataforma de desenvolvimento de aplicativos LLM de código aberto. Sua interface intuitiva combina workflow de IA, pipeline RAG, capacidades de agente, gerenciamento de modelos, recursos de observabilidade e muito mais, permitindo que você vá rapidamente do protótipo à produção. Aqui está uma lista das principais funcionalidades:
</br> </br>
**1. Workflow**:
Construa e teste workflows poderosos de IA em uma interface visual, aproveitando todos os recursos a seguir e muito mais.
https://github.com/langgenius/dify/assets/13230914/356df23e-1604-483d-80a6-9517ece318aa
**2. Suporte abrangente a modelos**:
Integração perfeita com centenas de LLMs proprietários e de código aberto de diversas provedoras e soluções auto-hospedadas, abrangendo GPT, Mistral, Llama3 e qualquer modelo compatível com a API da OpenAI. A lista completa de provedores suportados pode ser encontrada [aqui](https://docs.dify.ai/getting-started/readme/model-providers).
![providers-v5](https://github.com/langgenius/dify/assets/13230914/5a17bdbe-097a-4100-8363-40255b70f6e3)
**3. IDE de Prompt**:
Interface intuitiva para criação de prompts, comparação de desempenho de modelos e adição de recursos como conversão de texto para fala em um aplicativo baseado em chat.
**4. Pipeline RAG**:
Extensas capacidades de RAG que cobrem desde a ingestão de documentos até a recuperação, com suporte nativo para extração de texto de PDFs, PPTs e outros formatos de documentos comuns.
**5. Capacidades de agente**:
Você pode definir agentes com base em LLM Function Calling ou ReAct e adicionar ferramentas pré-construídas ou personalizadas para o agente. O Dify oferece mais de 50 ferramentas integradas para agentes de IA, como Google Search, DALL·E, Stable Diffusion e WolframAlpha.
**6. LLMOps**:
Monitore e analise os registros e o desempenho do aplicativo ao longo do tempo. É possível melhorar continuamente prompts, conjuntos de dados e modelos com base nos dados de produção e anotações.
**7. Backend como Serviço**:
Todas os recursos do Dify vêm com APIs correspondentes, permitindo que você integre o Dify sem esforço na lógica de negócios da sua empresa.
## Comparação de recursos
<table style="width: 100%;">
<tr>
<th align="center">Recurso</th>
<th align="center">Dify.AI</th>
<th align="center">LangChain</th>
<th align="center">Flowise</th>
<th align="center">OpenAI Assistants API</th>
</tr>
<tr>
<td align="center">Abordagem de Programação</td>
<td align="center">Orientada a API + Aplicativo</td>
<td align="center">Código Python</td>
<td align="center">Orientada a Aplicativo</td>
<td align="center">Orientada a API</td>
</tr>
<tr>
<td align="center">LLMs Suportados</td>
<td align="center">Variedade Rica</td>
<td align="center">Variedade Rica</td>
<td align="center">Variedade Rica</td>
<td align="center">Apenas OpenAI</td>
</tr>
<tr>
<td align="center">RAG Engine</td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Agente</td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Workflow</td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Observabilidade</td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Recursos Empresariais (SSO/Controle de Acesso)</td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Implantação Local</td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
</tr>
</table>
## Usando o Dify
- **Nuvem </br>**
Oferecemos o serviço [Dify Cloud](https://dify.ai) para qualquer pessoa experimentar sem nenhuma configuração. Ele fornece todas as funcionalidades da versão auto-hospedada, incluindo 200 chamadas GPT-4 gratuitas no plano sandbox.
- **Auto-hospedagem do Dify Community Edition</br>**
Configure rapidamente o Dify no seu ambiente com este [guia inicial](#quick-start).
Use nossa [documentação](https://docs.dify.ai) para referências adicionais e instruções mais detalhadas.
- **Dify para empresas/organizações</br>**
Oferecemos recursos adicionais voltados para empresas. [Envie suas perguntas através deste chatbot](https://udify.app/chat/22L1zSxg6yW1cWQg) ou [envie-nos um e-mail](mailto:business@dify.ai?subject=[GitHub]Business%20License%20Inquiry) para discutir necessidades empresariais. </br>
> Para startups e pequenas empresas que utilizam AWS, confira o [Dify Premium no AWS Marketplace](https://aws.amazon.com/marketplace/pp/prodview-t22mebxzwjhu6) e implemente no seu próprio AWS VPC com um clique. É uma oferta AMI acessível com a opção de criar aplicativos com logotipo e marca personalizados.
## Mantendo-se atualizado
Dê uma estrela no Dify no GitHub e seja notificado imediatamente sobre novos lançamentos.
![star-us](https://github.com/langgenius/dify/assets/13230914/b823edc1-6388-4e25-ad45-2f6b187adbb4)
## Início rápido
> Antes de instalar o Dify, certifique-se de que sua máquina atenda aos seguintes requisitos mínimos de sistema:
>
>- CPU >= 2 Núcleos
>- RAM >= 4 GiB
</br>
A maneira mais fácil de iniciar o servidor Dify é executar nosso arquivo [docker-compose.yml](docker/docker-compose.yaml). Antes de rodar o comando de instalação, certifique-se de que o [Docker](https://docs.docker.com/get-docker/) e o [Docker Compose](https://docs.docker.com/compose/install/) estão instalados na sua máquina:
```bash
cd docker
cp .env.example .env
docker compose up -d
```
Após a execução, você pode acessar o painel do Dify no navegador em [http://localhost/install](http://localhost/install) e iniciar o processo de inicialização.
> Se você deseja contribuir com o Dify ou fazer desenvolvimento adicional, consulte nosso [guia para implantar a partir do código fonte](https://docs.dify.ai/getting-started/install-self-hosted/local-source-code).
## Próximos passos
Se precisar personalizar a configuração, consulte os comentários no nosso arquivo [.env.example](docker/.env.example) e atualize os valores correspondentes no seu arquivo `.env`. Além disso, talvez seja necessário fazer ajustes no próprio arquivo `docker-compose.yaml`, como alterar versões de imagem, mapeamentos de portas ou montagens de volumes, com base no seu ambiente de implantação específico e nas suas necessidades. Após fazer quaisquer alterações, execute novamente `docker-compose up -d`. Você pode encontrar a lista completa de variáveis de ambiente disponíveis [aqui](https://docs.dify.ai/getting-started/install-self-hosted/environments).
Se deseja configurar uma instalação de alta disponibilidade, há [Helm Charts](https://helm.sh/) e arquivos YAML contribuídos pela comunidade que permitem a implantação do Dify no Kubernetes.
- [Helm Chart de @LeoQuote](https://github.com/douban/charts/tree/master/charts/dify)
- [Helm Chart de @BorisPolonsky](https://github.com/BorisPolonsky/dify-helm)
- [Arquivo YAML de @Winson-030](https://github.com/Winson-030/dify-kubernetes)
#### Usando o Terraform para Implantação
Implante o Dify na Plataforma Cloud com um único clique usando [terraform](https://www.terraform.io/)
##### Azure Global
- [Azure Terraform por @nikawang](https://github.com/nikawang/dify-azure-terraform)
##### Google Cloud
- [Google Cloud Terraform por @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)
## Contribuindo
Para aqueles que desejam contribuir com código, veja nosso [Guia de Contribuição](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md).
Ao mesmo tempo, considere apoiar o Dify compartilhando-o nas redes sociais e em eventos e conferências.
> Estamos buscando contribuidores para ajudar na tradução do Dify para idiomas além de Mandarim e Inglês. Se você tiver interesse em ajudar, consulte o [README i18n](https://github.com/langgenius/dify/blob/main/web/i18n/README.md) para mais informações e deixe-nos um comentário no canal `global-users` em nosso [Servidor da Comunidade no Discord](https://discord.gg/8Tpq4AcN9c).
**Contribuidores**
<a href="https://github.com/langgenius/dify/graphs/contributors">
<img src="https://contrib.rocks/image?repo=langgenius/dify" />
</a>
## Comunidade e contato
* [Discussões no GitHub](https://github.com/langgenius/dify/discussions). Melhor para: compartilhar feedback e fazer perguntas.
* [Problemas no GitHub](https://github.com/langgenius/dify/issues). Melhor para: relatar bugs encontrados no Dify.AI e propor novos recursos. Veja nosso [Guia de Contribuição](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md).
* [Discord](https://discord.gg/FngNHpbcY7). Melhor para: compartilhar suas aplicações e interagir com a comunidade.
* [X(Twitter)](https://twitter.com/dify_ai). Melhor para: compartilhar suas aplicações e interagir com a comunidade.
## Histórico de estrelas
[![Gráfico de Histórico de Estrelas](https://api.star-history.com/svg?repos=langgenius/dify&type=Date)](https://star-history.com/#langgenius/dify&Date)
## Divulgação de segurança
Para proteger sua privacidade, evite postar problemas de segurança no GitHub. Em vez disso, envie suas perguntas para security@dify.ai e forneceremos uma resposta mais detalhada.
## Licença
Este repositório está disponível sob a [Licença de Código Aberto Dify](LICENSE), que é essencialmente Apache 2.0 com algumas restrições adicionais.

View File

@ -120,7 +120,7 @@ SUPABASE_URL=your-server-url
WEB_API_CORS_ALLOW_ORIGINS=http://127.0.0.1:3000,*
CONSOLE_CORS_ALLOW_ORIGINS=http://127.0.0.1:3000,*
# Vector database configuration, support: weaviate, qdrant, milvus, myscale, relyt, pgvecto_rs, pgvector, pgvector, chroma, opensearch, tidb_vector, vikingdb, upstash
# Vector database configuration, support: weaviate, qdrant, milvus, myscale, relyt, pgvecto_rs, pgvector, pgvector, chroma, opensearch, tidb_vector, couchbase, vikingdb, upstash
VECTOR_STORE=weaviate
# Weaviate configuration
@ -136,6 +136,13 @@ QDRANT_CLIENT_TIMEOUT=20
QDRANT_GRPC_ENABLED=false
QDRANT_GRPC_PORT=6334
#Couchbase configuration
COUCHBASE_CONNECTION_STRING=127.0.0.1
COUCHBASE_USER=Administrator
COUCHBASE_PASSWORD=password
COUCHBASE_BUCKET_NAME=Embeddings
COUCHBASE_SCOPE_NAME=_default
# Milvus configuration
MILVUS_URI=http://127.0.0.1:19530
MILVUS_TOKEN=
@ -195,6 +202,20 @@ TIDB_VECTOR_USER=xxx.root
TIDB_VECTOR_PASSWORD=xxxxxx
TIDB_VECTOR_DATABASE=dify
# Tidb on qdrant configuration
TIDB_ON_QDRANT_URL=http://127.0.0.1
TIDB_ON_QDRANT_API_KEY=dify
TIDB_ON_QDRANT_CLIENT_TIMEOUT=20
TIDB_ON_QDRANT_GRPC_ENABLED=false
TIDB_ON_QDRANT_GRPC_PORT=6334
TIDB_PUBLIC_KEY=dify
TIDB_PRIVATE_KEY=dify
TIDB_API_URL=http://127.0.0.1
TIDB_IAM_API_URL=http://127.0.0.1
TIDB_REGION=regions/aws-us-east-1
TIDB_PROJECT_ID=dify
TIDB_SPEND_LIMIT=100
# Chroma configuration
CHROMA_HOST=127.0.0.1
CHROMA_PORT=8000
@ -242,6 +263,14 @@ VIKINGDB_SCHEMA=http
VIKINGDB_CONNECTION_TIMEOUT=30
VIKINGDB_SOCKET_TIMEOUT=30
# OceanBase Vector configuration
OCEANBASE_VECTOR_HOST=127.0.0.1
OCEANBASE_VECTOR_PORT=2881
OCEANBASE_VECTOR_USER=root@test
OCEANBASE_VECTOR_PASSWORD=
OCEANBASE_VECTOR_DATABASE=test
OCEANBASE_MEMORY_LIMIT=6G
# Upload configuration
UPLOAD_FILE_SIZE_LIMIT=15
UPLOAD_FILE_BATCH_LIMIT=5

View File

@ -278,6 +278,8 @@ def migrate_knowledge_vector_database():
VectorType.BAIDU,
VectorType.VIKINGDB,
VectorType.UPSTASH,
VectorType.COUCHBASE,
VectorType.OCEANBASE,
}
page = 1
while True:

View File

@ -10,6 +10,7 @@ from pydantic import (
PositiveInt,
computed_field,
)
from pydantic_extra_types.timezone_name import TimeZoneName
from pydantic_settings import BaseSettings
from configs.feature.hosted_service import HostedServiceConfig
@ -339,8 +340,9 @@ class LoggingConfig(BaseSettings):
default=None,
)
LOG_TZ: Optional[str] = Field(
description="Timezone for log timestamps (e.g., 'America/New_York')",
LOG_TZ: Optional[TimeZoneName] = Field(
description="Timezone for log timestamps. Allowed timezone values can be referred to IANA Time Zone Database,"
" e.g., 'America/New_York')",
default=None,
)

View File

@ -17,9 +17,11 @@ from configs.middleware.storage.tencent_cos_storage_config import TencentCloudCO
from configs.middleware.storage.volcengine_tos_storage_config import VolcengineTOSStorageConfig
from configs.middleware.vdb.analyticdb_config import AnalyticdbConfig
from configs.middleware.vdb.chroma_config import ChromaConfig
from configs.middleware.vdb.couchbase_config import CouchbaseConfig
from configs.middleware.vdb.elasticsearch_config import ElasticsearchConfig
from configs.middleware.vdb.milvus_config import MilvusConfig
from configs.middleware.vdb.myscale_config import MyScaleConfig
from configs.middleware.vdb.oceanbase_config import OceanBaseVectorConfig
from configs.middleware.vdb.opensearch_config import OpenSearchConfig
from configs.middleware.vdb.oracle_config import OracleConfig
from configs.middleware.vdb.pgvector_config import PGVectorConfig
@ -251,9 +253,11 @@ class MiddlewareConfig(
TiDBVectorConfig,
WeaviateConfig,
ElasticsearchConfig,
CouchbaseConfig,
InternalTestConfig,
VikingDBConfig,
UpstashConfig,
TidbOnQdrantConfig,
OceanBaseVectorConfig,
):
pass

View File

@ -0,0 +1,34 @@
from typing import Optional
from pydantic import BaseModel, Field
class CouchbaseConfig(BaseModel):
"""
Couchbase configs
"""
COUCHBASE_CONNECTION_STRING: Optional[str] = Field(
description="COUCHBASE connection string",
default=None,
)
COUCHBASE_USER: Optional[str] = Field(
description="COUCHBASE user",
default=None,
)
COUCHBASE_PASSWORD: Optional[str] = Field(
description="COUCHBASE password",
default=None,
)
COUCHBASE_BUCKET_NAME: Optional[str] = Field(
description="COUCHBASE bucket name",
default=None,
)
COUCHBASE_SCOPE_NAME: Optional[str] = Field(
description="COUCHBASE scope name",
default=None,
)

View File

@ -0,0 +1,35 @@
from typing import Optional
from pydantic import Field, PositiveInt
from pydantic_settings import BaseSettings
class OceanBaseVectorConfig(BaseSettings):
"""
Configuration settings for OceanBase Vector database
"""
OCEANBASE_VECTOR_HOST: Optional[str] = Field(
description="Hostname or IP address of the OceanBase Vector server (e.g. 'localhost')",
default=None,
)
OCEANBASE_VECTOR_PORT: Optional[PositiveInt] = Field(
description="Port number on which the OceanBase Vector server is listening (default is 2881)",
default=2881,
)
OCEANBASE_VECTOR_USER: Optional[str] = Field(
description="Username for authenticating with the OceanBase Vector database",
default=None,
)
OCEANBASE_VECTOR_PASSWORD: Optional[str] = Field(
description="Password for authenticating with the OceanBase Vector database",
default=None,
)
OCEANBASE_VECTOR_DATABASE: Optional[str] = Field(
description="Name of the OceanBase Vector database to connect to",
default=None,
)

View File

@ -63,3 +63,8 @@ class TidbOnQdrantConfig(BaseSettings):
description="Tidb project id",
default=None,
)
TIDB_SPEND_LIMIT: Optional[int] = Field(
description="Tidb spend limit",
default=100,
)

View File

@ -9,7 +9,7 @@ class PackagingInfo(BaseSettings):
CURRENT_VERSION: str = Field(
description="Dify version",
default="0.10.1",
default="0.10.2",
)
COMMIT_SHA: str = Field(

View File

@ -628,6 +628,7 @@ class DatasetRetrievalSettingApi(Resource):
| VectorType.BAIDU
| VectorType.VIKINGDB
| VectorType.UPSTASH
| VectorType.OCEANBASE
):
return {"retrieval_method": [RetrievalMethod.SEMANTIC_SEARCH.value]}
case (
@ -640,6 +641,7 @@ class DatasetRetrievalSettingApi(Resource):
| VectorType.ELASTICSEARCH
| VectorType.PGVECTOR
| VectorType.TIDB_ON_QDRANT
| VectorType.COUCHBASE
):
return {
"retrieval_method": [
@ -668,6 +670,7 @@ class DatasetRetrievalSettingMockApi(Resource):
| VectorType.BAIDU
| VectorType.VIKINGDB
| VectorType.UPSTASH
| VectorType.OCEANBASE
):
return {"retrieval_method": [RetrievalMethod.SEMANTIC_SEARCH.value]}
case (
@ -678,6 +681,7 @@ class DatasetRetrievalSettingMockApi(Resource):
| VectorType.MYSCALE
| VectorType.ORACLE
| VectorType.ELASTICSEARCH
| VectorType.COUCHBASE
| VectorType.PGVECTOR
):
return {

View File

@ -230,7 +230,7 @@ class DocumentUpdateByFileApi(DatasetApiResource):
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
document = documents[0]
documents_and_batch_fields = {"document": marshal(document, document_fields), "batch": batch}
documents_and_batch_fields = {"document": marshal(document, document_fields), "batch": document.batch}
return documents_and_batch_fields, 200

View File

@ -165,6 +165,12 @@ class BaseAgentRunner(AppRunner):
continue
parameter_type = parameter.type.as_normal_type()
if parameter.type in {
ToolParameter.ToolParameterType.SYSTEM_FILES,
ToolParameter.ToolParameterType.FILE,
ToolParameter.ToolParameterType.FILES,
}:
continue
enum = []
if parameter.type == ToolParameter.ToolParameterType.SELECT:
enum = [option.value for option in parameter.options]
@ -250,6 +256,12 @@ class BaseAgentRunner(AppRunner):
continue
parameter_type = parameter.type.as_normal_type()
if parameter.type in {
ToolParameter.ToolParameterType.SYSTEM_FILES,
ToolParameter.ToolParameterType.FILE,
ToolParameter.ToolParameterType.FILES,
}:
continue
enum = []
if parameter.type == ToolParameter.ToolParameterType.SELECT:
enum = [option.value for option in parameter.options]

View File

@ -76,8 +76,16 @@ def to_prompt_message_content(f: File, /):
def download(f: File, /):
upload_file = file_repository.get_upload_file(session=db.session(), file=f)
return _download_file_content(upload_file.key)
if f.transfer_method == FileTransferMethod.TOOL_FILE:
tool_file = file_repository.get_tool_file(session=db.session(), file=f)
return _download_file_content(tool_file.file_key)
elif f.transfer_method == FileTransferMethod.LOCAL_FILE:
upload_file = file_repository.get_upload_file(session=db.session(), file=f)
return _download_file_content(upload_file.key)
# remote file
response = ssrf_proxy.get(f.remote_url, follow_redirects=True)
response.raise_for_status()
return response.content
def _download_file_content(path: str, /):

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 9.8 KiB

View File

@ -0,0 +1,3 @@
<svg width="40" height="40" viewBox="0 0 40 40" fill="none" xmlns="http://www.w3.org/2000/svg">
<path fill-rule="evenodd" clip-rule="evenodd" d="M25.132 24.3947C25.497 25.7527 25.8984 27.1413 26.3334 28.5834C26.7302 29.8992 25.5459 30.4167 25.0752 29.1758C24.571 27.8466 24.0885 26.523 23.6347 25.1729C21.065 26.4654 18.5025 27.5424 15.5961 28.7541C16.7581 33.0256 17.8309 36.5984 19.4952 39.9935C19.4953 39.9936 19.4953 39.9937 19.4954 39.9938C19.6631 39.9979 19.8313 40 20 40C31.0457 40 40 31.0457 40 20C40 16.0335 38.8453 12.3366 36.8537 9.22729C31.6585 9.69534 27.0513 10.4562 22.8185 11.406C22.8882 12.252 22.9677 13.0739 23.0555 13.855C23.3824 16.7604 23.9112 19.5281 24.6137 22.3836C27.0581 21.2848 29.084 20.3225 30.6816 19.522C32.2154 18.7535 33.6943 18.7062 31.2018 20.6594C29.0388 22.1602 27.0644 23.3566 25.132 24.3947ZM36.1559 8.20846C33.0001 3.89184 28.1561 0.887462 22.5955 0.166882C22.4257 2.86234 22.4785 6.26344 22.681 9.50447C26.7473 8.88859 31.1721 8.46032 36.1559 8.20846ZM19.9369 9.73661e-05C19.7594 2.92694 19.8384 6.65663 20.19 9.91293C17.3748 10.4109 14.7225 11.0064 12.1592 11.7038C12.0486 10.4257 11.9927 9.25764 11.9927 8.24178C11.9927 7.5054 11.3957 6.90844 10.6593 6.90844C9.92296 6.90844 9.32601 7.5054 9.32601 8.24178C9.32601 9.47868 9.42873 10.898 9.61402 12.438C8.33567 12.8278 7.07397 13.2443 5.81918 13.688C5.12493 13.9336 4.76118 14.6954 5.0067 15.3896C5.25223 16.0839 6.01406 16.4476 6.7083 16.2021C7.7931 15.8185 8.88482 15.4388 9.98927 15.0659C10.5222 18.3344 11.3344 21.9428 12.2703 25.4156C12.4336 26.0218 12.6062 26.6262 12.7863 27.2263C9.34168 28.4135 5.82612 29.3782 2.61128 29.8879C0.949407 26.9716 0 23.5967 0 20C0 8.97534 8.92023 0.0341108 19.9369 9.73661e-05ZM4.19152 32.2527C7.45069 36.4516 12.3458 39.3173 17.9204 39.8932C16.5916 37.455 14.9338 33.717 13.5405 29.5901C10.4404 30.7762 7.25883 31.6027 4.19152 32.2527ZM22.9735 23.1135C22.1479 20.41 21.4462 17.5441 20.9225 14.277C20.746 13.5841 20.5918 12.8035 20.4593 11.9636C17.6508 12.6606 14.9992 13.4372 12.4356 14.2598C12.8479 17.4766 13.5448 21.1334 14.5118 24.7218C14.662 25.2792 14.8081 25.8248 14.9514 26.3594L14.9516 26.3603L14.9524 26.3634L14.9526 26.3639L14.973 26.4401C16.1833 25.9872 17.3746 25.5123 18.53 25.0259C20.1235 24.3552 21.6051 23.7165 22.9735 23.1135Z" fill="#141519"/>
</svg>

After

Width:  |  Height:  |  Size: 2.2 KiB

View File

@ -0,0 +1,47 @@
from dashscope.common.error import (
AuthenticationError,
InvalidParameter,
RequestFailure,
ServiceUnavailableError,
UnsupportedHTTPMethod,
UnsupportedModel,
)
from core.model_runtime.errors.invoke import (
InvokeAuthorizationError,
InvokeBadRequestError,
InvokeConnectionError,
InvokeError,
InvokeRateLimitError,
InvokeServerUnavailableError,
)
class _CommonGiteeAI:
@property
def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
"""
Map model invoke error to unified error
The key is the error type thrown to the caller
The value is the error type thrown by the model,
which needs to be converted into a unified error type for the caller.
:return: Invoke error mapping
"""
return {
InvokeConnectionError: [
RequestFailure,
],
InvokeServerUnavailableError: [
ServiceUnavailableError,
],
InvokeRateLimitError: [],
InvokeAuthorizationError: [
AuthenticationError,
],
InvokeBadRequestError: [
InvalidParameter,
UnsupportedModel,
UnsupportedHTTPMethod,
],
}

View File

@ -0,0 +1,25 @@
import logging
from core.model_runtime.entities.model_entities import ModelType
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.__base.model_provider import ModelProvider
logger = logging.getLogger(__name__)
class GiteeAIProvider(ModelProvider):
def validate_provider_credentials(self, credentials: dict) -> None:
"""
Validate provider credentials
if validate failed, raise exception
:param credentials: provider credentials, credentials form defined in `provider_credential_schema`.
"""
try:
model_instance = self.get_model_instance(ModelType.LLM)
model_instance.validate_credentials(model="Qwen2-7B-Instruct", credentials=credentials)
except CredentialsValidateFailedError as ex:
raise ex
except Exception as ex:
logger.exception(f"{self.get_provider_schema().provider} credentials validate failed")
raise ex

View File

@ -0,0 +1,35 @@
provider: gitee_ai
label:
en_US: Gitee AI
zh_Hans: Gitee AI
description:
en_US: 快速体验大模型,领先探索 AI 开源世界
zh_Hans: 快速体验大模型,领先探索 AI 开源世界
icon_small:
en_US: Gitee-AI-Logo.svg
icon_large:
en_US: Gitee-AI-Logo-full.svg
help:
title:
en_US: Get your token from Gitee AI
zh_Hans: 从 Gitee AI 获取 token
url:
en_US: https://ai.gitee.com/dashboard/settings/tokens
supported_model_types:
- llm
- text-embedding
- rerank
- speech2text
- tts
configurate_methods:
- predefined-model
provider_credential_schema:
credential_form_schemas:
- variable: api_key
label:
en_US: API Key
type: secret-input
required: true
placeholder:
zh_Hans: 在此输入您的 API Key
en_US: Enter your API Key

View File

@ -0,0 +1,105 @@
model: Qwen2-72B-Instruct
label:
zh_Hans: Qwen2-72B-Instruct
en_US: Qwen2-72B-Instruct
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 6400
parameter_rules:
- name: stream
use_template: boolean
label:
en_US: "Stream"
zh_Hans: "流式"
type: boolean
default: true
required: true
help:
en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
zh_Hans: "是否通过流式分批返回结果。如果设置为 true生成过程中实时地向用户推送每一部分生成的文本。"
- name: max_tokens
use_template: max_tokens
label:
en_US: "Max Tokens"
zh_Hans: "最大Token数"
type: int
default: 512
min: 1
required: true
help:
en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
- name: temperature
use_template: temperature
label:
en_US: "Temperature"
zh_Hans: "采样温度"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_p
use_template: top_p
label:
en_US: "Top P"
zh_Hans: "Top P"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_k
use_template: top_k
label:
en_US: "Top K"
zh_Hans: "Top K"
type: int
default: 50
min: 0
max: 100
required: true
help:
en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
- name: frequency_penalty
use_template: frequency_penalty
label:
en_US: "Frequency Penalty"
zh_Hans: "频率惩罚"
type: float
default: 0
min: -1.0
max: 1.0
precision: 1
required: false
help:
en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
- name: user
use_template: text
label:
en_US: "User"
zh_Hans: "用户"
type: string
required: false
help:
en_US: "Used to track and differentiate conversation requests from different users."
zh_Hans: "用于追踪和区分不同用户的对话请求。"

View File

@ -0,0 +1,105 @@
model: Qwen2-7B-Instruct
label:
zh_Hans: Qwen2-7B-Instruct
en_US: Qwen2-7B-Instruct
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 32768
parameter_rules:
- name: stream
use_template: boolean
label:
en_US: "Stream"
zh_Hans: "流式"
type: boolean
default: true
required: true
help:
en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
zh_Hans: "是否通过流式分批返回结果。如果设置为 true生成过程中实时地向用户推送每一部分生成的文本。"
- name: max_tokens
use_template: max_tokens
label:
en_US: "Max Tokens"
zh_Hans: "最大Token数"
type: int
default: 512
min: 1
required: true
help:
en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
- name: temperature
use_template: temperature
label:
en_US: "Temperature"
zh_Hans: "采样温度"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_p
use_template: top_p
label:
en_US: "Top P"
zh_Hans: "Top P"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_k
use_template: top_k
label:
en_US: "Top K"
zh_Hans: "Top K"
type: int
default: 50
min: 0
max: 100
required: true
help:
en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
- name: frequency_penalty
use_template: frequency_penalty
label:
en_US: "Frequency Penalty"
zh_Hans: "频率惩罚"
type: float
default: 0
min: -1.0
max: 1.0
precision: 1
required: false
help:
en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
- name: user
use_template: text
label:
en_US: "User"
zh_Hans: "用户"
type: string
required: false
help:
en_US: "Used to track and differentiate conversation requests from different users."
zh_Hans: "用于追踪和区分不同用户的对话请求。"

View File

@ -0,0 +1,105 @@
model: Yi-1.5-34B-Chat
label:
zh_Hans: Yi-1.5-34B-Chat
en_US: Yi-1.5-34B-Chat
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 4096
parameter_rules:
- name: stream
use_template: boolean
label:
en_US: "Stream"
zh_Hans: "流式"
type: boolean
default: true
required: true
help:
en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
zh_Hans: "是否通过流式分批返回结果。如果设置为 true生成过程中实时地向用户推送每一部分生成的文本。"
- name: max_tokens
use_template: max_tokens
label:
en_US: "Max Tokens"
zh_Hans: "最大Token数"
type: int
default: 512
min: 1
required: true
help:
en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
- name: temperature
use_template: temperature
label:
en_US: "Temperature"
zh_Hans: "采样温度"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_p
use_template: top_p
label:
en_US: "Top P"
zh_Hans: "Top P"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_k
use_template: top_k
label:
en_US: "Top K"
zh_Hans: "Top K"
type: int
default: 50
min: 0
max: 100
required: true
help:
en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
- name: frequency_penalty
use_template: frequency_penalty
label:
en_US: "Frequency Penalty"
zh_Hans: "频率惩罚"
type: float
default: 0
min: -1.0
max: 1.0
precision: 1
required: false
help:
en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
- name: user
use_template: text
label:
en_US: "User"
zh_Hans: "用户"
type: string
required: false
help:
en_US: "Used to track and differentiate conversation requests from different users."
zh_Hans: "用于追踪和区分不同用户的对话请求。"

View File

@ -0,0 +1,7 @@
- Qwen2-7B-Instruct
- Qwen2-72B-Instruct
- Yi-1.5-34B-Chat
- glm-4-9b-chat
- deepseek-coder-33B-instruct-chat
- deepseek-coder-33B-instruct-completions
- codegeex4-all-9b

View File

@ -0,0 +1,105 @@
model: codegeex4-all-9b
label:
zh_Hans: codegeex4-all-9b
en_US: codegeex4-all-9b
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 40960
parameter_rules:
- name: stream
use_template: boolean
label:
en_US: "Stream"
zh_Hans: "流式"
type: boolean
default: true
required: true
help:
en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
zh_Hans: "是否通过流式分批返回结果。如果设置为 true生成过程中实时地向用户推送每一部分生成的文本。"
- name: max_tokens
use_template: max_tokens
label:
en_US: "Max Tokens"
zh_Hans: "最大Token数"
type: int
default: 512
min: 1
required: true
help:
en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
- name: temperature
use_template: temperature
label:
en_US: "Temperature"
zh_Hans: "采样温度"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_p
use_template: top_p
label:
en_US: "Top P"
zh_Hans: "Top P"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_k
use_template: top_k
label:
en_US: "Top K"
zh_Hans: "Top K"
type: int
default: 50
min: 0
max: 100
required: true
help:
en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
- name: frequency_penalty
use_template: frequency_penalty
label:
en_US: "Frequency Penalty"
zh_Hans: "频率惩罚"
type: float
default: 0
min: -1.0
max: 1.0
precision: 1
required: false
help:
en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
- name: user
use_template: text
label:
en_US: "User"
zh_Hans: "用户"
type: string
required: false
help:
en_US: "Used to track and differentiate conversation requests from different users."
zh_Hans: "用于追踪和区分不同用户的对话请求。"

View File

@ -0,0 +1,105 @@
model: deepseek-coder-33B-instruct-chat
label:
zh_Hans: deepseek-coder-33B-instruct-chat
en_US: deepseek-coder-33B-instruct-chat
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 9000
parameter_rules:
- name: stream
use_template: boolean
label:
en_US: "Stream"
zh_Hans: "流式"
type: boolean
default: true
required: true
help:
en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
zh_Hans: "是否通过流式分批返回结果。如果设置为 true生成过程中实时地向用户推送每一部分生成的文本。"
- name: max_tokens
use_template: max_tokens
label:
en_US: "Max Tokens"
zh_Hans: "最大Token数"
type: int
default: 512
min: 1
required: true
help:
en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
- name: temperature
use_template: temperature
label:
en_US: "Temperature"
zh_Hans: "采样温度"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_p
use_template: top_p
label:
en_US: "Top P"
zh_Hans: "Top P"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_k
use_template: top_k
label:
en_US: "Top K"
zh_Hans: "Top K"
type: int
default: 50
min: 0
max: 100
required: true
help:
en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
- name: frequency_penalty
use_template: frequency_penalty
label:
en_US: "Frequency Penalty"
zh_Hans: "频率惩罚"
type: float
default: 0
min: -1.0
max: 1.0
precision: 1
required: false
help:
en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
- name: user
use_template: text
label:
en_US: "User"
zh_Hans: "用户"
type: string
required: false
help:
en_US: "Used to track and differentiate conversation requests from different users."
zh_Hans: "用于追踪和区分不同用户的对话请求。"

View File

@ -0,0 +1,91 @@
model: deepseek-coder-33B-instruct-completions
label:
zh_Hans: deepseek-coder-33B-instruct-completions
en_US: deepseek-coder-33B-instruct-completions
model_type: llm
features:
- agent-thought
model_properties:
mode: completion
context_size: 9000
parameter_rules:
- name: stream
use_template: boolean
label:
en_US: "Stream"
zh_Hans: "流式"
type: boolean
default: true
required: true
help:
en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
zh_Hans: "是否通过流式分批返回结果。如果设置为 true生成过程中实时地向用户推送每一部分生成的文本。"
- name: max_tokens
use_template: max_tokens
label:
en_US: "Max Tokens"
zh_Hans: "最大Token数"
type: int
default: 512
min: 1
required: true
help:
en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
- name: temperature
use_template: temperature
label:
en_US: "Temperature"
zh_Hans: "采样温度"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_p
use_template: top_p
label:
en_US: "Top P"
zh_Hans: "Top P"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: frequency_penalty
use_template: frequency_penalty
label:
en_US: "Frequency Penalty"
zh_Hans: "频率惩罚"
type: float
default: 0
min: -1.0
max: 1.0
precision: 1
required: false
help:
en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
- name: user
use_template: text
label:
en_US: "User"
zh_Hans: "用户"
type: string
required: false
help:
en_US: "Used to track and differentiate conversation requests from different users."
zh_Hans: "用于追踪和区分不同用户的对话请求。"

View File

@ -0,0 +1,105 @@
model: glm-4-9b-chat
label:
zh_Hans: glm-4-9b-chat
en_US: glm-4-9b-chat
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 32768
parameter_rules:
- name: stream
use_template: boolean
label:
en_US: "Stream"
zh_Hans: "流式"
type: boolean
default: true
required: true
help:
en_US: "Whether to return the results in batches through streaming. If set to true, the generated text will be pushed to the user in real time during the generation process."
zh_Hans: "是否通过流式分批返回结果。如果设置为 true生成过程中实时地向用户推送每一部分生成的文本。"
- name: max_tokens
use_template: max_tokens
label:
en_US: "Max Tokens"
zh_Hans: "最大Token数"
type: int
default: 512
min: 1
required: true
help:
en_US: "The maximum number of tokens that can be generated by the model varies depending on the model."
zh_Hans: "模型可生成的最大 token 个数,不同模型上限不同。"
- name: temperature
use_template: temperature
label:
en_US: "Temperature"
zh_Hans: "采样温度"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内,值越高,输出越随机和创造性;值越低,输出越稳定。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_p
use_template: top_p
label:
en_US: "Top P"
zh_Hans: "Top P"
type: float
default: 0.7
min: 0.0
max: 1.0
precision: 1
required: true
help:
en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens当 top_p 为 0 时,此参数无效。建议根据需求调整 top_p 或 temperature 参数,避免同时调整两者。"
- name: top_k
use_template: top_k
label:
en_US: "Top K"
zh_Hans: "Top K"
type: int
default: 50
min: 0
max: 100
required: true
help:
en_US: "The value range is [0,100], which limits the model to only select from the top k words with the highest probability when choosing the next word at each step. The larger the value, the more diverse text generation will be."
zh_Hans: "取值范围为 [0,100],限制模型在每一步选择下一个词时,只从概率最高的前 k 个词中选取。数值越大,文本生成越多样。"
- name: frequency_penalty
use_template: frequency_penalty
label:
en_US: "Frequency Penalty"
zh_Hans: "频率惩罚"
type: float
default: 0
min: -1.0
max: 1.0
precision: 1
required: false
help:
en_US: "Used to adjust the frequency of repeated content in automatically generated text. Positive numbers reduce repetition, while negative numbers increase repetition. After setting this parameter, if a word has already appeared in the text, the model will decrease the probability of choosing that word for subsequent generation."
zh_Hans: "用于调整自动生成文本中重复内容的频率。正数减少重复,负数增加重复。设置此参数后,如果一个词在文本中已经出现过,模型在后续生成中选择该词的概率会降低。"
- name: user
use_template: text
label:
en_US: "User"
zh_Hans: "用户"
type: string
required: false
help:
en_US: "Used to track and differentiate conversation requests from different users."
zh_Hans: "用于追踪和区分不同用户的对话请求。"

View File

@ -0,0 +1,47 @@
from collections.abc import Generator
from typing import Optional, Union
from core.model_runtime.entities.llm_entities import LLMMode, LLMResult
from core.model_runtime.entities.message_entities import (
PromptMessage,
PromptMessageTool,
)
from core.model_runtime.model_providers.openai_api_compatible.llm.llm import OAIAPICompatLargeLanguageModel
class GiteeAILargeLanguageModel(OAIAPICompatLargeLanguageModel):
MODEL_TO_IDENTITY: dict[str, str] = {
"Yi-1.5-34B-Chat": "Yi-34B-Chat",
"deepseek-coder-33B-instruct-completions": "deepseek-coder-33B-instruct",
"deepseek-coder-33B-instruct-chat": "deepseek-coder-33B-instruct",
}
def _invoke(
self,
model: str,
credentials: dict,
prompt_messages: list[PromptMessage],
model_parameters: dict,
tools: Optional[list[PromptMessageTool]] = None,
stop: Optional[list[str]] = None,
stream: bool = True,
user: Optional[str] = None,
) -> Union[LLMResult, Generator]:
self._add_custom_parameters(credentials, model, model_parameters)
return super()._invoke(model, credentials, prompt_messages, model_parameters, tools, stop, stream)
def validate_credentials(self, model: str, credentials: dict) -> None:
self._add_custom_parameters(credentials, model, None)
super().validate_credentials(model, credentials)
@staticmethod
def _add_custom_parameters(credentials: dict, model: str, model_parameters: dict) -> None:
if model is None:
model = "bge-large-zh-v1.5"
model_identity = GiteeAILargeLanguageModel.MODEL_TO_IDENTITY.get(model, model)
credentials["endpoint_url"] = f"https://ai.gitee.com/api/serverless/{model_identity}/"
if model.endswith("completions"):
credentials["mode"] = LLMMode.COMPLETION.value
else:
credentials["mode"] = LLMMode.CHAT.value

View File

@ -0,0 +1 @@
- bge-reranker-v2-m3

View File

@ -0,0 +1,4 @@
model: bge-reranker-v2-m3
model_type: rerank
model_properties:
context_size: 1024

View File

@ -0,0 +1,128 @@
from typing import Optional
import httpx
from core.model_runtime.entities.common_entities import I18nObject
from core.model_runtime.entities.model_entities import AIModelEntity, FetchFrom, ModelPropertyKey, ModelType
from core.model_runtime.entities.rerank_entities import RerankDocument, RerankResult
from core.model_runtime.errors.invoke import (
InvokeAuthorizationError,
InvokeBadRequestError,
InvokeConnectionError,
InvokeError,
InvokeRateLimitError,
InvokeServerUnavailableError,
)
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.__base.rerank_model import RerankModel
class GiteeAIRerankModel(RerankModel):
"""
Model class for rerank model.
"""
def _invoke(
self,
model: str,
credentials: dict,
query: str,
docs: list[str],
score_threshold: Optional[float] = None,
top_n: Optional[int] = None,
user: Optional[str] = None,
) -> RerankResult:
"""
Invoke rerank model
:param model: model name
:param credentials: model credentials
:param query: search query
:param docs: docs for reranking
:param score_threshold: score threshold
:param top_n: top n documents to return
:param user: unique user id
:return: rerank result
"""
if len(docs) == 0:
return RerankResult(model=model, docs=[])
base_url = credentials.get("base_url", "https://ai.gitee.com/api/serverless")
base_url = base_url.removesuffix("/")
try:
body = {"model": model, "query": query, "documents": docs}
if top_n is not None:
body["top_n"] = top_n
response = httpx.post(
f"{base_url}/{model}/rerank",
json=body,
headers={"Authorization": f"Bearer {credentials.get('api_key')}"},
)
response.raise_for_status()
results = response.json()
rerank_documents = []
for result in results["results"]:
rerank_document = RerankDocument(
index=result["index"],
text=result["document"]["text"],
score=result["relevance_score"],
)
if score_threshold is None or result["relevance_score"] >= score_threshold:
rerank_documents.append(rerank_document)
return RerankResult(model=model, docs=rerank_documents)
except httpx.HTTPStatusError as e:
raise InvokeServerUnavailableError(str(e))
def validate_credentials(self, model: str, credentials: dict) -> None:
"""
Validate model credentials
:param model: model name
:param credentials: model credentials
:return:
"""
try:
self._invoke(
model=model,
credentials=credentials,
query="What is the capital of the United States?",
docs=[
"Carson City is the capital city of the American state of Nevada. At the 2010 United States "
"Census, Carson City had a population of 55,274.",
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that "
"are a political division controlled by the United States. Its capital is Saipan.",
],
score_threshold=0.01,
)
except Exception as ex:
raise CredentialsValidateFailedError(str(ex))
@property
def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
"""
Map model invoke error to unified error
"""
return {
InvokeConnectionError: [httpx.ConnectError],
InvokeServerUnavailableError: [httpx.RemoteProtocolError],
InvokeRateLimitError: [],
InvokeAuthorizationError: [httpx.HTTPStatusError],
InvokeBadRequestError: [httpx.RequestError],
}
def get_customizable_model_schema(self, model: str, credentials: dict) -> AIModelEntity:
"""
generate custom model entities from credentials
"""
entity = AIModelEntity(
model=model,
label=I18nObject(en_US=model),
model_type=ModelType.RERANK,
fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size"))},
)
return entity

View File

@ -0,0 +1,2 @@
- whisper-base
- whisper-large

View File

@ -0,0 +1,53 @@
import os
from typing import IO, Optional
import requests
from core.model_runtime.errors.invoke import InvokeBadRequestError
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.__base.speech2text_model import Speech2TextModel
from core.model_runtime.model_providers.gitee_ai._common import _CommonGiteeAI
class GiteeAISpeech2TextModel(_CommonGiteeAI, Speech2TextModel):
"""
Model class for OpenAI Compatible Speech to text model.
"""
def _invoke(self, model: str, credentials: dict, file: IO[bytes], user: Optional[str] = None) -> str:
"""
Invoke speech2text model
:param model: model name
:param credentials: model credentials
:param file: audio file
:param user: unique user id
:return: text for given audio file
"""
# doc: https://ai.gitee.com/docs/openapi/serverless#tag/serverless/POST/{service}/speech-to-text
endpoint_url = f"https://ai.gitee.com/api/serverless/{model}/speech-to-text"
files = [("file", file)]
_, file_ext = os.path.splitext(file.name)
headers = {"Content-Type": f"audio/{file_ext}", "Authorization": f"Bearer {credentials.get('api_key')}"}
response = requests.post(endpoint_url, headers=headers, files=files)
if response.status_code != 200:
raise InvokeBadRequestError(response.text)
response_data = response.json()
return response_data["text"]
def validate_credentials(self, model: str, credentials: dict) -> None:
"""
Validate model credentials
:param model: model name
:param credentials: model credentials
:return:
"""
try:
audio_file_path = self._get_demo_file_path()
with open(audio_file_path, "rb") as audio_file:
self._invoke(model, credentials, audio_file)
except Exception as ex:
raise CredentialsValidateFailedError(str(ex))

View File

@ -0,0 +1,5 @@
model: whisper-base
model_type: speech2text
model_properties:
file_upload_limit: 1
supported_file_extensions: flac,mp3,mp4,mpeg,mpga,m4a,ogg,wav,webm

View File

@ -0,0 +1,5 @@
model: whisper-large
model_type: speech2text
model_properties:
file_upload_limit: 1
supported_file_extensions: flac,mp3,mp4,mpeg,mpga,m4a,ogg,wav,webm

View File

@ -0,0 +1,3 @@
- bge-large-zh-v1.5
- bge-small-zh-v1.5
- bge-m3

View File

@ -0,0 +1,8 @@
model: bge-large-zh-v1.5
label:
zh_Hans: bge-large-zh-v1.5
en_US: bge-large-zh-v1.5
model_type: text-embedding
model_properties:
context_size: 200000
max_chunks: 20

View File

@ -0,0 +1,8 @@
model: bge-m3
label:
zh_Hans: bge-m3
en_US: bge-m3
model_type: text-embedding
model_properties:
context_size: 200000
max_chunks: 20

View File

@ -0,0 +1,8 @@
model: bge-small-zh-v1.5
label:
zh_Hans: bge-small-zh-v1.5
en_US: bge-small-zh-v1.5
model_type: text-embedding
model_properties:
context_size: 200000
max_chunks: 20

View File

@ -0,0 +1,31 @@
from typing import Optional
from core.entities.embedding_type import EmbeddingInputType
from core.model_runtime.entities.text_embedding_entities import TextEmbeddingResult
from core.model_runtime.model_providers.openai_api_compatible.text_embedding.text_embedding import (
OAICompatEmbeddingModel,
)
class GiteeAIEmbeddingModel(OAICompatEmbeddingModel):
def _invoke(
self,
model: str,
credentials: dict,
texts: list[str],
user: Optional[str] = None,
input_type: EmbeddingInputType = EmbeddingInputType.DOCUMENT,
) -> TextEmbeddingResult:
self._add_custom_parameters(credentials, model)
return super()._invoke(model, credentials, texts, user, input_type)
def validate_credentials(self, model: str, credentials: dict) -> None:
self._add_custom_parameters(credentials, None)
super().validate_credentials(model, credentials)
@staticmethod
def _add_custom_parameters(credentials: dict, model: str) -> None:
if model is None:
model = "bge-m3"
credentials["endpoint_url"] = f"https://ai.gitee.com/api/serverless/{model}/v1/"

View File

@ -0,0 +1,11 @@
model: ChatTTS
model_type: tts
model_properties:
default_voice: 'default'
voices:
- mode: 'default'
name: 'Default'
language: [ 'zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID' ]
word_limit: 3500
audio_type: 'mp3'
max_workers: 5

View File

@ -0,0 +1,11 @@
model: FunAudioLLM-CosyVoice-300M
model_type: tts
model_properties:
default_voice: 'default'
voices:
- mode: 'default'
name: 'Default'
language: [ 'zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID' ]
word_limit: 3500
audio_type: 'mp3'
max_workers: 5

View File

@ -0,0 +1,4 @@
- speecht5_tts
- ChatTTS
- fish-speech-1.2-sft
- FunAudioLLM-CosyVoice-300M

View File

@ -0,0 +1,11 @@
model: fish-speech-1.2-sft
model_type: tts
model_properties:
default_voice: 'default'
voices:
- mode: 'default'
name: 'Default'
language: [ 'zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID' ]
word_limit: 3500
audio_type: 'mp3'
max_workers: 5

View File

@ -0,0 +1,11 @@
model: speecht5_tts
model_type: tts
model_properties:
default_voice: 'default'
voices:
- mode: 'default'
name: 'Default'
language: [ 'zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID' ]
word_limit: 3500
audio_type: 'mp3'
max_workers: 5

View File

@ -0,0 +1,79 @@
from typing import Optional
import requests
from core.model_runtime.errors.invoke import InvokeBadRequestError
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.__base.tts_model import TTSModel
from core.model_runtime.model_providers.gitee_ai._common import _CommonGiteeAI
class GiteeAIText2SpeechModel(_CommonGiteeAI, TTSModel):
"""
Model class for OpenAI Speech to text model.
"""
def _invoke(
self, model: str, tenant_id: str, credentials: dict, content_text: str, voice: str, user: Optional[str] = None
) -> any:
"""
_invoke text2speech model
:param model: model name
:param tenant_id: user tenant id
:param credentials: model credentials
:param content_text: text content to be translated
:param voice: model timbre
:param user: unique user id
:return: text translated to audio file
"""
return self._tts_invoke_streaming(model=model, credentials=credentials, content_text=content_text, voice=voice)
def validate_credentials(self, model: str, credentials: dict) -> None:
"""
validate credentials text2speech model
:param model: model name
:param credentials: model credentials
:return: text translated to audio file
"""
try:
self._tts_invoke_streaming(
model=model,
credentials=credentials,
content_text="Hello Dify!",
voice=self._get_model_default_voice(model, credentials),
)
except Exception as ex:
raise CredentialsValidateFailedError(str(ex))
def _tts_invoke_streaming(self, model: str, credentials: dict, content_text: str, voice: str) -> any:
"""
_tts_invoke_streaming text2speech model
:param model: model name
:param credentials: model credentials
:param content_text: text content to be translated
:param voice: model timbre
:return: text translated to audio file
"""
try:
# doc: https://ai.gitee.com/docs/openapi/serverless#tag/serverless/POST/{service}/text-to-speech
endpoint_url = "https://ai.gitee.com/api/serverless/" + model + "/text-to-speech"
headers = {"Content-Type": "application/json"}
api_key = credentials.get("api_key")
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
payload = {"inputs": content_text}
response = requests.post(endpoint_url, headers=headers, json=payload)
if response.status_code != 200:
raise InvokeBadRequestError(response.text)
data = response.content
for i in range(0, len(data), 1024):
yield data[i : i + 1024]
except Exception as ex:
raise InvokeBadRequestError(str(ex))

View File

@ -116,26 +116,33 @@ class GoogleLargeLanguageModel(LargeLanguageModel):
:param tools: tool messages
:return: glm tools
"""
return glm.Tool(
function_declarations=[
glm.FunctionDeclaration(
name=tool.name,
parameters=glm.Schema(
type=glm.Type.OBJECT,
properties={
key: {
"type_": value.get("type", "string").upper(),
"description": value.get("description", ""),
"enum": value.get("enum", []),
}
for key, value in tool.parameters.get("properties", {}).items()
},
required=tool.parameters.get("required", []),
),
function_declarations = []
for tool in tools:
properties = {}
for key, value in tool.parameters.get("properties", {}).items():
properties[key] = {
"type_": glm.Type.STRING,
"description": value.get("description", ""),
"enum": value.get("enum", []),
}
if properties:
parameters = glm.Schema(
type=glm.Type.OBJECT,
properties=properties,
required=tool.parameters.get("required", []),
)
for tool in tools
]
)
else:
parameters = None
function_declaration = glm.FunctionDeclaration(
name=tool.name,
parameters=parameters,
description=tool.description,
)
function_declarations.append(function_declaration)
return glm.Tool(function_declarations=function_declarations)
def validate_credentials(self, model: str, credentials: dict) -> None:
"""

View File

@ -0,0 +1,378 @@
import json
import logging
import time
import uuid
from datetime import timedelta
from typing import Any
from couchbase import search
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.management.search import SearchIndex
# needed for options -- cluster, timeout, SQL++ (N1QL) query, etc.
from couchbase.options import ClusterOptions, SearchOptions
from couchbase.vector_search import VectorQuery, VectorSearch
from flask import current_app
from pydantic import BaseModel, model_validator
from core.rag.datasource.vdb.vector_base import BaseVector
from core.rag.datasource.vdb.vector_factory import AbstractVectorFactory
from core.rag.datasource.vdb.vector_type import VectorType
from core.rag.embedding.embedding_base import Embeddings
from core.rag.models.document import Document
from extensions.ext_redis import redis_client
from models.dataset import Dataset
logger = logging.getLogger(__name__)
class CouchbaseConfig(BaseModel):
connection_string: str
user: str
password: str
bucket_name: str
scope_name: str
@model_validator(mode="before")
@classmethod
def validate_config(cls, values: dict) -> dict:
if not values.get("connection_string"):
raise ValueError("config COUCHBASE_CONNECTION_STRING is required")
if not values.get("user"):
raise ValueError("config COUCHBASE_USER is required")
if not values.get("password"):
raise ValueError("config COUCHBASE_PASSWORD is required")
if not values.get("bucket_name"):
raise ValueError("config COUCHBASE_PASSWORD is required")
if not values.get("scope_name"):
raise ValueError("config COUCHBASE_SCOPE_NAME is required")
return values
class CouchbaseVector(BaseVector):
def __init__(self, collection_name: str, config: CouchbaseConfig):
super().__init__(collection_name)
self._client_config = config
"""Connect to couchbase"""
auth = PasswordAuthenticator(config.user, config.password)
options = ClusterOptions(auth)
self._cluster = Cluster(config.connection_string, options)
self._bucket = self._cluster.bucket(config.bucket_name)
self._scope = self._bucket.scope(config.scope_name)
self._bucket_name = config.bucket_name
self._scope_name = config.scope_name
# Wait until the cluster is ready for use.
self._cluster.wait_until_ready(timedelta(seconds=5))
def create(self, texts: list[Document], embeddings: list[list[float]], **kwargs):
index_id = str(uuid.uuid4()).replace("-", "")
self._create_collection(uuid=index_id, vector_length=len(embeddings[0]))
self.add_texts(texts, embeddings)
def _create_collection(self, vector_length: int, uuid: str):
lock_name = "vector_indexing_lock_{}".format(self._collection_name)
with redis_client.lock(lock_name, timeout=20):
collection_exist_cache_key = "vector_indexing_{}".format(self._collection_name)
if redis_client.get(collection_exist_cache_key):
return
if self._collection_exists(self._collection_name):
return
manager = self._bucket.collections()
manager.create_collection(self._client_config.scope_name, self._collection_name)
index_manager = self._scope.search_indexes()
index_definition = json.loads("""
{
"type": "fulltext-index",
"name": "Embeddings._default.Vector_Search",
"uuid": "26d4db528e78b716",
"sourceType": "gocbcore",
"sourceName": "Embeddings",
"sourceUUID": "2242e4a25b4decd6650c9c7b3afa1dbf",
"planParams": {
"maxPartitionsPerPIndex": 1024,
"indexPartitions": 1
},
"params": {
"doc_config": {
"docid_prefix_delim": "",
"docid_regexp": "",
"mode": "scope.collection.type_field",
"type_field": "type"
},
"mapping": {
"analysis": { },
"default_analyzer": "standard",
"default_datetime_parser": "dateTimeOptional",
"default_field": "_all",
"default_mapping": {
"dynamic": true,
"enabled": true
},
"default_type": "_default",
"docvalues_dynamic": false,
"index_dynamic": true,
"store_dynamic": true,
"type_field": "_type",
"types": {
"collection_name": {
"dynamic": true,
"enabled": true,
"properties": {
"embedding": {
"dynamic": false,
"enabled": true,
"fields": [
{
"dims": 1536,
"index": true,
"name": "embedding",
"similarity": "dot_product",
"type": "vector",
"vector_index_optimized_for": "recall"
}
]
},
"metadata": {
"dynamic": true,
"enabled": true
},
"text": {
"dynamic": false,
"enabled": true,
"fields": [
{
"index": true,
"name": "text",
"store": true,
"type": "text"
}
]
}
}
}
}
},
"store": {
"indexType": "scorch",
"segmentVersion": 16
}
},
"sourceParams": { }
}
""")
index_definition["name"] = self._collection_name + "_search"
index_definition["uuid"] = uuid
index_definition["params"]["mapping"]["types"]["collection_name"]["properties"]["embedding"]["fields"][0][
"dims"
] = vector_length
index_definition["params"]["mapping"]["types"][self._scope_name + "." + self._collection_name] = (
index_definition["params"]["mapping"]["types"].pop("collection_name")
)
time.sleep(2)
index_manager.upsert_index(
SearchIndex(
index_definition["name"],
params=index_definition["params"],
source_name=self._bucket_name,
),
)
time.sleep(1)
redis_client.set(collection_exist_cache_key, 1, ex=3600)
def _collection_exists(self, name: str):
scope_collection_map: dict[str, Any] = {}
# Get a list of all scopes in the bucket
for scope in self._bucket.collections().get_all_scopes():
scope_collection_map[scope.name] = []
# Get a list of all the collections in the scope
for collection in scope.collections:
scope_collection_map[scope.name].append(collection.name)
# Check if the collection exists in the scope
return self._collection_name in scope_collection_map[self._scope_name]
def get_type(self) -> str:
return VectorType.COUCHBASE
def add_texts(self, documents: list[Document], embeddings: list[list[float]], **kwargs):
uuids = self._get_uuids(documents)
texts = [d.page_content for d in documents]
metadatas = [d.metadata for d in documents]
doc_ids = []
documents_to_insert = [
{"text": text, "embedding": vector, "metadata": metadata}
for id, text, vector, metadata in zip(uuids, texts, embeddings, metadatas)
]
for doc, id in zip(documents_to_insert, uuids):
result = self._scope.collection(self._collection_name).upsert(id, doc)
doc_ids.extend(uuids)
return doc_ids
def text_exists(self, id: str) -> bool:
# Use a parameterized query for safety and correctness
query = f"""
SELECT COUNT(1) AS count FROM
`{self._client_config.bucket_name}`.{self._client_config.scope_name}.{self._collection_name}
WHERE META().id = $doc_id
"""
# Pass the id as a parameter to the query
result = self._cluster.query(query, named_parameters={"doc_id": id}).execute()
for row in result:
return row["count"] > 0
return False # Return False if no rows are returned
def delete_by_ids(self, ids: list[str]) -> None:
query = f"""
DELETE FROM `{self._bucket_name}`.{self._client_config.scope_name}.{self._collection_name}
WHERE META().id IN $doc_ids;
"""
try:
self._cluster.query(query, named_parameters={"doc_ids": ids}).execute()
except Exception as e:
logger.error(e)
def delete_by_document_id(self, document_id: str):
query = f"""
DELETE FROM
`{self._client_config.bucket_name}`.{self._client_config.scope_name}.{self._collection_name}
WHERE META().id = $doc_id;
"""
self._cluster.query(query, named_parameters={"doc_id": document_id}).execute()
# def get_ids_by_metadata_field(self, key: str, value: str):
# query = f"""
# SELECT id FROM
# `{self._client_config.bucket_name}`.{self._client_config.scope_name}.{self._collection_name}
# WHERE `metadata.{key}` = $value;
# """
# result = self._cluster.query(query, named_parameters={'value':value})
# return [row['id'] for row in result.rows()]
def delete_by_metadata_field(self, key: str, value: str) -> None:
query = f"""
DELETE FROM `{self._client_config.bucket_name}`.{self._client_config.scope_name}.{self._collection_name}
WHERE metadata.{key} = $value;
"""
self._cluster.query(query, named_parameters={"value": value}).execute()
def search_by_vector(self, query_vector: list[float], **kwargs: Any) -> list[Document]:
top_k = kwargs.get("top_k", 5)
score_threshold = kwargs.get("score_threshold") or 0.0
search_req = search.SearchRequest.create(
VectorSearch.from_vector_query(
VectorQuery(
"embedding",
query_vector,
top_k,
)
)
)
try:
search_iter = self._scope.search(
self._collection_name + "_search",
search_req,
SearchOptions(limit=top_k, collections=[self._collection_name], fields=["*"]),
)
docs = []
# Parse the results
for row in search_iter.rows():
text = row.fields.pop("text")
metadata = self._format_metadata(row.fields)
score = row.score
metadata["score"] = score
doc = Document(page_content=text, metadata=metadata)
if score >= score_threshold:
docs.append(doc)
except Exception as e:
raise ValueError(f"Search failed with error: {e}")
return docs
def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
top_k = kwargs.get("top_k", 2)
try:
CBrequest = search.SearchRequest.create(search.QueryStringQuery("text:" + query))
search_iter = self._scope.search(
self._collection_name + "_search", CBrequest, SearchOptions(limit=top_k, fields=["*"])
)
docs = []
for row in search_iter.rows():
text = row.fields.pop("text")
metadata = self._format_metadata(row.fields)
score = row.score
metadata["score"] = score
doc = Document(page_content=text, metadata=metadata)
docs.append(doc)
except Exception as e:
raise ValueError(f"Search failed with error: {e}")
return docs
def delete(self):
manager = self._bucket.collections()
scopes = manager.get_all_scopes()
for scope in scopes:
for collection in scope.collections:
if collection.name == self._collection_name:
manager.drop_collection("_default", self._collection_name)
def _format_metadata(self, row_fields: dict[str, Any]) -> dict[str, Any]:
"""Helper method to format the metadata from the Couchbase Search API.
Args:
row_fields (Dict[str, Any]): The fields to format.
Returns:
Dict[str, Any]: The formatted metadata.
"""
metadata = {}
for key, value in row_fields.items():
# Couchbase Search returns the metadata key with a prefix
# `metadata.` We remove it to get the original metadata key
if key.startswith("metadata"):
new_key = key.split("metadata" + ".")[-1]
metadata[new_key] = value
else:
metadata[key] = value
return metadata
class CouchbaseVectorFactory(AbstractVectorFactory):
def init_vector(self, dataset: Dataset, attributes: list, embeddings: Embeddings) -> CouchbaseVector:
if dataset.index_struct_dict:
class_prefix: str = dataset.index_struct_dict["vector_store"]["class_prefix"]
collection_name = class_prefix
else:
dataset_id = dataset.id
collection_name = Dataset.gen_collection_name_by_id(dataset_id)
dataset.index_struct = json.dumps(self.gen_index_struct_dict(VectorType.COUCHBASE, collection_name))
config = current_app.config
return CouchbaseVector(
collection_name=collection_name,
config=CouchbaseConfig(
connection_string=config.get("COUCHBASE_CONNECTION_STRING"),
user=config.get("COUCHBASE_USER"),
password=config.get("COUCHBASE_PASSWORD"),
bucket_name=config.get("COUCHBASE_BUCKET_NAME"),
scope_name=config.get("COUCHBASE_SCOPE_NAME"),
),
)

View File

@ -142,7 +142,7 @@ class ElasticSearchVector(BaseVector):
def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
query_str = {"match": {Field.CONTENT_KEY.value: query}}
results = self._client.search(index=self._collection_name, query=query_str)
results = self._client.search(index=self._collection_name, query=query_str, size=kwargs.get("top_k", 4))
docs = []
for hit in results["hits"]["hits"]:
docs.append(

View File

@ -0,0 +1,209 @@
import json
import logging
import math
from typing import Any
from pydantic import BaseModel, model_validator
from pyobvector import VECTOR, ObVecClient
from sqlalchemy import JSON, Column, String, func
from sqlalchemy.dialects.mysql import LONGTEXT
from configs import dify_config
from core.rag.datasource.vdb.vector_base import BaseVector
from core.rag.datasource.vdb.vector_factory import AbstractVectorFactory
from core.rag.datasource.vdb.vector_type import VectorType
from core.rag.embedding.embedding_base import Embeddings
from core.rag.models.document import Document
from extensions.ext_redis import redis_client
from models.dataset import Dataset
logger = logging.getLogger(__name__)
DEFAULT_OCEANBASE_HNSW_BUILD_PARAM = {"M": 16, "efConstruction": 256}
DEFAULT_OCEANBASE_HNSW_SEARCH_PARAM = {"efSearch": 64}
OCEANBASE_SUPPORTED_VECTOR_INDEX_TYPE = "HNSW"
DEFAULT_OCEANBASE_VECTOR_METRIC_TYPE = "l2"
class OceanBaseVectorConfig(BaseModel):
host: str
port: int
user: str
password: str
database: str
@model_validator(mode="before")
@classmethod
def validate_config(cls, values: dict) -> dict:
if not values["host"]:
raise ValueError("config OCEANBASE_VECTOR_HOST is required")
if not values["port"]:
raise ValueError("config OCEANBASE_VECTOR_PORT is required")
if not values["user"]:
raise ValueError("config OCEANBASE_VECTOR_USER is required")
if not values["database"]:
raise ValueError("config OCEANBASE_VECTOR_DATABASE is required")
return values
class OceanBaseVector(BaseVector):
def __init__(self, collection_name: str, config: OceanBaseVectorConfig):
super().__init__(collection_name)
self._config = config
self._hnsw_ef_search = -1
self._client = ObVecClient(
uri=f"{self._config.host}:{self._config.port}",
user=self._config.user,
password=self._config.password,
db_name=self._config.database,
)
def get_type(self) -> str:
return VectorType.OCEANBASE
def create(self, texts: list[Document], embeddings: list[list[float]], **kwargs):
self._vec_dim = len(embeddings[0])
self._create_collection()
self.add_texts(texts, embeddings)
def _create_collection(self) -> None:
lock_name = "vector_indexing_lock_" + self._collection_name
with redis_client.lock(lock_name, timeout=20):
collection_exist_cache_key = "vector_indexing_" + self._collection_name
if redis_client.get(collection_exist_cache_key):
return
if self._client.check_table_exists(self._collection_name):
return
self.delete()
cols = [
Column("id", String(36), primary_key=True, autoincrement=False),
Column("vector", VECTOR(self._vec_dim)),
Column("text", LONGTEXT),
Column("metadata", JSON),
]
vidx_params = self._client.prepare_index_params()
vidx_params.add_index(
field_name="vector",
index_type=OCEANBASE_SUPPORTED_VECTOR_INDEX_TYPE,
index_name="vector_index",
metric_type=DEFAULT_OCEANBASE_VECTOR_METRIC_TYPE,
params=DEFAULT_OCEANBASE_HNSW_BUILD_PARAM,
)
self._client.create_table_with_index_params(
table_name=self._collection_name,
columns=cols,
vidxs=vidx_params,
)
vals = []
params = self._client.perform_raw_text_sql("SHOW PARAMETERS LIKE '%ob_vector_memory_limit_percentage%'")
for row in params:
val = int(row[6])
vals.append(val)
if len(vals) == 0:
print("ob_vector_memory_limit_percentage not found in parameters.")
exit(1)
if any(val == 0 for val in vals):
try:
self._client.perform_raw_text_sql("ALTER SYSTEM SET ob_vector_memory_limit_percentage = 30")
except Exception as e:
raise Exception(
"Failed to set ob_vector_memory_limit_percentage. "
+ "Maybe the database user has insufficient privilege.",
e,
)
redis_client.set(collection_exist_cache_key, 1, ex=3600)
def add_texts(self, documents: list[Document], embeddings: list[list[float]], **kwargs):
ids = self._get_uuids(documents)
for id, doc, emb in zip(ids, documents, embeddings):
self._client.insert(
table_name=self._collection_name,
data={
"id": id,
"vector": emb,
"text": doc.page_content,
"metadata": doc.metadata,
},
)
def text_exists(self, id: str) -> bool:
cur = self._client.get(table_name=self._collection_name, id=id)
return cur.rowcount != 0
def delete_by_ids(self, ids: list[str]) -> None:
self._client.delete(table_name=self._collection_name, ids=ids)
def get_ids_by_metadata_field(self, key: str, value: str) -> list[str]:
cur = self._client.get(
table_name=self._collection_name,
where_clause=f"metadata->>'$.{key}' = '{value}'",
output_column_name=["id"],
)
return [row[0] for row in cur]
def delete_by_metadata_field(self, key: str, value: str) -> None:
ids = self.get_ids_by_metadata_field(key, value)
self.delete_by_ids(ids)
def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
return []
def search_by_vector(self, query_vector: list[float], **kwargs: Any) -> list[Document]:
ef_search = kwargs.get("ef_search", self._hnsw_ef_search)
if ef_search != self._hnsw_ef_search:
self._client.set_ob_hnsw_ef_search(ef_search)
self._hnsw_ef_search = ef_search
topk = kwargs.get("top_k", 10)
cur = self._client.ann_search(
table_name=self._collection_name,
vec_column_name="vector",
vec_data=query_vector,
topk=topk,
distance_func=func.l2_distance,
output_column_names=["text", "metadata"],
with_dist=True,
)
docs = []
for text, metadata, distance in cur:
metadata = json.loads(metadata)
metadata["score"] = 1 - distance / math.sqrt(2)
docs.append(
Document(
page_content=text,
metadata=metadata,
)
)
return docs
def delete(self) -> None:
self._client.drop_table_if_exist(self._collection_name)
class OceanBaseVectorFactory(AbstractVectorFactory):
def init_vector(
self,
dataset: Dataset,
attributes: list,
embeddings: Embeddings,
) -> BaseVector:
if dataset.index_struct_dict:
class_prefix: str = dataset.index_struct_dict["vector_store"]["class_prefix"]
collection_name = class_prefix.lower()
else:
dataset_id = dataset.id
collection_name = Dataset.gen_collection_name_by_id(dataset_id).lower()
dataset.index_struct = json.dumps(self.gen_index_struct_dict(VectorType.OCEANBASE, collection_name))
return OceanBaseVector(
collection_name,
OceanBaseVectorConfig(
host=dify_config.OCEANBASE_VECTOR_HOST,
port=dify_config.OCEANBASE_VECTOR_PORT,
user=dify_config.OCEANBASE_VECTOR_USER,
password=(dify_config.OCEANBASE_VECTOR_PASSWORD or ""),
database=dify_config.OCEANBASE_VECTOR_DATABASE,
),
)

View File

@ -4,6 +4,7 @@ import uuid
import requests
from requests.auth import HTTPDigestAuth
from configs import dify_config
from extensions.ext_database import db
from extensions.ext_redis import redis_client
from models.dataset import TidbAuthBinding
@ -208,7 +209,7 @@ class TidbService:
}
spending_limit = {
"monthly": 10,
"monthly": dify_config.TIDB_SPEND_LIMIT,
}
password = str(uuid.uuid4()).replace("-", "")[:16]
display_name = str(uuid.uuid4()).replace("-", "")

View File

@ -114,6 +114,10 @@ class Vector:
from core.rag.datasource.vdb.analyticdb.analyticdb_vector import AnalyticdbVectorFactory
return AnalyticdbVectorFactory
case VectorType.COUCHBASE:
from core.rag.datasource.vdb.couchbase.couchbase_vector import CouchbaseVectorFactory
return CouchbaseVectorFactory
case VectorType.BAIDU:
from core.rag.datasource.vdb.baidu.baidu_vector import BaiduVectorFactory
@ -130,6 +134,10 @@ class Vector:
from core.rag.datasource.vdb.tidb_on_qdrant.tidb_on_qdrant_vector import TidbOnQdrantVectorFactory
return TidbOnQdrantVectorFactory
case VectorType.OCEANBASE:
from core.rag.datasource.vdb.oceanbase.oceanbase_vector import OceanBaseVectorFactory
return OceanBaseVectorFactory
case _:
raise ValueError(f"Vector store {vector_type} is not supported.")

View File

@ -16,7 +16,9 @@ class VectorType(str, Enum):
TENCENT = "tencent"
ORACLE = "oracle"
ELASTICSEARCH = "elasticsearch"
COUCHBASE = "couchbase"
BAIDU = "baidu"
VIKINGDB = "vikingdb"
UPSTASH = "upstash"
TIDB_ON_QDRANT = "tidb_on_qdrant"
OCEANBASE = "oceanbase"

View File

@ -1,6 +1,6 @@
from enum import Enum
class RerankMode(Enum):
class RerankMode(str, Enum):
RERANKING_MODEL = "reranking_model"
WEIGHTED_SCORE = "weighted_score"

View File

@ -22,6 +22,7 @@ from core.rag.datasource.keyword.jieba.jieba_keyword_table_handler import JiebaK
from core.rag.datasource.retrieval_service import RetrievalService
from core.rag.entities.context_entities import DocumentContext
from core.rag.models.document import Document
from core.rag.rerank.rerank_type import RerankMode
from core.rag.retrieval.retrieval_methods import RetrievalMethod
from core.rag.retrieval.router.multi_dataset_function_call_router import FunctionCallMultiDatasetRouter
from core.rag.retrieval.router.multi_dataset_react_route import ReactMultiDatasetRouter
@ -361,10 +362,39 @@ class DatasetRetrieval:
reranking_enable: bool = True,
message_id: Optional[str] = None,
):
if not available_datasets:
return []
threads = []
all_documents = []
dataset_ids = [dataset.id for dataset in available_datasets]
index_type = None
index_type_check = all(
item.indexing_technique == available_datasets[0].indexing_technique for item in available_datasets
)
if not index_type_check and (not reranking_enable or reranking_mode != RerankMode.RERANKING_MODEL):
raise ValueError(
"The configured knowledge base list have different indexing technique, please set reranking model."
)
index_type = available_datasets[0].indexing_technique
if index_type == "high_quality":
embedding_model_check = all(
item.embedding_model == available_datasets[0].embedding_model for item in available_datasets
)
embedding_model_provider_check = all(
item.embedding_model_provider == available_datasets[0].embedding_model_provider
for item in available_datasets
)
if (
reranking_enable
and reranking_mode == "weighted_score"
and (not embedding_model_check or not embedding_model_provider_check)
):
raise ValueError(
"The configured knowledge base list have different embedding model, please set reranking model."
)
if reranking_enable and reranking_mode == RerankMode.WEIGHTED_SCORE:
weights["vector_setting"]["embedding_provider_name"] = available_datasets[0].embedding_model_provider
weights["vector_setting"]["embedding_model_name"] = available_datasets[0].embedding_model
for dataset in available_datasets:
index_type = dataset.indexing_technique
retrieval_thread = threading.Thread(

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

View File

@ -0,0 +1,11 @@
from hashlib import md5
class BaiduTranslateToolBase:
def _get_sign(self, appid, secret, salt, query):
"""
get baidu translate sign
"""
# concatenate the string in the order of appid+q+salt+secret
str = appid + query + salt + secret
return md5(str.encode("utf-8")).hexdigest()

View File

@ -0,0 +1,17 @@
from typing import Any
from core.tools.errors import ToolProviderCredentialValidationError
from core.tools.provider.builtin.baidu_translate.tools.translate import BaiduTranslateTool
from core.tools.provider.builtin_tool_provider import BuiltinToolProviderController
class BaiduTranslateProvider(BuiltinToolProviderController):
def _validate_credentials(self, credentials: dict[str, Any]) -> None:
try:
BaiduTranslateTool().fork_tool_runtime(
runtime={
"credentials": credentials,
}
).invoke(user_id="", tool_parameters={"q": "这是一段测试文本", "from": "auto", "to": "en"})
except Exception as e:
raise ToolProviderCredentialValidationError(str(e))

View File

@ -0,0 +1,39 @@
identity:
author: Xiao Ley
name: baidu_translate
label:
en_US: Baidu Translate
zh_Hans: 百度翻译
description:
en_US: Translate text using Baidu
zh_Hans: 使用百度进行翻译
icon: icon.png
tags:
- utilities
credentials_for_provider:
appid:
type: secret-input
required: true
label:
en_US: Baidu translate appid
zh_Hans: Baidu translate appid
placeholder:
en_US: Please input your Baidu translate appid
zh_Hans: 请输入你的百度翻译 appid
help:
en_US: Get your Baidu translate appid from Baidu translate
zh_Hans: 从百度翻译开放平台获取你的 appid
url: https://api.fanyi.baidu.com
secret:
type: secret-input
required: true
label:
en_US: Baidu translate secret
zh_Hans: Baidu translate secret
placeholder:
en_US: Please input your Baidu translate secret
zh_Hans: 请输入你的百度翻译 secret
help:
en_US: Get your Baidu translate secret from Baidu translate
zh_Hans: 从百度翻译开放平台获取你的 secret
url: https://api.fanyi.baidu.com

View File

@ -0,0 +1,78 @@
import random
from hashlib import md5
from typing import Any, Union
import requests
from core.tools.entities.tool_entities import ToolInvokeMessage
from core.tools.provider.builtin.baidu_translate._baidu_translate_tool_base import BaiduTranslateToolBase
from core.tools.tool.builtin_tool import BuiltinTool
class BaiduFieldTranslateTool(BuiltinTool, BaiduTranslateToolBase):
def _invoke(
self,
user_id: str,
tool_parameters: dict[str, Any],
) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
"""
invoke tools
"""
BAIDU_FIELD_TRANSLATE_URL = "https://fanyi-api.baidu.com/api/trans/vip/fieldtranslate"
appid = self.runtime.credentials.get("appid", "")
if not appid:
raise ValueError("invalid baidu translate appid")
secret = self.runtime.credentials.get("secret", "")
if not secret:
raise ValueError("invalid baidu translate secret")
q = tool_parameters.get("q", "")
if not q:
raise ValueError("Please input text to translate")
from_ = tool_parameters.get("from", "")
if not from_:
raise ValueError("Please select source language")
to = tool_parameters.get("to", "")
if not to:
raise ValueError("Please select destination language")
domain = tool_parameters.get("domain", "")
if not domain:
raise ValueError("Please select domain")
salt = str(random.randint(32768, 16777215))
sign = self._get_sign(appid, secret, salt, q, domain)
headers = {"Content-Type": "application/x-www-form-urlencoded"}
params = {
"q": q,
"from": from_,
"to": to,
"appid": appid,
"salt": salt,
"domain": domain,
"sign": sign,
"needIntervene": 1,
}
try:
response = requests.post(BAIDU_FIELD_TRANSLATE_URL, headers=headers, data=params)
result = response.json()
if "trans_result" in result:
result_text = result["trans_result"][0]["dst"]
else:
result_text = f'{result["error_code"]}: {result["error_msg"]}'
return self.create_text_message(str(result_text))
except requests.RequestException as e:
raise ValueError(f"Translation service error: {e}")
except Exception:
raise ValueError("Translation service error, please check the network")
def _get_sign(self, appid, secret, salt, query, domain):
str = appid + query + salt + domain + secret
return md5(str.encode("utf-8")).hexdigest()

View File

@ -0,0 +1,123 @@
identity:
name: field_translate
author: Xiao Ley
label:
en_US: Field translate
zh_Hans: 百度领域翻译
description:
human:
en_US: A tool for Baidu Field translate (Currently, the fields of "novel" and "wiki" only support Chinese to English translation. If the language direction is set to English to Chinese, the default output will be a universal translation result).
zh_Hans: 百度领域翻译,提供多种领域的文本翻译(目前“网络文学领域”和“人文社科领域”仅支持中到英,如设置语言方向为英到中,则默认输出通用翻译结果)
llm: A tool for Baidu Field translate
parameters:
- name: q
type: string
required: true
label:
en_US: Text content
zh_Hans: 文本内容
human_description:
en_US: Text content to be translated
zh_Hans: 需要翻译的文本内容
llm_description: Text content to be translated
form: llm
- name: from
type: select
required: true
label:
en_US: source language
zh_Hans: 源语言
human_description:
en_US: The source language of the input text
zh_Hans: 输入的文本的源语言
default: auto
form: form
options:
- value: auto
label:
en_US: auto
zh_Hans: 自动检测
- value: zh
label:
en_US: Chinese
zh_Hans: 中文
- value: en
label:
en_US: English
zh_Hans: 英语
- name: to
type: select
required: true
label:
en_US: destination language
zh_Hans: 目标语言
human_description:
en_US: The destination language of the input text
zh_Hans: 输入文本的目标语言
default: en
form: form
options:
- value: zh
label:
en_US: Chinese
zh_Hans: 中文
- value: en
label:
en_US: English
zh_Hans: 英语
- name: domain
type: select
required: true
label:
en_US: domain
zh_Hans: 领域
human_description:
en_US: The domain of the input text
zh_Hans: 输入文本的领域
default: novel
form: form
options:
- value: it
label:
en_US: it
zh_Hans: 信息技术领域
- value: finance
label:
en_US: finance
zh_Hans: 金融财经领域
- value: machinery
label:
en_US: machinery
zh_Hans: 机械制造领域
- value: senimed
label:
en_US: senimed
zh_Hans: 生物医药领域
- value: novel
label:
en_US: novel (only support Chinese to English translation)
zh_Hans: 网络文学领域(仅支持中到英)
- value: academic
label:
en_US: academic
zh_Hans: 学术论文领域
- value: aerospace
label:
en_US: aerospace
zh_Hans: 航空航天领域
- value: wiki
label:
en_US: wiki (only support Chinese to English translation)
zh_Hans: 人文社科领域(仅支持中到英)
- value: news
label:
en_US: news
zh_Hans: 新闻咨询领域
- value: law
label:
en_US: law
zh_Hans: 法律法规领域
- value: contract
label:
en_US: contract
zh_Hans: 合同领域

View File

@ -0,0 +1,95 @@
import random
from typing import Any, Union
import requests
from core.tools.entities.tool_entities import ToolInvokeMessage
from core.tools.provider.builtin.baidu_translate._baidu_translate_tool_base import BaiduTranslateToolBase
from core.tools.tool.builtin_tool import BuiltinTool
class BaiduLanguageTool(BuiltinTool, BaiduTranslateToolBase):
def _invoke(
self,
user_id: str,
tool_parameters: dict[str, Any],
) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
"""
invoke tools
"""
BAIDU_LANGUAGE_URL = "https://fanyi-api.baidu.com/api/trans/vip/language"
appid = self.runtime.credentials.get("appid", "")
if not appid:
raise ValueError("invalid baidu translate appid")
secret = self.runtime.credentials.get("secret", "")
if not secret:
raise ValueError("invalid baidu translate secret")
q = tool_parameters.get("q", "")
if not q:
raise ValueError("Please input text to translate")
description_language = tool_parameters.get("description_language", "English")
salt = str(random.randint(32768, 16777215))
sign = self._get_sign(appid, secret, salt, q)
headers = {"Content-Type": "application/x-www-form-urlencoded"}
params = {
"q": q,
"appid": appid,
"salt": salt,
"sign": sign,
}
try:
response = requests.post(BAIDU_LANGUAGE_URL, params=params, headers=headers)
result = response.json()
if "error_code" not in result:
raise ValueError("Translation service error, please check the network")
result_text = ""
if result["error_code"] != 0:
result_text = f'{result["error_code"]}: {result["error_msg"]}'
else:
result_text = result["data"]["src"]
result_text = self.mapping_result(description_language, result_text)
return self.create_text_message(result_text)
except requests.RequestException as e:
raise ValueError(f"Translation service error: {e}")
except Exception:
raise ValueError("Translation service error, please check the network")
def mapping_result(self, description_language: str, result: str) -> str:
"""
mapping result
"""
mapping = {
"English": {
"zh": "Chinese",
"en": "English",
"jp": "Japanese",
"kor": "Korean",
"th": "Thai",
"vie": "Vietnamese",
"ru": "Russian",
},
"Chinese": {
"zh": "中文",
"en": "英文",
"jp": "日文",
"kor": "韩文",
"th": "泰语",
"vie": "越南语",
"ru": "俄语",
},
}
language_mapping = mapping.get(description_language)
if not language_mapping:
return result
return language_mapping.get(result, result)

View File

@ -0,0 +1,43 @@
identity:
name: language
author: Xiao Ley
label:
en_US: Baidu Language
zh_Hans: 百度语种识别
description:
human:
en_US: A tool for Baidu Language, support Chinese, English, Japanese, Korean, Thai, Vietnamese and Russian
zh_Hans: 使用百度进行语种识别,支持的语种:中文、英语、日语、韩语、泰语、越南语和俄语
llm: A tool for Baidu Language
parameters:
- name: q
type: string
required: true
label:
en_US: Text content
zh_Hans: 文本内容
human_description:
en_US: Text content to be recognized
zh_Hans: 需要识别语言的文本内容
llm_description: Text content to be recognized
form: llm
- name: description_language
type: select
required: true
label:
en_US: Description language
zh_Hans: 描述语言
human_description:
en_US: Describe the language used to identify the results
zh_Hans: 描述识别结果所用的语言
default: Chinese
form: form
options:
- value: Chinese
label:
en_US: Chinese
zh_Hans: 中文
- value: English
label:
en_US: English
zh_Hans: 英语

View File

@ -0,0 +1,67 @@
import random
from typing import Any, Union
import requests
from core.tools.entities.tool_entities import ToolInvokeMessage
from core.tools.provider.builtin.baidu_translate._baidu_translate_tool_base import BaiduTranslateToolBase
from core.tools.tool.builtin_tool import BuiltinTool
class BaiduTranslateTool(BuiltinTool, BaiduTranslateToolBase):
def _invoke(
self,
user_id: str,
tool_parameters: dict[str, Any],
) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
"""
invoke tools
"""
BAIDU_TRANSLATE_URL = "https://fanyi-api.baidu.com/api/trans/vip/translate"
appid = self.runtime.credentials.get("appid", "")
if not appid:
raise ValueError("invalid baidu translate appid")
secret = self.runtime.credentials.get("secret", "")
if not secret:
raise ValueError("invalid baidu translate secret")
q = tool_parameters.get("q", "")
if not q:
raise ValueError("Please input text to translate")
from_ = tool_parameters.get("from", "")
if not from_:
raise ValueError("Please select source language")
to = tool_parameters.get("to", "")
if not to:
raise ValueError("Please select destination language")
salt = str(random.randint(32768, 16777215))
sign = self._get_sign(appid, secret, salt, q)
headers = {"Content-Type": "application/x-www-form-urlencoded"}
params = {
"q": q,
"from": from_,
"to": to,
"appid": appid,
"salt": salt,
"sign": sign,
}
try:
response = requests.post(BAIDU_TRANSLATE_URL, params=params, headers=headers)
result = response.json()
if "trans_result" in result:
result_text = result["trans_result"][0]["dst"]
else:
result_text = f'{result["error_code"]}: {result["error_msg"]}'
return self.create_text_message(str(result_text))
except requests.RequestException as e:
raise ValueError(f"Translation service error: {e}")
except Exception:
raise ValueError("Translation service error, please check the network")

View File

@ -0,0 +1,275 @@
identity:
name: translate
author: Xiao Ley
label:
en_US: Translate
zh_Hans: 百度翻译
description:
human:
en_US: A tool for Baidu Translate
zh_Hans: 百度翻译
llm: A tool for Baidu Translate
parameters:
- name: q
type: string
required: true
label:
en_US: Text content
zh_Hans: 文本内容
human_description:
en_US: Text content to be translated
zh_Hans: 需要翻译的文本内容
llm_description: Text content to be translated
form: llm
- name: from
type: select
required: true
label:
en_US: source language
zh_Hans: 源语言
human_description:
en_US: The source language of the input text
zh_Hans: 输入的文本的源语言
default: auto
form: form
options:
- value: auto
label:
en_US: auto
zh_Hans: 自动检测
- value: zh
label:
en_US: Chinese
zh_Hans: 中文
- value: en
label:
en_US: English
zh_Hans: 英语
- value: cht
label:
en_US: Traditional Chinese
zh_Hans: 繁体中文
- value: yue
label:
en_US: Yue
zh_Hans: 粤语
- value: wyw
label:
en_US: Wyw
zh_Hans: 文言文
- value: jp
label:
en_US: Japanese
zh_Hans: 日语
- value: kor
label:
en_US: Korean
zh_Hans: 韩语
- value: fra
label:
en_US: French
zh_Hans: 法语
- value: spa
label:
en_US: Spanish
zh_Hans: 西班牙语
- value: th
label:
en_US: Thai
zh_Hans: 泰语
- value: ara
label:
en_US: Arabic
zh_Hans: 阿拉伯语
- value: ru
label:
en_US: Russian
zh_Hans: 俄语
- value: pt
label:
en_US: Portuguese
zh_Hans: 葡萄牙语
- value: de
label:
en_US: German
zh_Hans: 德语
- value: it
label:
en_US: Italian
zh_Hans: 意大利语
- value: el
label:
en_US: Greek
zh_Hans: 希腊语
- value: nl
label:
en_US: Dutch
zh_Hans: 荷兰语
- value: pl
label:
en_US: Polish
zh_Hans: 波兰语
- value: bul
label:
en_US: Bulgarian
zh_Hans: 保加利亚语
- value: est
label:
en_US: Estonian
zh_Hans: 爱沙尼亚语
- value: dan
label:
en_US: Danish
zh_Hans: 丹麦语
- value: fin
label:
en_US: Finnish
zh_Hans: 芬兰语
- value: cs
label:
en_US: Czech
zh_Hans: 捷克语
- value: rom
label:
en_US: Romanian
zh_Hans: 罗马尼亚语
- value: slo
label:
en_US: Slovak
zh_Hans: 斯洛文尼亚语
- value: swe
label:
en_US: Swedish
zh_Hans: 瑞典语
- value: hu
label:
en_US: Hungarian
zh_Hans: 匈牙利语
- value: vie
label:
en_US: Vietnamese
zh_Hans: 越南语
- name: to
type: select
required: true
label:
en_US: destination language
zh_Hans: 目标语言
human_description:
en_US: The destination language of the input text
zh_Hans: 输入文本的目标语言
default: en
form: form
options:
- value: zh
label:
en_US: Chinese
zh_Hans: 中文
- value: en
label:
en_US: English
zh_Hans: 英语
- value: cht
label:
en_US: Traditional Chinese
zh_Hans: 繁体中文
- value: yue
label:
en_US: Yue
zh_Hans: 粤语
- value: wyw
label:
en_US: Wyw
zh_Hans: 文言文
- value: jp
label:
en_US: Japanese
zh_Hans: 日语
- value: kor
label:
en_US: Korean
zh_Hans: 韩语
- value: fra
label:
en_US: French
zh_Hans: 法语
- value: spa
label:
en_US: Spanish
zh_Hans: 西班牙语
- value: th
label:
en_US: Thai
zh_Hans: 泰语
- value: ara
label:
en_US: Arabic
zh_Hans: 阿拉伯语
- value: ru
label:
en_US: Russian
zh_Hans: 俄语
- value: pt
label:
en_US: Portuguese
zh_Hans: 葡萄牙语
- value: de
label:
en_US: German
zh_Hans: 德语
- value: it
label:
en_US: Italian
zh_Hans: 意大利语
- value: el
label:
en_US: Greek
zh_Hans: 希腊语
- value: nl
label:
en_US: Dutch
zh_Hans: 荷兰语
- value: pl
label:
en_US: Polish
zh_Hans: 波兰语
- value: bul
label:
en_US: Bulgarian
zh_Hans: 保加利亚语
- value: est
label:
en_US: Estonian
zh_Hans: 爱沙尼亚语
- value: dan
label:
en_US: Danish
zh_Hans: 丹麦语
- value: fin
label:
en_US: Finnish
zh_Hans: 芬兰语
- value: cs
label:
en_US: Czech
zh_Hans: 捷克语
- value: rom
label:
en_US: Romanian
zh_Hans: 罗马尼亚语
- value: slo
label:
en_US: Slovak
zh_Hans: 斯洛文尼亚语
- value: swe
label:
en_US: Swedish
zh_Hans: 瑞典语
- value: hu
label:
en_US: Hungarian
zh_Hans: 匈牙利语
- value: vie
label:
en_US: Vietnamese
zh_Hans: 越南语

View File

@ -1 +0,0 @@
VECTORIZER_ICON_PNG = "iVBORw0KGgoAAAANSUhEUgAAAGAAAABgCAYAAADimHc4AAAACXBIWXMAACxLAAAsSwGlPZapAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAboSURBVHgB7Z09bBxFFMffRoAvcQqbguBUxu4wCUikMCZ0TmQK4NLQJCJOlQIkokgEGhQ7NCFIKEhQuIqNnIaGMxRY2GVwmlggDHS+pIHELmIXMTEULPP3eeXz7e7szO7MvE1ufpKV03nuNn7/mfcxH7tEHo/H42lXgqwG1bGw65+/aTQM6K0gpJdCoi7ypCIMui5s9Qv9R1OVTqrVxoL1jPbpvH4hrIp/rnmj5+YOhTQ++1kwmdZgT9ovRi6EF4Xhv/XGL0Sv6OLXYMu0BokjYOSDcBQfJI8xhKFP/HAlqCW8v5vqubBr8yn6maCexxiIDR376LnWmBBzQZtPEvx+L3mMAleOZKb1/XgM2EOnyWMFZJKt78UEQKpJHisk2TYmgM967JFk2z3kYcULwIwXgBkvADNeAGa8AMw8Qcwc6N55/eAh0cYmGaOzQtR/kOhQX+M6+/c23r+3RlT/i2ipTrSyRqw4F+CwMMbgANHQwG7jRywLw/wqDDNzI79xYPjqa2L262jjtYzaT0QT3xEbsck4MXUakgWOvUx08liy0ZPYEKNhel4Y6AZpgR7/8Tvq1wEQ+sMJN6Nh9kqwy+bWYwAM8elZovNv6xmlU7iLs280RNO9ls51os/h/8eBVQEig8Dt5OXUsNrno2tluZw0cI3qUXKONQHy9sYkVHqnjntLA2LnFTAv1gSA+zBhfIDvkfVO/B4xRgWZn4fbe2WAnGJFAAxn03+I7PtUXdzE90Sjl4ne+6L4d5nCigAyYyHPn7tFdPN30uJwX/qI6jtISkQZFVLdhd9SrtNPTrFSB6QZBAaYntsptpAyfvk+KYOCamVR/XrNtLqepduiFnkh3g4iIw6YLAhlOJmKwB9zaarhApr/MPREjAZVisSU1s/KYsGzhmKXClYEWLm/8xpV7btXhcv5I7lt2vtJFA3q/T07r1HopdG5l5xhxQVdn28YFn8kBJCBOZmiPHio1m5QuJzlu9ntXApgZwSsNYJslvGjtjrfm8Sq4neceFUtz3dZCzwW09Gqo2hreuPN7HZRnNqa1BP1x8lhczVNK+zT0TqkjYAF4e7Okxoo2PZX5K4IrhNpb/P8FTK2S1+TcUq1HpBFmquJYo1qEYU6RVarJE0c2ooL7C5IRwBZ5nJ9joyRtk5hA3YBdHqWzG1gBKgE/bzMaK5LqMIugKrbUDHu59/YWVRBsWhrsYZdANV5HBUXYGNlC9dFBW8LdgH6FQVYUnQvkQgm3NH8YuO7bM4LsWZBfT3qRY9OxRyJgJRz+Ij+FDPEQ1C3GVMiWAVQ7f31u/ncytxi4wdZTbRGgdcHnpYLD/FcwSrAoOKizfKfVAiIF4kBMPK+Opfe1iWsMUB1BJh2BRgBabSNAOiFqkXYbcNFUF9P+u82FGdWTcEmgGrvh0FUppB1kC073muXEaDq/21kIjLxV9tFAC7/n5X6tkUM0PH/dcP+P0v41fvkFBYBVHs/MD0CDmVsOzEdb7JgEYDT/8uq4rpj44NSjwDTc/CyzV1gxbH7Ac4F0PH/S4ZHAOaFZLiY+2nFuQA6/t9kQMTCz1CG66tbWvWS4VwAVf9vugAbel6efqrsYbKBcwFeVNz8ajobyTppw2F84FQAnfl/kwER6wJZcWdBc7e2KZwKoOP/TVakWb0f7md+kVhwOwI0BDCFyq42rt4PSiuAiRGAEXdK4ZQlV+8HTgVwefwHvR7nhbOA0FwBGDgTIM/Z3SLXUj2hOW1wR10eSrs7Ou9eTB3jo/dzuh/gTABdn35c8dhpM3BxOmeTuXs/cDoCdDY4qe7l32pbaZxL1jF+GXo/cLotBcWVTiZU3T7RMn8rHiijW9FgauP4Ef1TLdhHWgacCgAj6tYCqGKjU/DNbqxIkMYZNs7MpxmnLuhmwYJna1dbdzHjY42hDL4/wqkA6HWuDkAngRH0iYVjRkVwnoZO/0gsuLwpkw7OBcAtwlwvfESHxctmfMBSiOG0oStj4HCF7T3+RWARwIU7QK/HbWlqls52mYJtezqMj3v34C5VOveFy8Ll4QoTsJ8Txp0RsW8/Os2im2LCtSC1RIqLw3RldTVplOKkPEYDhMAPqttnune2rzTv5Y+WKdEem2ixkWqZYSeDSUp3qwIYNOrR7cBjcbOORxkvADNeAGa8AMx4AZjxAjATf5Ab0Tp5rJBk2/iD3PAwYo8Vkmyb9CjDGfLYIaCp1rdiAnT8S5PeDVkgoDuVCsWeJxwToHZ163m3Z8hjloDGk54vn5gFbT/5eZw8phifvZz8XPlA9qmRj8JRCumi+OkljzbbrvxM0qPMm9rIqY6FXZubVBUinMbzcP3jbuXA6Mh2kMx07KPJJLfj8Xg8Hg/4H+KfFYb2WM4MAAAAAElFTkSuQmCC" # noqa: E501

View File

@ -1,11 +1,12 @@
from base64 import b64decode
from typing import Any, Union
from httpx import post
from core.file.enums import FileType
from core.file.file_manager import download
from core.tools.entities.common_entities import I18nObject
from core.tools.entities.tool_entities import ToolInvokeMessage, ToolParameter
from core.tools.errors import ToolProviderCredentialValidationError
from core.tools.provider.builtin.vectorizer.tools.test_data import VECTORIZER_ICON_PNG
from core.tools.errors import ToolParameterValidationError
from core.tools.tool.builtin_tool import BuiltinTool
@ -16,30 +17,30 @@ class VectorizerTool(BuiltinTool):
"""
invoke tools
"""
api_key_name = self.runtime.credentials.get("api_key_name", None)
api_key_value = self.runtime.credentials.get("api_key_value", None)
api_key_name = self.runtime.credentials.get("api_key_name")
api_key_value = self.runtime.credentials.get("api_key_value")
mode = tool_parameters.get("mode", "test")
if mode == "production":
mode = "preview"
if not api_key_name or not api_key_value:
raise ToolProviderCredentialValidationError("Please input api key name and value")
# image file for workflow mode
image = tool_parameters.get("image")
if image and image.type != FileType.IMAGE:
raise ToolParameterValidationError("Not a valid image")
# image_id for agent mode
image_id = tool_parameters.get("image_id", "")
if not image_id:
return self.create_text_message("Please input image id")
if image_id.startswith("__test_"):
image_binary = b64decode(VECTORIZER_ICON_PNG)
else:
if image_id:
image_binary = self.get_variable_file(self.VariableKey.IMAGE)
if not image_binary:
return self.create_text_message("Image not found, please request user to generate image firstly.")
elif image:
image_binary = download(image)
else:
raise ToolParameterValidationError("Please provide either image or image_id")
response = post(
"https://vectorizer.ai/api/v1/vectorize",
data={"mode": mode},
files={"image": image_binary},
data={"mode": mode} if mode == "test" else {},
auth=(api_key_name, api_key_value),
timeout=30,
)
@ -59,11 +60,23 @@ class VectorizerTool(BuiltinTool):
return [
ToolParameter.get_simple_instance(
name="image_id",
llm_description=f"the image id that you want to vectorize, \
and the image id should be specified in \
llm_description=f"the image_id that you want to vectorize, \
and the image_id should be specified in \
{[i.name for i in self.list_default_image_variables()]}",
type=ToolParameter.ToolParameterType.SELECT,
required=True,
required=False,
options=[i.name for i in self.list_default_image_variables()],
)
),
ToolParameter(
name="image",
label=I18nObject(en_US="image", zh_Hans="image"),
human_description=I18nObject(
en_US="The image to be converted.",
zh_Hans="要转换的图片。",
),
type=ToolParameter.ToolParameterType.FILE,
form=ToolParameter.ToolParameterForm.LLM,
llm_description="you should not input this parameter. just input the image_id.",
required=False,
),
]

View File

@ -4,14 +4,21 @@ identity:
label:
en_US: Vectorizer.AI
zh_Hans: Vectorizer.AI
pt_BR: Vectorizer.AI
description:
human:
en_US: Convert your PNG and JPG images to SVG vectors quickly and easily. Fully automatically. Using AI.
zh_Hans: 一个将 PNG 和 JPG 图像快速轻松地转换为 SVG 矢量图的工具。
pt_BR: Convert your PNG and JPG images to SVG vectors quickly and easily. Fully automatically. Using AI.
llm: A tool for converting images to SVG vectors. you should input the image id as the input of this tool. the image id can be got from parameters.
parameters:
- name: image
type: file
label:
en_US: image
human_description:
en_US: The image to be converted.
zh_Hans: 要转换的图片。
llm_description: you should not input this parameter. just input the image_id.
form: llm
- name: mode
type: select
required: true
@ -20,19 +27,15 @@ parameters:
label:
en_US: production
zh_Hans: 生产模式
pt_BR: production
- value: test
label:
en_US: test
zh_Hans: 测试模式
pt_BR: test
default: test
label:
en_US: Mode
zh_Hans: 模式
pt_BR: Mode
human_description:
en_US: It is free to integrate with and test out the API in test mode, no subscription required.
zh_Hans: 在测试模式下可以免费测试API。
pt_BR: It is free to integrate with and test out the API in test mode, no subscription required.
form: form

View File

@ -1,5 +1,7 @@
from typing import Any
from core.file import File
from core.file.enums import FileTransferMethod, FileType
from core.tools.errors import ToolProviderCredentialValidationError
from core.tools.provider.builtin.vectorizer.tools.vectorizer import VectorizerTool
from core.tools.provider.builtin_tool_provider import BuiltinToolProviderController
@ -7,6 +9,12 @@ from core.tools.provider.builtin_tool_provider import BuiltinToolProviderControl
class VectorizerProvider(BuiltinToolProviderController):
def _validate_credentials(self, credentials: dict[str, Any]) -> None:
test_img = File(
tenant_id="__test_123",
remote_url="https://cloud.dify.ai/logo/logo-site.png",
type=FileType.IMAGE,
transfer_method=FileTransferMethod.REMOTE_URL,
)
try:
VectorizerTool().fork_tool_runtime(
runtime={
@ -14,7 +22,7 @@ class VectorizerProvider(BuiltinToolProviderController):
}
).invoke(
user_id="",
tool_parameters={"mode": "test", "image_id": "__test_123"},
tool_parameters={"mode": "test", "image": test_img},
)
except Exception as e:
raise ToolProviderCredentialValidationError(str(e))

View File

@ -4,11 +4,9 @@ identity:
label:
en_US: Vectorizer.AI
zh_Hans: Vectorizer.AI
pt_BR: Vectorizer.AI
description:
en_US: Convert your PNG and JPG images to SVG vectors quickly and easily. Fully automatically. Using AI.
zh_Hans: 一个将 PNG 和 JPG 图像快速轻松地转换为 SVG 矢量图的工具。
pt_BR: Convert your PNG and JPG images to SVG vectors quickly and easily. Fully automatically. Using AI.
icon: icon.png
tags:
- productivity
@ -20,15 +18,12 @@ credentials_for_provider:
label:
en_US: Vectorizer.AI API Key name
zh_Hans: Vectorizer.AI API Key name
pt_BR: Vectorizer.AI API Key name
placeholder:
en_US: Please input your Vectorizer.AI ApiKey name
zh_Hans: 请输入你的 Vectorizer.AI ApiKey name
pt_BR: Please input your Vectorizer.AI ApiKey name
help:
en_US: Get your Vectorizer.AI API Key from Vectorizer.AI.
zh_Hans: 从 Vectorizer.AI 获取您的 Vectorizer.AI API Key。
pt_BR: Get your Vectorizer.AI API Key from Vectorizer.AI.
url: https://vectorizer.ai/api
api_key_value:
type: secret-input
@ -36,12 +31,9 @@ credentials_for_provider:
label:
en_US: Vectorizer.AI API Key
zh_Hans: Vectorizer.AI API Key
pt_BR: Vectorizer.AI API Key
placeholder:
en_US: Please input your Vectorizer.AI ApiKey
zh_Hans: 请输入你的 Vectorizer.AI ApiKey
pt_BR: Please input your Vectorizer.AI ApiKey
help:
en_US: Get your Vectorizer.AI API Key from Vectorizer.AI.
zh_Hans: 从 Vectorizer.AI 获取您的 Vectorizer.AI API Key。
pt_BR: Get your Vectorizer.AI API Key from Vectorizer.AI.

View File

@ -242,11 +242,15 @@ class ToolManager:
parameters = tool_entity.get_all_runtime_parameters()
for parameter in parameters:
# check file types
if parameter.type in {
ToolParameter.ToolParameterType.SYSTEM_FILES,
ToolParameter.ToolParameterType.FILE,
ToolParameter.ToolParameterType.FILES,
}:
if (
parameter.type
in {
ToolParameter.ToolParameterType.SYSTEM_FILES,
ToolParameter.ToolParameterType.FILE,
ToolParameter.ToolParameterType.FILES,
}
and parameter.required
):
raise ValueError(f"file type parameter {parameter.name} not supported in agent")
if parameter.form == ToolParameter.ToolParameterForm.FORM:

View File

@ -153,6 +153,7 @@ class AnswerStreamGeneratorRouter:
NodeType.IF_ELSE,
NodeType.QUESTION_CLASSIFIER,
NodeType.ITERATION,
NodeType.CONVERSATION_VARIABLE_ASSIGNER,
}:
answer_dependencies[answer_node_id].append(source_node_id)
else:

View File

@ -94,7 +94,7 @@ class Response:
@property
def is_file(self):
content_type = self.content_type
content_disposition = self.response.headers.get("Content-Disposition", "")
content_disposition = self.response.headers.get("content-disposition", "")
return "attachment" in content_disposition or (
not any(non_file in content_type for non_file in NON_FILE_CONTENT_TYPES)
@ -103,7 +103,7 @@ class Response:
@property
def content_type(self) -> str:
return self.headers.get("Content-Type", "")
return self.headers.get("content-type", "")
@property
def text(self) -> str:

View File

@ -33,7 +33,7 @@ class Executor:
params: Mapping[str, str] | None
content: str | bytes | None
data: Mapping[str, Any] | None
files: Mapping[str, bytes] | None
files: Mapping[str, tuple[str | None, bytes, str]] | None
json: Any
headers: dict[str, str]
auth: HttpRequestNodeAuthorization
@ -141,7 +141,11 @@ class Executor:
files = {k: self.variable_pool.get_file(selector) for k, selector in file_selectors.items()}
files = {k: v for k, v in files.items() if v is not None}
files = {k: variable.value for k, variable in files.items()}
files = {k: file_manager.download(v) for k, v in files.items() if v.related_id is not None}
files = {
k: (v.filename, file_manager.download(v), v.mime_type or "application/octet-stream")
for k, v in files.items()
if v.related_id is not None
}
self.data = form_data
self.files = files

View File

@ -142,10 +142,11 @@ class HttpRequestNode(BaseNode[HttpRequestNodeData]):
Extract files from response
"""
files = []
is_file = response.is_file
content_type = response.content_type
content = response.content
if content_type:
if is_file and content_type:
# extract filename from url
filename = path.basename(url)
# extract extension if possible

View File

@ -327,7 +327,7 @@ class LLMNode(BaseNode[LLMNodeData]):
if variable is None:
raise ValueError(f"Variable {variable_selector.variable} not found")
if isinstance(variable, NoneSegment):
continue
inputs[variable_selector.variable] = ""
inputs[variable_selector.variable] = variable.to_object()
memory = node_data.memory

View File

@ -1,8 +1,10 @@
import logging
import os
import sys
from datetime import datetime
from logging.handlers import RotatingFileHandler
import pytz
from flask import Flask
from configs import dify_config
@ -30,16 +32,10 @@ def init_app(app: Flask):
handlers=log_handlers,
force=True,
)
log_tz = dify_config.LOG_TZ
if log_tz:
from datetime import datetime
import pytz
timezone = pytz.timezone(log_tz)
def time_converter(seconds):
return datetime.utcfromtimestamp(seconds).astimezone(timezone).timetuple()
for handler in logging.root.handlers:
handler.formatter.converter = time_converter
handler.formatter.converter = lambda seconds: (
datetime.fromtimestamp(seconds, tz=pytz.UTC).astimezone(log_tz).timetuple()
)

120
api/poetry.lock generated
View File

@ -1,4 +1,4 @@
# This file is automatically @generated by Poetry 1.8.3 and should not be changed by hand.
# This file is automatically @generated by Poetry 1.8.2 and should not be changed by hand.
[[package]]
name = "aiohappyeyeballs"
@ -932,6 +932,10 @@ files = [
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:a37b8f0391212d29b3a91a799c8e4a2855e0576911cdfb2515487e30e322253d"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_1_ppc64le.whl", hash = "sha256:e84799f09591700a4154154cab9787452925578841a94321d5ee8fb9a9a328f0"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:f66b5337fa213f1da0d9000bc8dc0cb5b896b726eefd9c6046f699b169c41b9e"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:5dab0844f2cf82be357a0eb11a9087f70c5430b2c241493fc122bb6f2bb0917c"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:e4fe605b917c70283db7dfe5ada75e04561479075761a0b3866c081d035b01c1"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:1e9a65b5736232e7a7f91ff3d02277f11d339bf34099a56cdab6a8b3410a02b2"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:58d4b711689366d4a03ac7957ab8c28890415e267f9b6589969e74b6e42225ec"},
{file = "Brotli-1.1.0-cp310-cp310-win32.whl", hash = "sha256:be36e3d172dc816333f33520154d708a2657ea63762ec16b62ece02ab5e4daf2"},
{file = "Brotli-1.1.0-cp310-cp310-win_amd64.whl", hash = "sha256:0c6244521dda65ea562d5a69b9a26120769b7a9fb3db2fe9545935ed6735b128"},
{file = "Brotli-1.1.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:a3daabb76a78f829cafc365531c972016e4aa8d5b4bf60660ad8ecee19df7ccc"},
@ -944,8 +948,14 @@ files = [
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:19c116e796420b0cee3da1ccec3b764ed2952ccfcc298b55a10e5610ad7885f9"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_1_ppc64le.whl", hash = "sha256:510b5b1bfbe20e1a7b3baf5fed9e9451873559a976c1a78eebaa3b86c57b4265"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:a1fd8a29719ccce974d523580987b7f8229aeace506952fa9ce1d53a033873c8"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:c247dd99d39e0338a604f8c2b3bc7061d5c2e9e2ac7ba9cc1be5a69cb6cd832f"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:1b2c248cd517c222d89e74669a4adfa5577e06ab68771a529060cf5a156e9757"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:2a24c50840d89ded6c9a8fdc7b6ed3692ed4e86f1c4a4a938e1e92def92933e0"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:f31859074d57b4639318523d6ffdca586ace54271a73ad23ad021acd807eb14b"},
{file = "Brotli-1.1.0-cp311-cp311-win32.whl", hash = "sha256:39da8adedf6942d76dc3e46653e52df937a3c4d6d18fdc94a7c29d263b1f5b50"},
{file = "Brotli-1.1.0-cp311-cp311-win_amd64.whl", hash = "sha256:aac0411d20e345dc0920bdec5548e438e999ff68d77564d5e9463a7ca9d3e7b1"},
{file = "Brotli-1.1.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:32d95b80260d79926f5fab3c41701dbb818fde1c9da590e77e571eefd14abe28"},
{file = "Brotli-1.1.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:b760c65308ff1e462f65d69c12e4ae085cff3b332d894637f6273a12a482d09f"},
{file = "Brotli-1.1.0-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:316cc9b17edf613ac76b1f1f305d2a748f1b976b033b049a6ecdfd5612c70409"},
{file = "Brotli-1.1.0-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:caf9ee9a5775f3111642d33b86237b05808dafcd6268faa492250e9b78046eb2"},
{file = "Brotli-1.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:70051525001750221daa10907c77830bc889cb6d865cc0b813d9db7fefc21451"},
@ -956,8 +966,24 @@ files = [
{file = "Brotli-1.1.0-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:4093c631e96fdd49e0377a9c167bfd75b6d0bad2ace734c6eb20b348bc3ea180"},
{file = "Brotli-1.1.0-cp312-cp312-musllinux_1_1_ppc64le.whl", hash = "sha256:7e4c4629ddad63006efa0ef968c8e4751c5868ff0b1c5c40f76524e894c50248"},
{file = "Brotli-1.1.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:861bf317735688269936f755fa136a99d1ed526883859f86e41a5d43c61d8966"},
{file = "Brotli-1.1.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:87a3044c3a35055527ac75e419dfa9f4f3667a1e887ee80360589eb8c90aabb9"},
{file = "Brotli-1.1.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:c5529b34c1c9d937168297f2c1fde7ebe9ebdd5e121297ff9c043bdb2ae3d6fb"},
{file = "Brotli-1.1.0-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:ca63e1890ede90b2e4454f9a65135a4d387a4585ff8282bb72964fab893f2111"},
{file = "Brotli-1.1.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:e79e6520141d792237c70bcd7a3b122d00f2613769ae0cb61c52e89fd3443839"},
{file = "Brotli-1.1.0-cp312-cp312-win32.whl", hash = "sha256:5f4d5ea15c9382135076d2fb28dde923352fe02951e66935a9efaac8f10e81b0"},
{file = "Brotli-1.1.0-cp312-cp312-win_amd64.whl", hash = "sha256:906bc3a79de8c4ae5b86d3d75a8b77e44404b0f4261714306e3ad248d8ab0951"},
{file = "Brotli-1.1.0-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:8bf32b98b75c13ec7cf774164172683d6e7891088f6316e54425fde1efc276d5"},
{file = "Brotli-1.1.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:7bc37c4d6b87fb1017ea28c9508b36bbcb0c3d18b4260fcdf08b200c74a6aee8"},
{file = "Brotli-1.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3c0ef38c7a7014ffac184db9e04debe495d317cc9c6fb10071f7fefd93100a4f"},
{file = "Brotli-1.1.0-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:91d7cc2a76b5567591d12c01f019dd7afce6ba8cba6571187e21e2fc418ae648"},
{file = "Brotli-1.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a93dde851926f4f2678e704fadeb39e16c35d8baebd5252c9fd94ce8ce68c4a0"},
{file = "Brotli-1.1.0-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f0db75f47be8b8abc8d9e31bc7aad0547ca26f24a54e6fd10231d623f183d089"},
{file = "Brotli-1.1.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:6967ced6730aed543b8673008b5a391c3b1076d834ca438bbd70635c73775368"},
{file = "Brotli-1.1.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:7eedaa5d036d9336c95915035fb57422054014ebdeb6f3b42eac809928e40d0c"},
{file = "Brotli-1.1.0-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:d487f5432bf35b60ed625d7e1b448e2dc855422e87469e3f450aa5552b0eb284"},
{file = "Brotli-1.1.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:832436e59afb93e1836081a20f324cb185836c617659b07b129141a8426973c7"},
{file = "Brotli-1.1.0-cp313-cp313-win32.whl", hash = "sha256:43395e90523f9c23a3d5bdf004733246fba087f2948f87ab28015f12359ca6a0"},
{file = "Brotli-1.1.0-cp313-cp313-win_amd64.whl", hash = "sha256:9011560a466d2eb3f5a6e4929cf4a09be405c64154e12df0dd72713f6500e32b"},
{file = "Brotli-1.1.0-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:a090ca607cbb6a34b0391776f0cb48062081f5f60ddcce5d11838e67a01928d1"},
{file = "Brotli-1.1.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2de9d02f5bda03d27ede52e8cfe7b865b066fa49258cbab568720aa5be80a47d"},
{file = "Brotli-1.1.0-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:2333e30a5e00fe0fe55903c8832e08ee9c3b1382aacf4db26664a16528d51b4b"},
@ -967,6 +993,10 @@ files = [
{file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_1_i686.whl", hash = "sha256:fd5f17ff8f14003595ab414e45fce13d073e0762394f957182e69035c9f3d7c2"},
{file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_1_ppc64le.whl", hash = "sha256:069a121ac97412d1fe506da790b3e69f52254b9df4eb665cd42460c837193354"},
{file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_1_x86_64.whl", hash = "sha256:e93dfc1a1165e385cc8239fab7c036fb2cd8093728cbd85097b284d7b99249a2"},
{file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_2_aarch64.whl", hash = "sha256:aea440a510e14e818e67bfc4027880e2fb500c2ccb20ab21c7a7c8b5b4703d75"},
{file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_2_i686.whl", hash = "sha256:6974f52a02321b36847cd19d1b8e381bf39939c21efd6ee2fc13a28b0d99348c"},
{file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_2_ppc64le.whl", hash = "sha256:a7e53012d2853a07a4a79c00643832161a910674a893d296c9f1259859a289d2"},
{file = "Brotli-1.1.0-cp36-cp36m-musllinux_1_2_x86_64.whl", hash = "sha256:d7702622a8b40c49bffb46e1e3ba2e81268d5c04a34f460978c6b5517a34dd52"},
{file = "Brotli-1.1.0-cp36-cp36m-win32.whl", hash = "sha256:a599669fd7c47233438a56936988a2478685e74854088ef5293802123b5b2460"},
{file = "Brotli-1.1.0-cp36-cp36m-win_amd64.whl", hash = "sha256:d143fd47fad1db3d7c27a1b1d66162e855b5d50a89666af46e1679c496e8e579"},
{file = "Brotli-1.1.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:11d00ed0a83fa22d29bc6b64ef636c4552ebafcef57154b4ddd132f5638fbd1c"},
@ -978,6 +1008,10 @@ files = [
{file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_1_i686.whl", hash = "sha256:919e32f147ae93a09fe064d77d5ebf4e35502a8df75c29fb05788528e330fe74"},
{file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_1_ppc64le.whl", hash = "sha256:23032ae55523cc7bccb4f6a0bf368cd25ad9bcdcc1990b64a647e7bbcce9cb5b"},
{file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:224e57f6eac61cc449f498cc5f0e1725ba2071a3d4f48d5d9dffba42db196438"},
{file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_2_aarch64.whl", hash = "sha256:cb1dac1770878ade83f2ccdf7d25e494f05c9165f5246b46a621cc849341dc01"},
{file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_2_i686.whl", hash = "sha256:3ee8a80d67a4334482d9712b8e83ca6b1d9bc7e351931252ebef5d8f7335a547"},
{file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_2_ppc64le.whl", hash = "sha256:5e55da2c8724191e5b557f8e18943b1b4839b8efc3ef60d65985bcf6f587dd38"},
{file = "Brotli-1.1.0-cp37-cp37m-musllinux_1_2_x86_64.whl", hash = "sha256:d342778ef319e1026af243ed0a07c97acf3bad33b9f29e7ae6a1f68fd083e90c"},
{file = "Brotli-1.1.0-cp37-cp37m-win32.whl", hash = "sha256:587ca6d3cef6e4e868102672d3bd9dc9698c309ba56d41c2b9c85bbb903cdb95"},
{file = "Brotli-1.1.0-cp37-cp37m-win_amd64.whl", hash = "sha256:2954c1c23f81c2eaf0b0717d9380bd348578a94161a65b3a2afc62c86467dd68"},
{file = "Brotli-1.1.0-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:efa8b278894b14d6da122a72fefcebc28445f2d3f880ac59d46c90f4c13be9a3"},
@ -990,6 +1024,10 @@ files = [
{file = "Brotli-1.1.0-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:1ab4fbee0b2d9098c74f3057b2bc055a8bd92ccf02f65944a241b4349229185a"},
{file = "Brotli-1.1.0-cp38-cp38-musllinux_1_1_ppc64le.whl", hash = "sha256:141bd4d93984070e097521ed07e2575b46f817d08f9fa42b16b9b5f27b5ac088"},
{file = "Brotli-1.1.0-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:fce1473f3ccc4187f75b4690cfc922628aed4d3dd013d047f95a9b3919a86596"},
{file = "Brotli-1.1.0-cp38-cp38-musllinux_1_2_aarch64.whl", hash = "sha256:d2b35ca2c7f81d173d2fadc2f4f31e88cc5f7a39ae5b6db5513cf3383b0e0ec7"},
{file = "Brotli-1.1.0-cp38-cp38-musllinux_1_2_i686.whl", hash = "sha256:af6fa6817889314555aede9a919612b23739395ce767fe7fcbea9a80bf140fe5"},
{file = "Brotli-1.1.0-cp38-cp38-musllinux_1_2_ppc64le.whl", hash = "sha256:2feb1d960f760a575dbc5ab3b1c00504b24caaf6986e2dc2b01c09c87866a943"},
{file = "Brotli-1.1.0-cp38-cp38-musllinux_1_2_x86_64.whl", hash = "sha256:4410f84b33374409552ac9b6903507cdb31cd30d2501fc5ca13d18f73548444a"},
{file = "Brotli-1.1.0-cp38-cp38-win32.whl", hash = "sha256:db85ecf4e609a48f4b29055f1e144231b90edc90af7481aa731ba2d059226b1b"},
{file = "Brotli-1.1.0-cp38-cp38-win_amd64.whl", hash = "sha256:3d7954194c36e304e1523f55d7042c59dc53ec20dd4e9ea9d151f1b62b4415c0"},
{file = "Brotli-1.1.0-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:5fb2ce4b8045c78ebbc7b8f3c15062e435d47e7393cc57c25115cfd49883747a"},
@ -1002,6 +1040,10 @@ files = [
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:949f3b7c29912693cee0afcf09acd6ebc04c57af949d9bf77d6101ebb61e388c"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_1_ppc64le.whl", hash = "sha256:89f4988c7203739d48c6f806f1e87a1d96e0806d44f0fba61dba81392c9e474d"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:de6551e370ef19f8de1807d0a9aa2cdfdce2e85ce88b122fe9f6b2b076837e59"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:0737ddb3068957cf1b054899b0883830bb1fec522ec76b1098f9b6e0f02d9419"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_2_i686.whl", hash = "sha256:4f3607b129417e111e30637af1b56f24f7a49e64763253bbc275c75fa887d4b2"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_2_ppc64le.whl", hash = "sha256:6c6e0c425f22c1c719c42670d561ad682f7bfeeef918edea971a79ac5252437f"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:494994f807ba0b92092a163a0a283961369a65f6cbe01e8891132b7a320e61eb"},
{file = "Brotli-1.1.0-cp39-cp39-win32.whl", hash = "sha256:f0d8a7a6b5983c2496e364b969f0e526647a06b075d034f3297dc66f3b360c64"},
{file = "Brotli-1.1.0-cp39-cp39-win_amd64.whl", hash = "sha256:cdad5b9014d83ca68c25d2e9444e28e967ef16e80f6b436918c700c117a85467"},
{file = "Brotli-1.1.0.tar.gz", hash = "sha256:81de08ac11bcb85841e440c13611c00b67d3bf82698314928d0b676362546724"},
@ -1801,6 +1843,46 @@ requests = ">=2.8"
six = "*"
xmltodict = "*"
[[package]]
name = "couchbase"
version = "4.3.3"
description = "Python Client for Couchbase"
optional = false
python-versions = ">=3.7"
files = [
{file = "couchbase-4.3.3-cp310-cp310-macosx_10_15_x86_64.whl", hash = "sha256:d8069e4f01332859d56cca597874645c914699162b3979d1b432f0dfc186b124"},
{file = "couchbase-4.3.3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:1caa6cfef49c785b35b1702102f718227f351df87bba2694b9334520c41e9eb5"},
{file = "couchbase-4.3.3-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:f4a9a65c44935249fa078fb90a3c28ea71da9d2d5889fcd514b12d0538010ae0"},
{file = "couchbase-4.3.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4f144b8c482c18283d8e419b844630d41f3249b07d43d40b5e3535444e57d0fb"},
{file = "couchbase-4.3.3-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:1c534fba6fdc7cf47eed9dee8a57d1e9eb867bf008574e321fa380a77cebf32f"},
{file = "couchbase-4.3.3-cp310-cp310-win_amd64.whl", hash = "sha256:b841be06e0e4370b69ebef6bca3409c378186f7d6e964cd645ba18e97216c022"},
{file = "couchbase-4.3.3-cp311-cp311-macosx_10_15_x86_64.whl", hash = "sha256:eee7a73b3acbdc78ae314fddf7f975b3c9e05df07df255f4dcc878939a2abae0"},
{file = "couchbase-4.3.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:53417cafcf90ff4e2fd81ebba2a08b7ad56f17160d1c5019ad3b09c758aeb363"},
{file = "couchbase-4.3.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:0cefd13bea8b0f150f1b9d27fd7614f971f77419b31817781d26ba315ed658bb"},
{file = "couchbase-4.3.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:78fa1054d7740e2fe38fce0a2aab4e9a2d30263d894e0615ee5df297f02f59a3"},
{file = "couchbase-4.3.3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:eb093899cfad5a7472258a9b6a57775dbf23a6e0180241507ba89ce3ab241e41"},
{file = "couchbase-4.3.3-cp311-cp311-win_amd64.whl", hash = "sha256:f7cfbdc699af5715f49365ffbb05a6a7366a534c0d7161edf270ad3e735a6c5d"},
{file = "couchbase-4.3.3-cp312-cp312-macosx_10_15_x86_64.whl", hash = "sha256:58352cae9b8affdaa2ac012e0a03c8c2632ee6297a878232888b4e0360d0d5df"},
{file = "couchbase-4.3.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:728e7e3b5e1682706cb9d63993d289226d02a25089527b8ecb4e3889dabc38cf"},
{file = "couchbase-4.3.3-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:73014bf098cf14187a39cc13453e0d859c1d54568df28f69cc308a9a5f24feb2"},
{file = "couchbase-4.3.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:a743375804068ae01b73c916bfca738764c8c12f381bb399ef04e784935856a1"},
{file = "couchbase-4.3.3-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:394c122cfe02a76a99e7d5178e64129f6da49843225e78d8629abcab556c24af"},
{file = "couchbase-4.3.3-cp312-cp312-win_amd64.whl", hash = "sha256:bf85d7a5cda548d9801614651206068b4445fa37972e62b14d7521a958198693"},
{file = "couchbase-4.3.3-cp38-cp38-macosx_10_15_x86_64.whl", hash = "sha256:92d23c9cedd571631070791f2afee0e3d7d8c9ce1bf2ea6e9a4f2fdbc37a0f1e"},
{file = "couchbase-4.3.3-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:38c42eb29a73cce2998ae5df45bd61b16dce9765d3bff968ec5cf6a622faa291"},
{file = "couchbase-4.3.3-cp38-cp38-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:afed137bf0edc642d7b201b6ab7b1e7117bb4c8eac6b2f253cc6e106f334a2a1"},
{file = "couchbase-4.3.3-cp38-cp38-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:954d991377d47883aaf903934c5d0f19577680a2abf80d3ce5bb9b3c80991fc7"},
{file = "couchbase-4.3.3-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:d5552b9fa684630698dc98d6f3b1082540634c1b7ad5bf53b843b5da57b0169c"},
{file = "couchbase-4.3.3-cp38-cp38-win_amd64.whl", hash = "sha256:f88f2b7e0c894f7237d9f3fb5c46abc44b8151a97b3ca8e75f57d23ebf59f9da"},
{file = "couchbase-4.3.3-cp39-cp39-macosx_10_15_x86_64.whl", hash = "sha256:769e1e2367ea1d4de181fcd4b4e353e9abef97d15b581a6c5aea49ece3dc7d59"},
{file = "couchbase-4.3.3-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:47f59a0b35ffce060583fd11f98f049f3b70701cf14aab9ac092594aca486aeb"},
{file = "couchbase-4.3.3-cp39-cp39-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:440bb93d611827ba0ea2403c6f204fe931467a6cb5811f0e03bf1779204ef843"},
{file = "couchbase-4.3.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:cdb4dde62e1d41c0b8707121ab68fa78b7a1508541bd48fc850be396f91bc8d9"},
{file = "couchbase-4.3.3-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:7f8cf45f317b39cc19db5c67b565662f08d6c90305b3aa14e04bc22707258213"},
{file = "couchbase-4.3.3-cp39-cp39-win_amd64.whl", hash = "sha256:c97d48ad486c8f201b4482d5594258f949369cb44792ed148d5159a3d12ae21b"},
{file = "couchbase-4.3.3.tar.gz", hash = "sha256:27808500551564b39b46943cf3daab572694889c1eb638425d363edb48b20da7"},
]
[[package]]
name = "coverage"
version = "7.2.7"
@ -6850,6 +6932,19 @@ files = [
{file = "pyarrow-17.0.0-cp312-cp312-win_amd64.whl", hash = "sha256:392bc9feabc647338e6c89267635e111d71edad5fcffba204425a7c8d13610d7"},
{file = "pyarrow-17.0.0-cp38-cp38-macosx_10_15_x86_64.whl", hash = "sha256:af5ff82a04b2171415f1410cff7ebb79861afc5dae50be73ce06d6e870615204"},
{file = "pyarrow-17.0.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:edca18eaca89cd6382dfbcff3dd2d87633433043650c07375d095cd3517561d8"},
{file = "pyarrow-17.0.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7c7916bff914ac5d4a8fe25b7a25e432ff921e72f6f2b7547d1e325c1ad9d155"},
{file = "pyarrow-17.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f553ca691b9e94b202ff741bdd40f6ccb70cdd5fbf65c187af132f1317de6145"},
{file = "pyarrow-17.0.0-cp38-cp38-manylinux_2_28_aarch64.whl", hash = "sha256:0cdb0e627c86c373205a2f94a510ac4376fdc523f8bb36beab2e7f204416163c"},
{file = "pyarrow-17.0.0-cp38-cp38-manylinux_2_28_x86_64.whl", hash = "sha256:d7d192305d9d8bc9082d10f361fc70a73590a4c65cf31c3e6926cd72b76bc35c"},
{file = "pyarrow-17.0.0-cp38-cp38-win_amd64.whl", hash = "sha256:02dae06ce212d8b3244dd3e7d12d9c4d3046945a5933d28026598e9dbbda1fca"},
{file = "pyarrow-17.0.0-cp39-cp39-macosx_10_15_x86_64.whl", hash = "sha256:13d7a460b412f31e4c0efa1148e1d29bdf18ad1411eb6757d38f8fbdcc8645fb"},
{file = "pyarrow-17.0.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:9b564a51fbccfab5a04a80453e5ac6c9954a9c5ef2890d1bcf63741909c3f8df"},
{file = "pyarrow-17.0.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:32503827abbc5aadedfa235f5ece8c4f8f8b0a3cf01066bc8d29de7539532687"},
{file = "pyarrow-17.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a155acc7f154b9ffcc85497509bcd0d43efb80d6f733b0dc3bb14e281f131c8b"},
{file = "pyarrow-17.0.0-cp39-cp39-manylinux_2_28_aarch64.whl", hash = "sha256:dec8d129254d0188a49f8a1fc99e0560dc1b85f60af729f47de4046015f9b0a5"},
{file = "pyarrow-17.0.0-cp39-cp39-manylinux_2_28_x86_64.whl", hash = "sha256:a48ddf5c3c6a6c505904545c25a4ae13646ae1f8ba703c4df4a1bfe4f4006bda"},
{file = "pyarrow-17.0.0-cp39-cp39-win_amd64.whl", hash = "sha256:42bf93249a083aca230ba7e2786c5f673507fa97bbd9725a1e2754715151a204"},
{file = "pyarrow-17.0.0.tar.gz", hash = "sha256:4beca9521ed2c0921c1023e68d097d0299b62c362639ea315572a58f3f50fd28"},
]
[package.dependencies]
@ -7216,6 +7311,22 @@ files = [
ed25519 = ["PyNaCl (>=1.4.0)"]
rsa = ["cryptography"]
[[package]]
name = "pyobvector"
version = "0.1.6"
description = "A python SDK for OceanBase Vector Store, based on SQLAlchemy, compatible with Milvus API."
optional = false
python-versions = "<4.0,>=3.9"
files = [
{file = "pyobvector-0.1.6-py3-none-any.whl", hash = "sha256:0d700e865a85b4716b9a03384189e49288cd9d5f3cef88aed4740bc82d5fd136"},
{file = "pyobvector-0.1.6.tar.gz", hash = "sha256:05551addcac8c596992d5e38b480c83ca3481c6cfc6f56a1a1bddfb2e6ae037e"},
]
[package.dependencies]
numpy = ">=1.26.0,<2.0.0"
pymysql = ">=1.1.1,<2.0.0"
sqlalchemy = ">=2.0.32,<3.0.0"
[[package]]
name = "pyopenssl"
version = "24.2.1"
@ -8624,6 +8735,11 @@ files = [
{file = "scikit_learn-1.5.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f60021ec1574e56632be2a36b946f8143bf4e5e6af4a06d85281adc22938e0dd"},
{file = "scikit_learn-1.5.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:394397841449853c2290a32050382edaec3da89e35b3e03d6cc966aebc6a8ae6"},
{file = "scikit_learn-1.5.2-cp312-cp312-win_amd64.whl", hash = "sha256:57cc1786cfd6bd118220a92ede80270132aa353647684efa385a74244a41e3b1"},
{file = "scikit_learn-1.5.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:e9a702e2de732bbb20d3bad29ebd77fc05a6b427dc49964300340e4c9328b3f5"},
{file = "scikit_learn-1.5.2-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:b0768ad641981f5d3a198430a1d31c3e044ed2e8a6f22166b4d546a5116d7908"},
{file = "scikit_learn-1.5.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:178ddd0a5cb0044464fc1bfc4cca5b1833bfc7bb022d70b05db8530da4bb3dd3"},
{file = "scikit_learn-1.5.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f7284ade780084d94505632241bf78c44ab3b6f1e8ccab3d2af58e0e950f9c12"},
{file = "scikit_learn-1.5.2-cp313-cp313-win_amd64.whl", hash = "sha256:b7b0f9a0b1040830d38c39b91b3a44e1b643f4b36e36567b80b7c6bd2202a27f"},
{file = "scikit_learn-1.5.2-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:757c7d514ddb00ae249832fe87100d9c73c6ea91423802872d9e74970a0e40b9"},
{file = "scikit_learn-1.5.2-cp39-cp39-macosx_12_0_arm64.whl", hash = "sha256:52788f48b5d8bca5c0736c175fa6bdaab2ef00a8f536cda698db61bd89c551c1"},
{file = "scikit_learn-1.5.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:643964678f4b5fbdc95cbf8aec638acc7aa70f5f79ee2cdad1eec3df4ba6ead8"},
@ -10866,4 +10982,4 @@ cffi = ["cffi (>=1.11)"]
[metadata]
lock-version = "2.0"
python-versions = ">=3.10,<3.13"
content-hash = "1b268122d3d4771ba219f0e983322e0454b7b8644dba35da38d7d950d489e1ba"
content-hash = "ef927b98c33d704d680e08db0e5c7d9a4e05454c66fcd6a5f656a65eb08e886b"

View File

@ -239,6 +239,7 @@ alibabacloud_gpdb20160503 = "~3.8.0"
alibabacloud_tea_openapi = "~0.3.9"
chromadb = "0.5.1"
clickhouse-connect = "~0.7.16"
couchbase = "~4.3.0"
elasticsearch = "8.14.0"
opensearch-py = "2.4.0"
oracledb = "~2.2.1"
@ -246,6 +247,7 @@ pgvecto-rs = { version = "~0.2.1", extras = ['sqlalchemy'] }
pgvector = "0.2.5"
pymilvus = "~2.4.4"
pymochow = "1.3.1"
pyobvector = "~0.1.6"
qdrant-client = "1.7.3"
tcvectordb = "1.3.2"
tidb-vector = "0.0.9"

View File

@ -27,3 +27,4 @@ env =
XINFERENCE_GENERATION_MODEL_UID = generate
XINFERENCE_RERANK_MODEL_UID = rerank
XINFERENCE_SERVER_URL = http://a.abc.com:11451
GITEE_AI_API_KEY = aaaaaaaaaaaaaaaaaaaa

View File

@ -6,6 +6,8 @@ from typing import Any, Optional, Union
import httpx
import validators
from constants import HIDDEN_VALUE
# from tasks.external_document_indexing_task import external_document_indexing_task
from core.helper import ssrf_proxy
from extensions.ext_database import db
@ -68,7 +70,7 @@ class ExternalDatasetService:
endpoint = f"{settings['endpoint']}/retrieval"
api_key = settings["api_key"]
if not validators.url(endpoint):
if not validators.url(endpoint, simple_host=True):
raise ValueError(f"invalid endpoint: {endpoint}")
try:
response = httpx.post(endpoint, headers={"Authorization": f"Bearer {api_key}"})
@ -92,6 +94,8 @@ class ExternalDatasetService:
).first()
if external_knowledge_api is None:
raise ValueError("api template not found")
if args.get("settings") and args.get("settings").get("api_key") == HIDDEN_VALUE:
args.get("settings")["api_key"] = external_knowledge_api.settings_dict.get("api_key")
external_knowledge_api.name = args.get("name")
external_knowledge_api.description = args.get("description", "")

View File

@ -83,3 +83,6 @@ VOLC_EMBEDDING_ENDPOINT_ID=
# 360 AI Credentials
ZHINAO_API_KEY=
# Gitee AI Credentials
GITEE_AI_API_KEY=

View File

@ -0,0 +1,132 @@
import os
from collections.abc import Generator
import pytest
from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk, LLMResultChunkDelta
from core.model_runtime.entities.message_entities import (
AssistantPromptMessage,
PromptMessageTool,
SystemPromptMessage,
UserPromptMessage,
)
from core.model_runtime.entities.model_entities import AIModelEntity
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.gitee_ai.llm.llm import GiteeAILargeLanguageModel
def test_predefined_models():
model = GiteeAILargeLanguageModel()
model_schemas = model.predefined_models()
assert len(model_schemas) >= 1
assert isinstance(model_schemas[0], AIModelEntity)
def test_validate_credentials_for_chat_model():
model = GiteeAILargeLanguageModel()
with pytest.raises(CredentialsValidateFailedError):
# model name to gpt-3.5-turbo because of mocking
model.validate_credentials(model="gpt-3.5-turbo", credentials={"api_key": "invalid_key"})
model.validate_credentials(
model="Qwen2-7B-Instruct",
credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
)
def test_invoke_chat_model():
model = GiteeAILargeLanguageModel()
result = model.invoke(
model="Qwen2-7B-Instruct",
credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
prompt_messages=[
SystemPromptMessage(
content="You are a helpful AI assistant.",
),
UserPromptMessage(content="Hello World!"),
],
model_parameters={
"temperature": 0.0,
"top_p": 1.0,
"presence_penalty": 0.0,
"frequency_penalty": 0.0,
"max_tokens": 10,
"stream": False,
},
stop=["How"],
stream=False,
user="foo",
)
assert isinstance(result, LLMResult)
assert len(result.message.content) > 0
def test_invoke_stream_chat_model():
model = GiteeAILargeLanguageModel()
result = model.invoke(
model="Qwen2-7B-Instruct",
credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
prompt_messages=[
SystemPromptMessage(
content="You are a helpful AI assistant.",
),
UserPromptMessage(content="Hello World!"),
],
model_parameters={"temperature": 0.0, "max_tokens": 100, "stream": False},
stream=True,
user="foo",
)
assert isinstance(result, Generator)
for chunk in result:
assert isinstance(chunk, LLMResultChunk)
assert isinstance(chunk.delta, LLMResultChunkDelta)
assert isinstance(chunk.delta.message, AssistantPromptMessage)
assert len(chunk.delta.message.content) > 0 if chunk.delta.finish_reason is None else True
if chunk.delta.finish_reason is not None:
assert chunk.delta.usage is not None
def test_get_num_tokens():
model = GiteeAILargeLanguageModel()
num_tokens = model.get_num_tokens(
model="Qwen2-7B-Instruct",
credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
prompt_messages=[UserPromptMessage(content="Hello World!")],
)
assert num_tokens == 10
num_tokens = model.get_num_tokens(
model="Qwen2-7B-Instruct",
credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
prompt_messages=[
SystemPromptMessage(
content="You are a helpful AI assistant.",
),
UserPromptMessage(content="Hello World!"),
],
tools=[
PromptMessageTool(
name="get_weather",
description="Determine weather in my location",
parameters={
"type": "object",
"properties": {
"location": {"type": "string", "description": "The city and state e.g. San Francisco, CA"},
"unit": {"type": "string", "enum": ["c", "f"]},
},
"required": ["location"],
},
),
],
)
assert num_tokens == 77

View File

@ -0,0 +1,15 @@
import os
import pytest
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.gitee_ai.gitee_ai import GiteeAIProvider
def test_validate_provider_credentials():
provider = GiteeAIProvider()
with pytest.raises(CredentialsValidateFailedError):
provider.validate_provider_credentials(credentials={"api_key": "invalid_key"})
provider.validate_provider_credentials(credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")})

View File

@ -0,0 +1,47 @@
import os
import pytest
from core.model_runtime.entities.rerank_entities import RerankResult
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.gitee_ai.rerank.rerank import GiteeAIRerankModel
def test_validate_credentials():
model = GiteeAIRerankModel()
with pytest.raises(CredentialsValidateFailedError):
model.validate_credentials(
model="bge-reranker-v2-m3",
credentials={"api_key": "invalid_key"},
)
model.validate_credentials(
model="bge-reranker-v2-m3",
credentials={
"api_key": os.environ.get("GITEE_AI_API_KEY"),
},
)
def test_invoke_model():
model = GiteeAIRerankModel()
result = model.invoke(
model="bge-reranker-v2-m3",
credentials={
"api_key": os.environ.get("GITEE_AI_API_KEY"),
},
query="What is the capital of the United States?",
docs=[
"Carson City is the capital city of the American state of Nevada. At the 2010 United States "
"Census, Carson City had a population of 55,274.",
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that "
"are a political division controlled by the United States. Its capital is Saipan.",
],
top_n=1,
score_threshold=0.01,
)
assert isinstance(result, RerankResult)
assert len(result.docs) == 1
assert result.docs[0].score >= 0.01

View File

@ -0,0 +1,45 @@
import os
import pytest
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.gitee_ai.speech2text.speech2text import GiteeAISpeech2TextModel
def test_validate_credentials():
model = GiteeAISpeech2TextModel()
with pytest.raises(CredentialsValidateFailedError):
model.validate_credentials(
model="whisper-base",
credentials={"api_key": "invalid_key"},
)
model.validate_credentials(
model="whisper-base",
credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")},
)
def test_invoke_model():
model = GiteeAISpeech2TextModel()
# Get the directory of the current file
current_dir = os.path.dirname(os.path.abspath(__file__))
# Get assets directory
assets_dir = os.path.join(os.path.dirname(current_dir), "assets")
# Construct the path to the audio file
audio_file_path = os.path.join(assets_dir, "audio.mp3")
# Open the file and get the file object
with open(audio_file_path, "rb") as audio_file:
file = audio_file
result = model.invoke(
model="whisper-base", credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")}, file=file
)
assert isinstance(result, str)
assert result == "1 2 3 4 5 6 7 8 9 10"

View File

@ -0,0 +1,46 @@
import os
import pytest
from core.model_runtime.entities.text_embedding_entities import TextEmbeddingResult
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.gitee_ai.text_embedding.text_embedding import GiteeAIEmbeddingModel
def test_validate_credentials():
model = GiteeAIEmbeddingModel()
with pytest.raises(CredentialsValidateFailedError):
model.validate_credentials(model="bge-large-zh-v1.5", credentials={"api_key": "invalid_key"})
model.validate_credentials(model="bge-large-zh-v1.5", credentials={"api_key": os.environ.get("GITEE_AI_API_KEY")})
def test_invoke_model():
model = GiteeAIEmbeddingModel()
result = model.invoke(
model="bge-large-zh-v1.5",
credentials={
"api_key": os.environ.get("GITEE_AI_API_KEY"),
},
texts=["hello", "world"],
user="user",
)
assert isinstance(result, TextEmbeddingResult)
assert len(result.embeddings) == 2
def test_get_num_tokens():
model = GiteeAIEmbeddingModel()
num_tokens = model.get_num_tokens(
model="bge-large-zh-v1.5",
credentials={
"api_key": os.environ.get("GITEE_AI_API_KEY"),
},
texts=["hello", "world"],
)
assert num_tokens == 2

View File

@ -0,0 +1,23 @@
import os
from core.model_runtime.model_providers.gitee_ai.tts.tts import GiteeAIText2SpeechModel
def test_invoke_model():
model = GiteeAIText2SpeechModel()
result = model.invoke(
model="speecht5_tts",
tenant_id="test",
credentials={
"api_key": os.environ.get("GITEE_AI_API_KEY"),
},
content_text="Hello, world!",
voice="",
)
content = b""
for chunk in result:
content += chunk
assert content != b""

View File

@ -0,0 +1,50 @@
import subprocess
import time
from core.rag.datasource.vdb.couchbase.couchbase_vector import CouchbaseConfig, CouchbaseVector
from tests.integration_tests.vdb.test_vector_store import (
AbstractVectorTest,
get_example_text,
setup_mock_redis,
)
def wait_for_healthy_container(service_name="couchbase-server", timeout=300):
start_time = time.time()
while time.time() - start_time < timeout:
result = subprocess.run(
["docker", "inspect", "--format", "{{.State.Health.Status}}", service_name], capture_output=True, text=True
)
if result.stdout.strip() == "healthy":
print(f"{service_name} is healthy!")
return True
else:
print(f"Waiting for {service_name} to be healthy...")
time.sleep(10)
raise TimeoutError(f"{service_name} did not become healthy in time")
class CouchbaseTest(AbstractVectorTest):
def __init__(self):
super().__init__()
self.vector = CouchbaseVector(
collection_name=self.collection_name,
config=CouchbaseConfig(
connection_string="couchbase://127.0.0.1",
user="Administrator",
password="password",
bucket_name="Embeddings",
scope_name="_default",
),
)
def search_by_vector(self):
# brief sleep to ensure document is indexed
time.sleep(5)
hits_by_vector = self.vector.search_by_vector(query_vector=self.example_embedding)
assert len(hits_by_vector) == 1
def test_couchbase(setup_mock_redis):
wait_for_healthy_container("couchbase-server", timeout=60)
CouchbaseTest().run_all_tests()

View File

@ -0,0 +1,71 @@
from unittest.mock import MagicMock, patch
import pytest
from core.rag.datasource.vdb.oceanbase.oceanbase_vector import (
OceanBaseVector,
OceanBaseVectorConfig,
)
from tests.integration_tests.vdb.__mock.tcvectordb import setup_tcvectordb_mock
from tests.integration_tests.vdb.test_vector_store import (
AbstractVectorTest,
get_example_text,
setup_mock_redis,
)
@pytest.fixture
def oceanbase_vector():
return OceanBaseVector(
"dify_test_collection",
config=OceanBaseVectorConfig(
host="127.0.0.1",
port="2881",
user="root@test",
database="test",
password="test",
),
)
class OceanBaseVectorTest(AbstractVectorTest):
def __init__(self, vector: OceanBaseVector):
super().__init__()
self.vector = vector
def search_by_vector(self):
hits_by_vector = self.vector.search_by_vector(query_vector=self.example_embedding)
assert len(hits_by_vector) == 0
def search_by_full_text(self):
hits_by_full_text = self.vector.search_by_full_text(query=get_example_text())
assert len(hits_by_full_text) == 0
def text_exists(self):
exist = self.vector.text_exists(self.example_doc_id)
assert exist == True
def get_ids_by_metadata_field(self):
ids = self.vector.get_ids_by_metadata_field(key="document_id", value=self.example_doc_id)
assert len(ids) == 0
@pytest.fixture
def setup_mock_oceanbase_client():
with patch("core.rag.datasource.vdb.oceanbase.oceanbase_vector.ObVecClient", new_callable=MagicMock) as mock_client:
yield mock_client
@pytest.fixture
def setup_mock_oceanbase_vector(oceanbase_vector):
with patch.object(oceanbase_vector, "_client"):
yield oceanbase_vector
def test_oceanbase_vector(
setup_mock_redis,
setup_mock_oceanbase_client,
setup_mock_oceanbase_vector,
oceanbase_vector,
):
OceanBaseVectorTest(oceanbase_vector).run_all_tests()

Some files were not shown because too many files have changed in this diff Show More