From 1c23cbc643517596945b2bbe013191394603c640 Mon Sep 17 00:00:00 2001 From: Hanqing Zhao Date: Wed, 17 Sep 2025 16:19:53 +0800 Subject: [PATCH] Update the translation for pipeline --- .../transform/file-general-economy.yml | 55 +++++++++---------- .../transform/file-general-high-quality.yml | 43 +++++++-------- .../transform/file-parentchild.yml | 50 ++++++++--------- .../transform/notion-general-economy.yml | 36 ++++++------ .../transform/notion-general-high-quality.yml | 36 ++++++------ .../transform/notion-parentchild.yml | 43 +++++++-------- .../website-crawl-general-economy.yml | 36 ++++++------ .../website-crawl-general-high-quality.yml | 36 ++++++------ .../transform/website-crawl-parentchild.yml | 43 +++++++-------- 9 files changed, 186 insertions(+), 192 deletions(-) diff --git a/api/services/rag_pipeline/transform/file-general-economy.yml b/api/services/rag_pipeline/transform/file-general-economy.yml index daad0ea166..cf73f2d84d 100644 --- a/api/services/rag_pipeline/transform/file-general-economy.yml +++ b/api/services/rag_pipeline/transform/file-general-economy.yml @@ -202,15 +202,14 @@ workflow: human_description: en_US: the file to be parsed(support pdf, ppt, pptx, doc, docx, png, jpg, jpeg) - ja_JP: the file to be parsed(support pdf, ppt, pptx, doc, docx, png, jpg, - jpeg) + ja_JP: 解析するファイル(pdf, ppt, pptx, doc, docx, png, jpg, jpegをサポート) pt_BR: o arquivo a ser analisado (suporta pdf, ppt, pptx, doc, docx, png, jpg, jpeg) zh_Hans: 用于解析的文件(支持 pdf, ppt, pptx, doc, docx, png, jpg, jpeg) label: en_US: file - ja_JP: file - pt_BR: file + ja_JP: ファイル + pt_BR: arquivo zh_Hans: file llm_description: the file to be parsed (support pdf, ppt, pptx, doc, docx, png, jpg, jpeg) @@ -432,13 +431,13 @@ workflow: form: llm human_description: en_US: The text you want to chunk. - ja_JP: The text you want to chunk. - pt_BR: The text you want to chunk. + ja_JP: チャンク化したいテキスト。 + pt_BR: O texto que você deseja dividir. zh_Hans: 你想要分块的文本。 label: en_US: Input Variable - ja_JP: Input Variable - pt_BR: Input Variable + ja_JP: 入力変数 + pt_BR: Variável de entrada zh_Hans: 输入变量 llm_description: The text you want to chunk. max: null @@ -456,13 +455,13 @@ workflow: form: llm human_description: en_US: The delimiter of the chunks. - ja_JP: The delimiter of the chunks. - pt_BR: The delimiter of the chunks. + ja_JP: チャンクの区切り記号。 + pt_BR: O delimitador dos blocos. zh_Hans: 块的分隔符。 label: en_US: Delimiter - ja_JP: Delimiter - pt_BR: Delimiter + ja_JP: 区切り記号 + pt_BR: DDelimitador zh_Hans: 分隔符 llm_description: The delimiter of the chunks, the format of the delimiter must be a string. @@ -481,13 +480,13 @@ workflow: form: llm human_description: en_US: The maximum chunk length. - ja_JP: The maximum chunk length. - pt_BR: The maximum chunk length. + ja_JP: 最大長のチャンク。 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度。 label: en_US: Maximum Chunk Length - ja_JP: Maximum Chunk Length - pt_BR: Maximum Chunk Length + ja_JP: チャンク最大長 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度 llm_description: The maximum chunk length, the format of the chunk size must be an integer. @@ -506,13 +505,13 @@ workflow: form: llm human_description: en_US: The chunk overlap length. - ja_JP: The chunk overlap length. - pt_BR: The chunk overlap length. + ja_JP: チャンクの重複長 + pt_BR: O comprimento de sobreposição dos fragmentos zh_Hans: 块的重叠长度。 label: en_US: Chunk Overlap Length - ja_JP: Chunk Overlap Length - pt_BR: Chunk Overlap Length + ja_JP: チャンク重複長 + pt_BR: Comprimento de sobreposição do bloco zh_Hans: 块的重叠长度 llm_description: The chunk overlap length, the format of the chunk overlap length must be an integer. @@ -531,13 +530,13 @@ workflow: form: llm human_description: en_US: Replace consecutive spaces, newlines and tabs - ja_JP: Replace consecutive spaces, newlines and tabs - pt_BR: Replace consecutive spaces, newlines and tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する + pt_BR: Substituir espaços consecutivos, novas linhas e tabulações zh_Hans: 替换连续的空格、换行符和制表符 label: en_US: Replace Consecutive Spaces, Newlines and Tabs - ja_JP: Replace Consecutive Spaces, Newlines and Tabs - pt_BR: Replace Consecutive Spaces, Newlines and Tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する + pt_BR: Substituir espaços consecutivos, novas linhas e tabulações zh_Hans: 替换连续的空格、换行符和制表符 llm_description: Replace consecutive spaces, newlines and tabs, the format of the replace must be a boolean. @@ -556,13 +555,13 @@ workflow: form: llm human_description: en_US: Delete all URLs and email addresses - ja_JP: Delete all URLs and email addresses - pt_BR: Delete all URLs and email addresses + ja_JP: すべてのURLとメールアドレスを削除する + pt_BR: Excluir todos os URLs e endereços de e-mail zh_Hans: 删除所有URL和电子邮件地址 label: en_US: Delete All URLs and Email Addresses - ja_JP: Delete All URLs and Email Addresses - pt_BR: Delete All URLs and Email Addresses + ja_JP: すべてのURLとメールアドレスを削除する + pt_BR: Excluir todos os URLs e endereços de e-mail zh_Hans: 删除所有URL和电子邮件地址 llm_description: Delete all URLs and email addresses, the format of the delete must be a boolean. diff --git a/api/services/rag_pipeline/transform/file-general-high-quality.yml b/api/services/rag_pipeline/transform/file-general-high-quality.yml index fd414741d2..2e09a7634f 100644 --- a/api/services/rag_pipeline/transform/file-general-high-quality.yml +++ b/api/services/rag_pipeline/transform/file-general-high-quality.yml @@ -202,15 +202,14 @@ workflow: human_description: en_US: the file to be parsed(support pdf, ppt, pptx, doc, docx, png, jpg, jpeg) - ja_JP: the file to be parsed(support pdf, ppt, pptx, doc, docx, png, jpg, - jpeg) + ja_JP: 解析するファイル(pdf, ppt, pptx, doc, docx, png, jpg, jpegをサポート) pt_BR: o arquivo a ser analisado (suporta pdf, ppt, pptx, doc, docx, png, jpg, jpeg) zh_Hans: 用于解析的文件(支持 pdf, ppt, pptx, doc, docx, png, jpg, jpeg) label: en_US: file - ja_JP: file - pt_BR: file + ja_JP: ファイル + pt_BR: arquivo zh_Hans: file llm_description: the file to be parsed (support pdf, ppt, pptx, doc, docx, png, jpg, jpeg) @@ -432,13 +431,13 @@ workflow: form: llm human_description: en_US: The text you want to chunk. - ja_JP: The text you want to chunk. - pt_BR: The text you want to chunk. + ja_JP: チャンク化したいテキスト。 + pt_BR: O texto que você deseja dividir. zh_Hans: 你想要分块的文本。 label: en_US: Input Variable - ja_JP: Input Variable - pt_BR: Input Variable + ja_JP: 入力変数 + pt_BR: Variável de entrada zh_Hans: 输入变量 llm_description: The text you want to chunk. max: null @@ -456,13 +455,13 @@ workflow: form: llm human_description: en_US: The delimiter of the chunks. - ja_JP: The delimiter of the chunks. - pt_BR: The delimiter of the chunks. + ja_JP: チャンクの区切り記号。 + pt_BR: O delimitador dos pedaços. zh_Hans: 块的分隔符。 label: en_US: Delimiter - ja_JP: Delimiter - pt_BR: Delimiter + ja_JP: 区切り記号 + pt_BR: Delimitador zh_Hans: 分隔符 llm_description: The delimiter of the chunks, the format of the delimiter must be a string. @@ -481,13 +480,13 @@ workflow: form: llm human_description: en_US: The maximum chunk length. - ja_JP: The maximum chunk length. - pt_BR: The maximum chunk length. + ja_JP: 最大長のチャンク。 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度。 label: en_US: Maximum Chunk Length - ja_JP: Maximum Chunk Length - pt_BR: Maximum Chunk Length + ja_JP: チャンク最大長 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度 llm_description: The maximum chunk length, the format of the chunk size must be an integer. @@ -506,12 +505,12 @@ workflow: form: llm human_description: en_US: The chunk overlap length. - ja_JP: The chunk overlap length. + ja_JP: チャンクの重複長 pt_BR: The chunk overlap length. zh_Hans: 块的重叠长度。 label: en_US: Chunk Overlap Length - ja_JP: Chunk Overlap Length + ja_JP: チャンク重複長 pt_BR: Chunk Overlap Length zh_Hans: 块的重叠长度 llm_description: The chunk overlap length, the format of the chunk overlap @@ -531,12 +530,12 @@ workflow: form: llm human_description: en_US: Replace consecutive spaces, newlines and tabs - ja_JP: Replace consecutive spaces, newlines and tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する pt_BR: Replace consecutive spaces, newlines and tabs zh_Hans: 替换连续的空格、换行符和制表符 label: en_US: Replace Consecutive Spaces, Newlines and Tabs - ja_JP: Replace Consecutive Spaces, Newlines and Tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する pt_BR: Replace Consecutive Spaces, Newlines and Tabs zh_Hans: 替换连续的空格、换行符和制表符 llm_description: Replace consecutive spaces, newlines and tabs, the format @@ -556,12 +555,12 @@ workflow: form: llm human_description: en_US: Delete all URLs and email addresses - ja_JP: Delete all URLs and email addresses + ja_JP: すべてのURLとメールアドレスを削除する pt_BR: Delete all URLs and email addresses zh_Hans: 删除所有URL和电子邮件地址 label: en_US: Delete All URLs and Email Addresses - ja_JP: Delete All URLs and Email Addresses + ja_JP: すべてのURLとメールアドレスを削除する pt_BR: Delete All URLs and Email Addresses zh_Hans: 删除所有URL和电子邮件地址 llm_description: Delete all URLs and email addresses, the format of the diff --git a/api/services/rag_pipeline/transform/file-parentchild.yml b/api/services/rag_pipeline/transform/file-parentchild.yml index af945e7d36..bbb90fe45d 100644 --- a/api/services/rag_pipeline/transform/file-parentchild.yml +++ b/api/services/rag_pipeline/transform/file-parentchild.yml @@ -201,15 +201,14 @@ workflow: human_description: en_US: the file to be parsed(support pdf, ppt, pptx, doc, docx, png, jpg, jpeg) - ja_JP: the file to be parsed(support pdf, ppt, pptx, doc, docx, png, jpg, - jpeg) + ja_JP: 解析するファイル(pdf, ppt, pptx, doc, docx, png, jpg, jpegをサポート) pt_BR: o arquivo a ser analisado (suporta pdf, ppt, pptx, doc, docx, png, jpg, jpeg) zh_Hans: 用于解析的文件(支持 pdf, ppt, pptx, doc, docx, png, jpg, jpeg) label: en_US: file - ja_JP: file - pt_BR: file + ja_JP: ファイル + pt_BR: arquivo zh_Hans: file llm_description: the file to be parsed (support pdf, ppt, pptx, doc, docx, png, jpg, jpeg) @@ -427,13 +426,13 @@ workflow: form: llm human_description: en_US: The text you want to chunk. - ja_JP: The text you want to chunk. - pt_BR: The text you want to chunk. + ja_JP: チャンク化したいテキスト。 + pt_BR: O texto que você deseja dividir. zh_Hans: 你想要分块的文本。 label: en_US: Input text - ja_JP: Input text - pt_BR: Input text + ja_JP: 入力テキスト + pt_BR: Texto de entrada zh_Hans: 输入文本 llm_description: The text you want to chunk. max: null @@ -451,12 +450,12 @@ workflow: form: llm human_description: en_US: Maximum length for chunking - ja_JP: Maximum length for chunking + ja_JP: チャンク分割の最大長 pt_BR: Comprimento máximo para divisão zh_Hans: 用于分块的最大长度 label: en_US: Maximum Length - ja_JP: Maximum Length + ja_JP: 最大長 pt_BR: Comprimento Máximo zh_Hans: 最大长度 llm_description: Maximum length allowed per chunk @@ -478,12 +477,12 @@ workflow: form: llm human_description: en_US: Separator used for chunking - ja_JP: Separator used for chunking + ja_JP: チャンク分割に使用する区切り文字 pt_BR: Separador usado para divisão zh_Hans: 用于分块的分隔符 label: en_US: Chunk Separator - ja_JP: Chunk Separator + ja_JP: チャンク区切り文字 pt_BR: Separador de Divisão zh_Hans: 分块分隔符 llm_description: The separator used to split chunks @@ -502,12 +501,12 @@ workflow: form: llm human_description: en_US: Maximum length for subchunking - ja_JP: Maximum length for subchunking + ja_JP: サブチャンク分割の最大長 pt_BR: Comprimento máximo para subdivisão zh_Hans: 用于子分块的最大长度 label: en_US: Subchunk Maximum Length - ja_JP: Subchunk Maximum Length + ja_JP: サブチャンク最大長 pt_BR: Comprimento Máximo de Subdivisão zh_Hans: 子分块最大长度 llm_description: Maximum length allowed per subchunk @@ -526,12 +525,12 @@ workflow: form: llm human_description: en_US: Separator used for subchunking - ja_JP: Separator used for subchunking + ja_JP: サブチャンク分割に使用する区切り文字 pt_BR: Separador usado para subdivisão zh_Hans: 用于子分块的分隔符 label: en_US: Subchunk Separator - ja_JP: Subchunk Separator + ja_JP: サブチャンキング用セパレーター pt_BR: Separador de Subdivisão zh_Hans: 子分块分隔符 llm_description: The separator used to split subchunks @@ -552,16 +551,15 @@ workflow: en_US: Split text into paragraphs based on separator and maximum chunk length, using split text as parent block or entire document as parent block and directly retrieve. - ja_JP: Split text into paragraphs based on separator and maximum chunk - length, using split text as parent block or entire document as parent - block and directly retrieve. + ja_JP: セパレーターと最大チャンク長に基づいてテキストを段落に分割し、分割されたテキスト + を親ブロックとして使用するか、文書全体を親ブロックとして使用して直接取得します。 pt_BR: Dividir texto em parágrafos com base no separador e no comprimento máximo do bloco, usando o texto dividido como bloco pai ou documento completo como bloco pai e diretamente recuperá-lo. zh_Hans: 根据分隔符和最大块长度将文本拆分为段落,使用拆分文本作为检索的父块或整个文档用作父块并直接检索。 label: en_US: Parent Mode - ja_JP: Parent Mode + ja_JP: 親子モード pt_BR: Modo Pai zh_Hans: 父块模式 llm_description: Split text into paragraphs based on separator and maximum @@ -574,14 +572,14 @@ workflow: - icon: '' label: en_US: Paragraph - ja_JP: Paragraph + ja_JP: 段落 pt_BR: Parágrafo zh_Hans: 段落 value: paragraph - icon: '' label: en_US: Full Document - ja_JP: Full Document + ja_JP: 全文 pt_BR: Documento Completo zh_Hans: 全文 value: full_doc @@ -596,12 +594,12 @@ workflow: form: llm human_description: en_US: Whether to remove extra spaces in the text - ja_JP: Whether to remove extra spaces in the text + ja_JP: テキスト内の余分なスペースを削除するかどうか pt_BR: Se deve remover espaços extras no texto zh_Hans: 是否移除文本中的多余空格 label: en_US: Remove Extra Spaces - ja_JP: Remove Extra Spaces + ja_JP: 余分なスペースを削除 pt_BR: Remover Espaços Extras zh_Hans: 移除多余空格 llm_description: Whether to remove extra spaces in the text @@ -620,12 +618,12 @@ workflow: form: llm human_description: en_US: Whether to remove URLs and emails in the text - ja_JP: Whether to remove URLs and emails in the text + ja_JP: テキスト内のURLやメールアドレスを削除するかどうか pt_BR: Se deve remover URLs e e-mails no texto zh_Hans: 是否移除文本中的URL和电子邮件地址 label: en_US: Remove URLs and Emails - ja_JP: Remove URLs and Emails + ja_JP: URLとメールアドレスを削除 pt_BR: Remover URLs e E-mails zh_Hans: 移除URL和电子邮件地址 llm_description: Whether to remove URLs and emails in the text diff --git a/api/services/rag_pipeline/transform/notion-general-economy.yml b/api/services/rag_pipeline/transform/notion-general-economy.yml index d1a30f0ae6..83c1d8d2dd 100644 --- a/api/services/rag_pipeline/transform/notion-general-economy.yml +++ b/api/services/rag_pipeline/transform/notion-general-economy.yml @@ -99,13 +99,13 @@ workflow: form: llm human_description: en_US: The text you want to chunk. - ja_JP: The text you want to chunk. - pt_BR: The text you want to chunk. + ja_JP: チャンク化したいテキスト。 + pt_BR: O texto que você deseja dividir. zh_Hans: 你想要分块的文本。 label: en_US: Input Variable - ja_JP: Input Variable - pt_BR: Input Variable + ja_JP: 入力変数 + pt_BR: Variável de entrada zh_Hans: 输入变量 llm_description: The text you want to chunk. max: null @@ -123,13 +123,13 @@ workflow: form: llm human_description: en_US: The delimiter of the chunks. - ja_JP: The delimiter of the chunks. - pt_BR: The delimiter of the chunks. + ja_JP: チャンクの区切り記号。 + pt_BR: O delimitador dos pedaços. zh_Hans: 块的分隔符。 label: en_US: Delimiter - ja_JP: Delimiter - pt_BR: Delimiter + ja_JP: 区切り記号 + pt_BR: Delimitador zh_Hans: 分隔符 llm_description: The delimiter of the chunks, the format of the delimiter must be a string. @@ -148,13 +148,13 @@ workflow: form: llm human_description: en_US: The maximum chunk length. - ja_JP: The maximum chunk length. - pt_BR: The maximum chunk length. + ja_JP: 最大長のチャンク。 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度。 label: en_US: Maximum Chunk Length - ja_JP: Maximum Chunk Length - pt_BR: Maximum Chunk Length + ja_JP: チャンク最大長 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度 llm_description: The maximum chunk length, the format of the chunk size must be an integer. @@ -173,12 +173,12 @@ workflow: form: llm human_description: en_US: The chunk overlap length. - ja_JP: The chunk overlap length. + ja_JP: チャンクの重複長 pt_BR: The chunk overlap length. zh_Hans: 块的重叠长度。 label: en_US: Chunk Overlap Length - ja_JP: Chunk Overlap Length + ja_JP: チャンク重複長 pt_BR: Chunk Overlap Length zh_Hans: 块的重叠长度 llm_description: The chunk overlap length, the format of the chunk overlap @@ -198,12 +198,12 @@ workflow: form: llm human_description: en_US: Replace consecutive spaces, newlines and tabs - ja_JP: Replace consecutive spaces, newlines and tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する pt_BR: Replace consecutive spaces, newlines and tabs zh_Hans: 替换连续的空格、换行符和制表符 label: en_US: Replace Consecutive Spaces, Newlines and Tabs - ja_JP: Replace Consecutive Spaces, Newlines and Tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する pt_BR: Replace Consecutive Spaces, Newlines and Tabs zh_Hans: 替换连续的空格、换行符和制表符 llm_description: Replace consecutive spaces, newlines and tabs, the format @@ -223,12 +223,12 @@ workflow: form: llm human_description: en_US: Delete all URLs and email addresses - ja_JP: Delete all URLs and email addresses + ja_JP: すべてのURLとメールアドレスを削除する pt_BR: Delete all URLs and email addresses zh_Hans: 删除所有URL和电子邮件地址 label: en_US: Delete All URLs and Email Addresses - ja_JP: Delete All URLs and Email Addresses + ja_JP: すべてのURLとメールアドレスを削除する pt_BR: Delete All URLs and Email Addresses zh_Hans: 删除所有URL和电子邮件地址 llm_description: Delete all URLs and email addresses, the format of the diff --git a/api/services/rag_pipeline/transform/notion-general-high-quality.yml b/api/services/rag_pipeline/transform/notion-general-high-quality.yml index cd81bc4f1d..3e94edb67e 100644 --- a/api/services/rag_pipeline/transform/notion-general-high-quality.yml +++ b/api/services/rag_pipeline/transform/notion-general-high-quality.yml @@ -99,13 +99,13 @@ workflow: form: llm human_description: en_US: The text you want to chunk. - ja_JP: The text you want to chunk. - pt_BR: The text you want to chunk. + ja_JP: チャンク化したいテキスト。 + pt_BR: O texto que você deseja dividir. zh_Hans: 你想要分块的文本。 label: en_US: Input Variable - ja_JP: Input Variable - pt_BR: Input Variable + ja_JP: 入力変数 + pt_BR: Variável de entrada zh_Hans: 输入变量 llm_description: The text you want to chunk. max: null @@ -123,13 +123,13 @@ workflow: form: llm human_description: en_US: The delimiter of the chunks. - ja_JP: The delimiter of the chunks. - pt_BR: The delimiter of the chunks. + ja_JP: チャンクの区切り記号。 + pt_BR: O delimitador dos pedaços. zh_Hans: 块的分隔符。 label: en_US: Delimiter - ja_JP: Delimiter - pt_BR: Delimiter + ja_JP: 区切り記号 + pt_BR: Delimitador zh_Hans: 分隔符 llm_description: The delimiter of the chunks, the format of the delimiter must be a string. @@ -148,13 +148,13 @@ workflow: form: llm human_description: en_US: The maximum chunk length. - ja_JP: The maximum chunk length. - pt_BR: The maximum chunk length. + ja_JP: 最大長のチャンク。 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度。 label: en_US: Maximum Chunk Length - ja_JP: Maximum Chunk Length - pt_BR: Maximum Chunk Length + ja_JP: チャンク最大長 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度 llm_description: The maximum chunk length, the format of the chunk size must be an integer. @@ -173,12 +173,12 @@ workflow: form: llm human_description: en_US: The chunk overlap length. - ja_JP: The chunk overlap length. + ja_JP: チャンクの重複長 pt_BR: The chunk overlap length. zh_Hans: 块的重叠长度。 label: en_US: Chunk Overlap Length - ja_JP: Chunk Overlap Length + ja_JP: チャンク重複長 pt_BR: Chunk Overlap Length zh_Hans: 块的重叠长度 llm_description: The chunk overlap length, the format of the chunk overlap @@ -198,12 +198,12 @@ workflow: form: llm human_description: en_US: Replace consecutive spaces, newlines and tabs - ja_JP: Replace consecutive spaces, newlines and tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する pt_BR: Replace consecutive spaces, newlines and tabs zh_Hans: 替换连续的空格、换行符和制表符 label: en_US: Replace Consecutive Spaces, Newlines and Tabs - ja_JP: Replace Consecutive Spaces, Newlines and Tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する pt_BR: Replace Consecutive Spaces, Newlines and Tabs zh_Hans: 替换连续的空格、换行符和制表符 llm_description: Replace consecutive spaces, newlines and tabs, the format @@ -223,12 +223,12 @@ workflow: form: llm human_description: en_US: Delete all URLs and email addresses - ja_JP: Delete all URLs and email addresses + ja_JP: すべてのURLとメールアドレスを削除する pt_BR: Delete all URLs and email addresses zh_Hans: 删除所有URL和电子邮件地址 label: en_US: Delete All URLs and Email Addresses - ja_JP: Delete All URLs and Email Addresses + ja_JP: すべてのURLとメールアドレスを削除する pt_BR: Delete All URLs and Email Addresses zh_Hans: 删除所有URL和电子邮件地址 llm_description: Delete all URLs and email addresses, the format of the diff --git a/api/services/rag_pipeline/transform/notion-parentchild.yml b/api/services/rag_pipeline/transform/notion-parentchild.yml index 99f9714566..90ce75c418 100644 --- a/api/services/rag_pipeline/transform/notion-parentchild.yml +++ b/api/services/rag_pipeline/transform/notion-parentchild.yml @@ -118,13 +118,13 @@ workflow: form: llm human_description: en_US: The text you want to chunk. - ja_JP: The text you want to chunk. - pt_BR: The text you want to chunk. + ja_JP: チャンク化したいテキスト。 + pt_BR: O texto que você deseja dividir. zh_Hans: 你想要分块的文本。 label: en_US: Input text - ja_JP: Input text - pt_BR: Input text + ja_JP: 入力テキスト + pt_BR: Texto de entrada zh_Hans: 输入文本 llm_description: The text you want to chunk. max: null @@ -142,12 +142,12 @@ workflow: form: llm human_description: en_US: Maximum length for chunking - ja_JP: Maximum length for chunking + ja_JP: チャンク分割の最大長 pt_BR: Comprimento máximo para divisão zh_Hans: 用于分块的最大长度 label: en_US: Maximum Length - ja_JP: Maximum Length + ja_JP: 最大長 pt_BR: Comprimento Máximo zh_Hans: 最大长度 llm_description: Maximum length allowed per chunk @@ -169,12 +169,12 @@ workflow: form: llm human_description: en_US: Separator used for chunking - ja_JP: Separator used for chunking + ja_JP: チャンク分割に使用する区切り文字 pt_BR: Separador usado para divisão zh_Hans: 用于分块的分隔符 label: en_US: Chunk Separator - ja_JP: Chunk Separator + ja_JP: チャンク区切り文字 pt_BR: Separador de Divisão zh_Hans: 分块分隔符 llm_description: The separator used to split chunks @@ -193,12 +193,12 @@ workflow: form: llm human_description: en_US: Maximum length for subchunking - ja_JP: Maximum length for subchunking + ja_JP: サブチャンク分割の最大長 pt_BR: Comprimento máximo para subdivisão zh_Hans: 用于子分块的最大长度 label: en_US: Subchunk Maximum Length - ja_JP: Subchunk Maximum Length + ja_JP: サブチャンク最大長 pt_BR: Comprimento Máximo de Subdivisão zh_Hans: 子分块最大长度 llm_description: Maximum length allowed per subchunk @@ -217,12 +217,12 @@ workflow: form: llm human_description: en_US: Separator used for subchunking - ja_JP: Separator used for subchunking + ja_JP: サブチャンク分割に使用する区切り文字 pt_BR: Separador usado para subdivisão zh_Hans: 用于子分块的分隔符 label: en_US: Subchunk Separator - ja_JP: Subchunk Separator + ja_JP: サブチャンキング用セパレーター pt_BR: Separador de Subdivisão zh_Hans: 子分块分隔符 llm_description: The separator used to split subchunks @@ -243,16 +243,15 @@ workflow: en_US: Split text into paragraphs based on separator and maximum chunk length, using split text as parent block or entire document as parent block and directly retrieve. - ja_JP: Split text into paragraphs based on separator and maximum chunk - length, using split text as parent block or entire document as parent - block and directly retrieve. + ja_JP: セパレーターと最大チャンク長に基づいてテキストを段落に分割し、分割されたテキスト + を親ブロックとして使用するか、文書全体を親ブロックとして使用して直接取得します。 pt_BR: Dividir texto em parágrafos com base no separador e no comprimento máximo do bloco, usando o texto dividido como bloco pai ou documento completo como bloco pai e diretamente recuperá-lo. zh_Hans: 根据分隔符和最大块长度将文本拆分为段落,使用拆分文本作为检索的父块或整个文档用作父块并直接检索。 label: en_US: Parent Mode - ja_JP: Parent Mode + ja_JP: 親子モード pt_BR: Modo Pai zh_Hans: 父块模式 llm_description: Split text into paragraphs based on separator and maximum @@ -265,14 +264,14 @@ workflow: - icon: '' label: en_US: Paragraph - ja_JP: Paragraph + ja_JP: 段落 pt_BR: Parágrafo zh_Hans: 段落 value: paragraph - icon: '' label: en_US: Full Document - ja_JP: Full Document + ja_JP: 全文 pt_BR: Documento Completo zh_Hans: 全文 value: full_doc @@ -287,12 +286,12 @@ workflow: form: llm human_description: en_US: Whether to remove extra spaces in the text - ja_JP: Whether to remove extra spaces in the text + ja_JP: テキスト内の余分なスペースを削除するかどうか pt_BR: Se deve remover espaços extras no texto zh_Hans: 是否移除文本中的多余空格 label: en_US: Remove Extra Spaces - ja_JP: Remove Extra Spaces + ja_JP: 余分なスペースを削除 pt_BR: Remover Espaços Extras zh_Hans: 移除多余空格 llm_description: Whether to remove extra spaces in the text @@ -311,12 +310,12 @@ workflow: form: llm human_description: en_US: Whether to remove URLs and emails in the text - ja_JP: Whether to remove URLs and emails in the text + ja_JP: テキスト内のURLやメールアドレスを削除するかどうか pt_BR: Se deve remover URLs e e-mails no texto zh_Hans: 是否移除文本中的URL和电子邮件地址 label: en_US: Remove URLs and Emails - ja_JP: Remove URLs and Emails + ja_JP: URLとメールアドレスを削除 pt_BR: Remover URLs e E-mails zh_Hans: 移除URL和电子邮件地址 llm_description: Whether to remove URLs and emails in the text diff --git a/api/services/rag_pipeline/transform/website-crawl-general-economy.yml b/api/services/rag_pipeline/transform/website-crawl-general-economy.yml index 12fd61a389..241d94c95d 100644 --- a/api/services/rag_pipeline/transform/website-crawl-general-economy.yml +++ b/api/services/rag_pipeline/transform/website-crawl-general-economy.yml @@ -238,13 +238,13 @@ workflow: form: llm human_description: en_US: The text you want to chunk. - ja_JP: The text you want to chunk. - pt_BR: The text you want to chunk. + ja_JP: チャンク化したいテキスト。 + pt_BR: O texto que você deseja dividir. zh_Hans: 你想要分块的文本。 label: en_US: Input Variable - ja_JP: Input Variable - pt_BR: Input Variable + ja_JP: 入力変数 + pt_BR: Variável de entrada zh_Hans: 输入变量 llm_description: The text you want to chunk. max: null @@ -262,13 +262,13 @@ workflow: form: llm human_description: en_US: The delimiter of the chunks. - ja_JP: The delimiter of the chunks. - pt_BR: The delimiter of the chunks. + ja_JP: チャンクの区切り記号。 + pt_BR: O delimitador dos pedaços. zh_Hans: 块的分隔符。 label: en_US: Delimiter - ja_JP: Delimiter - pt_BR: Delimiter + ja_JP: 区切り記号 + pt_BR: Delimitador zh_Hans: 分隔符 llm_description: The delimiter of the chunks, the format of the delimiter must be a string. @@ -287,13 +287,13 @@ workflow: form: llm human_description: en_US: The maximum chunk length. - ja_JP: The maximum chunk length. - pt_BR: The maximum chunk length. + ja_JP: 最大長のチャンク。 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度。 label: en_US: Maximum Chunk Length - ja_JP: Maximum Chunk Length - pt_BR: Maximum Chunk Length + ja_JP: チャンク最大長 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度 llm_description: The maximum chunk length, the format of the chunk size must be an integer. @@ -312,12 +312,12 @@ workflow: form: llm human_description: en_US: The chunk overlap length. - ja_JP: The chunk overlap length. + ja_JP: チャンクの重複長 pt_BR: The chunk overlap length. zh_Hans: 块的重叠长度。 label: en_US: Chunk Overlap Length - ja_JP: Chunk Overlap Length + ja_JP: チャンク重複長 pt_BR: Chunk Overlap Length zh_Hans: 块的重叠长度 llm_description: The chunk overlap length, the format of the chunk overlap @@ -337,12 +337,12 @@ workflow: form: llm human_description: en_US: Replace consecutive spaces, newlines and tabs - ja_JP: Replace consecutive spaces, newlines and tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する pt_BR: Replace consecutive spaces, newlines and tabs zh_Hans: 替换连续的空格、换行符和制表符 label: en_US: Replace Consecutive Spaces, Newlines and Tabs - ja_JP: Replace Consecutive Spaces, Newlines and Tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する pt_BR: Replace Consecutive Spaces, Newlines and Tabs zh_Hans: 替换连续的空格、换行符和制表符 llm_description: Replace consecutive spaces, newlines and tabs, the format @@ -362,12 +362,12 @@ workflow: form: llm human_description: en_US: Delete all URLs and email addresses - ja_JP: Delete all URLs and email addresses + ja_JP: すべてのURLとメールアドレスを削除する pt_BR: Delete all URLs and email addresses zh_Hans: 删除所有URL和电子邮件地址 label: en_US: Delete All URLs and Email Addresses - ja_JP: Delete All URLs and Email Addresses + ja_JP: すべてのURLとメールアドレスを削除する pt_BR: Delete All URLs and Email Addresses zh_Hans: 删除所有URL和电子邮件地址 llm_description: Delete all URLs and email addresses, the format of the diff --git a/api/services/rag_pipeline/transform/website-crawl-general-high-quality.yml b/api/services/rag_pipeline/transform/website-crawl-general-high-quality.yml index 5e9c05f232..52b8f822c0 100644 --- a/api/services/rag_pipeline/transform/website-crawl-general-high-quality.yml +++ b/api/services/rag_pipeline/transform/website-crawl-general-high-quality.yml @@ -238,13 +238,13 @@ workflow: form: llm human_description: en_US: The text you want to chunk. - ja_JP: The text you want to chunk. - pt_BR: The text you want to chunk. + ja_JP: チャンク化したいテキスト。 + pt_BR: O texto que você deseja dividir. zh_Hans: 你想要分块的文本。 label: en_US: Input Variable - ja_JP: Input Variable - pt_BR: Input Variable + ja_JP: 入力変数 + pt_BR: Variável de entrada zh_Hans: 输入变量 llm_description: The text you want to chunk. max: null @@ -262,13 +262,13 @@ workflow: form: llm human_description: en_US: The delimiter of the chunks. - ja_JP: The delimiter of the chunks. - pt_BR: The delimiter of the chunks. + ja_JP: チャンクの区切り記号。 + pt_BR: O delimitador dos pedaços. zh_Hans: 块的分隔符。 label: en_US: Delimiter - ja_JP: Delimiter - pt_BR: Delimiter + ja_JP: 区切り記号 + pt_BR: Delimitador zh_Hans: 分隔符 llm_description: The delimiter of the chunks, the format of the delimiter must be a string. @@ -287,13 +287,13 @@ workflow: form: llm human_description: en_US: The maximum chunk length. - ja_JP: The maximum chunk length. - pt_BR: The maximum chunk length. + ja_JP: 最大長のチャンク。 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度。 label: en_US: Maximum Chunk Length - ja_JP: Maximum Chunk Length - pt_BR: Maximum Chunk Length + ja_JP: チャンク最大長 + pt_BR: O comprimento máximo do bloco zh_Hans: 最大块的长度 llm_description: The maximum chunk length, the format of the chunk size must be an integer. @@ -312,12 +312,12 @@ workflow: form: llm human_description: en_US: The chunk overlap length. - ja_JP: The chunk overlap length. + ja_JP: チャンクの重複長。 pt_BR: The chunk overlap length. zh_Hans: 块的重叠长度。 label: en_US: Chunk Overlap Length - ja_JP: Chunk Overlap Length + ja_JP: チャンク重複長 pt_BR: Chunk Overlap Length zh_Hans: 块的重叠长度 llm_description: The chunk overlap length, the format of the chunk overlap @@ -337,12 +337,12 @@ workflow: form: llm human_description: en_US: Replace consecutive spaces, newlines and tabs - ja_JP: Replace consecutive spaces, newlines and tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する pt_BR: Replace consecutive spaces, newlines and tabs zh_Hans: 替换连续的空格、换行符和制表符 label: en_US: Replace Consecutive Spaces, Newlines and Tabs - ja_JP: Replace Consecutive Spaces, Newlines and Tabs + ja_JP: 連続のスペース、改行、まだはタブを置換する pt_BR: Replace Consecutive Spaces, Newlines and Tabs zh_Hans: 替换连续的空格、换行符和制表符 llm_description: Replace consecutive spaces, newlines and tabs, the format @@ -362,12 +362,12 @@ workflow: form: llm human_description: en_US: Delete all URLs and email addresses - ja_JP: Delete all URLs and email addresses + ja_JP: すべてのURLとメールアドレスを削除する pt_BR: Delete all URLs and email addresses zh_Hans: 删除所有URL和电子邮件地址 label: en_US: Delete All URLs and Email Addresses - ja_JP: Delete All URLs and Email Addresses + ja_JP: すべてのURLとメールアドレスを削除する pt_BR: Delete All URLs and Email Addresses zh_Hans: 删除所有URL和电子邮件地址 llm_description: Delete all URLs and email addresses, the format of the diff --git a/api/services/rag_pipeline/transform/website-crawl-parentchild.yml b/api/services/rag_pipeline/transform/website-crawl-parentchild.yml index a9414d7bdb..5d609bd12b 100644 --- a/api/services/rag_pipeline/transform/website-crawl-parentchild.yml +++ b/api/services/rag_pipeline/transform/website-crawl-parentchild.yml @@ -121,13 +121,13 @@ workflow: form: llm human_description: en_US: The text you want to chunk. - ja_JP: The text you want to chunk. - pt_BR: The text you want to chunk. + ja_JP: チャンク化したいテキスト。 + pt_BR: O texto que você deseja dividir. zh_Hans: 你想要分块的文本。 label: en_US: Input text - ja_JP: Input text - pt_BR: Input text + ja_JP: 入力テキスト + pt_BR: Texto de entrada zh_Hans: 输入文本 llm_description: The text you want to chunk. max: null @@ -145,12 +145,12 @@ workflow: form: llm human_description: en_US: Maximum length for chunking - ja_JP: Maximum length for chunking + ja_JP: チャンク分割の最大長 pt_BR: Comprimento máximo para divisão zh_Hans: 用于分块的最大长度 label: en_US: Maximum Length - ja_JP: Maximum Length + ja_JP: 最大長 pt_BR: Comprimento Máximo zh_Hans: 最大长度 llm_description: Maximum length allowed per chunk @@ -172,12 +172,12 @@ workflow: form: llm human_description: en_US: Separator used for chunking - ja_JP: Separator used for chunking + ja_JP: チャンク分割に使用する区切り文字 pt_BR: Separador usado para divisão zh_Hans: 用于分块的分隔符 label: en_US: Chunk Separator - ja_JP: Chunk Separator + ja_JP: チャンク区切り文字 pt_BR: Separador de Divisão zh_Hans: 分块分隔符 llm_description: The separator used to split chunks @@ -196,12 +196,12 @@ workflow: form: llm human_description: en_US: Maximum length for subchunking - ja_JP: Maximum length for subchunking + ja_JP: サブチャンク分割の最大長 pt_BR: Comprimento máximo para subdivisão zh_Hans: 用于子分块的最大长度 label: en_US: Subchunk Maximum Length - ja_JP: Subchunk Maximum Length + ja_JP: サブチャンク最大長 pt_BR: Comprimento Máximo de Subdivisão zh_Hans: 子分块最大长度 llm_description: Maximum length allowed per subchunk @@ -220,12 +220,12 @@ workflow: form: llm human_description: en_US: Separator used for subchunking - ja_JP: Separator used for subchunking + ja_JP: サブチャンク分割に使用する区切り文字 pt_BR: Separador usado para subdivisão zh_Hans: 用于子分块的分隔符 label: en_US: Subchunk Separator - ja_JP: Subchunk Separator + ja_JP: サブチャンキング用セパレーター pt_BR: Separador de Subdivisão zh_Hans: 子分块分隔符 llm_description: The separator used to split subchunks @@ -246,16 +246,15 @@ workflow: en_US: Split text into paragraphs based on separator and maximum chunk length, using split text as parent block or entire document as parent block and directly retrieve. - ja_JP: Split text into paragraphs based on separator and maximum chunk - length, using split text as parent block or entire document as parent - block and directly retrieve. + ja_JP: セパレーターと最大チャンク長に基づいてテキストを段落に分割し、分割されたテキスト + を親ブロックとして使用するか、文書全体を親ブロックとして使用して直接取得します。 pt_BR: Dividir texto em parágrafos com base no separador e no comprimento máximo do bloco, usando o texto dividido como bloco pai ou documento completo como bloco pai e diretamente recuperá-lo. zh_Hans: 根据分隔符和最大块长度将文本拆分为段落,使用拆分文本作为检索的父块或整个文档用作父块并直接检索。 label: en_US: Parent Mode - ja_JP: Parent Mode + ja_JP: 親子モード pt_BR: Modo Pai zh_Hans: 父块模式 llm_description: Split text into paragraphs based on separator and maximum @@ -268,14 +267,14 @@ workflow: - icon: '' label: en_US: Paragraph - ja_JP: Paragraph + ja_JP: 段落 pt_BR: Parágrafo zh_Hans: 段落 value: paragraph - icon: '' label: en_US: Full Document - ja_JP: Full Document + ja_JP: 全文 pt_BR: Documento Completo zh_Hans: 全文 value: full_doc @@ -290,12 +289,12 @@ workflow: form: llm human_description: en_US: Whether to remove extra spaces in the text - ja_JP: Whether to remove extra spaces in the text + ja_JP: テキスト内の余分なスペースを削除するかどうか pt_BR: Se deve remover espaços extras no texto zh_Hans: 是否移除文本中的多余空格 label: en_US: Remove Extra Spaces - ja_JP: Remove Extra Spaces + ja_JP: 余分なスペースを削除 pt_BR: Remover Espaços Extras zh_Hans: 移除多余空格 llm_description: Whether to remove extra spaces in the text @@ -314,12 +313,12 @@ workflow: form: llm human_description: en_US: Whether to remove URLs and emails in the text - ja_JP: Whether to remove URLs and emails in the text + ja_JP: テキスト内のURLやメールアドレスを削除するかどうか pt_BR: Se deve remover URLs e e-mails no texto zh_Hans: 是否移除文本中的URL和电子邮件地址 label: en_US: Remove URLs and Emails - ja_JP: Remove URLs and Emails + ja_JP: URLとメールアドレスを削除 pt_BR: Remover URLs e E-mails zh_Hans: 移除URL和电子邮件地址 llm_description: Whether to remove URLs and emails in the text