Commit Graph

569 Commits

Author SHA1 Message Date
wangxiaolei
9007109a6b
fix: [xxx](xxx) render as xxx](xxx) (#30392)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-12-31 10:30:15 +08:00
wangxiaolei
30dd50ff83
feat: allow fail fast (#30262) 2025-12-30 09:27:40 +08:00
Jyong
f610f6895f
fix: retrieval test and knowledge retrieval node failed in multimodal mode (#30210)
Co-authored-by: Stephen Zhou <38493346+hyoban@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-26 21:42:06 +08:00
wangxiaolei
d20a8d5b77
fix: fix missing not in (#30207) 2025-12-26 16:52:34 +08:00
wangxiaolei
8611301722
fix: fix DatasetRetrieval._process_metadata_filter_func miss in operator (#30199) 2025-12-26 16:34:50 +08:00
wangxiaolei
61d255a6e6
chore: bypass InsufficientPrivilege on Azure PostgreSQL (#30191) 2025-12-26 14:35:05 +08:00
Rhys
a5309bee25
fix: handle missing credential_id (#30051)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-24 11:21:51 +08:00
wangxiaolei
111a39b549
fix: fix firecrawl url concat (#30008) 2025-12-24 09:40:32 +08:00
非法操作
9701a2994b
chore: Translate stray Chinese comment to English (#30024)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-23 14:05:21 +08:00
wangxiaolei
eaf4146e2f
perf: optimize DatasetRetrieval.retrieve、RetrievalService._deduplicat… (#29981) 2025-12-22 20:08:21 +08:00
wangxiaolei
32605181bd
feat: first use INTERNAL_FILES_URL first, then FILES_URL (#29962) 2025-12-21 16:53:37 +08:00
呆萌闷油瓶
5067e4f255
fix 29184 (#29188)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-18 17:11:52 +08:00
wangxiaolei
78ca5ad142
fix: fix fixed_separator (#29861) 2025-12-18 16:50:44 +08:00
zhaobingshuang
8d1e36540a
fix: detect_file_encodings TypeError: tuple indices must be integers or slices, not str (#29595)
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-12-17 13:58:05 +08:00
Jyong
ae4a9040df
Feat/update notion preview (#29345)
Co-authored-by: twwu <twwu@dify.ai>
2025-12-16 16:43:45 +08:00
longbingljw
4cc6652424
feat: VECTOR_STORE supports seekdb (#29658) 2025-12-16 12:35:04 +09:00
quicksand
4bf6c4dafa
chore: add online drive metadata source enum (#29674) 2025-12-15 21:13:23 +08:00
wangxiaolei
8f3fd9a728
perf: commit once (#29590) 2025-12-15 11:40:26 +08:00
TomoOkuyama
569c593240
feat: Add InterSystems IRIS vector database support (#29480)
Co-authored-by: Tomo Okuyama <tomo.okuyama@intersystems.com>
2025-12-15 10:20:43 +08:00
Jyong
db42f467c8
fix: docx extractor external image failed (#29558) 2025-12-12 13:41:51 +08:00
Nie Ronghua
12e39365fa
perf(core/rag): optimize Excel extractor performance and memory usage (#29551)
Co-authored-by: 01393547 <nieronghua@sf-express.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-12-12 12:15:03 +08:00
Jyong
69a22af1c9
fix: optimize database query when retrieval knowledge in App (#29467) 2025-12-11 13:50:46 +08:00
Jyong
18082752a0
fix knowledge pipeline run multimodal document failed (#29431) 2025-12-10 20:42:51 +08:00
Jyong
784008997b
fix parent-child check when child chunk is not exist (#29426) 2025-12-10 18:45:43 +08:00
Jyong
b49e2646ff
fix: session unbound during parent-child retrieval (#29396) 2025-12-10 14:08:55 +08:00
wangxiaolei
e205182e1f
fix: Parent instance <DocumentSegment at 0x7955b5572c90> is not bound… (#29377) 2025-12-10 10:01:45 +08:00
Jyong
9affc546c6
Feat/support multimodal embedding (#29115)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-12-09 14:41:46 +08:00
wangxiaolei
ca61bb5de0
fix: Weaviate was not closed properly (#29301) 2025-12-09 10:23:29 +08:00
wangxiaolei
45911ab0af
feat: using charset_normalizer instead of chardet (#29022) 2025-12-05 11:19:19 +08:00
Conner Mo
0af8a7b958
feat: enhance OceanBase vector database with SQL injection fixes, unified processing, and improved error handling (#28951) 2025-12-01 09:51:47 +08:00
Conner Mo
acbc886ecd
fix: implement score_threshold filtering for OceanBase vector search (#28536)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-29 18:50:21 +08:00
Eric Guo
d7010f582f
Fix 500 error in knowledge base, select weightedScore and click retrieve. (#28586)
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-26 16:44:00 +08:00
墨绿色
f76a3f545c
Feat/add weaviate tokenization configurable (#28159)
Co-authored-by: lijiezhao <lijiezhao@perfect99.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-11-25 20:07:45 +08:00
Byron.wang
83702762c8
use no-root user in docker image by default (#26419) 2025-11-25 19:59:45 +08:00
Asuka Minato
751ce4ec41
more typed orm (#28577)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-24 21:01:46 +08:00
GuanMu
5f61ca5e6f
feat: Implement partial update for document metadata, allowing merging of new values with existing ones. (#28390)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-11-21 12:58:20 +08:00
longbingljw
c0b7ffd5d0
feat:mysql adaptation for metadb (#28188) 2025-11-20 09:44:39 +08:00
NeatGuyCoding
c74eb4fcf3
minor fix(rag): return early when pushing empty tasks to avoid Redis DataError (#28027)
Signed-off-by: NeatGuyCoding <15627489+NeatGuyCoding@users.noreply.github.com>
2025-11-13 20:18:11 +08:00
李龙飞
81832c14ee
Fix: Correctly handle merged cells in DOCX tables to prevent content duplication and loss (#27871)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-11-13 15:56:24 +08:00
huangzhuo1949
9843fec393
fix: elasticsearch_vector version (#28028)
Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-10 13:17:13 +09:00
NeatGuyCoding
eea713b668
Fix typo in weaviate comment, improve time test precision, and add security tests for get-icon utility (#27919)
Signed-off-by: NeatGuyCoding <15627489+NeatGuyCoding@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-10 10:11:54 +08:00
hj24
37903722fe
refactor: implement tenant self queue for rag tasks (#27559)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
2025-11-06 21:25:50 +08:00
Boris Polonsky
68d357d7f6
Add WEAVIATE_GRPC_ENDPOINT as designed in weaviate migration guide (#27861)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-11-05 17:19:08 +08:00
quicksand
41e549af14
fix(weaviate): skip init checks to prevent PyPI requests on each search (#27624)
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-30 09:59:08 +08:00
Tanaka Kisuke
666586b59c
improve opensearch index deletion #27231 (#27336)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-26 23:57:21 +08:00
-LAN-
4a6398fc1f
Fix: surface workflow container LLM usage (#27021) 2025-10-21 16:05:26 +08:00
Dhruv Gorasiya
d19c100166
fix: logical error in Weaviate distance calculation (#27019)
Co-authored-by: crazywoola <427733928@qq.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-17 09:06:50 +08:00
Dhruv Gorasiya
a8ad80c405
Fixed Weaviate no module found issue (issue #26938) (#26964)
Co-authored-by: crazywoola <427733928@qq.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-16 22:41:48 +08:00
Ademílson Tonato
a16ef7e73c
refactor: Update Firecrawl to use v2 API (#24734)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-15 10:48:54 +08:00
Dhruv Gorasiya
48c42a9fba
Weaviate update version (#25447)
Co-authored-by: crazywoola <427733928@qq.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-14 10:39:53 +08:00
minglu7
150a8276b9
fix: avoid closing shared session during embeddings (#26830) 2025-10-13 17:36:00 +08:00
Asuka Minato
24cd7bbc62
fix RetrievalMethod StrEnum (#26768)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-10-13 10:29:37 +08:00
Guangdong Liu
d299e75e1b
refactor: use dynamic max characters for chunking in extractors (#26782) 2025-10-13 10:22:59 +08:00
carribean
cbf2ba6cec
Feature integrate alibabacloud mysql vector (#25994)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-10-11 10:47:28 +08:00
Asuka Minato
1bd621f819
remove .value (#26633)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-11 09:08:29 +08:00
Asuka Minato
bb6a331490
change all to httpx (#26119)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-10-10 23:41:16 +08:00
Asuka Minato
ab2eacb6c1
use model_validate (#26182)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-10-10 17:30:13 +09:00
znn
2b6882bd97
fix chunks 2 (#26623)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-10 16:01:33 +08:00
kenwoodjw
8d803a26eb
fix: duplicate chunks (#26360)
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
2025-09-30 10:53:55 +08:00
AkisAya
66196459d5
fix db connection error in embed_documents() (#26196) 2025-09-28 13:44:51 +08:00
longbingljw
24b4289d6c
fix:add some explanation for oceanbase parser selection (#26071)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-23 17:06:06 +08:00
Shili Cao
345ac8333c
Add Full-Text & Hybrid Search Support to Baidu Vector DB and Update SDK, Closes #25982 (#25983)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-09-22 10:17:35 +08:00
longbingljw
208fe3d7de
feat:support selecting different ftparser for OceanBase. (#25970)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-22 09:56:33 +08:00
-LAN-
4ba1292455
refactor: replace print statements with proper logging (#25773) 2025-09-18 20:35:47 +08:00
-LAN-
85cda47c70
feat: knowledge pipeline (#25360)
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: twwu <twwu@dify.ai>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: jyong <718720800@qq.com>
Co-authored-by: Wu Tianwei <30284043+WTW0313@users.noreply.github.com>
Co-authored-by: QuantumGhost <obelisk.reg+git@gmail.com>
Co-authored-by: lyzno1 <yuanyouhuilyz@gmail.com>
Co-authored-by: quicksand <quicksandzn@gmail.com>
Co-authored-by: Jyong <76649700+JohnJyong@users.noreply.github.com>
Co-authored-by: lyzno1 <92089059+lyzno1@users.noreply.github.com>
Co-authored-by: zxhlyh <jasonapring2015@outlook.com>
Co-authored-by: Yongtao Huang <yongtaoh2022@gmail.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Joel <iamjoel007@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: nite-knite <nkCoding@gmail.com>
Co-authored-by: Hanqing Zhao <sherry9277@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry <xh001x@hotmail.com>
2025-09-18 12:49:10 +08:00
Jiang
b283b10d3e
Fix/lindorm vdb optimize (#25748)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-16 16:54:18 +08:00
Asuka Minato
bdd85b36a4
ruff check preview (#25653)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-16 12:58:12 +08:00
湛露先生
0bbf4fb66a
correct typos . (#25717)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-09-15 21:22:40 +08:00
-LAN-
bab4975809
chore: add ast-grep rule to convert Optional[T] to T | None (#25560)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-15 13:06:33 +08:00
Ritoban Dutta
67a686cf98
[Chore/Refactor] use __all__ to specify export member. (#25681) 2025-09-15 09:45:35 +08:00
ChasePassion
a3f2c05632
optimize _merge_splits function by using enumerate instead of manual index tracking (#25680)
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-09-15 09:41:16 +08:00
Krito.
a13d7987e0
chore: adopt StrEnum and auto() for some string-typed enums (#25129)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-09-12 21:14:26 +08:00
kenwoodjw
c91253d05d
fix segment deletion race condition (#24408)
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-09-12 15:29:57 +08:00
Asuka Minato
cbc0e639e4
update sql in batch (#24801)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
2025-09-10 13:00:17 +08:00
-LAN-
08dd3f7b50
Fix basedpyright type errors (#25435)
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-10 01:54:26 +08:00
Yongtao Huang
2ac7a9c8fc
Chore: thanks to bump-pydantic (#25437) 2025-09-09 20:07:17 +08:00
Asuka Minato
38057b1b0e
add typing to all wraps (#25405)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-09 16:48:33 +08:00
Asuka Minato
f6059ef389
add more typing (#24949)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-08 10:40:00 +08:00
-LAN-
9b8a03b53b
[Chore/Refactor] Improve type annotations in models module (#25281)
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-09-08 09:42:27 +08:00
NeatGuyCoding
afa7228076
fix: a failed index to be marked as created (#25290)
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-09-06 22:53:26 +08:00
Asuka Minato
a78339a040
remove bare list, dict, Sequence, None, Any (#25058)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
2025-09-06 03:32:23 +08:00
Yongtao Huang
432f89cf33
Chore: clean some # type: ignore (#25157) 2025-09-05 11:30:04 +08:00
Yoshio Sugiyama
4966e4e1fb
fix: Remove invalid key from firecrawl request payload. (#25190)
Signed-off-by: SUGIYAMA Yoshio <nenegi.01mo@gmail.com>
2025-09-05 10:10:56 +08:00
Davide Delbianco
cdf9b674dc
chore: Bump weaviate-client to latest v3 version (#25096) 2025-09-04 11:15:36 +08:00
-LAN-
53c4a8787f
[Chore/Refactor] Improve type safety and resolve type checking issues (#25104) 2025-09-04 09:35:32 +08:00
GuanMu
ff7a0e3170
fix: improve error logging for vector search operation in MyScale (#25087) 2025-09-03 22:24:45 +08:00
-LAN-
9d5956cef8
[Chore/Refactor] Switch from MyPy to Basedpyright for type checking (#25047)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-09-03 11:52:26 +08:00
湛露先生
1fff4620e6
clean console apis and rag cleans. (#25042)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-03 11:25:18 +08:00
Yongtao Huang
bc9efa7ea8
Refactor: use DatasourceType.XX.value instead of hardcoded (#25015)
Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-03 08:56:48 +08:00
Yongtao Huang
8fcc864fb7
Post fix of #23224 (#25007) 2025-09-02 20:59:08 +08:00
Bowen Liang
7b379e2a61
chore: apply ty checks on api code with script and ci action (#24653) 2025-09-02 16:05:13 +08:00
湛露先生
deea07e905
make clean() function in index_processor_base abstractmethod (#24959)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-02 14:48:45 +08:00
Novice
ca96350707
chore: optimize SQL queries that perform partial full table scans (#24786) 2025-09-02 11:46:11 +08:00
Yongtao Huang
be3af1e234
Migrate SQLAlchemy from 1.x to 2.0 with automated and manual adjustments (#23224)
Co-authored-by: Yongtao Huang <99629139+hyongtao-db@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-02 10:30:19 +08:00
Frederick2313072
2042353526
fix:score threshold (#24897) 2025-09-02 08:58:14 +08:00
wlleiiwang
9486715929
FEAT: Tencent Vector optimize BM25 initialization to reduce loading time (#24915)
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-09-01 21:08:41 +08:00
Frederick2313072
5b3cc560d5
fix:hard-coded top-k fallback issue. (#24879) 2025-09-01 15:46:37 +08:00
willzhao
ffba341258
[CHORE]: remove redundant-cast (#24807) 2025-09-01 14:05:32 +08:00
Bowen Liang
39064197da
chore: cleanup unnecessary mypy suppressions on imports (#24712) 2025-08-28 23:17:25 +08:00
Asuka Minato
4adf85d7d4
example for rm extra cast (#24646) 2025-08-28 09:37:39 +08:00