Commit Graph

500 Commits

Author SHA1 Message Date
Jyong db42f467c8
fix: docx extractor external image failed (#29558) 2025-12-12 13:41:51 +08:00
Nie Ronghua 12e39365fa
perf(core/rag): optimize Excel extractor performance and memory usage (#29551)
Co-authored-by: 01393547 <nieronghua@sf-express.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-12-12 12:15:03 +08:00
Jyong 69a22af1c9
fix: optimize database query when retrieval knowledge in App (#29467) 2025-12-11 13:50:46 +08:00
Jyong 18082752a0
fix knowledge pipeline run multimodal document failed (#29431) 2025-12-10 20:42:51 +08:00
Jyong 784008997b
fix parent-child check when child chunk is not exist (#29426) 2025-12-10 18:45:43 +08:00
Jyong b49e2646ff
fix: session unbound during parent-child retrieval (#29396) 2025-12-10 14:08:55 +08:00
wangxiaolei e205182e1f
fix: Parent instance <DocumentSegment at 0x7955b5572c90> is not bound… (#29377) 2025-12-10 10:01:45 +08:00
Jyong 9affc546c6
Feat/support multimodal embedding (#29115)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-12-09 14:41:46 +08:00
wangxiaolei ca61bb5de0
fix: Weaviate was not closed properly (#29301) 2025-12-09 10:23:29 +08:00
wangxiaolei 45911ab0af
feat: using charset_normalizer instead of chardet (#29022) 2025-12-05 11:19:19 +08:00
Conner Mo 0af8a7b958
feat: enhance OceanBase vector database with SQL injection fixes, unified processing, and improved error handling (#28951) 2025-12-01 09:51:47 +08:00
Conner Mo acbc886ecd
fix: implement score_threshold filtering for OceanBase vector search (#28536)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-29 18:50:21 +08:00
Eric Guo d7010f582f
Fix 500 error in knowledge base, select weightedScore and click retrieve. (#28586)
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-26 16:44:00 +08:00
墨绿色 f76a3f545c
Feat/add weaviate tokenization configurable (#28159)
Co-authored-by: lijiezhao <lijiezhao@perfect99.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-11-25 20:07:45 +08:00
Byron.wang 83702762c8
use no-root user in docker image by default (#26419) 2025-11-25 19:59:45 +08:00
Asuka Minato 751ce4ec41
more typed orm (#28577)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-24 21:01:46 +08:00
GuanMu 5f61ca5e6f
feat: Implement partial update for document metadata, allowing merging of new values with existing ones. (#28390)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-11-21 12:58:20 +08:00
longbingljw c0b7ffd5d0
feat:mysql adaptation for metadb (#28188) 2025-11-20 09:44:39 +08:00
NeatGuyCoding c74eb4fcf3
minor fix(rag): return early when pushing empty tasks to avoid Redis DataError (#28027)
Signed-off-by: NeatGuyCoding <15627489+NeatGuyCoding@users.noreply.github.com>
2025-11-13 20:18:11 +08:00
李龙飞 81832c14ee
Fix: Correctly handle merged cells in DOCX tables to prevent content duplication and loss (#27871)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-11-13 15:56:24 +08:00
huangzhuo1949 9843fec393
fix: elasticsearch_vector version (#28028)
Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-10 13:17:13 +09:00
NeatGuyCoding eea713b668
Fix typo in weaviate comment, improve time test precision, and add security tests for get-icon utility (#27919)
Signed-off-by: NeatGuyCoding <15627489+NeatGuyCoding@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-10 10:11:54 +08:00
hj24 37903722fe
refactor: implement tenant self queue for rag tasks (#27559)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
2025-11-06 21:25:50 +08:00
Boris Polonsky 68d357d7f6
Add WEAVIATE_GRPC_ENDPOINT as designed in weaviate migration guide (#27861)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-11-05 17:19:08 +08:00
quicksand 41e549af14
fix(weaviate): skip init checks to prevent PyPI requests on each search (#27624)
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-30 09:59:08 +08:00
Tanaka Kisuke 666586b59c
improve opensearch index deletion #27231 (#27336)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-26 23:57:21 +08:00
-LAN- 4a6398fc1f
Fix: surface workflow container LLM usage (#27021) 2025-10-21 16:05:26 +08:00
Dhruv Gorasiya d19c100166
fix: logical error in Weaviate distance calculation (#27019)
Co-authored-by: crazywoola <427733928@qq.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-17 09:06:50 +08:00
Dhruv Gorasiya a8ad80c405
Fixed Weaviate no module found issue (issue #26938) (#26964)
Co-authored-by: crazywoola <427733928@qq.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-16 22:41:48 +08:00
Ademílson Tonato a16ef7e73c
refactor: Update Firecrawl to use v2 API (#24734)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-15 10:48:54 +08:00
Dhruv Gorasiya 48c42a9fba
Weaviate update version (#25447)
Co-authored-by: crazywoola <427733928@qq.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-14 10:39:53 +08:00
minglu7 150a8276b9
fix: avoid closing shared session during embeddings (#26830) 2025-10-13 17:36:00 +08:00
Asuka Minato 24cd7bbc62
fix RetrievalMethod StrEnum (#26768)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-10-13 10:29:37 +08:00
Guangdong Liu d299e75e1b
refactor: use dynamic max characters for chunking in extractors (#26782) 2025-10-13 10:22:59 +08:00
carribean cbf2ba6cec
Feature integrate alibabacloud mysql vector (#25994)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-10-11 10:47:28 +08:00
Asuka Minato 1bd621f819
remove .value (#26633)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-11 09:08:29 +08:00
Asuka Minato bb6a331490
change all to httpx (#26119)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-10-10 23:41:16 +08:00
Asuka Minato ab2eacb6c1
use model_validate (#26182)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-10-10 17:30:13 +09:00
znn 2b6882bd97
fix chunks 2 (#26623)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-10-10 16:01:33 +08:00
kenwoodjw 8d803a26eb
fix: duplicate chunks (#26360)
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
2025-09-30 10:53:55 +08:00
AkisAya 66196459d5
fix db connection error in embed_documents() (#26196) 2025-09-28 13:44:51 +08:00
longbingljw 24b4289d6c
fix:add some explanation for oceanbase parser selection (#26071)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-23 17:06:06 +08:00
Shili Cao 345ac8333c
Add Full-Text & Hybrid Search Support to Baidu Vector DB and Update SDK, Closes #25982 (#25983)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-09-22 10:17:35 +08:00
longbingljw 208fe3d7de
feat:support selecting different ftparser for OceanBase. (#25970)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-22 09:56:33 +08:00
-LAN- 4ba1292455
refactor: replace print statements with proper logging (#25773) 2025-09-18 20:35:47 +08:00
-LAN- 85cda47c70
feat: knowledge pipeline (#25360)
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: twwu <twwu@dify.ai>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: jyong <718720800@qq.com>
Co-authored-by: Wu Tianwei <30284043+WTW0313@users.noreply.github.com>
Co-authored-by: QuantumGhost <obelisk.reg+git@gmail.com>
Co-authored-by: lyzno1 <yuanyouhuilyz@gmail.com>
Co-authored-by: quicksand <quicksandzn@gmail.com>
Co-authored-by: Jyong <76649700+JohnJyong@users.noreply.github.com>
Co-authored-by: lyzno1 <92089059+lyzno1@users.noreply.github.com>
Co-authored-by: zxhlyh <jasonapring2015@outlook.com>
Co-authored-by: Yongtao Huang <yongtaoh2022@gmail.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Joel <iamjoel007@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: nite-knite <nkCoding@gmail.com>
Co-authored-by: Hanqing Zhao <sherry9277@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry <xh001x@hotmail.com>
2025-09-18 12:49:10 +08:00
Jiang b283b10d3e
Fix/lindorm vdb optimize (#25748)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-16 16:54:18 +08:00
Asuka Minato bdd85b36a4
ruff check preview (#25653)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-16 12:58:12 +08:00
湛露先生 0bbf4fb66a
correct typos . (#25717)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-09-15 21:22:40 +08:00
-LAN- bab4975809
chore: add ast-grep rule to convert Optional[T] to T | None (#25560)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-15 13:06:33 +08:00