Yongtao Huang
be3af1e234
Migrate SQLAlchemy from 1.x to 2.0 with automated and manual adjustments ( #23224 )
...
Co-authored-by: Yongtao Huang <99629139+hyongtao-db@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-02 10:30:19 +08:00
Frederick2313072
2042353526
fix:score threshold ( #24897 )
2025-09-02 08:58:14 +08:00
wlleiiwang
9486715929
FEAT: Tencent Vector optimize BM25 initialization to reduce loading time ( #24915 )
...
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-09-01 21:08:41 +08:00
Frederick2313072
5b3cc560d5
fix:hard-coded top-k fallback issue. ( #24879 )
2025-09-01 15:46:37 +08:00
willzhao
ffba341258
[CHORE]: remove redundant-cast ( #24807 )
2025-09-01 14:05:32 +08:00
-LAN-
dce4d0ff80
Merge remote-tracking branch 'origin/main' into feat/queue-based-graph-engine
2025-08-29 13:22:13 +08:00
Bowen Liang
39064197da
chore: cleanup unnecessary mypy suppressions on imports ( #24712 )
2025-08-28 23:17:25 +08:00
Asuka Minato
4adf85d7d4
example for rm extra cast ( #24646 )
2025-08-28 09:37:39 +08:00
Asuka Minato
d2f234757b
example try rm ignore ( #24649 )
2025-08-28 09:36:16 +08:00
-LAN-
8c35663220
feat: queue-based graph engine
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-08-27 15:33:28 +08:00
Yongtao Huang
826f19e968
Chore : rm dead code detected by pylance ( #24588 )
2025-08-27 13:19:40 +08:00
Petrus Han
d9e26eba65
fix: rag/milvus clarify full-text search warning with actionable guidance ( #24570 )
2025-08-26 23:32:26 +08:00
Yongtao Huang
fa753239ad
Refactor: use logger = logging.getLogger(__name__) in logging ( #24515 )
...
Co-authored-by: Yongtao Huang <99629139+hyongtao-db@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-08-26 18:10:31 +08:00
GuanMu
47f480c0dc
fix: unify log format, use placeholders instead of string concatenation ( #24544 )
2025-08-26 15:45:16 +08:00
huangzhuo1949
98473e9d4f
fix:external dataset weight rerank bug ( #24533 )
...
Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com>
2025-08-26 14:54:40 +08:00
-LAN-
04954918a5
Merge commit from fork
...
* fix(oraclevector): SQL Injection
Signed-off-by: -LAN- <laipz8200@outlook.com>
* fix(oraclevector): Remove bind variables from FETCH FIRST clause
Oracle doesn't support bind variables in the FETCH FIRST clause.
Fixed by using validated integers directly in the SQL string while
maintaining proper input validation to prevent SQL injection.
- Updated search_by_vector method to use validated top_k directly
- Updated search_by_full_text method to use validated top_k directly
- Adjusted parameter numbering for document_ids_filter placeholders
🤖 Generated with [Claude Code](https://claude.ai/code )
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-08-26 13:51:23 +08:00
huayaoyue6
23dcb2dc1b
fix(vector): use semantic version comparison for version check ( #24409 )
2025-08-24 21:04:33 +08:00
-LAN-
da9af7b547
[Chore/Refactor] Use centralized naive_utc_now for UTC datetime operations ( #24352 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-08-22 23:53:05 +08:00
Asuka Minato
51cc2bf429
example of next(, None) ( #24345 )
2025-08-22 18:32:22 +08:00
willzhao
5ab6bc283c
[CHORE]: x: T = None to x: Optional[T] = None ( #24217 )
2025-08-21 21:58:39 +08:00
Yongtao Huang
106ab7f2a8
Fix: safe defaults for BaseModel dict fields ( #24098 )
...
Co-authored-by: Yongtao Huang <99629139+hyongtao-db@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-21 21:38:55 +08:00
Guangdong Liu
1abf1240b2
refactor: replace try-except blocks with contextlib.suppress for cleaner exception handling ( #24284 )
2025-08-21 18:18:49 +08:00
8bitpd
a183b2affb
fix: rollback when AnalyticDB create zhparser failed ( #24260 )
...
Co-authored-by: xiaozeyu <xiaozeyu.xzy@alibaba-inc.com>
2025-08-21 15:00:26 +08:00
Amy
738aaee101
fix(api):Fix the issue of empty and not empty operations failing in k… ( #24276 )
...
Co-authored-by: liumin <min.liu@tongdun.net>
2025-08-21 14:43:08 +08:00
8bitpd
6b1606f4f4
fix: keep idempotent when init AnalyticdbVectorBySql ( #24239 )
...
Co-authored-by: xiaozeyu <xiaozeyu.xzy@alibaba-inc.com>
2025-08-20 23:22:27 +08:00
yihong
4c1ad40f8e
docs: format all md files ( #24195 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-08-20 13:49:11 +08:00
He Wang
670d479e32
Bump pyobvector to 0.2.15 ( #24120 )
2025-08-18 17:36:27 +08:00
crazywoola
8288b1dcab
Revert "fix pg_vector extension requires SUPERUSER, but not availabl… ( #24108 )
2025-08-18 16:46:15 +08:00
Elvis_LEE
16d1289a0a
fix pg_vector extension requires SUPERUSER, but not available on Huawei Cloud RDS ( #24093 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-18 16:29:36 +08:00
Bo Wu
790a6ec203
fix: return empty list instead of raising exception for qdrant search when score_threshold is 1 ( #24032 )
2025-08-18 12:44:05 +08:00
-LAN-
e340fccafb
feat: integrate flask-orjson for improved JSON serialization performance ( #23935 )
2025-08-14 19:50:59 +08:00
engchina
7566d90dfe
fix issue #23758 ( #23764 )
...
Co-authored-by: root <root@thinkpad-pc.localdomain>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-12 10:26:13 +08:00
yunqiqiliang
14e1c16cf2
Fix ClickZetta stability and reduce logging noise ( #23632 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-08 22:57:47 +08:00
湛露先生
fd536a943a
word extractor cleans. ( #20926 )
...
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-08-08 09:37:51 +08:00
yunqiqiliang
62772e8871
fix: ensure vector database cleanup on dataset deletion regardless of document presence (affects all 33 vector databases) ( #23574 )
...
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-08 09:18:43 +08:00
Qiang Lee
e9045a8838
Fix: Apply Metadata Filters Correctly in Full-Text Search Mode for Tencent Cloud Vector Database ( #23564 )
2025-08-07 05:36:06 -07:00
yunqiqiliang
e01510e2a6
feat: Add Clickzetta Lakehouse vector database integration ( #22551 )
...
Co-authored-by: Claude <noreply@anthropic.com>
2025-08-07 14:21:46 +08:00
Yongtao Huang
6b8b31ff64
Remove unnecessary issubclass check ( #23455 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-06 13:43:55 +08:00
Yongtao Huang
406c1952b8
Fix version comparison with imported_version ( #23326 )
...
Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com>
2025-08-04 10:40:49 +08:00
wanttobeamaster
da5c003f97
chore: tablestore full text search support score normalization ( #23255 )
...
Co-authored-by: xiaozhiqing.xzq <xiaozhiqing.xzq@alibaba-inc.com>
2025-08-01 14:14:11 +08:00
Aurelius Huang
ffddabde43
feat(notion): Notion Database extracts Rows content `in row order` and appends `Row Page URL` ( #22646 )
...
Co-authored-by: Aurelius Huang <cm.huang@aftership.com>
2025-07-30 21:35:20 +08:00
kenwoodjw
28478cdc41
feat: support metadata condition filter string array ( #23111 )
...
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
2025-07-30 16:13:45 +08:00
rhochman
eee576355b
Fix: Support for Elasticsearch Cloud Connector ( #23017 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-30 11:12:16 +08:00
Yongtao Huang
1c05491f1c
Chore: remove duplicate TYPE_CHECKING import ( #23013 )
...
Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com>
2025-07-28 10:04:45 +08:00
Asuka Minato
a189d293f8
make logging not use f-str, change others to f-str ( #22882 )
2025-07-25 10:32:48 +08:00
Asuka Minato
ef51678c73
orm filter -> where ( #22801 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-07-24 00:57:45 +08:00
wanttobeamaster
8278b39f85
fix tablestore full text search bug ( #22853 )
2025-07-23 19:31:47 +08:00
wanttobeamaster
1c3c40db69
fix: tablestore TypeError when vector is missing ( #22843 )
...
Co-authored-by: xiaozhiqing.xzq <xiaozhiqing.xzq@alibaba-inc.com>
2025-07-23 18:59:16 +08:00
wlleiiwang
b4e152f775
FEAT: Tencent Vector search supports backward compatibility with the previous score calculation approach. ( #22820 )
...
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-07-23 15:38:31 +08:00
Asuka Minato
6d3e198c3c
Mapped column ( #22644 )
...
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-23 00:39:59 +08:00
wanttobeamaster
a2048fd0f4
fix: tablestore vdb support metadata filter ( #22774 )
...
Co-authored-by: xiaozhiqing.xzq <xiaozhiqing.xzq@alibaba-inc.com>
2025-07-22 16:48:59 +08:00
issac2e
58d92970a9
Optimize tencent_vector knowledge base deletion error handling with batch processing support ( #22726 )
...
Co-authored-by: liuchen15 <liuchen15@gaotu.cn>
Co-authored-by: crazywoola <427733928@qq.com>
2025-07-22 08:21:41 +08:00
uply23333
ab012fe1a2
fix: improve document filtering in full text search(elasticsearch) ( #22683 )
2025-07-21 15:59:37 +08:00
8bitpd
9251a66a10
fix: update analyticdb vector to do filter by metadata ( #22698 )
...
Co-authored-by: xiaozeyu <xiaozeyu.xzy@alibaba-inc.com>
2025-07-21 15:03:37 +08:00
znn
ed263aed9f
fix text splitter ( #22596 )
2025-07-18 13:51:58 +08:00
-LAN-
460a825ef1
refactor: decouple Node and NodeData ( #22581 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: QuantumGhost <obelisk.reg+git@gmail.com>
2025-07-18 10:08:51 +08:00
helojo
e7d80bf7bf
Fix: the pict type picture was not processed in the docx ( #19305 )
...
Co-authored-by: zqgame <zqgame@zqgame.local>
2025-07-17 22:53:35 +08:00
yihong
d2933c2bfe
fix: drop dead code phase2 unused class ( #22042 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-07-17 09:33:07 +08:00
wanttobeamaster
bf7b2c339b
tablestore vector support more method ( #22225 )
...
Co-authored-by: xiaozhiqing.xzq <xiaozhiqing.xzq@alibaba-inc.com>
2025-07-15 09:58:48 +08:00
Jacky Wu
3e96c0c468
fix: close session before doing long latency operation ( #22306 )
2025-07-14 15:16:10 +08:00
luckylhb90
a371390d6c
optimize: batch embedding and qdrant write_consistency_factor parameter ( #21776 )
...
Co-authored-by: hobo.l <hobo.l@binance.com>
2025-07-10 10:16:59 +08:00
wlleiiwang
89b52471fb
Optimize the memory usage of Tencent Vector Database ( #22079 )
...
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-07-09 15:53:06 +08:00
baonudesifeizhai
1c7404099d
fix: prevent timeout in file encoding detection for large files ( #21453 )
...
Co-authored-by: crazywoola <427733928@qq.com>
2025-07-03 17:06:49 +08:00
efrey kong
826bf25abf
Fix: prevent SQL errors when metadata filter Constant value is None or blank ( #21803 )
2025-07-02 14:43:01 +08:00
Dongyu Li
00f0b569cc
Feat/kb index ( #20868 )
...
Co-authored-by: twwu <twwu@dify.ai>
2025-06-25 17:52:59 +08:00
Jin
3e7f8bad56
fix: markdown_extractor lost chunks if it starts without a header( #21308 ) ( #21309 )
2025-06-21 23:10:00 +08:00
LiuBo
17fe62cf91
feat: add support for Matrixone database ( #20714 )
2025-06-19 10:20:12 +08:00
NeatGuyCoding
9835730278
Translation fix ( #21194 )
2025-06-19 09:36:56 +08:00
NeatGuyCoding
2eae7503e1
Minor Improvements for File Validation and Configuration Handling #21179 ( #21171 )
...
Co-authored-by: tech <cto@sb>
2025-06-18 18:33:28 +08:00
Ademílson Tonato
9e73e8b9e8
feat: add search endpoint for Firecrawl Integration ( #20521 )
...
Co-authored-by: crazywoola <427733928@qq.com>
2025-06-18 14:37:03 +08:00
Rain Wang
47e0f92c0f
Fixes #20748 KnowledgeRetrievalNode return all external documents when reranker disabled even top-k configed ( #20762 )
2025-06-18 14:35:12 +08:00
kazuya-awano
45c89bd6de
feat: add pagenation to notion extractor ( #20919 )
2025-06-18 11:30:55 +08:00
kurokobo
4689e8953e
fix: shorten connection timeout to pypi.org for deprecation check for weaviate client ( #21131 )
2025-06-18 09:25:52 +08:00
Bowen Liang
366ddb05ae
test: run vdb test of oceanbase with docker compose in CI tests ( #20945 )
2025-06-16 11:05:19 +08:00
Bowen Liang
0f3d4d0b6e
chore: bump mypy to 1.16 ( #20608 )
2025-06-11 01:01:33 +08:00
QuantumGhost
c439e82038
refactor(api): Decouple `ParameterExtractorNode` from `LLMNode` ( #20843 )
...
- Extract methods used by `ParameterExtractorNode` from `LLMNode` into a separate file.
- Convert `ParameterExtractorNode` into a subclass of `BaseNode`.
- Refactor code referencing the extracted methods to ensure functionality and clarity.
- Fixes the issue that `ParameterExtractorNode` returns error when executed.
- Fix relevant test cases.
Closes #20840 .
2025-06-10 11:47:50 +08:00
yihong
65c7c01d90
fix: clean up two unreachable code ( #20773 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-06-07 23:06:46 +08:00
jefferyvvv
37c3283450
fix: opensearch vector search falls back to keyword search ( #20723 )
...
Co-authored-by: wenjun.gu <wenjun.gu@envision-energy.com>
2025-06-06 16:29:15 +08:00
jefferyvvv
4271602cfc
fix: opensearch metadata filtering returns empty ( #20701 )
...
Co-authored-by: wenjun.gu <wenjun.gu@envision-energy.com>
Co-authored-by: crazywoola <427733928@qq.com>
2025-06-06 09:10:01 +08:00
jefferyvvv
138ad6e8b3
fix: opensearch fulltext search with metadata filtering dsl error ( #20702 )
...
Co-authored-by: wenjun.gu <wenjun.gu@envision-energy.com>
2025-06-05 23:09:00 +08:00
kenwoodjw
01d500db14
fix: autocorrect everything in web ( #20605 )
...
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
2025-06-04 14:12:24 +08:00
zhaobingshuang
3f7aa38d77
fix : #20560 When elasticsearch is used as the vector database, the Retrieval Test fails to filter the data after setting the Score Threshold, and the score of the recalled results is empty ( #20561 )
2025-06-03 13:24:26 +08:00
Cheney Zhang
b4b59148dc
check zilliz cloud of full-text search ( #20519 )
2025-06-02 18:04:13 +08:00
Dongyu Li
1ea4459d9f
update knowledge base api ( #20426 )
2025-05-30 14:45:30 +08:00
-LAN-
a6ea15e63c
Refactor/message cycle manage and knowledge retrieval ( #20460 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-05-30 14:36:44 +08:00
yihong
5a991295e0
fix: drop some type fixme ( #20344 )
2025-05-30 14:10:09 +08:00
-LAN-
482e50aae9
Refactor/remove db from cycle manager ( #20455 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-05-30 04:34:13 +08:00
rouxiaomin
4c4887c5fc
feat(qdrant):add replication_factor when create collection in qdrant ( #20133 )
...
Co-authored-by: 刘敏 <min.liu@tongdun.net>
2025-05-27 14:46:04 +08:00
He Huang
6f48af2610
Refactor OpenSearch config to separate use_ssl and verify_certs flags ( #20075 )
...
Co-authored-by: he.huang <he.huang1@outlook.com>
Co-authored-by: crazywoola <427733928@qq.com>
2025-05-22 10:14:38 +08:00
wlleiiwang
7d230acf40
tencent vectordb compatible with version 1.1.3 and below ( #20056 )
...
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-05-21 20:24:05 +08:00
-LAN-
3196dc2d61
refactor: Use typed SQLAlchemy base model and fix type errors ( #19980 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-05-21 15:38:03 +08:00
Darlewo
8cb3b4aef2
fix: multiple retrieve reranking_enabled switch ( #19958 )
2025-05-20 15:22:03 +08:00
Amir Mohsen Asaran
c9ee60e197
Feat(WaterCrawl error handling): add custom exceptions and error handling ( #19948 )
2025-05-20 10:25:16 +08:00
-LAN-
4977bb21ec
feat(workflow): domain model for workflow node execution ( #19430 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-17 00:56:16 +08:00
k-kayashima
b292990075
Fix: Ensure unique index names for pgvector knowledge tables ( #19672 )
...
Co-authored-by: crazywoola <427733928@qq.com>
2025-05-15 11:43:44 +08:00
非法操作
085bd1aa93
chore: model.query change to db.session.query ( #19551 )
...
Co-authored-by: QuantumGhost <obelisk.reg+git@gmail.com>
2025-05-13 09:13:12 +08:00
非法操作
14cd71ed0a
chore: all model.query replace to db.session.query ( #19521 )
2025-05-12 15:19:41 +08:00
非法操作
b00f94df64
fix: replace all dataset.Model.query to db.session.query(Model) ( #19509 )
2025-05-12 13:52:33 +08:00
湛露先生
1119790b02
clean rag word_extractor. ( #19397 )
...
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-05-09 16:39:16 +08:00
Will
bfa652f2d0
fix: metadata filtering condition variable unassigned; fix External K… ( #19208 )
2025-05-07 14:52:09 +08:00