Ademílson Tonato
9e73e8b9e8
feat: add search endpoint for Firecrawl Integration ( #20521 )
...
Co-authored-by: crazywoola <427733928@qq.com>
2025-06-18 14:37:03 +08:00
Rain Wang
47e0f92c0f
Fixes #20748 KnowledgeRetrievalNode return all external documents when reranker disabled even top-k configed ( #20762 )
2025-06-18 14:35:12 +08:00
kazuya-awano
45c89bd6de
feat: add pagenation to notion extractor ( #20919 )
2025-06-18 11:30:55 +08:00
kurokobo
4689e8953e
fix: shorten connection timeout to pypi.org for deprecation check for weaviate client ( #21131 )
2025-06-18 09:25:52 +08:00
Bowen Liang
366ddb05ae
test: run vdb test of oceanbase with docker compose in CI tests ( #20945 )
2025-06-16 11:05:19 +08:00
Bowen Liang
0f3d4d0b6e
chore: bump mypy to 1.16 ( #20608 )
2025-06-11 01:01:33 +08:00
QuantumGhost
c439e82038
refactor(api): Decouple `ParameterExtractorNode` from `LLMNode` ( #20843 )
...
- Extract methods used by `ParameterExtractorNode` from `LLMNode` into a separate file.
- Convert `ParameterExtractorNode` into a subclass of `BaseNode`.
- Refactor code referencing the extracted methods to ensure functionality and clarity.
- Fixes the issue that `ParameterExtractorNode` returns error when executed.
- Fix relevant test cases.
Closes #20840 .
2025-06-10 11:47:50 +08:00
yihong
65c7c01d90
fix: clean up two unreachable code ( #20773 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-06-07 23:06:46 +08:00
jefferyvvv
37c3283450
fix: opensearch vector search falls back to keyword search ( #20723 )
...
Co-authored-by: wenjun.gu <wenjun.gu@envision-energy.com>
2025-06-06 16:29:15 +08:00
jefferyvvv
4271602cfc
fix: opensearch metadata filtering returns empty ( #20701 )
...
Co-authored-by: wenjun.gu <wenjun.gu@envision-energy.com>
Co-authored-by: crazywoola <427733928@qq.com>
2025-06-06 09:10:01 +08:00
jefferyvvv
138ad6e8b3
fix: opensearch fulltext search with metadata filtering dsl error ( #20702 )
...
Co-authored-by: wenjun.gu <wenjun.gu@envision-energy.com>
2025-06-05 23:09:00 +08:00
kenwoodjw
01d500db14
fix: autocorrect everything in web ( #20605 )
...
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
2025-06-04 14:12:24 +08:00
zhaobingshuang
3f7aa38d77
fix : #20560 When elasticsearch is used as the vector database, the Retrieval Test fails to filter the data after setting the Score Threshold, and the score of the recalled results is empty ( #20561 )
2025-06-03 13:24:26 +08:00
Cheney Zhang
b4b59148dc
check zilliz cloud of full-text search ( #20519 )
2025-06-02 18:04:13 +08:00
Dongyu Li
1ea4459d9f
update knowledge base api ( #20426 )
2025-05-30 14:45:30 +08:00
-LAN-
a6ea15e63c
Refactor/message cycle manage and knowledge retrieval ( #20460 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-05-30 14:36:44 +08:00
yihong
5a991295e0
fix: drop some type fixme ( #20344 )
2025-05-30 14:10:09 +08:00
-LAN-
482e50aae9
Refactor/remove db from cycle manager ( #20455 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-05-30 04:34:13 +08:00
rouxiaomin
4c4887c5fc
feat(qdrant):add replication_factor when create collection in qdrant ( #20133 )
...
Co-authored-by: 刘敏 <min.liu@tongdun.net>
2025-05-27 14:46:04 +08:00
He Huang
6f48af2610
Refactor OpenSearch config to separate use_ssl and verify_certs flags ( #20075 )
...
Co-authored-by: he.huang <he.huang1@outlook.com>
Co-authored-by: crazywoola <427733928@qq.com>
2025-05-22 10:14:38 +08:00
wlleiiwang
7d230acf40
tencent vectordb compatible with version 1.1.3 and below ( #20056 )
...
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-05-21 20:24:05 +08:00
-LAN-
3196dc2d61
refactor: Use typed SQLAlchemy base model and fix type errors ( #19980 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-05-21 15:38:03 +08:00
Darlewo
8cb3b4aef2
fix: multiple retrieve reranking_enabled switch ( #19958 )
2025-05-20 15:22:03 +08:00
Amir Mohsen Asaran
c9ee60e197
Feat(WaterCrawl error handling): add custom exceptions and error handling ( #19948 )
2025-05-20 10:25:16 +08:00
-LAN-
4977bb21ec
feat(workflow): domain model for workflow node execution ( #19430 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-17 00:56:16 +08:00
k-kayashima
b292990075
Fix: Ensure unique index names for pgvector knowledge tables ( #19672 )
...
Co-authored-by: crazywoola <427733928@qq.com>
2025-05-15 11:43:44 +08:00
非法操作
085bd1aa93
chore: model.query change to db.session.query ( #19551 )
...
Co-authored-by: QuantumGhost <obelisk.reg+git@gmail.com>
2025-05-13 09:13:12 +08:00
非法操作
14cd71ed0a
chore: all model.query replace to db.session.query ( #19521 )
2025-05-12 15:19:41 +08:00
非法操作
b00f94df64
fix: replace all dataset.Model.query to db.session.query(Model) ( #19509 )
2025-05-12 13:52:33 +08:00
湛露先生
1119790b02
clean rag word_extractor. ( #19397 )
...
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-05-09 16:39:16 +08:00
Will
bfa652f2d0
fix: metadata filtering condition variable unassigned; fix External K… ( #19208 )
2025-05-07 14:52:09 +08:00
Hao Cheng
9bcf837f17
fix: use only supported operators in metadata filter system prompts ( #19195 )
2025-05-03 20:08:08 +08:00
Will
a212a63e6a
fix: time type metadata filtering error ( #19192 )
2025-05-03 20:07:37 +08:00
Bowen Liang
12c96b93d9
immediately return initialed tiktokenizer instance and remove dead code in usage of tiktokenizer ( #17957 )
2025-04-30 16:07:20 +08:00
QuantumGhost
bd1bbfee4b
Enhance Code Consistency Across Repository with `.editorconfig` ( #19023 )
2025-04-29 18:04:33 +08:00
Ahmad Zidan
8266815cda
feat: add AWS Managed IAM auth for OpenSearch vector DB ( #18963 )
2025-04-29 15:10:08 +08:00
Ethan
8b4ea01810
feat: support access milvus with token ( #19034 )
2025-04-29 14:52:13 +08:00
Panpan
83187b30c0
fix: fix rerank model runner usage ( #19008 )
2025-04-29 14:51:21 +08:00
Wesley
b62eb61400
fix depth param issue for WaterCrawl ( #18839 )
2025-04-27 11:04:56 +08:00
Jiang
37e2f73909
[Lindorm VDB] Add the QUERY_TIMEOUT parameter to force the search query to fail. ( #18613 )
...
Co-authored-by: jiangzhijie <jiangzhijie.jzj@alibaba-inc.com>
2025-04-25 09:42:58 +08:00
王晓阳
0babdffe3e
feat: support vastbase vector database ( #16308 )
2025-04-24 18:04:57 +08:00
Jyong
e2cb7006c4
check metadata_filtering_conditions could be None in auto mode ( #18548 )
2025-04-22 17:09:33 +08:00
lauding
eb1ce3dd6b
feat: support huawei cloud vector database ( #16141 )
2025-04-22 13:03:35 +08:00
tmuife
7b6523e54d
Update Oracle db connection library and change connection pool to single connection ( #18466 )
2025-04-21 17:56:57 +08:00
Rain Wang
d2e3744ca3
Switching from CONSOLE_API_URL to FILES_URL in word_extractor.py ( #18249 )
2025-04-18 16:05:48 +08:00
Rain Wang
83f1aeec1d
Fix ORDER BY (score, id) error in api/core/rag/datasource/vdb/analyticdb/analyticdb_vector_sql.py line 249 ( #18252 )
2025-04-17 14:15:05 +08:00
Rain Wang
e8d98e3d89
Add analyzer_params config for milvus vectordb ( #18180 )
2025-04-17 10:38:56 +08:00
Jyong
95283b4dd3
Feat/change split length method ( #18097 )
...
Co-authored-by: JzoNg <jzongcode@gmail.com>
2025-04-16 12:28:22 +08:00
YANG
d119c7d629
ignore errors when creating duplicate indexes ( #18069 )
...
Co-authored-by: 璟义 <yangshangpo.ysp@alibaba-inc.com>
2025-04-15 15:48:16 +08:00
Jasonfish
1f722cde22
fix(api): Some params were ignored when creating empty Datasets through API ( #17932 )
2025-04-14 10:24:01 +08:00
Yongtao Huang
5d72003ebb
Remove dead code ( #17899 )
2025-04-11 20:33:52 +08:00
briqt
91cfa90503
Fix external knowledge Issues: ( #17685 ) ( #17843 )
2025-04-11 15:37:27 +08:00
yihong
f04d52c044
fix: autocorrect everything in api ( #17859 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-04-11 15:24:39 +08:00
wlleiiwang
9d20561af4
create db if not exists ( #17796 )
...
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-04-10 18:03:22 +08:00
Jyong
8b3be4224d
revert batch query ( #17707 )
2025-04-09 20:25:36 +08:00
wlleiiwang
f148f1efa2
fix: Check collection exists before drop it. ( #17692 )
...
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-04-09 19:14:32 +08:00
Jyong
abfcd9d3b6
fix segment query index not effect ( #17704 )
2025-04-09 19:09:08 +08:00
Han
f1e4d5ed6c
Fix Performance Issues: ( #17083 )
...
Co-authored-by: Wang Han <wanghan@zhejianglab.org>
2025-04-09 11:22:53 +08:00
Steven Li
abead647e2
fix: Extract docx file fails when the file contains an invalid link ( #17576 )
2025-04-08 13:59:33 +08:00
Amir Mohsen Asaran
f54905e685
feat: Integrate WaterCrawl.dev as a new knowledge base provider ( #16396 )
...
Co-authored-by: crazywoola <427733928@qq.com>
2025-04-07 12:43:23 +08:00
wlleiiwang
42a42a7962
FEAT: support Tencent vectordb to full text search ( #16865 )
...
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-04-07 09:50:03 +08:00
crazywoola
3e698074e7
Fix/17466 cannot create a knowledge base by adding files ( #17470 )
2025-04-06 00:03:05 +08:00
Panpan
fc3f14c0ee
fix: keep image url ( #17430 )
2025-04-04 15:55:48 +08:00
Perfecto
16c722d1d8
fix: move hardcoded text to language settings ( #16990 ) ( #17133 )
2025-04-02 22:35:51 +08:00
Jyong
6104b91d3f
add doc support in knowledge base for unstructured ( #17352 )
2025-04-02 21:35:01 +08:00
Jiang
fd1e40d22e
Lindorm VDB bugfix ( #17357 )
...
Co-authored-by: jiangzhijie <jiangzhijie.jzj@alibaba-inc.com>
2025-04-02 21:31:59 +08:00
Jiang
ff388fe3e6
optimize lindorm vdb add_texts ( #17212 )
...
Co-authored-by: jiangzhijie <jiangzhijie.jzj@alibaba-inc.com>
2025-04-01 11:06:35 +08:00
非法操作
44f911a0a8
chore: docstring not match the function parameter ( #17162 )
2025-03-31 13:19:15 +08:00
jiangbo721
a1aa325ce3
Chore/code format and Repair commit_id 3254018d more deleted codes and Fix naming error ambiguity between workflow_run_id and workflow_id ( #17075 )
...
Co-authored-by: 刘江波 <jiangbo721@163.com>
2025-03-29 14:15:53 +08:00
wanttobeamaster
7f70cadacb
feat: support Tablestore vector database ( #16601 )
...
Co-authored-by: xiaozhiqing.xzq <xiaozhiqing.xzq@alibaba-inc.com>
2025-03-27 15:53:33 +08:00
wlleiiwang
a743d5dc71
feat: tencent vectordb: use grpc client and set upsert batch size ( #16016 )
...
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-03-27 12:20:16 +08:00
Jyong
30792a1e1a
install pandoc ( #16825 )
2025-03-26 22:34:10 +08:00
yourchanges
59a86dabee
fix: fix missing oceanbase config enable_hybrid_search init ( #16852 )
...
Co-authored-by: 李远军 <4842@9ji.com>
2025-03-26 21:15:54 +08:00
Jyong
6a857e01f6
fix multiple metadata filter's confusing setting ( #16771 )
2025-03-26 14:16:21 +08:00
taokuizu
0c2a459c30
fix typo in _process_metadata_filter_func ( #16780 )
2025-03-26 09:01:41 +08:00
Jyong
2174225259
fix milvus filter search ( #16725 )
2025-03-25 16:22:43 +08:00
hsiong
6157f57872
feat: Add OceanBase hybrid search features ( #16652 )
...
Co-authored-by: 李远军 <4842@9ji.com>
Co-authored-by: yourchanges <yourchanges@gmail.com>
2025-03-25 14:32:00 +08:00
kenwoodjw
a113356695
fix: pgvector metadata filter ( #16688 )
...
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
2025-03-25 11:34:33 +08:00
Jiang
fc8c765215
Fix/vdb lindorm ( #16660 )
...
Co-authored-by: jiangzhijie <jiangzhijie.jzj@alibaba-inc.com>
2025-03-25 09:19:06 +08:00
Jyong
86a1859d02
Metadata variable value fix ( #16665 )
2025-03-25 09:07:11 +08:00
Jyong
1be0d26c1f
fix metadata filter not affect in keyword-search and fulltext-search ( #16644 )
2025-03-24 18:35:16 +08:00
chenhuan0728
770c461a8f
feat: add openGauss PQ acceleration feature ( #16432 )
...
Co-authored-by: chenhuan <huan.chen0728@foxmail>
2025-03-24 15:16:40 +08:00
Jyong
d135677c25
add vdb document id index ( #16244 )
...
Co-authored-by: crazywoola <427733928@qq.com>
2025-03-20 01:38:15 +08:00
Jyong
a8879057c0
fix tidb metadata filter ( #16237 )
2025-03-19 19:44:56 +08:00
Jyong
81325df368
fix weaviate metadata filter ( #16230 )
2025-03-19 18:26:53 +08:00
Jyong
b8ef3149ef
metadata expect value check error ( #16210 )
2025-03-19 17:48:01 +08:00
Jyong
c3c957bb80
change recreate_collection function to create_collection ( #16212 )
2025-03-19 17:13:08 +08:00
Jyong
abeaea4f79
Support knowledge metadata filter ( #15982 )
2025-03-18 16:42:19 +08:00
Jyong
33ba7e659b
fix vector db sql injection ( #16096 )
2025-03-18 15:07:29 +08:00
LittleFish-15
223ab5a38f
feat: support openGauss vector database ( #15865 )
2025-03-17 19:42:54 +08:00
huangzhuo1949
695a7400a9
fix:delete empty table bug ( #15517 )
...
Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com>
2025-03-17 10:53:26 +08:00
Jyong
84a866028a
fix document could be None ( #15818 )
2025-03-14 16:40:01 +08:00
Yuichiro Utsumi
5f9d236d22
Feat: Add pg_bigm for keyword search in pgvector ( #13876 )
...
Signed-off-by: Yuichiro Utsumi <utsumi.yuichiro@fujitsu.com>
2025-03-13 16:32:34 +08:00
Jyong
a8e8c37fdd
improve text split ( #15719 )
2025-03-13 15:29:33 +08:00
kenwoodjw
087bb60b31
fix: preserve Unicode characters in keyword search queries ( #15522 )
...
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
2025-03-12 18:34:42 +08:00
Jyong
f77f7e1437
fix text split ( #15426 )
2025-03-11 00:24:27 +08:00
Jyong
435564f0f2
fix parent-child retrival count ( #15119 )
2025-03-06 22:32:38 +08:00
engchina
9c1db7dca7
modify oracle lexer name Fixes #15106 ( #15108 )
...
Co-authored-by: engchina <atjapan2015@gmail.com>
2025-03-06 18:58:51 +08:00
llinvokerl
d04f40c274
Fix empty results issue in full-text search with Milvus vector database ( #14885 )
...
Co-authored-by: liusurong.lsr <liusurong.lsr@alibaba-inc.com>
2025-03-05 12:27:01 +08:00
engchina
c8de30f3d9
feat: support oracle oci autonomouse database. Fixes #14792 and Fixes #14628 . ( #14804 )
...
Co-authored-by: engchina <atjapan2015@gmail.com>
2025-03-04 09:22:04 +08:00
yuhaowin
1e3197a1ea
Fixes 14217: database retrieve api and chat-messages api response doc_metadata ( #14219 )
2025-02-27 14:56:46 +08:00
Rhys
548f6ef2b6
fix: incorrect score in the chroma vector ( #14273 )
2025-02-25 09:40:22 +08:00
Bowen Liang
dfdd6dfa20
fix: change the config name and fix typo in description of the number of retrieval executors ( #13856 )
2025-02-19 09:13:36 +08:00
Jyong
aa19bb3f30
fix session close issue ( #13946 )
2025-02-18 19:29:57 +08:00
Charlie.Wei
abe5aca3e2
Retrieval service optimization ( #13849 )
2025-02-17 18:22:36 +08:00
Yeuoly
403e2d58b9
Introduce Plugins ( #13836 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Signed-off-by: -LAN- <laipz8200@outlook.com>
Signed-off-by: xhe <xw897002528@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: takatost <takatost@gmail.com>
Co-authored-by: kurokobo <kuro664@gmail.com>
Co-authored-by: Novice Lee <novicelee@NoviPro.local>
Co-authored-by: zxhlyh <jasonapring2015@outlook.com>
Co-authored-by: AkaraChen <akarachen@outlook.com>
Co-authored-by: Yi <yxiaoisme@gmail.com>
Co-authored-by: Joel <iamjoel007@gmail.com>
Co-authored-by: JzoNg <jzongcode@gmail.com>
Co-authored-by: twwu <twwu@dify.ai>
Co-authored-by: Hiroshi Fujita <fujita-h@users.noreply.github.com>
Co-authored-by: AkaraChen <85140972+AkaraChen@users.noreply.github.com>
Co-authored-by: NFish <douxc512@gmail.com>
Co-authored-by: Wu Tianwei <30284043+WTW0313@users.noreply.github.com>
Co-authored-by: 非法操作 <hjlarry@163.com>
Co-authored-by: Novice <857526207@qq.com>
Co-authored-by: Hiroki Nagai <82458324+nagaihiroki-git@users.noreply.github.com>
Co-authored-by: Gen Sato <52241300+halogen22@users.noreply.github.com>
Co-authored-by: eux <euxuuu@gmail.com>
Co-authored-by: huangzhuo1949 <167434202+huangzhuo1949@users.noreply.github.com>
Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com>
Co-authored-by: lotsik <lotsik@mail.ru>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: nite-knite <nkCoding@gmail.com>
Co-authored-by: Jyong <76649700+JohnJyong@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: gakkiyomi <gakkiyomi@aliyun.com>
Co-authored-by: CN-P5 <heibai2006@gmail.com>
Co-authored-by: CN-P5 <heibai2006@qq.com>
Co-authored-by: Chuehnone <1897025+chuehnone@users.noreply.github.com>
Co-authored-by: yihong <zouzou0208@gmail.com>
Co-authored-by: Kevin9703 <51311316+Kevin9703@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: Boris Feld <lothiraldan@gmail.com>
Co-authored-by: mbo <himabo@gmail.com>
Co-authored-by: mabo <mabo@aeyes.ai>
Co-authored-by: Warren Chen <warren.chen830@gmail.com>
Co-authored-by: JzoNgKVO <27049666+JzoNgKVO@users.noreply.github.com>
Co-authored-by: jiandanfeng <chenjh3@wangsu.com>
Co-authored-by: zhu-an <70234959+xhdd123321@users.noreply.github.com>
Co-authored-by: zhaoqingyu.1075 <zhaoqingyu.1075@bytedance.com>
Co-authored-by: 海狸大師 <86974027+yenslife@users.noreply.github.com>
Co-authored-by: Xu Song <xusong.vip@gmail.com>
Co-authored-by: rayshaw001 <396301947@163.com>
Co-authored-by: Ding Jiatong <dingjiatong@gmail.com>
Co-authored-by: Bowen Liang <liangbowen@gf.com.cn>
Co-authored-by: JasonVV <jasonwangiii@outlook.com>
Co-authored-by: le0zh <newlight@qq.com>
Co-authored-by: zhuxinliang <zhuxinliang@didiglobal.com>
Co-authored-by: k-zaku <zaku99@outlook.jp>
Co-authored-by: luckylhb90 <luckylhb90@gmail.com>
Co-authored-by: hobo.l <hobo.l@binance.com>
Co-authored-by: jiangbo721 <365065261@qq.com>
Co-authored-by: 刘江波 <jiangbo721@163.com>
Co-authored-by: Shun Miyazawa <34241526+miya@users.noreply.github.com>
Co-authored-by: EricPan <30651140+Egfly@users.noreply.github.com>
Co-authored-by: crazywoola <427733928@qq.com>
Co-authored-by: sino <sino2322@gmail.com>
Co-authored-by: Jhvcc <37662342+Jhvcc@users.noreply.github.com>
Co-authored-by: lowell <lowell.hu@zkteco.in>
Co-authored-by: Boris Polonsky <BorisPolonsky@users.noreply.github.com>
Co-authored-by: Ademílson Tonato <ademilsonft@outlook.com>
Co-authored-by: Ademílson Tonato <ademilson.tonato@refurbed.com>
Co-authored-by: IWAI, Masaharu <iwaim.sub@gmail.com>
Co-authored-by: Yueh-Po Peng (Yabi) <94939112+y10ab1@users.noreply.github.com>
Co-authored-by: Jason <ggbbddjm@gmail.com>
Co-authored-by: Xin Zhang <sjhpzx@gmail.com>
Co-authored-by: yjc980121 <3898524+yjc980121@users.noreply.github.com>
Co-authored-by: heyszt <36215648+hieheihei@users.noreply.github.com>
Co-authored-by: Abdullah AlOsaimi <osaimiacc@gmail.com>
Co-authored-by: Abdullah AlOsaimi <189027247+osaimi@users.noreply.github.com>
Co-authored-by: Yingchun Lai <laiyingchun@apache.org>
Co-authored-by: Hash Brown <hi@xzd.me>
Co-authored-by: zuodongxu <192560071+zuodongxu@users.noreply.github.com>
Co-authored-by: Masashi Tomooka <tmokmss@users.noreply.github.com>
Co-authored-by: aplio <ryo.091219@gmail.com>
Co-authored-by: Obada Khalili <54270856+obadakhalili@users.noreply.github.com>
Co-authored-by: Nam Vu <zuzoovn@gmail.com>
Co-authored-by: Kei YAMAZAKI <1715090+kei-yamazaki@users.noreply.github.com>
Co-authored-by: TechnoHouse <13776377+deephbz@users.noreply.github.com>
Co-authored-by: Riddhimaan-Senapati <114703025+Riddhimaan-Senapati@users.noreply.github.com>
Co-authored-by: MaFee921 <31881301+2284730142@users.noreply.github.com>
Co-authored-by: te-chan <t-nakanome@sakura-is.co.jp>
Co-authored-by: HQidea <HQidea@users.noreply.github.com>
Co-authored-by: Joshbly <36315710+Joshbly@users.noreply.github.com>
Co-authored-by: xhe <xw897002528@gmail.com>
Co-authored-by: weiwenyan-dev <154779315+weiwenyan-dev@users.noreply.github.com>
Co-authored-by: ex_wenyan.wei <ex_wenyan.wei@tcl.com>
Co-authored-by: engchina <12236799+engchina@users.noreply.github.com>
Co-authored-by: engchina <atjapan2015@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: 呆萌闷油瓶 <253605712@qq.com>
Co-authored-by: Kemal <kemalmeler@outlook.com>
Co-authored-by: Lazy_Frog <4590648+lazyFrogLOL@users.noreply.github.com>
Co-authored-by: Yi Xiao <54782454+YIXIAO0@users.noreply.github.com>
Co-authored-by: Steven sun <98230804+Tuyohai@users.noreply.github.com>
Co-authored-by: steven <sunzwj@digitalchina.com>
Co-authored-by: Kalo Chin <91766386+fdb02983rhy@users.noreply.github.com>
Co-authored-by: Katy Tao <34019945+KatyTao@users.noreply.github.com>
Co-authored-by: depy <42985524+h4ckdepy@users.noreply.github.com>
Co-authored-by: 胡春东 <gycm520@gmail.com>
Co-authored-by: Junjie.M <118170653@qq.com>
Co-authored-by: MuYu <mr.muzea@gmail.com>
Co-authored-by: Naoki Takashima <39912547+takatea@users.noreply.github.com>
Co-authored-by: Summer-Gu <37869445+gubinjie@users.noreply.github.com>
Co-authored-by: Fei He <droxer.he@gmail.com>
Co-authored-by: ybalbert001 <120714773+ybalbert001@users.noreply.github.com>
Co-authored-by: Yuanbo Li <ybalbert@amazon.com>
Co-authored-by: douxc <7553076+douxc@users.noreply.github.com>
Co-authored-by: liuzhenghua <1090179900@qq.com>
Co-authored-by: Wu Jiayang <62842862+Wu-Jiayang@users.noreply.github.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: kimjion <45935338+kimjion@users.noreply.github.com>
Co-authored-by: AugNSo <song.tiankai@icloud.com>
Co-authored-by: llinvokerl <38915183+llinvokerl@users.noreply.github.com>
Co-authored-by: liusurong.lsr <liusurong.lsr@alibaba-inc.com>
Co-authored-by: Vasu Negi <vasu-negi@users.noreply.github.com>
Co-authored-by: Hundredwz <1808096180@qq.com>
Co-authored-by: Xiyuan Chen <52963600+GareArc@users.noreply.github.com>
2025-02-17 17:05:13 +08:00
Charlie.Wei
222df44d21
Retrieval Service efficiency optimization ( #13543 )
2025-02-17 14:09:57 +08:00
Bowen Liang
0751ad1eeb
feat(vdb): add HNSW vector index for TiDB vector store with TiFlash ( #12043 )
2025-02-12 13:53:51 +08:00
liuzhenghua
47a64610ca
Fix the issue of repeated escaping of quotes in hit test ( #13477 )
2025-02-11 09:58:31 +08:00
Ademílson Tonato
d0a21086bd
refactor: Update Firecrawl API parameters and default settings ( #13082 )
2025-01-29 11:21:05 +08:00
Ademílson Tonato
6024d8a42d
refactor: Update Firecrawl to use v1 API ( #12574 )
...
Co-authored-by: Ademílson Tonato <ademilson.tonato@refurbed.com>
2025-01-23 11:14:48 +08:00
huangzhuo1949
4c3076f2a4
feat: add pg vector index ( #12338 )
...
Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com>
2025-01-22 17:07:18 +08:00
Bowen Liang
166221d784
chore(lint): fix quotes for f-string formatting by bumping ruff to 0.9.x ( #12702 )
2025-01-21 10:12:29 +08:00
yihong
4e101604c3
fix: ruff check for True if ... else ( #12576 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-01-13 09:38:48 +08:00
CN-P5
cd257b91c5
Fix pandas indexing method for knowledge base imports ( #12637 ) ( #12638 )
...
Co-authored-by: CN-P5 <heibai2006@qq.com>
2025-01-13 09:06:59 +08:00
YoungLH
040a3b782c
FEAT: support milvus to full text search ( #11430 )
...
Signed-off-by: YoungLH <974840768@qq.com>
2025-01-08 17:39:53 +08:00
Yingchun Lai
53bb37b749
fix: fix the incorrect plaintext file key when saving ( #10429 )
2025-01-08 12:52:45 +08:00
Hiroshi Fujita
d2586278d6
Feat elasticsearch japanese ( #12194 )
2025-01-08 12:35:41 +08:00
Jyong
05bda6f38d
add tidb on qdrant redis lock ( #12462 )
2025-01-08 08:55:44 +08:00
huangzhuo1949
70698024f5
fix: empty delete bug ( #12339 )
...
Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com>
2025-01-03 20:46:39 +08:00
Jyong
b873e6349c
add child chunk preview number limit ( #12309 )
2025-01-03 16:14:27 +08:00
-LAN-
8d15c8cfbf
fix: improve error handling in NotionExtractor data fetching ( #12182 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2024-12-29 11:53:09 +08:00
-LAN-
dae1b5a619
fix: import jieba.analyse ( #12133 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2024-12-27 11:37:55 +08:00
Jyong
811e4bd0cf
fix unstructured setting ( #12116 )
2024-12-26 12:08:36 +08:00
Jyong
84ac004772
py lint ( #12102 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
2024-12-26 00:16:35 +08:00
Jyong
9231fdbf4c
Feat/support parent child chunk ( #12092 )
2024-12-25 19:49:07 +08:00
yihong
56e15d09a9
feat: mypy for all type check ( #10921 )
2024-12-24 18:38:51 +08:00
-LAN-
599d410d99
fix: validate reranking model attributes before processing ( #11930 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2024-12-21 21:23:12 +08:00
-LAN-
8c559d6231
fix(retrieval_service): avoid to use exception ( #11925 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2024-12-21 21:19:46 +08:00
yihong
7b03a0316d
fix: better memory usage from 800+ to 500+ ( #11796 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-12-20 14:51:43 +08:00
yihong
463fbe2680
fix: better gard nan value from numpy for issue #11827 ( #11864 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-12-20 09:28:32 +08:00
yihong
5a8a901560
fix: float values are not json for nan value close #11827 ( #11840 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-12-19 20:50:20 +08:00
Jiang
ad17ff9a92
Lindorm vdb bug-fix ( #11790 )
...
Co-authored-by: jiangzhijie <jiangzhijie.jzj@alibaba-inc.com>
2024-12-18 15:19:20 +08:00
Bowen Liang
924b4fe742
test: run vdb tests on TiDB Vector with docker in CI tests ( #11645 )
2024-12-15 17:16:40 +08:00
yihong
22258fb0bf
fix: filter bug for keywork cause code can not reach ( #11666 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-12-15 17:12:06 +08:00
yihong
36cb25b341
fix: support mdx files close #11557 ( #11565 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-12-12 13:37:56 +08:00
Jiang
0d04cdc323
Lindorm vdb ( #11574 )
...
Co-authored-by: jiangzhijie <jiangzhijie.jzj@alibaba-inc.com>
2024-12-12 09:43:27 +08:00
Jyong
9b7adcd4d9
update tidb batch get endpoint to basic mode ( #11426 )
2024-12-06 17:06:46 +08:00
Jyong
d7c1f43b49
fix tidb full-text-search vector missed ( #11337 )
2024-12-04 16:13:23 +08:00
Jyong
c58d2fce89
roll back rerank topn setting ( #11297 )
2024-12-03 17:34:56 +08:00
yihong
e686f12317
fix: better handle error ( #11265 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-12-03 09:15:38 +08:00
-LAN-
9601102885
fix(word_extractor): Fix type error and remove stream in ssrf_proxy ( #11241 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2024-12-02 10:24:03 +08:00
Cling_o3
f9c2aa7689
feat: add retireval_top_n to config in env ( #11132 )
2024-11-30 11:14:45 +08:00
kazuya-awano
2d6865d421
Ensure consistent float type for cached embedding return values ( #10185 )
2024-11-29 09:18:41 +08:00
yihong
d7160ee563
fix: typo in upstashVector if id is always true, also fix some type hint ( #11183 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-11-28 14:05:25 +08:00
-LAN-
9789905a1f
chore(*): Removes debugging print statements ( #11145 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com>
2024-11-26 22:03:19 +08:00
Bowen Liang
6c8e208ef3
chore: bump minimum supported Python version to 3.11 ( #10386 )
2024-11-24 13:28:46 +08:00
yihong
ed55de888a
fix: rules should not be None for in ( #10977 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-11-22 23:04:20 +08:00
AkisAya
cb0c55daa7
fix weight rerank of knowledge retrieval ( #10931 )
2024-11-21 17:53:20 +08:00
yihong
58a9d9eb9a
fix: better WeightRerankRunner run logic use O(1) and delete unused code ( #10849 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-11-19 20:12:13 +08:00