dify/api/core/rag/extractor
Wesley b62eb61400
fix depth param issue for WaterCrawl (#18839)
2025-04-27 11:04:56 +08:00
..
blob chore: refurbish Python code by applying refurb linter rules (#8296) 2024-09-12 15:50:49 +08:00
entity feat: mypy for all type check (#10921) 2024-12-24 18:38:51 +08:00
firecrawl chore: docstring not match the function parameter (#17162) 2025-03-31 13:19:15 +08:00
unstructured add doc support in knowledge base for unstructured (#17352) 2025-04-02 21:35:01 +08:00
watercrawl fix depth param issue for WaterCrawl (#18839) 2025-04-27 11:04:56 +08:00
csv_extractor.py chore(api/core): apply ruff reformatting (#7624) 2024-09-10 17:00:20 +08:00
excel_extractor.py py lint (#12102) 2024-12-26 00:16:35 +08:00
extract_processor.py feat: Integrate WaterCrawl.dev as a new knowledge base provider (#16396) 2025-04-07 12:43:23 +08:00
extractor_base.py chore(api/core): apply ruff reformatting (#7624) 2024-09-10 17:00:20 +08:00
helpers.py chore: refurbish Python code by applying refurb linter rules (#8296) 2024-09-12 15:50:49 +08:00
html_extractor.py feat: mypy for all type check (#10921) 2024-12-24 18:38:51 +08:00
jina_reader_extractor.py feat(website-crawl): add jina reader as additional alternative for website crawling (#8761) 2024-09-30 09:57:19 +08:00
markdown_extractor.py chore: refurbish Python code by applying refurb linter rules (#8296) 2024-09-12 15:50:49 +08:00
notion_extractor.py chore(lint): fix quotes for f-string formatting by bumping ruff to 0.9.x (#12702) 2025-01-21 10:12:29 +08:00
pdf_extractor.py fix: fix the incorrect plaintext file key when saving (#10429) 2025-01-08 12:52:45 +08:00
text_extractor.py chore: refurbish Python code by applying refurb linter rules (#8296) 2024-09-12 15:50:49 +08:00
word_extractor.py Switching from CONSOLE_API_URL to FILES_URL in word_extractor.py (#18249) 2025-04-18 16:05:48 +08:00