dify/extractor at 086ee4c19d7612aeef15ae4da1d0d8bff57df724 - dify

History

Jyong db42f467c8 fix: docx extractor external image failed (#29558 )		2025-12-12 13:41:51 +08:00
..
blob	chore: add ast-grep rule to convert Optional[T] to T \| None (#25560 )	2025-09-15 13:06:33 +08:00
entity	use model_validate (#26182 )	2025-10-10 17:30:13 +09:00
firecrawl	refactor: Update Firecrawl to use v2 API (#24734 )	2025-10-15 10:48:54 +08:00
unstructured	refactor: use dynamic max characters for chunking in extractors (#26782 )	2025-10-13 10:22:59 +08:00
watercrawl	change all to httpx (#26119 )	2025-10-10 23:41:16 +08:00
csv_extractor.py	chore: add ast-grep rule to convert Optional[T] to T \| None (#25560 )	2025-09-15 13:06:33 +08:00
excel_extractor.py	perf(core/rag): optimize Excel extractor performance and memory usage (#29551 )	2025-12-12 12:15:03 +08:00
extract_processor.py	remove .value (#26633 )	2025-10-11 09:08:29 +08:00
extractor_base.py	chore(api/core): apply ruff reformatting (#7624 )	2024-09-10 17:00:20 +08:00
helpers.py	feat: using charset_normalizer instead of chardet (#29022 )	2025-12-05 11:19:19 +08:00
html_extractor.py	chore: cleanup unnecessary mypy suppressions on imports (#24712 )	2025-08-28 23:17:25 +08:00
jina_reader_extractor.py	feat: knowledge pipeline (#25360 )	2025-09-18 12:49:10 +08:00
markdown_extractor.py	chore: add ast-grep rule to convert Optional[T] to T \| None (#25560 )	2025-09-15 13:06:33 +08:00
notion_extractor.py	change all to httpx (#26119 )	2025-10-10 23:41:16 +08:00
pdf_extractor.py	chore: add ast-grep rule to convert Optional[T] to T \| None (#25560 )	2025-09-15 13:06:33 +08:00
text_extractor.py	chore: add ast-grep rule to convert Optional[T] to T \| None (#25560 )	2025-09-15 13:06:33 +08:00
word_extractor.py	fix: docx extractor external image failed (#29558 )	2025-12-12 13:41:51 +08:00