Contributors to Crawl4AI

We would like to thank the following people for their contributions to Crawl4AI:

Core Team

Unclecode - Project Creator and Main Developer
Nasrin - Project Manager and Developer
Aravind Karnam - Head of Community and Product

Community Contributors

aadityakanjolia4 - Fix for CustomHTML2Text is not defined.
FractalMind - Created the first official Docker Hub image and fixed Dockerfile errors
ketonkss4 - Identified Selenium's new capabilities, helping reduce dependencies
jonymusky - Javascript execution documentation, and wait_for
datehoer - Add browser prxy support

Pull Requests

dvschuyl - AsyncPlaywrightCrawlerStrategy page-evaluate context destroyed by navigation #304
nelzomal - Enhance development installation instructions #286
HamzaFarhan - Handled the cases where markdown_with_citations, references_markdown, and filtered_html might not be defined #293
NanmiCoder - fix: crawler strategy exception handling and fixes #271
paulokuong - fix: RAWL4_AI_BASE_DIRECTORY should be Path object instead of string #298
TheRedRad - feat: add force viewport screenshot option #1694
ChiragBellara - fix: avoid Common Crawl calls for sitemap-only URL seeding #1746
YuriNachos - fix: replace tf-playwright-stealth with playwright-stealth #1714, fix: respect <base> tag for relative link resolution #1721, fix: include GoogleSearchCrawler script.js in package #1719, fix: allow local embeddings by removing OpenAI fallback #1717, docs: add missing CacheMode import #1715, docs: fix return types to RunManyReturn #1716
christian-oudard - fix: deep-crawl CLI outputting only the first page #1667
vladmandic - fix: VersionManager ignoring CRAWL4_AI_BASE_DIRECTORY env var #1296
nnxiong - fix: script tag removal losing adjacent text in cleaned_html #1364
RoyLeviLangware - fix: bs4 deprecation warning (text -> string) #1077
garyluky - fix: proxy auth ERR_INVALID_AUTH_CREDENTIALS #1281
Martichou - investigation: browser context memory leak under continuous load #1640, #943
danyQe - identified: temperature typo in async_configs.py #973
saipavanmeruga7797 - identified: local HTML file crawling bug with capture_console_messages #1073
stevenaldinger - identified: duplicate PROMPT_EXTRACT_BLOCKS dead code in prompts.py #931
chrizzly2309 - identified: JWT auth bypass when no credentials provided #1133
complete-dope - identified: console logging error attribute issue #729
TristanDonze - feat: add configurable device_scale_factor for screenshot quality #1463
charlaie - feat: add redirected_status_code to CrawlResult #1435
mzyfree - investigation: Docker concurrency performance and pool resource management #1689
nightcityblade - fix: prevent AdaptiveCrawler from crawling external domains #1805
Otman404 - fix: return in finally block silently suppressing exceptions in dispatcher #1763
SohamKukreti - fix: from_serializable_dict ignoring plain data dicts with "type" key #1803, fix: deep-crawl streaming mirrors Python library behavior #1798
Br1an67 - fix: handle nested brackets and parentheses in LINK_PATTERN regex #1790, identified: strip markdown fences in LLM JSON responses #1787, fix: preserve class/id in cleaned_html #1782, fix: guard against None LLM content #1788, fix: strip port from domain in is_external_url #1783, fix: UTF-8 encoding for CLI output #1789, fix: configurable link_preview_timeout #1793, fix: wait_for_images on screenshot endpoint #1792, fix: cross-platform terminal input in CrawlerMonitor #1794, fix: UnicodeEncodeError in URL seeder #1784, fix: wire mean_delay/max_range into dispatcher #1786, fix: DOMParser in process_iframes #1796, fix: require api_token for /token endpoint #1795
nightcityblade - feat: add score_threshold to BestFirstCrawlingStrategy #1804
phamngocquy - identified: raw HTML URL token leak #1179
AkosLukacs - docs: fix docstring param name crawler_config -> config #1494
dominicx - docs: fix css_selector type from list to string #1308
hoi - fix: add TTL expiry for Redis task data #1730
maksimzayats - docs: modernize deprecated API usage across 25 files #1770
jtanningbed - fix: add newline before opening code fence in html2text #462
Ahmed-Tawfik94 - identified: redirect target verification in URL seeder #1622
hafezparast - identified: PDFContentScrapingStrategy deserialization fix #1815; fix: screenshot distortion, deep crawl timeout/arun_many, CLI encoding #1829
pgoslatara - chore: update GitHub Actions to latest versions #1734
130347665 - feat: type-list pipeline in JsonCssExtractionStrategy #1290
microHoffman - feat: add --json-ensure-ascii CLI flag for Unicode handling #1668

Feb-Alpha-1

Other Contributors

Typo fixes

Acknowledgements

We also want to thank all the users who have reported bugs, suggested features, or helped in any other way to make Crawl4AI better.

If you've contributed to Crawl4AI and your name isn't on this list, please open a pull request with your name, link, and contribution, and we'll review it promptly.

Thank you all for your contributions!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Contributors to Crawl4AI

Core Team

Community Contributors

Pull Requests

Feb-Alpha-1

Other Contributors

Typo fixes

Acknowledgements

Uh oh!

FilesExpand file tree

CONTRIBUTORS.md

Latest commit

History

CONTRIBUTORS.md

File metadata and controls

Contributors to Crawl4AI

Core Team

Community Contributors

Pull Requests

Feb-Alpha-1

Other Contributors

Typo fixes

Acknowledgements