QZ qz thoughts
a blog from Eli the Bearded
Tag search results for software Page 3 of 5

Heritrix Crawler


Heritrix is an open-source archival web-crawler.

"Heritrix (sometimes spelled heretrix, or misspelled or missaid as heratrix / heritix / heretix / heratix) is an archaic word for inheritess. Since our crawler seeks to collect the digital artifacts of our culture for the benefit of future researchers and generations, this name seemed apt."

dead link: http://crawler.archive.org/

Fixed link