gnuCrawl&Map - more than a sitemap generator
key features
- create XML sitemaps for search engines
- create a .CSV file, that contains the meta information of all fetched URLs
- The application runs on all operating systems for which a Java Runtime Environment is available.
(You can download Java here if it is not installed on your machine)
Options and parameters
You can ...- set up different sitemap parameters (lastmod, changefreq, priority)
- use a proxy server (HTTP or SOCKS proxy / with or without authentification)
- choose whether you like to have "www." URLs or non-"www." URLs in your sitemap (duplicates will be removed for SEO reasons)
- choose whether the software takes account into robots information ("nofollow" and "noindex" tags, robots.txt files)
- set a maximum number of links that will be fetched and added to the sitemap
- choose which file types should be downloaded and crawled and whether you like to add downloads (files) to the sitemap
- set up filters if you want to fetch only URLs and/or content that contain user-defined terms or that does not contain these terms
Download
Here you can download gnuCrawl&Map as an executable Java file (.jar).Download gnuCrawl&Map 0.9 beta
Some general information
Please note:- As already stated, you need Java to run the application. Java can be downloaded here.
- The application runs quite stable. Nevertheless it is a beta verison and it may contain some errors. If you find a bug or just want to give me some feedback, please contact me via the contact form or the #gnuyork IRC channel.