A very basic erlang Crawler

1.Save the code below as spyder.erl

2.Run as : erl spyder.erl  -s spyder start http://site_to_crawl.com -s init stop

http://pastebin.com/Qw9f11bi

This is a very crude crawler – will crawl all the links in a page, and then further. It doesn’t protect you from black holes, and will crawl away without concern for robots.txt. Just something to brush up my rusty(probably visible in the code ?) erlang. But what the hell – it works ! Feel free to improve on it :)

Advertisements
This entry was posted in Uncategorized and tagged , . Bookmark the permalink.

One Response to A very basic erlang Crawler

  1. forex india says:

    I’ve been searching on the web trying to find ideas on how to get my personal blog site coded, your present style and theme are wonderful. Did you code it your self or did you recruit a coder to get it done for you personally?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s