Subscribe to Vincent Granville's Weekly Digest:
Scrape web pages, measure length, time to download, etc. Avoid common pitfalls by using time out (2 seconds) and max size (64 KB) parameters. URL's to visit are stored in an input text file.

See attachment.

Views: 392

Attachments:

Replies to This Discussion

Hi - I'm hoping someone can help me with getting this bot to run. I'm new to perl so please excuse me if I'm missing something obvious. This is what I've done so far:
1) Installed Strawberry perl
2) Saved robotCF2.pl to c:\strawberry
3) created textfile called URLlist.txt in c:\strawberry containing one url: www.smh.com.au
4) At the commsnd prompt I've changed director to c:\strawberry then submitted the command: perl robotCF2.pl
5) The program runs and creates a file called robotout.txt in c:\strawberry but the file is empty.

Any assistance to fix this would be greatly appreciated.

Cheers
Don't worry - I now have this working

RSS

Follow us

© 2013   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC

Badges  |  Report an Issue  |  Terms of Service