A Simple Spider for Researcher Ranking Project
March 28, 2011 Leave a comment
A important issue in ranking researchers is to construct the citation network between researchers. To achieve it, I need crawl the database of cictation data from HistCite Website, whose URL is http://www.garfield.library.upenn.edu/histcomp/index.html
I downloaded a python spider from web and revised it to make it useable. It is not perfect but enough for this project. It is really cool to watch command windows scrolling down and downloading thousands of papes. Wow, is it so called geek behaviour?