Ranter
Join devRant
Do all the things like
				++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
				Sign Up
			Pipeless API
 
				From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
				Learn More
			Comments
		- 
				
				 Froot74578yWhat are you using to scrape? Froot74578yWhat are you using to scrape?
 
 I've done it with Node.js + Request + Cheerio. Easy as pie
- 
				
				I'm using cheerio. But it's not framework that's bothering me. It's the web scraping itself. I hate doing it.
- 
				
				 github95078yI used Python request or urllib2 and beautifulsoup.. github95078yI used Python request or urllib2 and beautifulsoup..
 And you can good amount if you are in your clg enough for monthly as side pocket money.. lots of startup pay for scrapping. And once you are good in it, it's like minute DOM selector parse change and you earn the same amount but every iterative time, doing less effort.. and you can eventually end up making a more generic your own scrapping framework handling different types of websites...
- 
				
				 github95078yAnd once during an internship at a startup, I crawled the entire LinkedIn member and companies directory and stored it in my Local drive... github95078yAnd once during an internship at a startup, I crawled the entire LinkedIn member and companies directory and stored it in my Local drive...
- 
				
				@beriba
 Yay first time time I see anyone using perl here :)
 Having done scraping in both python and perl, I'll take perl if I can choose.
 It takes a third of the code and it runs 3 times faster
- 
				
				 zshh38248yI'm using Python with requests as far as it gets me. Sometimes I need to use Selenium with PhantomJS headless browser, if the website uses JS to dynamically update the HTML. zshh38248yI'm using Python with requests as far as it gets me. Sometimes I need to use Selenium with PhantomJS headless browser, if the website uses JS to dynamically update the HTML.







I hate web scraping.
undefined
web scraping fucking sucks