Scraping by using CasperJS[JavaScript]


It is a tips that you can use when you want to scrape a page using JavaScript.
Since CasperJS can be instaled easily by npm, it is recomended to use in sever side.


😼 Installation

Installation command for CasperJS is as follows:

npm install casperjs -g

🍣 Source Code

The following code is a a script to scrape web page by using CasperJS.
In this case, I tried to be able to scpecify URL to scrape and a destination to save scraped HTML.

// scrape.js
var casper = require('casper').create();
casper.userAgent('Mozilla/5.0 (Linux; Android 4.0.4; Galaxy Nexus Build/IMM76B) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.133 Mobile Safari/535.19');

var url = casper.cli.args[0];
var outputPath = casper.cli.args[1];

casper.start(url);

casper.then(function() {
casper.wait(5000, function() {
// Get HTML
var html = this.evaluate(function(){
return document.querySelector("html").outerHTML;
});

// Save HTML
fs = require('fs');
fs.write(outputPath, html, 'w');
}
});

casper.run();

After that, if you run the command below, HTML of your target URL will be saved.

casperjs scrape.js https://google.com /tmp/html.txt --ssl-protocol=TLSv1

Happy Hacking!

🐠 Special Thanks

🖥 Recommended VPS Service

VULTR provides high performance cloud compute environment for you. Vultr has 15 data-centers strategically placed around the globe, you can use a VPS with 512 MB memory for just $ 2.5 / month ($ 0.004 / hour). In addition, Vultr is up to 4 times faster than the competition, so please check it => Check Benchmark Results!!