No Programming, No Life
Jan 8, 2017 • 2 min read

Scraping by using CasperJS[JavaScript]

It is a tips that you can use when you want to scrape a page using JavaScript. Since CasperJS can be instaled easily by npm, it is recomended to use in sever side.


Installation

Installation command for CasperJS is as follows:

npm install casperjs -g

Source Code

The following code is a a script to scrape web page by using CasperJS. In this case, I tried to be able to scpecify URL to scrape and a destination to save scraped HTML.

// scrape.js
var casper = require('casper').create();
casper.userAgent('Mozilla/5.0 (Linux; Android 4.0.4; Galaxy Nexus Build/IMM76B) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.133 Mobile Safari/535.19');

var url = casper.cli.args[0];
var outputPath = casper.cli.args[1];

casper.start(url);

casper.then(function() {
  casper.wait(5000, function() {
    // Get HTML
    var html = this.evaluate(function(){
      return document.querySelector("html").outerHTML;
    });

    // Save HTML
    fs = require('fs');
    fs.write(outputPath, html, 'w');
  }
});

casper.run();

After that, if you run the command below, HTML of your target URL will be saved.

casperjs scrape.js https://google.com /tmp/html.txt --ssl-protocol=TLSv1

Happy Hacking!

Special Thanks