...
That's only slightly more tricky. You have to tell Crawljax not only to proxy https but to accept untrusted certificates, like the one provided by warcprox. The code is in https://github.com/csrster/crawljax-archiver/blob/master/src/main/java/CrawlJAXProxyHarvester.java .
Show me some pictures
So that's solved all out webcrawling problems
...