Perceptual Webhashing with PhantomJS and ImageMagick Compare

We can use phantomjs to harvest snapshots of a website over time like this

#!/bin/bash
while true
do
     phantomjs /home/csr/projects/webrender-phantomjs/webrender/examples/rasterize.js  http://www.dr.dk/nyheder $(date +%s).dr.png 1280px
sleep 300s
done

using the rasterize.js script from the British Library. Recent versions of ImageMagick's (>=6.8) compare command include a Perceptual Hash function which allows one to make both a qualitative and quantitative comparison between images (even when their size differs). For example

#!/bin/bash
first=0
for file in $(ls -rt *.dr.png)
do
   if [ $first -eq 0 ]
   then
         file1=$file 
   else
         file2=$file1
         file1=$file
         date1=$(date -d @$(echo $file1|cut -d '.' -f 1) -Iminutes)
         date2=$(date -d @$(echo $file2|cut -d '.' -f 1) -Iminutes)         
         echo $file2 $date2 $file1 $date1 $(/opt/imagemagick-6.9/bin/compare -metric phash $file2 $file1 $file2.$file1.diff.png 2>&1 >/dev/null)
   fi
   first=1
done

which gives output that looks like

1449216835.dr.png 2015-12-04T09:13+0100 1449217139.dr.png 2015-12-04T09:18+0100 1.10119
1449217139.dr.png 2015-12-04T09:18+0100 1449217443.dr.png 2015-12-04T09:24+0100 11.3124
1449217443.dr.png 2015-12-04T09:24+0100 1449217747.dr.png 2015-12-04T09:29+0100 0.105403
1449217747.dr.png 2015-12-04T09:29+0100 1449218052.dr.png 2015-12-04T09:34+0100 0.248067
1449218052.dr.png 2015-12-04T09:34+0100 1449218356.dr.png 2015-12-04T09:39+0100 0.0913965
1449218356.dr.png 2015-12-04T09:39+0100 1449218661.dr.png 2015-12-04T09:44+0100 0.0234691
1449218661.dr.png 2015-12-04T09:44+0100 1449218967.dr.png 2015-12-04T09:49+0100 0.915116
1449218967.dr.png 2015-12-04T09:49+0100 1449219272.dr.png 2015-12-04T09:54+0100 0.00781808
1449219272.dr.png 2015-12-04T09:54+0100 1449219576.dr.png 2015-12-04T09:59+0100 0.0145241
1449219576.dr.png 2015-12-04T09:59+0100 1449219883.dr.png 2015-12-04T10:04+0100 21.7091
1449219883.dr.png 2015-12-04T10:04+0100 1449220194.dr.png 2015-12-04T10:09+0100 6.08325
1449220194.dr.png 2015-12-04T10:09+0100 1449220500.dr.png 2015-12-04T10:15+0100 1.73225
1449220500.dr.png 2015-12-04T10:15+0100 1449220807.dr.png 2015-12-04T10:20+0100 6.5083
1449220807.dr.png 2015-12-04T10:20+0100 1449221111.dr.png 2015-12-04T10:25+0100 13.2742

Small numbers indicate similarity. So here, for example, there is period of stasis between about 09:24 and 09:59, after which things become more dynamic. We can see this visually using ImageMagick's montage function:

 montage -geometry +0+0 -tile x1 *.diff.png -resize 10% -quality 15 montage.dr.jpg

The red areas indicate change. Specifically the large change between 09:59 and 10:04 is due to a change in the main photo (from a distant shot of Lars Løkke Rasmussen to a close-up). Note that changes to the banner advert at the bottom of the page (sometimes present, sometimes not) confuse the result somewhat.