unclear or unsuitable licensing.
In general, we replace the files we can with liberally licensed data,
and remove all the others (in particular all the parts of the Canterbury
corpus that are not clearly in the public domain). The replacements
do not always have the exact same characteristics as the original ones,
but they are more than good enough to be useful for benchmarking.
git-svn-id: https://snappy.googlecode.com/svn/trunk@83 03e5f5b5-db94-4691-08a0-1a8bf15f6143