Add benchmark where the sequential Rust version of the word count is executed twice to demonstrate parallelism with Python threads. Also slightly simplify the benchmark functions.
* Test and benchmark word-count example * Optimize rust word_count