28 September 2022
A simple demo of how pure functions simplify parallelisation
One thing that bugs me about teaching computer science is the amount of ‘rote learning’ – teaching pupils facts to memorise without understanding.
As an example, one of the rote-learned facts about ‘functional programming’ (FP) – in the AQA A-level syllabus – is that writing systems from pure, side-effect free, functions makes parallelisation easier. But can a pupil at A-level standard really experience this for themselves? Here’s a nice example in C# with a VB translation at the end. (I’m afraid I don’t know Python in enough depth to know how to do it in Python, or even if it is possible – if anyone can help please do.)
Here's a functional approach to identifying all the primes up to a given value. It’s not an efficient efficient algorithm but it serves the challenge well, because it is computationally expensive:
static List<int> Primes(int upTo) => Enumerable.Range(2, upTo - 1).Where(n => IsPrime(n)).ToList();
static bool IsPrime(int n) => !Enumerable.Range(2, n / 2 - 1).Any(f => IsFactor(f, n));
static bool IsFactor(int f, int ofN) => ofN % f == 0;
We can time this (using the system stopwatch) generating all the primes up to 1 million:
using System.Diagnostics;
var sw = new Stopwatch();
sw.Start();
var result = Primes(1000000);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds/1000);
On my laptop (which has an Intel i7 processor) this task takes approx. 140 seconds. Because my Primes function is written using LINQ, which uses, and requires working with, side-effect free functions, I can easily parallelise this just by adding the call AsParallel() at the appropriate point, as shown here:
static List<int> Primes(int upTo) => Enumerable.Range(2, upTo + 1).AsParallel().Where(n => IsPrime(n)).ToList();
Running this again, took approx. 30 seconds on my laptop, which is a speed up of 3.8x. This is because my processor has 4 (physical) cores.
Observing the core usage
You can observe what is happening if you open the Resource Monitor (go to the Task Manager > Performance tab; there’s a link to the Resource Monitor at the bottom of the window).
Running the Primes function again, without the AsParallel(), I found that my processor was running at just under 30% capacity. Surprisingly, perhaps, all the cores are being used, but – at least for the function execution itself (bear in mind there are other processes being run too) no more that one core is actually running the function at any given moment. (Why it is switching between cores is another discussion).
But running with the AsParallel() not only is the processor running at near 100% capacity overall but each of the cores is running at near 100%.
More detail
My processor actually has 8 ‘logical’ cores, but that is no advantage over the number of physical cores on a task like this, which is mathematically intensive, because each pair of logical cores has to share the arithmetic circuits.
You are never going to get 4x improvement for two reasons:
- The processor is still having to run other background processors for the operating system
- There is some overhead from breaking up the data for parallelisation and then re-assembling it to generate the resulting list.
Translation into VB
Public Function Primes(upTo As Integer) As List(Of Integer)
Return Enumerable.Range(2, upTo - 1).Where(Function(n) IsPrime(n)).ToList()
End Function
Public Function IsPrime(n As Integer) As Boolean
Return Not Enumerable.Range(2, n / 2 - 1).Any(Function(f) IsFactor(f, n))
End Function
Public Function IsFactor(f As Integer, ofN As Integer)
Return ofN Mod f = 0
End Function
And with the parallelism:
Public Function Primes(upTo As Integer) As List(Of Integer)
Return Enumerable.Range(2, upTo + 1).AsParallel().Where(Function(n) IsPrime(n)).ToList()
End Function
Discussion
Please login to post a comment
Great demo. Thanks Richard. Really appreciated. I will try with year 13 too.
Thanks,
Just tried this on a MacBook Pro - half the execution time with parallelisation, interestingly the reported CPU utilisation was 97% without parallelisation and hit 373% with, which leads to question about how MacOS measures that…
Thanks Richard - my Y13s will love this. Great explanation of the difference between logical and physical cores.