# Python multithreading solution

Here, we will create a simple stochastic calculation of pi, and then parallelize it using multiprocessing (and multithreading to compare).

```
import random
```

```
def sample(n):
"""Make n trials of points in the square. Return (n, number_in_circle)
This is our basic function. By design, it returns everything it\
needs to compute the final answer: both n (even though it is an input
argument) and n_inside_circle. To compute our final answer, all we
have to do is sum up the n:s and the n_inside_circle:s and do our
computation"""
n_inside_circle = 0
for i in range(n):
x = random.random()
y = random.random()
if x**2 + y**2 < 1.0:
n_inside_circle += 1
return n, n_inside_circle
```

```
%%timeit
# Do it just for timing
n, n_inside_circle = sample(10**6)
```

```
314 ms ± 1.37 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
```

```
# Do the actual calculation (the previous result doesn't get saved)
n, n_inside_circle = sample(10**6)
```

This is the “calculate answer” phase.

```
pi = 4.0 * (n_inside_circle / n)
pi
```

```
3.140036
```

## Do it in parallel with multiprocessing

This divides the calculation into 10 tasks and runs `sample`

on each of them. Then it re-combines the results.

```
import multiprocessing.pool
pool = multiprocessing.pool.Pool()
# The default pool makes one process per CPU
```

```
%%timeit
# Do it once to time it
results = pool.map(sample, [10**5] * 10)
```

```
152 ms ± 437 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
```

```
# Do it again to get the results, since the results of the above
# cell aren't accessible because of the %%timeit magic.
results = pool.map(sample, [10**5] * 10)
```

```
pool.close()
```

```
n_sum = sum(x[0] for x in results)
n_inside_circle_sum = sum(x[1] for x in results)
pi = 4.0 * (n_inside_circle_sum / n_sum)
pi
```

```
3.139504
```

## Do it in “parallel” with threads

To compare. This should not be any faster, because the multiple Python functions can not run at the same time in the same process.

```
threadpool = multiprocessing.pool.ThreadPool()
```

```
%%timeit -o
# Do it once to time it
threadpool.map(sample, [10**5] * 10)
```

```
288 ms ± 1.63 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
```

```
<TimeitResult : 288 ms ± 1.63 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)>
```

```
# Do it again to get the results, since the results of the above
# cell aren't accessible because of the %%timeit magic.
results = threadpool.map(sample, [10**5] * 10)
```

```
threadpool.close()
```

```
n_sum = sum(x[0] for x in results)
n_inside_circle_sum = sum(x[1] for x in results)
pi = 4.0 * (n_inside_circle_sum / n_sum)
pi
```

```
3.142448
```

## Future ideas

You could make a separate `calculate`

function that take a list of results and returns pi. This can be used regardless of if it is done with multiprocessing or without.

Notice the similarity to split-apply-combine or map-reduce which is a specialization of split-apply-combine.