QuantEcon 1 Scipy Submodules

Python
Scipy
NumPy
scipy.optimize
Newton Raphson method
Brents method
timeit
bisect method
optimization
Author

Galen Seilis

Published

March 28, 2019

As someone with previous background in Python, I’ve been blasting my way through the basics of the Quantecon curricula. One of the joys of self-directed learning is that, with discipline, you can speed through familar material and really camp out with the new material. With that in mind, I’ve decided to further play with finding solutions (x-intercepts) of some single variable functions.

First of all, let’s find ourselves an interesting function. I’ve chosen \(f(x) = \sin(x) \exp(-x)\) because I’ve always enjoyed its degradating oscillations, but also because I expect this equation to have solutions. Since any integer multiple \(k\) of \(\pi\) will result in \(\sin(x) = 0\) when \(x = k \pi\), we know that \(f(k \pi) = 0\) as well. While I’m quite late (or too early, depending on how you see it) for calculating \(\pi\) on \(\pi\) Day, let’s take \(k = 1\) to find \(\pi\) anyway!

Bisection Method

The first method mentioned on QuantEcon is the bisection algorithm, which essentially treats finding solutions to a function as a binary search problem. There are two parameters that are needed to get started with the bisection algorithm, an initial lower bound and an initial upper bound on the search space. Not only do we need two such parameters, but our choice of these two numbers can change what solution is found. Let’s consider the following example where we look on the interval \([-10, 10]\).

import numpy as np
from scipy.optimize import bisect
# Define a single-variable function to find solutions in
f = lambda x: np.sin(x) * np.exp(-x)
# try out the bisection algorithm
print(bisect(f, -10, 10))
0.0

We were looking for \(x = \pi\), but we got \(x = 0\) instead. If there are multiple solutions within your search interval, the algorithm won’t necessarily converge on the one that you wanted, nor will it report to you there were multiple solutions. Knowing ahead of time that we’d like to calculate \(\pi\), and that \(3 < \pi < 4\), let’s rerun the bisection algorithm on \([3, 4]\).

print(bisect(f, 3, 4))
3.1415926535901235

That gives us a value pretty close to \(\pi\), correct to the \(11\)th digit anyway.

Newton-Raphson method

The Newton-Raphson method is a calculus-based method that iteratively steps towards a solution. Like the bisection method, it requires a number decided ahead of time but this time this chosen number is an initial guess or starting point. Unlike the bisection method, the Newton-Raphson method does not have bounds set on the search so a continuous function over the real numbers can be searched indefinitely. To prevent the algorithm searching for too long, a hyperparameter limiting the number of iterations (steps) is included if a stable solution is not converged upon (default is \(50\) steps).

from scipy.optimize import newton
# Define a single-variable function to find solutions in
# try out the Newton-Raphson algorithm
print(newton(f, 0.2))
3.6499361606787994e-14

While \(x = 0.2\) is not that far off from Pi, the local derivatives are going to point the steps to descend toward zero. Notice that the solution we got was not exactly zero, but rather the first solution found within a predefined tolerance of \(1.48 \times 10^{-8}\). What you don’t see from the code is the that shape of the curve, which if you plot our function you’ll see there is a local maxima between \(x = 0\) and \(x = \pi\) at \(x = \frac{\pi}{4}\). Relative to this hill, our estimate is analogous to a ball rolling in the direction of steepest descent. This analogy breaks down for solutions separated by a local minima as the method is not equivalent to steepest descent even though it is based on the local derivative. Another issue that can come about is picking an initial value close to an extrema because the results can be unstable, allowing incredibly large jumps across the domain. Therefore, we should be cautious about our choice of initial guess by doing some exploration of function’s properties before attempting to estimate the solution. Let’s retry with a more suitable initial value.

# try out the Newton-Raphson algorithm
print(newton(f, np.pi / 4 + 1))
3.1415926535897936

That is clearly closer to \(\pi\) than \(3.6499361606787994 \times 10^{-14}\), and being accurate for the first \(16\) digits suggests that it was more precise than the bisection algorithm under these parameters.

Brent’s method

The QuantEcon course points out that Bisection is more robust (stable) than Newton-Raphson’s method, but it is also slower. An alternative approach that balances this tradeoff is Brent’s method which includes bounds and garantees solutions for computable functions. Let’s give this approach a try on our function on \([3, 4]\).

from scipy.optimize import brentq
# try out the Brent's algorithm
print(brentq(f, 3, 4))
3.141592653589788

Looks like this estimation of \(\pi\) was correct for the first \(13\) digits, which was better than Bisection but worse than Newton-Raphson.

Performance comparison with timeit

Last of all, it would be interesting to compare the time performance of each of these solution-finding approaches. Let’s do that with timeit.

from timeit import timeit
print(timeit(stmt='bisect(f, 3, 4)',\
    globals={'bisect':bisect, 'f':f},\
    number=100000) / 100000)
print(timeit(stmt='newton(f, np.pi / 4 + 1)',\
    globals={'newton':newton, 'f':f, 'np':np},\
    number=100000) / 100000)
print(timeit(stmt='brentq(f, 3, 4)',\
    globals={'brentq':brentq, 'f':f},\
    number=100000) / 100000)
0.00010017173054002341
0.0001319278220400156
2.3580086980000487e-05

We find under this setup that the slowest algorithm was the Newton-Raphson’s method, followed by the bisection method by a factor of \(\frac{1}{5}\), and final Brent’s method being about an order of magnitude faster! So Brent’s method gave us more accurate digits in the solution, at least for \(x = \pi\), and also performed faster than the other two methods. Does this mean that Brent’s method is always the best method? Not necessarily. We should be open to the possibility of tradeoffs not discussed on QuantEcon, as well as there being a panoply of algorithms out available in code repositories.