Unconstrained Optimization#
Teng-Jui Lin
Content adapted from UW AMATH 301, Beginning Scientific Computing, in Spring 2020.
Unconstrained optimization methods
Derivative-free method
Golden section search
Derivative method
Gradient descent
scipy
implementationsMinimize scalar functions by
scipy.optimize.fminbound()
Minimize multivariable functions by
scipy.optimize.fmin()
Golden section search#
Goal: find the minimum of a single-variable unimodal function \(f(x)\).
unconstrained
derivative-free
In a given interval \([a, b]\), we find two points \(x_1\) and \(x_2\) where \(a<x_1<x_2<b\). The two points can be compared to determine the new interval:
If \(f(x_1) \le f(x_2)\), then change the new interval to \([a, x_2]\)
If \(f(x_1) > f(x_2)\), then change the new interval to \([x_1, b]\)
The golden section search minimizes the number of calculation rather than iteration, so we want to
have a constant reduction factor \(c\)
If \(f(x_1) \le f(x_2)\), \(c = \dfrac{x_2 - a}{b-a} \implies x_2 = (1-c)a + cb\)
If \(f(x_1) > f(x_2)\), \(c = \dfrac{b-x_1}{b-a} \implies x_1 = ca + (1-c)b\)
reuse calculated points
\(x_{2}^{\text {new}}=x_{1}^{\text {old}}=c a+(1-c) b\)
\(x_{2}^{\text {new}}=(1-c) a+c x_{2}^{\text {old}}=(1-c) a+c[(1-c) a+c b]\)
Equate the above two equations, we get
Therefore, we can reuse the points during iteration by choosing:
If \(f(x_1) \le f(x_2)\)
\(x_1^{\text{new}} = x_2^{\text{old}}\)
\(x_2^{\text{new}} = (1-c)a + cb\)
If \(f(x_1) > f(x_2)\)
\(x_2^{\text{new}} = x_1^{\text{old}}\)
\(x_1^{\text{new}} = ca + (1-c)b\)
Implementation#
Problem Statement. Find the minimum of the function
using golden section search.
import numpy as np
def golden_section_search(f, a, b, tolerance=1e-6, max_iter=5000):
'''
Find the minimum of single-variable unimodal function using golden section search method.
:param f: objective function
:param a: lower bound of interval
:param b: upper bound of interval
:param tolerance: tolerance of stopping criteria
:param max_iter: maximum iterations allowed for calculation
:returns: minimum of objective function
'''
c = (np.sqrt(5) - 1)/2
x1 = c*a + (1 - c)*b
x2 = (1 - c)*a + c*b
f1 = f(x1)
f2 = f(x2)
i = 0 # iteration counter
# golden section search logic
while (b-a) > tolerance and i < max_iter:
if f1 <= f2:
b = x2
x2 = x1
f2 = f1
x1 = c*a + (1 - c)*b
f1 = f(x1)
else:
a = x1
x1 = x2
f1 = f2
x2 = (1 - c)*a + c*b
f2 = f(x2)
# maximum iteration warning
if i == max_iter:
import warnings
warnings.warn(f'Maximum iteration reached. Current stopping criteria is {b-a}', UserWarning)
result = (a+b)/2 # take avg of upper and lower bound
return result
def test_func(x):
return -(1/3*np.exp(-x/2) + 3*x*np.exp(-x/2))
a = 0
b = 10
golden_section_search(test_func, a, b)
1.888888807141935
Finding minimum using scipy.optimize.fminbound()
#
scipy.optimize.fminbound()
find the minimum for scalar function in given interval.
import scipy
from scipy import optimize
scipy.optimize.fminbound(test_func, a, b)
1.8888887480521486
Gradient descent#
Goal: find the minimum of a multi-variable function \(f(\mathbf{x})\)
unconstrained
derivative method
Gradient descent chooses the next point in the steepest downhill direction, where the function value is minimum. For each guess \(\mathbf{p_n}\), we calculate \(\mathbf{p_{n+1}}\) by
Calculate \(\nabla f(\mathbf{p_n})\)
Define \(\phi(t) = \mathbf{p_n} - t\nabla f(\mathbf{p_n})\)
Define \(f(\phi(t))\)
Find \(t^*\) using
fminbound()
to minimize \(f(\phi(t))\)Define \(\mathbf{p_{n+1}} = \phi(t^*)\)
Optimization is the speed limiting step.
Implementation#
Problem Statement. Find the minimum of the function
using gradient descent and compare with the result from scipy.optimize.fmin()
.
def gradient_descent(f, fgrad, x0, tolerance=1e-6, max_iter=10000):
'''
Find a minimum of multivariable function by gradient descent.
:param f: objective function
:param fgrad: gradient of objective function
:param x0: initial guess (starting point)
:param tolerance: tolerance of stopping criteria
:param max_iter: maximum iterations allowed for calculation
:returns: minimum of objective function
'''
import scipy
from scipy import optimize
i = 0 # iteration counter
x = x0
grad = fgrad(x)
# gradient descent logic
while np.linalg.norm(grad, np.inf) > tolerance and i < max_iter:
grad = fgrad(x)
phi = lambda t : x - t*grad
f_of_phi = lambda t : f(phi(t))
tmin = scipy.optimize.fminbound(f_of_phi, 0, 1)
x = phi(tmin) # use next point with minimum function value
i += 1
# maximum iteration warning
if i == max_iter:
import warnings
warnings.warn(f'Maximum iteration reached. Current stopping criteria is {np.linalg.norm(grad, np.inf)}', UserWarning)
return x
fxy = lambda x, y : (x - 2)**2 + (y + 1)**2 + 5*np.sin(x)*np.sin(y) + 100
f = lambda p : fxy(*p)
fgradx = lambda x, y : 2*(x - 2) + 5*np.cos(x)*np.sin(y)
fgrady = lambda x, y : 2*(y + 1) + 5*np.sin(x)*np.cos(y)
fgrad = lambda p : np.array([fgradx(*p), fgrady(*p)])
x0 = [6, 4]
gradient_descent(f, fgrad, x0)
array([ 1.69484631, -1.40628341])
# compare with scipy
scipy.optimize.fmin(f, x0)
Optimization terminated successfully.
Current function value: 95.363597
Iterations: 42
Function evaluations: 82
array([ 1.69487388, -1.40630558])
Gradient descent with fixed steps#
Goal: find the minimum of a multi-variable function \(f(\mathbf{x})\)
unconstrained
derivative method
Gradient descent chooses the next point in the steepest downhill direction, with a constant learning rate (fixed step) \(t\). For each guess \(\mathbf{p_n}\), we calculate \(\mathbf{p_{n+1}}\) by
Calculate \(\nabla f(\mathbf{p_n})\)
Define \(\mathbf{p_{n+1}} = \mathbf{p_{n}} - t\nabla f(\mathbf{p_n})\)
Calculating gradient is the speed limiting step. The learning rate need to be tuned.
Implementation#
Problem Statement. Find the minimum of the function
using gradient descent with fixed steps and compare with the result from scipy.optimize.fmin()
.
def gradient_descent_fixed_step(f, fgrad, x0, learning_rate, tolerance=1e-6, max_iter=10000):
'''
Find a minimum of multivariable function by gradient descent.
:param f: objective function
:param fgrad: gradient of objective function
:param x0: initial guess (starting point)
:param learning_rate: learning rate
:param tolerance: tolerance of stopping criteria
:param max_iter: maximum iterations allowed for calculation
:returns: minimum of objective function
'''
import scipy
from scipy import optimize
i = 0 # iteration counter
x = x0
grad = fgrad(x)
# gradient descent with fixed step logic
while np.linalg.norm(grad, np.inf) > tolerance and i < max_iter:
grad = fgrad(x)
x = x - learning_rate*grad # use next point with fixed step
i += 1
# maximum iteration warning
if i == max_iter:
import warnings
warnings.warn(f'Maximum iteration reached. Current stopping criteria is {np.linalg.norm(grad, np.inf)}', UserWarning)
return x
x0 = [6, 4]
gradient_descent_fixed_step(f, fgrad, x0, 0.2)
array([ 1.69484629, -1.40628343])
# compare with scipy
scipy.optimize.fmin(f, x0)
Optimization terminated successfully.
Current function value: 95.363597
Iterations: 42
Function evaluations: 82
array([ 1.69487388, -1.40630558])
Finding minimum using scipy.optimize.fmin()
#
scipy.optimize.fmin()
finds a minimum for multivariable function.
scipy.optimize.fmin(f, x0)
Optimization terminated successfully.
Current function value: 95.363597
Iterations: 42
Function evaluations: 82
array([ 1.69487388, -1.40630558])