Unconstrained Optimization#

Teng-Jui Lin

Content adapted from UW AMATH 301, Beginning Scientific Computing, in Spring 2020.

  • Unconstrained optimization methods

    • Derivative-free method

      • Golden section search

    • Derivative method

      • Gradient descent

  • scipy implementations

Finding minimum using scipy.optimize.fminbound()#

scipy.optimize.fminbound() find the minimum for scalar function in given interval.

import scipy
from scipy import optimize
scipy.optimize.fminbound(test_func, a, b)
1.8888887480521486

Gradient descent#

Goal: find the minimum of a multi-variable function \(f(\mathbf{x})\)

  • unconstrained

  • derivative method

Gradient descent chooses the next point in the steepest downhill direction, where the function value is minimum. For each guess \(\mathbf{p_n}\), we calculate \(\mathbf{p_{n+1}}\) by

  1. Calculate \(\nabla f(\mathbf{p_n})\)

  2. Define \(\phi(t) = \mathbf{p_n} - t\nabla f(\mathbf{p_n})\)

  3. Define \(f(\phi(t))\)

  4. Find \(t^*\) using fminbound() to minimize \(f(\phi(t))\)

  5. Define \(\mathbf{p_{n+1}} = \phi(t^*)\)

Optimization is the speed limiting step.

Implementation#

Problem Statement. Find the minimum of the function

\[ f(x, y) = (x-2)^2 + (y+1)^2 + 5\sin x \sin y + 100 \]

using gradient descent and compare with the result from scipy.optimize.fmin().

def gradient_descent(f, fgrad, x0, tolerance=1e-6, max_iter=10000):
    '''
    Find a minimum of multivariable function by gradient descent.
    
    :param f: objective function
    :param fgrad: gradient of objective function
    :param x0: initial guess (starting point)
    :param tolerance: tolerance of stopping criteria
    :param max_iter: maximum iterations allowed for calculation
    :returns: minimum of objective function
    '''
    import scipy
    from scipy import optimize
    
    i = 0  # iteration counter
    x = x0
    grad = fgrad(x)
    
    # gradient descent logic
    while np.linalg.norm(grad, np.inf) > tolerance and i < max_iter:
        grad = fgrad(x)
        phi = lambda t : x - t*grad
        f_of_phi = lambda t : f(phi(t))
        tmin = scipy.optimize.fminbound(f_of_phi, 0, 1)
        x = phi(tmin)  # use next point with minimum function value
        i += 1
    
    # maximum iteration warning
    if i == max_iter:
        import warnings
        warnings.warn(f'Maximum iteration reached. Current stopping criteria is {np.linalg.norm(grad, np.inf)}', UserWarning)
    
    return x
fxy = lambda x, y : (x - 2)**2 + (y + 1)**2 + 5*np.sin(x)*np.sin(y) + 100
f = lambda p : fxy(*p)
fgradx = lambda x, y : 2*(x - 2) + 5*np.cos(x)*np.sin(y) 
fgrady = lambda x, y : 2*(y + 1) + 5*np.sin(x)*np.cos(y)
fgrad =  lambda p : np.array([fgradx(*p), fgrady(*p)])
x0 = [6, 4]
gradient_descent(f, fgrad, x0)
array([ 1.69484631, -1.40628341])
# compare with scipy
scipy.optimize.fmin(f, x0)
Optimization terminated successfully.
         Current function value: 95.363597
         Iterations: 42
         Function evaluations: 82
array([ 1.69487388, -1.40630558])

Gradient descent with fixed steps#

Goal: find the minimum of a multi-variable function \(f(\mathbf{x})\)

  • unconstrained

  • derivative method

Gradient descent chooses the next point in the steepest downhill direction, with a constant learning rate (fixed step) \(t\). For each guess \(\mathbf{p_n}\), we calculate \(\mathbf{p_{n+1}}\) by

  1. Calculate \(\nabla f(\mathbf{p_n})\)

  2. Define \(\mathbf{p_{n+1}} = \mathbf{p_{n}} - t\nabla f(\mathbf{p_n})\)

Calculating gradient is the speed limiting step. The learning rate need to be tuned.

Implementation#

Problem Statement. Find the minimum of the function

\[ f(x, y) = (x-2)^2 + (y+1)^2 + 5\sin x \sin y + 100 \]

using gradient descent with fixed steps and compare with the result from scipy.optimize.fmin().

def gradient_descent_fixed_step(f, fgrad, x0, learning_rate, tolerance=1e-6, max_iter=10000):
    '''
    Find a minimum of multivariable function by gradient descent.
    
    :param f: objective function
    :param fgrad: gradient of objective function
    :param x0: initial guess (starting point)
    :param learning_rate: learning rate
    :param tolerance: tolerance of stopping criteria
    :param max_iter: maximum iterations allowed for calculation
    :returns: minimum of objective function
    '''
    import scipy
    from scipy import optimize
    
    i = 0  # iteration counter
    x = x0
    grad = fgrad(x)
    
    # gradient descent with fixed step logic
    while np.linalg.norm(grad, np.inf) > tolerance and i < max_iter:
        grad = fgrad(x)
        x = x - learning_rate*grad  # use next point with fixed step
        i += 1
    
    # maximum iteration warning
    if i == max_iter:
        import warnings
        warnings.warn(f'Maximum iteration reached. Current stopping criteria is {np.linalg.norm(grad, np.inf)}', UserWarning)
    
    return x
x0 = [6, 4]
gradient_descent_fixed_step(f, fgrad, x0, 0.2)
array([ 1.69484629, -1.40628343])
# compare with scipy
scipy.optimize.fmin(f, x0)
Optimization terminated successfully.
         Current function value: 95.363597
         Iterations: 42
         Function evaluations: 82
array([ 1.69487388, -1.40630558])

Finding minimum using scipy.optimize.fmin()#

scipy.optimize.fmin() finds a minimum for multivariable function.

scipy.optimize.fmin(f, x0)
Optimization terminated successfully.
         Current function value: 95.363597
         Iterations: 42
         Function evaluations: 82
array([ 1.69487388, -1.40630558])