Operational Research Notes > Lse Operational Research Notes > Operational Research Techniques Notes

Dynamic Programming Notes

This is a sample of our (approximately) 4 page long Dynamic Programming notes, which we sell as part of the Operational Research Techniques Notes collection, a 1st Class package written at LSE in 2011 that contains (approximately) 104 pages of notes across 17 different documents.

Learn more about our Operational Research Techniques Notes

Dynamic Programming Revision

The following is a plain text extract of the PDF sample above, taken from our Operational Research Techniques Notes. This text version has had its formatting removed so pay attention to its contents alone rather than its presentation. The version you download will have its original formatting intact and so will be much prettier to look at.

Lecture 13: Introduction to Dynamic Programming Topics

• Introduction

• General dynamic programming problems

• Mathematical formulation

• Comparing MCE and DP

• Example of a DP problem

Reading

Introduction

• Dynamic programming = The solving of optimisation problems that can be formulated as a sequence of decisions

• Advantages of dynamic programming over linear programming:
○ Tends to be much faster to solve
○ Can easily write a program in a high level programming language to carry out a DP

• Uses of DP:
○ Investment planning
○ Stock control
○ Production scheduling
○ Animal behaviour Stagecoach problem (shortest path problem)
○ Stagecoach problem = when we wish to travel from one point to another by the shortest possible path (route) passing through particular levels
○ DP dramatically reduces the number of calculations required to solve this kind of problem
 The alternative method is the Method of Complete Enumeration (MCE)
○ DP essentially works backwards from the end, optimising as we go
○ The number of possible routes is found by multiplying all the number of states in each stage (i.e. 3x3x3x3 in the example below)
 It increases geometrically with the number of stages Backward (Pass) DP algorithm i. Divide the problem into stages from finish to start ii. Start at the penultimate stage of the problem iii. Repeat {
1) Find the optimal policy from each state of the current stage to the ultimate destination, making use of our knowledge of the shortest route for later stages which we have already worked out 2) Record (store in memory) the optimal policy and the shortest distance at each node 3) Go back one stage iv. } Until you get to the start (the problem is then solved) Example of using the backward (pass) DP algorithm

General dynamic programming problems

• Characteristics of a general DP problem: An objective function must be maximised or minimised

Course Notes Page 29

Key Points Definitions

• Decision rule = A function that takes as input the state we are in and returns a decision (an action)

• Decision variable = which state we decide to go to next

• Dynamic programming = The solving of optimisation problems that can be formulated as a sequence of decisions

• Pair-wise addition = An addition that adds a new link length to the length so far

• Set space = the set of all states

• Stagecoach problem = when we wish to travel from one point to another by the shortest possible path (route) passing through particular levels

• Transition function = a function that specifies the state that we reach, at the next stage, as a consequence of our decision

• Value of state s at stage n = total reward (cost) from that state onwards (including that state) to the end state if the optimal policy is followed Formulae

****************************End Of Sample*****************************

Buy the full version of these notes or essay plans and more in our Operational Research Techniques Notes.