Someone recently bought our

students are currently browsing our notes.

X

Dynamic Programming Notes

Operational Research Notes > Operational Research Techniques Notes

This is an extract of our Dynamic Programming document, which we sell as part of our Operational Research Techniques Notes collection written by the top tier of LSE students.

The following is a more accessble plain text extract of the PDF sample above, taken from our Operational Research Techniques Notes. Due to the challenges of extracting text from PDFs, it will have odd formatting:

Lecture 13: Introduction to Dynamic Programming Topics

* Introduction

* General dynamic programming problems

* Mathematical formulation

* Comparing MCE and DP

* Example of a DP problem

Reading

Introduction

* Dynamic programming = The solving of optimisation problems that can be formulated as a sequence of decisions

* Advantages of dynamic programming over linear programming:
* Tends to be much faster to solve
* Can easily write a program in a high level programming language to carry out a DP

* Uses of DP:
* Investment planning
* Stock control
* Production scheduling
* Animal behaviour Stagecoach problem (shortest path problem)
* Stagecoach problem = when we wish to travel from one point to another by the shortest possible path (route) passing through particular levels
* DP dramatically reduces the number of calculations required to solve this kind of problem
? The alternative method is the Method of Complete Enumeration (MCE)
* DP essentially works backwards from the end, optimising as we go
* The number of possible routes is found by multiplying all the number of states in each stage (i.e. 3x3x3x3 in the example below)
? It increases geometrically with the number of stages Backward (Pass) DP algorithm i. Divide the problem into stages from finish to start ii. Start at the penultimate stage of the problem iii. Repeat {
1) Find the optimal policy from each state of the current stage to the ultimate destination, making use of our knowledge of the shortest route for later stages which we have already worked out 2) Record (store in memory) the optimal policy and the shortest distance at each node 3) Go back one stage iv. } Until you get to the start (the problem is then solved) Example of using the backward (pass) DP algorithm

General dynamic programming problems

* Characteristics of a general DP problem: An objective function must be maximised or minimised

Course Notes Page 29

Key Points Definitions

* Decision rule = A function that takes as input the state we are in and returns a decision (an action)

* Decision variable = which state we decide to go to next

* Dynamic programming = The solving of optimisation problems that can be formulated as a sequence of decisions

* Pair-wise addition = An addition that adds a new link length to the length so far

* Set space = the set of all states

* Stagecoach problem = when we wish to travel from one point to another by the shortest possible path (route) passing through particular levels

* Transition function = a function that specifies the state that we reach, at the next stage, as a consequence of our decision

* Value of state s at stage n = total reward (cost) from that state onwards (including that state) to the end state if the optimal policy is followed Formulae

*

Buy the full version of these notes or essay plans and more in our Operational Research Techniques Notes.