# 2.1 Markov decision -- type 3 gains  (Page 2/2)

The calculations

```orderdata reorderEnter row vector of states states Enter row vector A of actions (padded) AEnter row vector C of order costs (padded) C Enter row vector D of demand values DEnter row vector PD of demand probabilities PD Enter unit selling price SP SPEnter backorder penalty cost BP BP PA =1.0000 0 0 0 0 0.6000 0.2000 0.2000 0 00.2000 0.2000 0.2000 0.2000 0.2000 0.8000 0.2000 0 0 00.4000 0.2000 0.2000 0.2000 0 0.4000 0.2000 0.2000 0.2000 00.6000 0.2000 0.2000 0 0 0.2000 0.2000 0.2000 0.2000 0.20000.2000 0.2000 0.2000 0.2000 0.2000 0.4000 0.2000 0.2000 0.2000 00.4000 0.2000 0.2000 0.2000 0 0.4000 0.2000 0.2000 0.2000 00.2000 0.2000 0.2000 0.2000 0.2000 0.2000 0.2000 0.2000 0.2000 0.20000.2000 0.2000 0.2000 0.2000 0.2000 GA =0 -40 -80 -120 -160 -300 -100 100 60 20-480 -280 -80 120 320 0 200 160 120 80-300 -100 100 300 260 -300 -100 100 300 2600 200 400 360 320 -300 -100 100 300 500-300 -100 100 300 500 0 200 400 600 5600 200 400 600 560 0 200 400 600 5600 200 400 600 800 0 200 400 600 8000 200 400 600 800```

## Infinite-horizon strategy (no discounting)

```polit Data needed:- - - - - - - - - - - - - - - Enter type number to show gain type typeEnter row vector of states states Enter row vector A of possible actions AEnter value of alpha (= 1 for no discounting) 1 Enter matrix PA of transition probabilities PAEnter matrix GA of gains GA Enter row vector PD of demand probabilities PDIndex Action Value 1 0 -802 2 -44 3 4 -804 0 1125 2 52 6 2 527 0 256 8 2 1009 2 100 10 0 35211 0 352 12 0 35213 0 400 14 0 40015 0 400 Initial policy: action numbers2 1 1 1 1 Policy: actions2 0 0 0 0 New policy: action numbers3 2 2 1 1 Policy: actions4 2 2 0 0 Long-run distribution0.2800 0.2000 0.2000 0.2000 0.1200 Test values for selecting new policyIndex Action Test Value 1.0000 0 -248.00002.0000 2.0000 -168.8000 3.0000 4.0000 -41.60004.0000 0 -48.8000 5.0000 2.0000 -5.60006.0000 2.0000 -5.6000 7.0000 0 131.20008.0000 2.0000 138.4000 9.0000 2.0000 138.400010.0000 0 294.4000 11.0000 0 294.400012.0000 0 294.4000 13.0000 0 438.400014.0000 0 438.4000 15.0000 0 438.4000Optimum policy State Action Value0 4.0000 -168.0000 1.0000 2.0000 -132.00002.0000 2.0000 12.0000 3.0000 0 168.00004.0000 0 312.0000 Long-run expected gain per period G126.4000```

## Infinite-horizon strategy (with discounting)

```polit Data needed:- - - - - - - - - - - - - - - Enter case number to show gain type typeEnter row vector of states states Enter row vector A of possible actions AEnter value of alpha (= 1 for no discounting) 1/1.02 Enter matrix PA of transition probabilities PAEnter matrix GA of gains GA Enter row vector PD of demand probabilities PDIndex Action Value 1 0 -802 2 -44 3 4 -804 0 112 5 2 526 2 52 7 0 2568 2 100 9 2 10010 0 352 11 0 35212 0 352 13 0 40014 0 400 15 0 400Initial policy: action numbers 2 1 1 1 1Policy: actions 2 0 0 0 0New policy: action numbers 3 2 2 1 1Policy: actions 4 2 2 0 0Test values for selecting policy Index Action Test Value1.0e+03 * 0.0010 0 6.07460.0020 0.0020 6.1533 0.0030 0.0040 6.27760.0040 0 6.2740 0.0050 0.0020 6.31550.0060 0.0020 6.3155 0.0070 0 6.45330.0080 0.0020 6.4576 0.0090 0.0020 6.45760.0100 0 6.6155 0.0110 0 6.61550.0120 0 6.6155 0.0130 0 6.75760.0140 0 6.7576 0.0150 0 6.7576Optimum policy State Action Value1.0e+03 * 0 0.0040 6.27760.0010 0.0020 6.3155 0.0020 0.0020 6.45760.0030 0 6.6155 0.0040 0 6.7576```

## Finite-horizon calculations

```dpinit Initialize for finite horizon calculationsMatrices A, PA, and GA, padded if necessary Enter type number to show gain type typeEnter vector of states states Enter row vector A of possible actions AEnter matrix PA of transition probabilities PA Enter matrix GA of gains GAEnter row vector PD of demand probabilities PD Call for dprogdprog States and expected total gains0 1 2 3 4 -44 112 256 352 400States Actions 0 21 0 2 03 0 4 0dprog States and expected total gains0 1.0000 2.0000 3.0000 4.0000 135.2000 178.4000 315.2000 478.4000 615.2000States Actions 0 41 2 2 23 0 4 0dprog States and expected total gains0 1.0000 2.0000 3.0000 4.0000 264.4800 300.4800 444.4800 600.4800 744.4800States Actions 0 41 2 2 23 0 4 0dprog States and expected total gains0 1.0000 2.0000 3.0000 4.0000 390.8800 426.8800 570.8800 726.8800 870.8800States Actions 0 41 2 2 23 0 4 0dprog States and expected total gains0 1.0000 2.0000 3.0000 4.0000 517.2800 553.2800 697.2800 853.2800 997.2800States Actions 0 41 2 2 23 0 4 0dprog States and expected total gains1.0e+03 * 0 0.0010 0.0020 0.0030 0.00400.6437 0.6797 0.8237 0.9797 1.1237 States Actions0 4 1 22 2 3 04 0```

