John is a criterium (a type of bicycle race that takes place over multiple laps
ID: 3259902 • Letter: J
Question
John is a criterium (a type of bicycle race that takes place over multiple laps with intermediate finishes on some of the laps that bring points for several top placements) racer who wants to develop a strategy for maximizing his race performance. He can be in one of four states during a race: fresh, a bit tired, seriously tired and totally dead. He can apply three levels of effort on an intermediate finish: all out, hard and easy.
In the fresh state, if he goes all out he obtains 10 points on average and finds himself a bit tired with probability 0.6 and seriously tired with probability 0.4 at the time of the next finish. If he goes hard, he wins an average of 6 points and gets a bit tired with probability 0.8 and seriously tired with probability 0.2. If he goes easy he wins an average of 2 points and keeps the fresh state with probability 0.6 and becomes a bit tired with probability 0.4.
If he is a bit tired and goes all out then he wins an average of 8 points and keeps being in the same state with probability 0.3, becomes seriously tired with probability 0.5 and totally dead with probability 0.2. If he goes hard he wins an average of 5 points, keeps himself in the same state with probability 0.5 and becomes seriously tired with probability 0.5 as well. If he goes easy he wins 1 point on average, stays in the same state with probability 0.4 and goes back to being fresh with probability 0.6.
If John is seriously tired and goes all out he wins 5 points on average, keeps in the same state and becomes totally dead with probability 0.5 each. If he goes hard, he wins 3 points on average, keeps in the same state with probability 0.8 and becomes totally dead with probability 0.2. If he goes easy he wins 1 point on average, goes back to a bit tired with probability 0.7 and back to fresh with probability 0.3.
If he is totally dead, he can’t go all out. If he goes hard he wins 1 point on average and stays in the same state. If he goes easy he wins no points at all but goes back to the seriously tired state with probability 0.6, and to the a bit tired state with probability 0.4.
Formulate his problem as an MDP and solve it using the policy improvement method.
Explanation / Answer
T Fresh Bit tired Seriously Tired Totally dead T Fresh Bit tired Seriously Tired Totally dead T Fresh Bit tired Seriously Tired Totally dead T Fresh Bit tired Seriously Tired Totally dead All out 0 0.6 0.4 0 All out 0 0.3 0.5 0.2 All out 0 0 0.5 0.5 All out 0 0 0 1 Hard 0 0.8 0.2 0 Hard 0 0.5 0.5 0 Hard 0 0 0.8 0.2 Hard 0 0 1 0 Easy 0.6 0.4 0 0 Easy 0.6 0.4 0 0 Easy 0.3 0.7 0 0 Easy 0 0.4 0.6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 S1 Fresh Bit tired Seriously Tired Totally dead S1 Fresh Bit tired Seriously Tired Totally dead S1 Fresh Bit tired Seriously Tired Totally dead S1 Fresh Bit tired Seriously Tired Totally dead All out 0.24 0.64 0.12 0 All out 0.3 0.35 0.15 0 All out 0.15 0.35 0 0 All out 0 0 0 0 Hard 0.12 0.72 0.16 0 Hard 0.3 0.45 0.25 0 Hard 0.24 0.56 0 0 Hard 0 0.4 0.6 0 Easy 0 0.68 0.32 0 Easy 0 0.38 0.5 0.12 Easy 0 0 0.71 0.29 Easy 0 0.24 0.76 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 S2 Fresh Bit tired Seriously Tired Totally dead S2 Fresh Bit tired Seriously Tired Totally dead S2 Fresh Bit tired Seriously Tired Totally dead S2 Fresh Bit tired Seriously Tired Totally dead All out 0.072 0.704 0.224 0 All out 0.09 0.325 0.325 0.06 All out 0 0 0.355 0.145 All out 0 0 0 0 Hard 0.096 0.712 0.192 0 Hard 0.15 0.415 0.375 0.06 Hard 0 0 0.568 0.232 Hard 0 0.24 0.76 0 Easy 0.192 0.672 0.136 0 Easy 0.3 0.39 0.19 0 Easy 0.213 0.497 0 0 Easy 0 0.304 0.696 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 S3 Fresh Bit tired Seriously Tired Totally dead S3 Fresh Bit tired Seriously Tired Totally dead S3 Fresh Bit tired Seriously Tired Totally dead S3 Fresh Bit tired Seriously Tired Totally dead All out 0.1344 0.696 0.1696 0 All out 0.195 0.3195 0.2075 0.018 All out 0.1065 0.2485 0 0 All out 0 0 0 0 Hard 0.1152 0.704 0.1808 0 Hard 0.225 0.4025 0.2825 0.03 Hard 0.1704 0.3976 0 0 Hard 0 0.304 0.696 0 Easy 0.0816 0.7072 0.2112 0 Easy 0.114 0.361 0.345 0.06 Easy 0 0 0.5041 0.2059 Easy 0 0.2784 0.7216 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.