cs 7642 reinforcement learning calculate temporal difference td

please refer to the PDF attached for complete question and calculate TD(λ)
Find a value of
, strictly less than 1, such that the TD estimate for
equals that of the
TD(1) estimate. Round your answer for
to three decimal places.
This HW is designed to help solidify your understanding of the Temporal Difference
algorithms and k-step estimators. You will be given the probability to State 1 and a vector
of rewards {r0, r1, r2, r3, r4, r5, r6}
You will be given 10 test cases for which you will return the best lambda value for each.
Your answer must be correct to 3 decimal places. You may use any programming
language and libraries you wish.
