000 00622nam a22001577a 4500
008 231016b |||||||| |||| 00| 0 eng d
020 _a978-0-262-03924-6
082 _223
_a006.31
_bSUT
100 _aSutton, Richard
245 _aReinforcement Learning :
_bAn Introduction /
_cRichard S. Sutton & Andrew G. Barto
_hEnglish
250 _a2nd ed
260 _aLondon:
_bThe MIT Press,
_c2020.
300 _aviii, 526p.:
_bhard bound
_c18x 24cm.
505 _a1.Introduction 2.Multi-Armed Bandits 3.Finite Markov Decision Processes 4.Dynamic programming 5.Monte Carlo Methods ........................
942 _2ddc
_cREF
999 _c8393
_d8393