An MRP formulation for supervised learning: generalized temporal difference learning models
In traditional statistical learning, data points are usually assumed to be independently and identically distributed (i.i.d.) following an unknown probability distribution. This paper presents a contrasting viewpoint, perceiving data points as interconnected and employing a Markov reward process (MR...
Main Authors: | , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
OpenReview
2024
|