“…From this, we can see that that one can learn the optimal
by optimizing
, that is, maximum likelihood. It is standard to choose a Gaussian likelihood (e.g., Cleary et al.,
2021; Dunbar et al.,
2021; Howland et al.,
2022):
Where superscript
denotes the transpose. This equation is the probability that the data
originates from
, allowing for the Gaussian noise with variance
as described by Equation .…”