Maybe you have seen something like this when observing the log likelihood derivations for multivariate Gaussians \( \ln p(X|\mu, \Sigma) = \frac{1}{2}\ln|\Sigma|- \frac{1}{2}X^{T}\Sigma^{-1}X + const = \frac{1}{2}\ln|\Sigma| - \frac{1}{2}Tr(\Sigma^{-1}XX^{T}) + const \) and you wondered where that \(Tr\) came from. Here you can find a great explanation but I thought I would write it down […]