I implemented a C++ pipeline for learning Fisher feature vectors using VLFeat since Matlab should be avoided whenever possible. Also I found the tutorials and API on the web page a bit too rudimentary when it comes to implementing things in C++. I managed to find Python bindings later, but which doesn’t seem to be updated in awhile but they might be a bit easier to handle.
The code does the following from a set of labeled images it extracts dense SIFT features. It finds the PCA representation of the SIFT reducing the dimension from 128 to 80 (or whatever dimension you want). It then computes the GMM clustering using the EM algorithm. From the learned the GMM it computes the Fisher vector representation. We can also add additional features from a text file, such as a sub layer from a CNN like Caffe. After computing the feature vectors the code trains a linear SVM one for each class label. The code saves all representations so that the inference can be run on new images.
The code use CMake for compilation and requires the Boost, OpenCV and Eigen libraries in addtion to VLFeat. To train on some subset of images we can run it from the terminal
./fisher train
and to test new images
./fisher test
Just make sure you set the data paths correctly. You can find the code here.