In this paper an improvement over our previous work is proposed to handle short-medium range surveillance videos. The features of histogram of oriented social force (HOSF) are the primitive building blocks to capture the interactions among people. To reduce the correlation among data, whitening procedure is applied on features. We use Bag-of-Feature (BoF) to pool HOSF in a given frame. Since our goal is to classify whether a given frame is normal and BoF, a histogram of visual words in a frame, can better represent patterns in term of frame. In the phase of building the dictionary, training BoFs are clustered and the center means are so called code words corresponding to "normal" patterns observed during the training process. A Gaussian model is constructed for distances between data and the codeword in each cluster. To decide whether a given frame is normal, the BoF feature is evaluated and the Z-score which measuring the deviation to the closest codeword is calculated. If such BoF is an outlier (i.e. High Z-score) comparing to the closest codeword, then the frame is classified "abnormal". The method is testified by the subway dataset with promising results.
Proceedings of the 29th IEEE International Conference on Advanced Information Networking and Applications (AINA-2015)