This paper presents an algorithm which incorporates spatial and temporal gradients for full reference video quality assessment. In the proposed method the frame-based gradient magnitude similarity deviation is calculated to form the spatial quality vector. To capture the temporal distortion, the similarity of frame difference is measured. In the proposed method, we extract the worst scores in both the spatial and temporal vectors by introducing the variable-length temporal window for max-pooling operation. The resultant vectors are then combined to form the final score. The performance of the proposed method is evaluated on LIVE SD and EPFLPoliMI datasets. The results clearly illustrate that, despite the computational efficiency, the predictions are highly correlated with human visual system.
Keywords: Video Quality Assessment, Full-Reference, Gradient, Human Visual System, Mean Squared Error