Xu, X., Chen, X., Liu, C., Rohrbach, A., Darrell, T., & Song, D. (2018). Fooling Vision and Language Models Despite Localization and Attention Mechanism. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 4951-4961). Piscataway, NJ: IEEE. doi:10.1109/CVPR.2018.00520.