222 results found
Deng J, Zhou Y, Kotsia I, et al., Dense 3D Face Decoding over 2500FPS: Joint Texture and Shape Convolutional Mesh Decoders, CVPR
Gecer B, Ploumpis S, Kotsia I, et al., GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction, CVPR
Gligorijevic V, Panagakis Y, Zafeiriou S, 2019, Non-Negative Matrix Factorizations for Multiplex Network Analysis, Publisher: IEEE COMPUTER SOC
Ploumpis S, Wang H, Pears N, et al., Combining 3D Morphable Models: A Large-scale Face-and-Head Model, CVPR
Deng J, Guo J, Xue N, et al., Arcface: Additive angular margin loss for deep face recognition, CVPR
Deng J, Xue N, Cheng S, et al., Side information for face completion: a robust PCA approach, IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN: 0162-8828
Robust principal component analysis (RPCA) is a powerful method for learning low-rank feature representation of variousvisual data. However, for certain types as well as significant amount of error corruption, it fails to yield satisfactory results; a drawbackthat can be alleviated by exploiting domain-dependent prior knowledge or information. In this paper, we propose two models for theRPCA that take into account such side information, even in the presence of missing values. We apply this framework to the task of UVcompletion which is widely used in pose-invariant face recognition. Moreover, we construct a generative adversarial network (GAN) toextract side information as well as subspaces. These subspaces not only assist in the recovery but also speed up the process in caseof large-scale data. We quantitatively and qualitatively evaluate the proposed approaches through both synthetic data and fivereal-world datasets to verify their effectiveness.
Kollias D, Tzirakis P, Nicolaou MA, et al., Deep Affect Prediction in-the-Wild: Aff-Wild Database and Challenge, Deep Architectures, and Beyond, International Journal of Computer Vision, ISSN: 0920-5691
Wang M, Shu Z, Cheng S, et al., 2019, An Adversarial Neuro-Tensorial Approach for Learning Disentangled Representations, International Journal of Computer Vision, ISSN: 0920-5691
© 2019, The Author(s). Several factors contribute to the appearance of an object in a visual scene, including pose, illumination, and deformation, among others. Each factor accounts for a source of variability in the data, while the multiplicative interactions of these factors emulate the entangled variability, giving rise to the rich structure of visual object appearance. Disentangling such unobserved factors from visual data is a challenging task, especially when the data have been captured in uncontrolled recording conditions (also referred to as “in-the-wild”) and label information is not available. In this paper, we propose a pseudo-supervised deep learning method for disentangling multiple latent factors of variation in face images captured in-the-wild. To this end, we propose a deep latent variable model, where the multiplicative interactions of multiple latent factors of variation are explicitly modelled by means of multilinear (tensor) structure. We demonstrate that the proposed approach indeed learns disentangled representations of facial expressions and pose, which can be used in various applications, including face editing, as well as 3D face reconstruction and classification of facial expression, identity and pose.
Kollias D, Cheng S, Pantic M, et al., 2019, Photorealistic facial synthesis in the dimensional affect space, Pages: 475-491, ISSN: 0302-9743
© 2019, Springer Nature Switzerland AG. This paper presents a novel approach for synthesizing facial affect, which is based on our annotating 600,000 frames of the 4DFAB database in terms of valence and arousal. The input of this approach is a pair of these emotional state descriptors and a neutral 2D image of a person to whom the corresponding affect will be synthesized. Given this target pair, a set of 3D facial meshes is selected, which is used to build a blendshape model and generate the new facial affect. To synthesize the affect on the 2D neutral image, 3DMM fitting is performed and the reconstructed face is deformed to generate the target facial expressions. Last, the new face is rendered into the original image. Both qualitative and quantitative experimental studies illustrate the generation of realistic images, when the neutral image is sampled from a variety of well known databases, such as the Aff-Wild, AFEW, Multi-PIE, AFEW-VA, BU-3DFE, Bosphorus.
Hovhannisyan V, Panagakis Y, Parpas P, et al., 2019, Fast Multilevel Algorithms for Compressive Principal Component Pursuit, SIAM JOURNAL ON IMAGING SCIENCES, Vol: 12, Pages: 624-649, ISSN: 1936-4954
Nicolaou MA, Zafeiriou S, Kotsia I, et al., 2019, Editorial of Special Issue on Human Behaviour Analysis "In-the-Wild", IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, Vol: 10, Pages: 4-6, ISSN: 1949-3045
Moschoglou S, Ververas E, Panagakis Y, et al., 2018, Multi-Attribute Robust Component Analysis for Facial UV Maps, IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, Vol: 12, Pages: 1324-1337, ISSN: 1932-4553
Booth J, Roussos A, Ververas E, et al., 2018, 3D Reconstruction of "In-the-Wild" Faces in Images and Videos, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol: 40, Pages: 2638-2652, ISSN: 0162-8828
Chrysos GG, Zafeiriou S, 2018, (PDT)-T-2: Person-Specific Detection, Deformable Tracking, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol: 40, Pages: 2555-2568, ISSN: 0162-8828
Sagonas C, Ververas E, Panagakis Y, et al., 2018, Recovering Joint and Individual Components in Facial Data, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol: 40, Pages: 2668-2681, ISSN: 0162-8828
Wang M, Panagakis Y, Snape P, et al., 2018, Disentangling the Modes of Variation in Unlabelled Data., IEEE Trans Pattern Anal Mach Intell, Vol: 40, Pages: 2682-2695
Statistical methods are of paramount importance in discovering the modes of variation in visual data. The Principal Component Analysis (PCA) is probably the most prominent method for extracting a single mode of variation in the data. However, in practice, several factors contribute to the appearance of visual objects including pose, illumination, and deformation, to mention a few. To extract these modes of variations from visual data, several supervised methods, such as the TensorFaces relying on multilinear (tensor) decomposition have been developed. The main drawbacks of such methods is that they require both labels regarding the modes of variations and the same number of samples under all modes of variations (e.g., the same face under different expressions, poses etc.). Therefore, their applicability is limited to well-organised data, usually captured in well-controlled conditions. In this paper, we propose a novel general multilinear matrix decomposition method that discovers the multilinear structure of possibly incomplete sets of visual data in unsupervised setting (i.e., without the presence of labels). We also propose extensions of the method with sparsity and low-rank constraints in order to handle noisy data, captured in unconstrained conditions. Besides that, a graph-regularised variant of the method is also developed in order to exploit available geometric or label information for some modes of variations. We demonstrate the applicability of the proposed method in several computer vision tasks, including Shape from Shading (SfS) (in the wild and with occlusion removal), expression transfer, and estimation of surface normals from images captured in the wild.
Booth J, Roussos A, Ververas E, et al., 2018, 3D Reconstruction of "In-the-Wild" Faces in Images and Videos., IEEE Trans Pattern Anal Mach Intell, Vol: 40, Pages: 2638-2652
3D Morphable Models (3DMMs) are powerful statistical models of 3D facial shape and texture, and are among the state-of-the-art methods for reconstructing facial shape from single images. With the advent of new 3D sensors, many 3D facial datasets have been collected containing both neutral as well as expressive faces. However, all datasets are captured under controlled conditions. Thus, even though powerful 3D facial shape models can be learnt from such data, it is difficult to build statistical texture models that are sufficient to reconstruct faces captured in unconstrained conditions ("in-the-wild"). In this paper, we propose the first "in-the-wild" 3DMM by combining a statistical model of facial identity and expression shape with an "in-the-wild" texture model. We show that such an approach allows for the development of a greatly simplified fitting procedure for images and videos, as there is no need to optimise with regards to the illumination parameters. We have collected three new benchmarks that combine "in-the-wild" images and video with ground truth 3D facial geometry, the first of their kind, and report extensive quantitative evaluations using them that demonstrate our method is state-of-the-art.
Kollias D, Zafeiriou S, 2018, Training Deep Neural Networks with Different Datasets In-the-wild: The Emotion Recognition Paradigm
© 2018 IEEE. A novel procedure is presented in this paper, for training a deep convolutional and recurrent neural network, taking into account both the available training data set and some information extracted from similar networks trained with other relevant data sets. This information is included in an extended loss function used for the network training, so that the network can have an improved performance when applied to the other data sets, without forgetting the learned knowledge from the original data set. Facial expression and emotion recognition in-the-wild is the test bed application that is used to demonstrate the improved performance achieved using the proposed approach. In this framework, we provide an experimental study on categorical emotion recognition using datasets from a very recent related emotion recognition challenge.
Chrysos GG, Antonakos E, Zafeiriou S, 2018, IPST: Incremental Pictorial Structures for Model-Free Tracking of Deformable Objects, IEEE TRANSACTIONS ON IMAGE PROCESSING, Vol: 27, Pages: 3529-3540, ISSN: 1057-7149
Kampouris C, Zafeiriou S, Ghosh A, 2018, Diffuse-specular separation using binary spherical gradient illumination, Eurographics Symposium on Rendering (EGSR) 2018, Publisher: The Eurographics Association, ISSN: 1727-3463
We introduce a novel method for view-independent diffuse-specular separation of albedo and photometric normals withoutrequiring polarization using binary spherical gradient illumination. The key idea is that with binary gradient illumination, adielectric surface oriented towards the dark hemisphere exhibits pure diffuse reflectance while a surface oriented towards thebright hemisphere exhibits both diffuse and specular reflectance. We exploit this observation to formulate diffuse-specular separationbased on color-space analysis of a surface’s response to binary spherical gradients and their complements. The methoddoes not impose restrictions on viewpoints and requires fewer photographs for multiview acquisition than polarized sphericalgradient illumination. We further demonstrate an efficient two-shot capture using spectral multiplexing of the illumination thatenables diffuse-specular separation of albedo and heuristic separation of photometric normals.
Deng J, Cheng S, Xue N, et al., 2018, UV-GAN: Adversarial Facial UV Map Completion for Pose-invariant Face Recognition, 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Publisher: IEEE, Pages: 7093-7102, ISSN: 1063-6919
Trigeorgis G, Nicolaou MA, Schuller BW, et al., 2018, Deep Canonical Time Warping for Simultaneous Alignment and Representation Learning of Sequences, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol: 40, Pages: 1128-1138, ISSN: 0162-8828
Chrysos GG, Antonakos E, Snape P, et al., 2018, A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild", INTERNATIONAL JOURNAL OF COMPUTER VISION, Vol: 126, Pages: 198-232, ISSN: 0920-5691
Zafeiriou S, Kotsia I, Pantic M, 2018, Unconstrained face recognition, Computer Vision: Concepts, Methodologies, Tools, and Applications, Pages: 1640-1661, ISBN: 9781522552048
© 2018 by IGI Global. All rights reserved. The human face is the most well-researched object in computer vision, mainly because (1) it is a highly deformable object whose appearance changes dramatically under different poses, expressions, and, illuminations, etc., (2) the applications of face recognition are numerous and span several fields, (3) it is widely known that humans possess the ability to perform, extremely efficiently and accurately, facial analysis, especially identity recognition. Although a lot of research has been conducted in the past years, the problem of face recognition using images captured in uncontrolled environments including several illumination and/or pose variations still remains open. This is also attributed to the existence of outliers (such as partial occlusion, cosmetics, eyeglasses, etc.) or changes due to age. In this chapter, the authors provide an overview of the existing fully automatic face recognition technologies for uncontrolled scenarios. They present the existing databases and summarize the challenges that arise in such scenarios and conclude by presenting the opportunities that exist in the field.
Bahri M, Panagakis Y, Zafeiriou SP, 2018, Robust Kronecker Component Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN: 0162-8828
IEEE Dictionary learning and component analysis models are fundamental for learning compact representations relevant to a given task. The model complexity is encoded by means of structure, such as sparsity, low-rankness, or nonnegativity. Unfortunately, approaches like K-SVD that learn dictionaries for sparse coding via Singular Value Decomposition (SVD) are hard to scale, and fragile in the presence of outliers. Conversely, robust component analysis methods such as the Robust Principal Component Analysis (RPCA) are able to recover low-complexity representations from data corrupted with noise of unknown magnitude and support, but do not provide a dictionary that respects the structure of the data, and also involve expensive computations. In this paper, we propose a novel Kronecker-decomposable component analysis model, coined as Robust Kronecker Component Analysis (RKCA), that combines ideas from sparse dictionary learning and robust component analysis. RKCA has several appealing properties, including robustness to gross corruption; it can be used for low-rank modeling, and leverages separability to solve significantly smaller problems. We design an efficient learning algorithm by drawing links with tensor factorizations, and analyze its optimality and low-rankness properties. The effectiveness of the proposed approach is demonstrated on real-world applications, namely background subtraction and image denoising and completion, by performing a thorough comparison with the current state of the art.
Deng J, Roussos A, Chrysos G, et al., 2018, The Menpo Benchmark for Multi-pose 2D and 3D Facial Landmark Localisation and Tracking, International Journal of Computer Vision, ISSN: 0920-5691
© 2018, The Author(s). In this article, we present the Menpo 2D and Menpo 3D benchmarks, two new datasets for multi-pose 2D and 3D facial landmark localisation and tracking. In contrast to the previous benchmarks such as 300W and 300VW, the proposed benchmarks contain facial images in both semi-frontal and profile pose. We introduce an elaborate semi-automatic methodology for providing high-quality annotations for both the Menpo 2D and Menpo 3D benchmarks. In Menpo 2D benchmark, different visible landmark configurations are designed for semi-frontal and profile faces, thus making the 2D face alignment full-pose. In Menpo 3D benchmark, a united landmark configuration is designed for both semi-frontal and profile faces based on the correspondence with a 3D face model, thus making face alignment not only full-pose but also corresponding to the real-world 3D space. Based on the considerable number of annotated images, we organised Menpo 2D Challenge and Menpo 3D Challenge for face alignment under large pose variations in conjunction with CVPR 2017 and ICCV 2017, respectively. The results of these challenges demonstrate that recent deep learning architectures, when trained with the abundant data, lead to excellent results. We also provide a very simple, yet effective solution, named Cascade Multi-view Hourglass Model, to 2D and 3D face alignment. In our method, we take advantage of all 2D and 3D facial landmark annotations in a joint way. We not only capitalise on the correspondences between the semi-frontal and profile 2D facial landmarks but also employ joint supervision from both 2D and 3D facial landmarks. Finally, we discuss future directions on the topic of face alignment.
Xue N, Deng J, Panagakis Y, et al., 2018, Informed non-convex robust principal component analysis with features, Pages: 4343-4349
Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. We revisit the problem of robust principal component analysis with features acting as prior side information. To this aim, a novel, elegant, non-convex optimization approach is proposed to decompose a given observation matrix into a low-rank core and the corresponding sparse residual. Rigorous theoretical analysis of the proposed algorithm results in exact recovery guarantees with low computational complexity. Aptly designed synthetic experiments demonstrate that our method is the first to wholly harness the power of non-convexity over convexity in terms of both recoverability and speed. That is, the proposed non-convex approach is more accurate and faster compared to the best available algorithms for the problem under study. Two real-world applications, namely image classification and face denoising further exemplify the practical superiority of the proposed method.
Cheng S, Kotsia I, Pantic M, et al., 2018, 4DFAB: A Large Scale 4D Database for Facial Expression Analysis and Biometric Applications, 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Publisher: IEEE, Pages: 5117-5126, ISSN: 1063-6919
Zhou Y, Deng J, Zafeiriou S, 2018, Improve Accurate Pose Alignment and Action Localization by Dense Pose Estimation, 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG), Publisher: IEEE, Pages: 480-484, ISSN: 2326-5396
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.