Mohammed VI Polytechnic University is an institution dedicated to research and innovation in Africa and aims to position itself among world-renowned universities in its fields
The University is engaged in economic and human development and puts research and innovation at the forefront of African development. A mechanism that enables it to consolidate Morocco’s frontline position in these fields, in a unique partnership-based approach and boosting skills training relevant for the future of Africa.
Located in the municipality of Benguerir, in the very heart of the Green City, Mohammed VI Polytechnic University aspires to leave its mark nationally, continentally, and globally.
Key-words: Hand-Object Interaction, 3D Grasp Generation, 3D Motion Generation, Deep Gen- erative Models
Hands are dexterous and versatile manipulators essential to human interaction with objects and the environment. Therefore, accurately modeling realistic hand-object interactions, including the subtle movements of individual fingers has the potential to aid robots in learning human-robot interactions through simulation and improve the realism of virtual manipulation experiences. In virtual reality, for example, hand interactions still often depend on heuristics and controllers that attach objects to the hand based on predefined grasps. Faithfully reproducing object manipulation from input signals such as natural language or few previous frames of hand and object poses could significantly boost the immersiveness of these interactions.
In this context, we propose a thesis that will contribute to the task of generating human hand object interaction. Our aim is to leverage deep generative models, reinforcement learning and recent advance in large language models to investigate new solutions to the problem of synthesizing hand object interaction in 3D controlled with provided text prompts and geometry of the object.
Synthesizing realistic hand-object interactions in 3D comes with various challenges given that the re- sulting motions should satisfy different constraints, (1) The motions must be geometrically plausible, minimizing hand and object intersections and ensuring that the grasp appears stable. (2) The mo- tions must be semantically plausible, with hands respecting natural object affordances (e.g., grasping a cup by its handle rather than flipping it upside down). (3) motions must be temporally consistent, with hand and object movements synchronized and the dynamics appearing natural. Addressing these challenges requires finding a suitable representation to better model interaction, contact and collision and avoid artifacts like hand-object interpenetration or non-plausible contact points.
The limited scale of existing hand-object datasets is another challenge encountered in generating hand- object interactions. This limited data affects also the generalization ability of the trained models to
unseen objects. The generalization ability of the model is a critical point and challenging task since different object shapes require different types of interaction and hand grasps, such as a power grasp of an apple, a delicate three-finger pinching of a cup handle, and bi-manual grasp of binoculars.
To tackle the above challenges, recent studies exploit diffusion models [1, 2], while others turn to reinforcement learning to learn from physical simulation [3]. On the other hand, rather than focusing on hands, new studies tackle the problem of hand interaction while considering the whole human body motion[4, 5, 6]. However, all these approaches still encounter various issues, such as limited general- ization ability, high computation time, the need for an initial hand pose, or only modeling single-hand interactions.
In this thesis, we aim to propose new models and approaches to understand and synthesize human hand interactions. More specifically, we aim to address the following questions; (1) Given a 3D point cloud of an object, how we can generate a plausible hand pose that can grasp and handle the object correctly? This includes the need to understand the object’s shape and environment and the challenge of generalization to new unseen objects. (2) How to generate physically plausible 3D human hand motion to move the object into a target location and pose? This involves generating continuous interaction with objects to move them while maintaining a stable grasp throughout the interaction. We aim also to leverage the recent advance in large language models to guide the hand object interaction with natural language and allow fine-grained control over the motion.
The aim of this thesis is to explore new approaches to model and generate realistic human hand object interactions. Firstly, a state-of-the-art review should be performed in order to understand the achieved advance, the existing challenges and the promising directions that can be investigated. Next, we aim to propose new generative model architectures to synthesize the human hand pose and motion to correctly interact, manipulate and grasp a given 3D object. Our goal is to publish these contributions in high impact computer vision conferences (e,g., ICCV, CVPR, ECCV) and journals.
The PhD position is proposed by the International Center of Artificial Intelligence of Morocco, of the Mohammed VI Polytechnic University. Applicants with excellent cursus must be holders of a Mas- ter’s, an engineering or an equivalent recognized degree in Computer Science or Applied Mathemat- ics. In addition, they should have skills in Programming (Python and C++) and good communication skills in English. Particular attention will be given to the suitability of this research project with the applicant’s background.
References
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22479-22489, 2023.
UM6P.
Sr Python Lead Developer – Gen AI (Onsite) Job Summary We are seeking a highly skilled Technical Lead with 8...
How to applyProject Role : AI / ML EngineerProject Role Description : Develops applications and systems that utilize AI to improve performance...
How to applyEPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive...
How to applyJob Responsibilities: Developing edge side CPU and NPU based time series data type for anomaly detection, classification and regression AI/ML...
How to applyProject Role : AI / ML EngineerProject Role Description : Develops applications and systems that utilize AI to improve performance...
How to applySummary Are you a passionate software engineer eager to design and build cloud-native infrastructure platforms at Apple scale for Siri,...
How to apply