Jaw tracking without markers for facial performance capture

التفاصيل البيبلوغرافية
العنوان: Jaw tracking without markers for facial performance capture
Patent Number: 12118,734
تاريخ النشر: October 15, 2024
Appl. No: 17/851185
Application Filed: June 28, 2022
مستخلص: Some implementations of the disclosure are directed to capturing facial training data for one or more subjects, the captured facial training data including each of the one or more subject's facial skin geometry tracked over a plurality of times and the subject's corresponding jaw poses for each of those plurality of times; and using the captured facial training data to create a model that provides a mapping from skin motion to jaw motion. Additional implementations of the disclosure are directed to determining a facial skin geometry of a subject; using a model that provides a mapping from skin motion to jaw motion to predict a motion of the subject's jaw from a rest pose given the facial skin geometry; and determining a jaw pose of the subject using the predicted motion of the subject's jaw.
Inventors: Disney Enterprises, Inc. (Burbank, CA, US); ETH Zürich (Eidgenössische Technische Hochschule Zürich) (Zürich, CH)
Assignees: Disney Enterprises, Inc. (Burbank, CA, US), ETH Zürich (Eidgenössische Technische Hochschule Zürich) (Zürich, CH)
Claim: 1. A non-transitory computer-readable medium having executable instructions stored thereon that, when executed by a processor, cause a system to perform operations comprising: obtaining a trained model that provides a mapping from facial skin motion to jaw motion, the trained model created using facial training data that includes, for each of one or more subjects, a facial skin geometry and corresponding jaw pose tracked over a plurality of times; determining a first facial skin motion of a first subject corresponding to a facial performance capture, the first subject being different from the one or more subjects; predicting, using the trained model, based at least on the first facial skin motion, a first jaw motion of a jaw of the first subject, the first jaw motion corresponding to the facial performance capture; and generating, using the first jaw motion predicted using the trained model, a facial animation of a digital character.
Claim: 2. The non-transitory computer-readable medium of claim 1 , wherein: the one or more subjects include a plurality of subjects; and the facial training data is captured for the plurality of subjects over a plurality of facial expressions.
Claim: 3. The non-transitory computer-readable medium of claim 1 , wherein the operations further comprise: determining, based on the first jaw motion that is predicted, a jaw pose of the first subject.
Claim: 4. The non-transitory computer-readable medium of claim 1 , wherein determining the first facial skin motion, comprises determining multiple skin features corresponding to the first facial skin motion.
Claim: 5. The non-transitory computer-readable medium of claim 4 , wherein before predicting the first jaw motion, the operations further comprise: transforming the multiple skin features to align with a feature space of the trained model.
Claim: 6. The non-transitory computer-readable medium of claim 4 , wherein predicting the first jaw motion of the first subject, comprises: predicting, using the trained model, based on the multiple skin features, multiple jaw features that define the first jaw motion.
Claim: 7. The non-transitory computer-readable medium of claim 6 , wherein predicting the first jaw motion comprises: predicting, using the trained model, based on the multiple skin features, the multiple jaw features as displacements between current positions of points on the first subject's jaw and positions of the points of the first subject's jaw in a rest pose.
Claim: 8. The non-transitory computer-readable medium of claim 7 , wherein the operations further comprise: determining a jaw pose of the first subject by fitting the jaw of the first subject using the displacements.
Claim: 9. The non-transitory computer-readable medium of claim 1 , wherein: determining the first facial skin motion comprises determining the first facial skin motion for a first time; predicting the first jaw motion comprises predicting the first jaw motion at the first time; and the operations further comprise: determining a second facial skin motion of the first subject for a second time; and predicting, using the trained model, based at least on the second facial skin motion, a second jaw motion of the jaw for the second time.
Claim: 10. A method, comprising: obtaining, at a computing device, a trained model that provides a mapping from facial skin motion to jaw motion, the trained model created using facial training data that includes, for each of one or more subjects, a facial skin geometry and corresponding jaw pose tracked over a plurality of times; determining, at the computing device, a first facial skin motion of a first subject corresponding to a facial performance capture, the first subject being different from the one or more subjects; predicting, at the computing device, using the trained model, based at least on the first facial skin motion, a first jaw motion of a jaw of the first subject, the first jaw motion corresponding to the facial performance capture; and generating, at the computing device, using the first jaw motion predicted using the trained model, a facial animation of a digital character.
Claim: 11. The method of claim 10 , wherein: the one or more subjects include a plurality of subjects; and the facial training data is captured for the plurality of subjects over a plurality of facial expressions.
Claim: 12. The method of claim 11 , further comprising: capturing the facial training data for the plurality of subjects; and creating, using the captured facial training data, the trained model.
Claim: 13. The method of claim 10 , further comprising: determining, at the computing device, based on the first jaw motion that is predicted, a jaw pose of the first subject.
Claim: 14. The method of claim 10 , wherein determining the first facial skin motion, comprises determining multiple skin features corresponding to the first facial skin motion.
Claim: 15. The method of claim 14 , wherein before predicting the first jaw motion, the method further comprises: transforming, at the computing device, the multiple skin features to align with a feature space of the trained model.
Claim: 16. The method of claim 14 , wherein predicting the first jaw motion of the first subject, comprises: predicting, using the trained model, based on the multiple skin features, multiple jaw features that define the first jaw motion.
Claim: 17. The method of claim 16 , wherein predicting the first jaw motion comprises: predicting, using the trained model, based on the multiple skin features, the multiple jaw features as displacements between current positions of points on the first subject's jaw and positions of the points of the first subject's jaw in a rest pose.
Claim: 18. A non-transitory computer-readable medium having executable instructions stored thereon that, when executed by a processor, cause a system to perform operations comprising: obtaining a trained model that provides a mapping from facial skin motion to jaw motion, the trained model created using facial training data captured for one or more subjects; determining a first facial skin motion of a first subject corresponding to a facial performance capture, the first subject being different from the one or more subjects; predicting, using the trained model, based at least on the first facial skin motion, a first jaw motion of a jaw of the first subject, the first jaw motion corresponding to the facial performance capture; and generating, using the first jaw motion predicted using the trained model, a facial animation of a digital character, wherein determining the first facial skin motion comprises determining multiple skin features corresponding to the first facial skin motion, and determining the multiple skin features comprises determining a position of multiple skin feature vertices relative to a skull of the first subject.
Claim: 19. A method, comprising: obtaining, at a computing device, a trained model that provides a mapping from facial skin motion to jaw motion, the trained model created using facial training data captured for one or more subjects; determining, at the computing device, a first facial skin motion of a first subject corresponding to a facial performance capture, the first subject being different from the one or more subjects; predicting, at the computing device, using the trained model, based at least on the first facial skin motion, a first jaw motion of a jaw of the first subject, the first jaw motion corresponding to the facial performance capture; and generating, at the computing device, using the first jaw motion predicted using the trained model, a facial animation of a digital character, wherein determining the first facial skin motion comprises determining multiple skin features corresponding to the first facial skin motion, and determining the multiple skin features comprises determining a position of multiple skin feature vertices relative to a skull of the first subject.
Patent References Cited: 20190090784 March 2019 Chang et al.










Other References: Li, Richard, and Gabriel Reyes. “Buccal: low-cost cheek sensing for inferring continuous jaw motion in mobile virtual reality.” Proceedings of the 2018 ACM International Symposium on Wearable Computers. 2018. (Year: 2018). cited by examiner
Anonymous, “Accurate Markerless Jaw Tracking for Facial Performance Capture”, Association for Computing Machinery, Jan. 2019, vol. 1, No. 1, pp. 1-8. cited by applicant
Brandini et al., “Video-Based Tracking of Jaw Movements During Speech: Preliminary Results and Future Directions”, Interspeech, ISCA Aug. 20-24, 2017, Stockholm, Sweden, pp. 689-693. cited by applicant
Furtado et al., “A specialized motion capture system for real-time analysis of mandibular movements using infrared cameras”, Biomedical Engineering Online, 2013, vol. 12, No. 1, pp. 1-16. cited by applicant
Gerstner et al., “Predicting masticatory jaw movements from chin movements using multivariate linear methods”, Journal of Biomechanics, 2005, vol. 38, No. 10, pp. 1991-1999. cited by applicant
Green et al., “Estimating Mandibular Motion Based on Chin Surface Targets During Speech”, Journal of Speech, Language and Hearing Research, Aug. 2007, vol. 50, pp. 928-939. cited by applicant
Li et al., “Buccal: Low-Cost Cheek Sensing for Inferring Continuous Jaw Motion in Mobile Virtual Reality”, Proceedings of the 2018 ACM International Symposium on Wearable Computers, Oct. 8-12, 2018, Singapore, pp. 180-183. cited by applicant
Soboleva et al., “Jaw tracking devices—historical review of methods development. Part I”, Stomatologija, vol. 7, No. 3, 2005, pp. 67-71. cited by applicant
Soboleva et al., “Jaw tracking devices-historical review of methods development. Part II”, Stomatologija, vol. 7, No. 3, 2005, pp. 72-76. cited by applicant
Tanaka et al., “Markerless three-dimensional tracking of masticatory movement”, Journal of Biomechanics, 2016, vol. 49, No. 3, pp. 442-449. cited by applicant
Wilson et al., “Comparison of jaw tracking by single video camera with 3D electromagnetic system”, Journal of Food Engineering, 2016, vol. 190, pp. 22-33. cited by applicant
Primary Examiner: Summers, Geoffrey E
Attorney, Agent or Firm: Sheppard Mullin Richter & Hampton LLP
رقم الانضمام: edspgr.12118734
قاعدة البيانات: USPTO Patent Grants