Internformat: Hybrid methods for the analysis and synthesis of human faces

Hybrid methods for the analysis and synthesis of human faces:

Der Trend hin zu virtueller Realität (VR) hat neues Interesse an Themen wie der Modellierung menschlicher Körper geweckt, da sich neue Möglichkeiten für Unterhaltung, Konferenzsysteme und immersive Anwendungen bieten. Diese Dissertation stellt deshalb neue Ansätze für die Erstellung animierbarer/rea...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Paier, Wolfgang (VerfasserIn)
Format:	Abschlussarbeit Buch
Sprache:	English
Veröffentlicht:	Berlin [2024?]
Schlagworte:	Rendering Computeranimation Dreidimensionale Rekonstruktion Gesicht Hochschulschrift
Online-Zugang:	Volltext
Zusammenfassung:	Der Trend hin zu virtueller Realität (VR) hat neues Interesse an Themen wie der Modellierung menschlicher Körper geweckt, da sich neue Möglichkeiten für Unterhaltung, Konferenzsysteme und immersive Anwendungen bieten. Diese Dissertation stellt deshalb neue Ansätze für die Erstellung animierbarer/realistischer 3D-Kopfmodelle, zur computergestützten Gesichtsanimation aus Text/Sprache sowie zum fotorealistischen Echtzeit-Rendering vor. Um die 3D-Erfassung zu vereinfachen, wird ein hybrider Ansatz genutzt, der statistische Kopfmodelle mit dynamischen Texturen kombiniert. Das Modell erfasst Kopfhaltung und großflächige Deformationen, während die Texturen feine Details und komplexe Bewegungen kodieren. Anhand der erfassten Daten wird ein generatives Modell trainiert, das realistische Gesichtsausdrücke aus einem latenten Merkmalsvektor rekonstruiert. Zudem wird eine neue neuronale Rendering-Technik presentiert, die lernt den Vordergrund (Kopf) vom Hintergrund zu trennen. Das erhöht die Flexibilität während der Inferenz (z. B. neuer Hintergrund) und vereinfacht den Trainingsprozess, da die Segmentierung nicht vorab berechnet werden muss. Ein neuer Animationsansatz ermöglicht die automatische Synthese von Gesichtsvideos auf der Grundlage weniger Trainingssequenzen. Im Gegensatz zu bestehenden Arbeiten lernt das Verfahren einen latenten Merkmalsraum, der sowohl Emotionen als auch visuelle Variationen der Sprache erfasst, während gelernte Priors Animations-Artefakte und unrealistische Kopfbewegungen minimieren. Nach dem Training ist es möglich, realistische Sprachsequenzen zu erzeugen, während der latente Stil-Raum zusätzliche Gestaltungsmöglichkeiten bietet. Die vorgestellten Methoden bilden ein Komplettsystem für die realistische 3D-Modellierung, Animation und Darstellung von menschlichen Köpfen, das den Stand der Technik übertrifft. Dies wird in verschiedenen Experimenten, Ablations-/Nutzerstudien gezeigt und ausführlich diskutiert. Englische Version: The recent trend of virtual reality (VR) has sparked new interest in human body modeling by offering new possibilities for entertainment, conferencing, and immersive applications (e.g., intelligent virtual assistants). Therefore, this dissertation presents new approaches to creating animatable and realistic 3D head models, animating human faces from text/speech, and the photo-realistic rendering of head models in real-time. To simplify complex 3D face reconstruction, a hybrid approach is introduced that combines a lightweight statistical head model for 3D geometry with dynamic textures. The model captures head orientation and large-scale deformations, while textures encode fine details and complex motions. A deep variational autoencoder trained on these textured meshes learns to synthesize realistic facial expressions from a compact vector. Additionally, a new neural-rendering technique is proposed that separates the head (foreground) from the background, providing more flexibility during inference (e.g., rendering on novel backgrounds) and simplifying the training process as no segmentation masks have to be pre-computed. This dissertation also presents a new neural-network-based approach to synthesizing novel face animations based on emotional speech videos of an actor. Unlike existing works, the proposed model learns a latent animation style space that captures emotions as well as natural variations in visual speech. Additionally, learned animation priors minimize animation artifacts and unrealistic head movements. After training, the animation model offers temporally consistent editing of the animation style according to the users’ needs. Together, the presented methods provide an end-to-end system for realistic 3D modeling, animation, and rendering of human heads. Various experimental results, ablation studies, and user evaluations demonstrate that the proposed approaches outperform the state-of-the-art.
Beschreibung:	Tag der mündlichen Prüfung: 02.09.2024 Der Text enthält eine Zusammenfassung in deutscher und englischer Sprache.
Beschreibung:	xvi, 157 Seiten Illustrationen, Diagramme (farbig)

Internformat

MARC


LEADER	00000nam a2200000 c 4500
001	BV050042916
003	DE-604
005	20241129
007	t\|
008	241126s2024 xx a\|\|\| m\|\|\| 00\|\|\| eng d
035			\|a (DE-599)BVBBV050042916
040			\|a DE-604 \|b ger \|e rda
041	0		\|a eng
049			\|a DE-11
084			\|a ST 330 \|0 (DE-625)143663: \|2 rvk
084			\|a ST 177 \|0 (DE-625)143604: \|2 rvk
084			\|8 1\p \|a 543 \|2 23ksdnb
084			\|8 2\p \|a 540 \|2 23sdnb
100	1		\|a Paier, Wolfgang \|e Verfasser \|0 (DE-588)1348623535 \|4 aut
245	1	0	\|a Hybrid methods for the analysis and synthesis of human faces \|c von M.Sc. Wolfgang Paier
264		1	\|a Berlin \|c [2024?]
300			\|a xvi, 157 Seiten \|b Illustrationen, Diagramme (farbig)
336			\|b txt \|2 rdacontent
337			\|b n \|2 rdamedia
338			\|b nc \|2 rdacarrier
500			\|a Tag der mündlichen Prüfung: 02.09.2024
500			\|a Der Text enthält eine Zusammenfassung in deutscher und englischer Sprache.
502			\|b Dissertation \|c Humboldt-Universität zu Berlin \|d 2024
520	8		\|a Der Trend hin zu virtueller Realität (VR) hat neues Interesse an Themen wie der Modellierung menschlicher Körper geweckt, da sich neue Möglichkeiten für Unterhaltung, Konferenzsysteme und immersive Anwendungen bieten. Diese Dissertation stellt deshalb neue Ansätze für die Erstellung animierbarer/realistischer 3D-Kopfmodelle, zur computergestützten Gesichtsanimation aus Text/Sprache sowie zum fotorealistischen Echtzeit-Rendering vor. Um die 3D-Erfassung zu vereinfachen, wird ein hybrider Ansatz genutzt, der statistische Kopfmodelle mit dynamischen Texturen kombiniert. Das Modell erfasst Kopfhaltung und großflächige Deformationen, während die Texturen feine Details und komplexe Bewegungen kodieren. Anhand der erfassten Daten wird ein generatives Modell trainiert, das realistische Gesichtsausdrücke aus einem latenten Merkmalsvektor rekonstruiert. Zudem wird eine neue neuronale Rendering-Technik presentiert, die lernt den Vordergrund (Kopf) vom Hintergrund zu trennen. Das erhöht die Flexibilität während der Inferenz (z. B. neuer Hintergrund) und vereinfacht den Trainingsprozess, da die Segmentierung nicht vorab berechnet werden muss. Ein neuer Animationsansatz ermöglicht die automatische Synthese von Gesichtsvideos auf der Grundlage weniger Trainingssequenzen. Im Gegensatz zu bestehenden Arbeiten lernt das Verfahren einen latenten Merkmalsraum, der sowohl Emotionen als auch visuelle Variationen der Sprache erfasst, während gelernte Priors Animations-Artefakte und unrealistische Kopfbewegungen minimieren. Nach dem Training ist es möglich, realistische Sprachsequenzen zu erzeugen, während der latente Stil-Raum zusätzliche Gestaltungsmöglichkeiten bietet. Die vorgestellten Methoden bilden ein Komplettsystem für die realistische 3D-Modellierung, Animation und Darstellung von menschlichen Köpfen, das den Stand der Technik übertrifft. Dies wird in verschiedenen Experimenten, Ablations-/Nutzerstudien gezeigt und ausführlich diskutiert.
520	8		\|a Englische Version: The recent trend of virtual reality (VR) has sparked new interest in human body modeling by offering new possibilities for entertainment, conferencing, and immersive applications (e.g., intelligent virtual assistants). Therefore, this dissertation presents new approaches to creating animatable and realistic 3D head models, animating human faces from text/speech, and the photo-realistic rendering of head models in real-time. To simplify complex 3D face reconstruction, a hybrid approach is introduced that combines a lightweight statistical head model for 3D geometry with dynamic textures. The model captures head orientation and large-scale deformations, while textures encode fine details and complex motions. A deep variational autoencoder trained on these textured meshes learns to synthesize realistic facial expressions from a compact vector. Additionally, a new neural-rendering technique is proposed that separates the head (foreground) from the background, providing more flexibility during inference (e.g., rendering on novel backgrounds) and simplifying the training process as no segmentation masks have to be pre-computed. This dissertation also presents a new neural-network-based approach to synthesizing novel face animations based on emotional speech videos of an actor. Unlike existing works, the proposed model learns a latent animation style space that captures emotions as well as natural variations in visual speech. Additionally, learned animation priors minimize animation artifacts and unrealistic head movements. After training, the animation model offers temporally consistent editing of the animation style according to the users’ needs. Together, the presented methods provide an end-to-end system for realistic 3D modeling, animation, and rendering of human heads. Various experimental results, ablation studies, and user evaluations demonstrate that the proposed approaches outperform the state-of-the-art.
650	0	7	\|a Rendering \|0 (DE-588)4219666-8 \|2 gnd \|9 rswk-swf
650	0	7	\|a Computeranimation \|0 (DE-588)4199710-4 \|2 gnd \|9 rswk-swf
650	0	7	\|a Dreidimensionale Rekonstruktion \|0 (DE-588)4150634-0 \|2 gnd \|9 rswk-swf
650	0	7	\|a Gesicht \|0 (DE-588)4020687-7 \|2 gnd \|9 rswk-swf
655		7	\|0 (DE-588)4113937-9 \|a Hochschulschrift \|2 gnd-content
689	0	0	\|a Computeranimation \|0 (DE-588)4199710-4 \|D s
689	0		\|5 DE-604
689	1	0	\|a Dreidimensionale Rekonstruktion \|0 (DE-588)4150634-0 \|D s
689	1		\|5 DE-604
689	2	0	\|a Gesicht \|0 (DE-588)4020687-7 \|D s
689	2		\|5 DE-604
689	3	0	\|a Rendering \|0 (DE-588)4219666-8 \|D s
689	3		\|5 DE-604
776	0	8	\|i Erscheint auch als \|n Online-Ausgabe \|a Paier, Wolfgang \|t Hybrid methods for the analysis and synthesis of human faces \|o 10.18452/29350 \|o urn:nbn:de:kobv:11-110-18452/31090-5 \|w (DE-604)BV049955232
856	4	1	\|u http://edoc.hu-berlin.de/18452/31090 \|x Verlag \|z kostenfrei \|3 Volltext
883	0		\|8 1\p \|a emakn \|c 0,15997 \|d 20241119 \|q DE-101 \|u https://d-nb.info/provenance/plan#emakn
883	0		\|8 2\p \|a emasg \|c 0,42053 \|d 20241119 \|q DE-101 \|u https://d-nb.info/provenance/plan#emasg
912			\|a ebook
943	1		\|a oai:aleph.bib-bvb.de:BVB01-035380619

Datensatz im Suchindex

_version_	1817054366566187008
adam_text
any_adam_object
author	Paier, Wolfgang
author_GND	(DE-588)1348623535
author_facet	Paier, Wolfgang
author_role	aut
author_sort	Paier, Wolfgang
author_variant	w p wp
building	Verbundindex
bvnumber	BV050042916
classification_rvk	ST 330 ST 177
collection	ebook
ctrlnum	(DE-599)BVBBV050042916
discipline	Informatik
format	Thesis Book
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 c 4500</leader><controlfield tag="001">BV050042916</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20241129</controlfield><controlfield tag="007">t\|</controlfield><controlfield tag="008">241126s2024 xx a\|\|\| m\|\|\| 00\|\|\| eng d</controlfield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV050042916</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-11</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 330</subfield><subfield code="0">(DE-625)143663:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 177</subfield><subfield code="0">(DE-625)143604:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="8">1\p</subfield><subfield code="a">543</subfield><subfield code="2">23ksdnb</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="8">2\p</subfield><subfield code="a">540</subfield><subfield code="2">23sdnb</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Paier, Wolfgang</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1348623535</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Hybrid methods for the analysis and synthesis of human faces</subfield><subfield code="c">von M.Sc. Wolfgang Paier</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Berlin</subfield><subfield code="c">[2024?]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xvi, 157 Seiten</subfield><subfield code="b">Illustrationen, Diagramme (farbig)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Tag der mündlichen Prüfung: 02.09.2024</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Der Text enthält eine Zusammenfassung in deutscher und englischer Sprache.</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="b">Dissertation</subfield><subfield code="c">Humboldt-Universität zu Berlin</subfield><subfield code="d">2024</subfield></datafield><datafield tag="520" ind1="8" ind2=" "><subfield code="a">Der Trend hin zu virtueller Realität (VR) hat neues Interesse an Themen wie der Modellierung menschlicher Körper geweckt, da sich neue Möglichkeiten für Unterhaltung, Konferenzsysteme und immersive Anwendungen bieten. Diese Dissertation stellt deshalb neue Ansätze für die Erstellung animierbarer/realistischer 3D-Kopfmodelle, zur computergestützten Gesichtsanimation aus Text/Sprache sowie zum fotorealistischen Echtzeit-Rendering vor. Um die 3D-Erfassung zu vereinfachen, wird ein hybrider Ansatz genutzt, der statistische Kopfmodelle mit dynamischen Texturen kombiniert. Das Modell erfasst Kopfhaltung und großflächige Deformationen, während die Texturen feine Details und komplexe Bewegungen kodieren. Anhand der erfassten Daten wird ein generatives Modell trainiert, das realistische Gesichtsausdrücke aus einem latenten Merkmalsvektor rekonstruiert. Zudem wird eine neue neuronale Rendering-Technik presentiert, die lernt den Vordergrund (Kopf) vom Hintergrund zu trennen. Das erhöht die Flexibilität während der Inferenz (z. B. neuer Hintergrund) und vereinfacht den Trainingsprozess, da die Segmentierung nicht vorab berechnet werden muss. Ein neuer Animationsansatz ermöglicht die automatische Synthese von Gesichtsvideos auf der Grundlage weniger Trainingssequenzen. Im Gegensatz zu bestehenden Arbeiten lernt das Verfahren einen latenten Merkmalsraum, der sowohl Emotionen als auch visuelle Variationen der Sprache erfasst, während gelernte Priors Animations-Artefakte und unrealistische Kopfbewegungen minimieren. Nach dem Training ist es möglich, realistische Sprachsequenzen zu erzeugen, während der latente Stil-Raum zusätzliche Gestaltungsmöglichkeiten bietet. Die vorgestellten Methoden bilden ein Komplettsystem für die realistische 3D-Modellierung, Animation und Darstellung von menschlichen Köpfen, das den Stand der Technik übertrifft. Dies wird in verschiedenen Experimenten, Ablations-/Nutzerstudien gezeigt und ausführlich diskutiert.</subfield></datafield><datafield tag="520" ind1="8" ind2=" "><subfield code="a">Englische Version: The recent trend of virtual reality (VR) has sparked new interest in human body modeling by offering new possibilities for entertainment, conferencing, and immersive applications (e.g., intelligent virtual assistants). Therefore, this dissertation presents new approaches to creating animatable and realistic 3D head models, animating human faces from text/speech, and the photo-realistic rendering of head models in real-time. To simplify complex 3D face reconstruction, a hybrid approach is introduced that combines a lightweight statistical head model for 3D geometry with dynamic textures. The model captures head orientation and large-scale deformations, while textures encode fine details and complex motions. A deep variational autoencoder trained on these textured meshes learns to synthesize realistic facial expressions from a compact vector. Additionally, a new neural-rendering technique is proposed that separates the head (foreground) from the background, providing more flexibility during inference (e.g., rendering on novel backgrounds) and simplifying the training process as no segmentation masks have to be pre-computed. This dissertation also presents a new neural-network-based approach to synthesizing novel face animations based on emotional speech videos of an actor. Unlike existing works, the proposed model learns a latent animation style space that captures emotions as well as natural variations in visual speech. Additionally, learned animation priors minimize animation artifacts and unrealistic head movements. After training, the animation model offers temporally consistent editing of the animation style according to the users’ needs. Together, the presented methods provide an end-to-end system for realistic 3D modeling, animation, and rendering of human heads. Various experimental results, ablation studies, and user evaluations demonstrate that the proposed approaches outperform the state-of-the-art.</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Rendering</subfield><subfield code="0">(DE-588)4219666-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Computeranimation</subfield><subfield code="0">(DE-588)4199710-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Dreidimensionale Rekonstruktion</subfield><subfield code="0">(DE-588)4150634-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Gesicht</subfield><subfield code="0">(DE-588)4020687-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Computeranimation</subfield><subfield code="0">(DE-588)4199710-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="1" ind2="0"><subfield code="a">Dreidimensionale Rekonstruktion</subfield><subfield code="0">(DE-588)4150634-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="2" ind2="0"><subfield code="a">Gesicht</subfield><subfield code="0">(DE-588)4020687-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="2" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="3" ind2="0"><subfield code="a">Rendering</subfield><subfield code="0">(DE-588)4219666-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="3" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="a">Paier, Wolfgang</subfield><subfield code="t">Hybrid methods for the analysis and synthesis of human faces</subfield><subfield code="o">10.18452/29350</subfield><subfield code="o">urn:nbn:de:kobv:11-110-18452/31090-5</subfield><subfield code="w">(DE-604)BV049955232</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://edoc.hu-berlin.de/18452/31090</subfield><subfield code="x">Verlag</subfield><subfield code="z">kostenfrei</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="883" ind1="0" ind2=" "><subfield code="8">1\p</subfield><subfield code="a">emakn</subfield><subfield code="c">0,15997</subfield><subfield code="d">20241119</subfield><subfield code="q">DE-101</subfield><subfield code="u">https://d-nb.info/provenance/plan#emakn</subfield></datafield><datafield tag="883" ind1="0" ind2=" "><subfield code="8">2\p</subfield><subfield code="a">emasg</subfield><subfield code="c">0,42053</subfield><subfield code="d">20241119</subfield><subfield code="q">DE-101</subfield><subfield code="u">https://d-nb.info/provenance/plan#emasg</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ebook</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-035380619</subfield></datafield></record></collection>
genre	(DE-588)4113937-9 Hochschulschrift gnd-content
genre_facet	Hochschulschrift
id	DE-604.BV050042916
illustrated	Illustrated
indexdate	2024-11-29T11:01:20Z
institution	BVB
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-035380619
open_access_boolean	1
owner	DE-11
owner_facet	DE-11
physical	xvi, 157 Seiten Illustrationen, Diagramme (farbig)
psigel	ebook
publishDate	2024
publishDateSearch	2024
publishDateSort	2024
record_format	marc
spelling	Paier, Wolfgang Verfasser (DE-588)1348623535 aut Hybrid methods for the analysis and synthesis of human faces von M.Sc. Wolfgang Paier Berlin [2024?] xvi, 157 Seiten Illustrationen, Diagramme (farbig) txt rdacontent n rdamedia nc rdacarrier Tag der mündlichen Prüfung: 02.09.2024 Der Text enthält eine Zusammenfassung in deutscher und englischer Sprache. Dissertation Humboldt-Universität zu Berlin 2024 Der Trend hin zu virtueller Realität (VR) hat neues Interesse an Themen wie der Modellierung menschlicher Körper geweckt, da sich neue Möglichkeiten für Unterhaltung, Konferenzsysteme und immersive Anwendungen bieten. Diese Dissertation stellt deshalb neue Ansätze für die Erstellung animierbarer/realistischer 3D-Kopfmodelle, zur computergestützten Gesichtsanimation aus Text/Sprache sowie zum fotorealistischen Echtzeit-Rendering vor. Um die 3D-Erfassung zu vereinfachen, wird ein hybrider Ansatz genutzt, der statistische Kopfmodelle mit dynamischen Texturen kombiniert. Das Modell erfasst Kopfhaltung und großflächige Deformationen, während die Texturen feine Details und komplexe Bewegungen kodieren. Anhand der erfassten Daten wird ein generatives Modell trainiert, das realistische Gesichtsausdrücke aus einem latenten Merkmalsvektor rekonstruiert. Zudem wird eine neue neuronale Rendering-Technik presentiert, die lernt den Vordergrund (Kopf) vom Hintergrund zu trennen. Das erhöht die Flexibilität während der Inferenz (z. B. neuer Hintergrund) und vereinfacht den Trainingsprozess, da die Segmentierung nicht vorab berechnet werden muss. Ein neuer Animationsansatz ermöglicht die automatische Synthese von Gesichtsvideos auf der Grundlage weniger Trainingssequenzen. Im Gegensatz zu bestehenden Arbeiten lernt das Verfahren einen latenten Merkmalsraum, der sowohl Emotionen als auch visuelle Variationen der Sprache erfasst, während gelernte Priors Animations-Artefakte und unrealistische Kopfbewegungen minimieren. Nach dem Training ist es möglich, realistische Sprachsequenzen zu erzeugen, während der latente Stil-Raum zusätzliche Gestaltungsmöglichkeiten bietet. Die vorgestellten Methoden bilden ein Komplettsystem für die realistische 3D-Modellierung, Animation und Darstellung von menschlichen Köpfen, das den Stand der Technik übertrifft. Dies wird in verschiedenen Experimenten, Ablations-/Nutzerstudien gezeigt und ausführlich diskutiert. Englische Version: The recent trend of virtual reality (VR) has sparked new interest in human body modeling by offering new possibilities for entertainment, conferencing, and immersive applications (e.g., intelligent virtual assistants). Therefore, this dissertation presents new approaches to creating animatable and realistic 3D head models, animating human faces from text/speech, and the photo-realistic rendering of head models in real-time. To simplify complex 3D face reconstruction, a hybrid approach is introduced that combines a lightweight statistical head model for 3D geometry with dynamic textures. The model captures head orientation and large-scale deformations, while textures encode fine details and complex motions. A deep variational autoencoder trained on these textured meshes learns to synthesize realistic facial expressions from a compact vector. Additionally, a new neural-rendering technique is proposed that separates the head (foreground) from the background, providing more flexibility during inference (e.g., rendering on novel backgrounds) and simplifying the training process as no segmentation masks have to be pre-computed. This dissertation also presents a new neural-network-based approach to synthesizing novel face animations based on emotional speech videos of an actor. Unlike existing works, the proposed model learns a latent animation style space that captures emotions as well as natural variations in visual speech. Additionally, learned animation priors minimize animation artifacts and unrealistic head movements. After training, the animation model offers temporally consistent editing of the animation style according to the users’ needs. Together, the presented methods provide an end-to-end system for realistic 3D modeling, animation, and rendering of human heads. Various experimental results, ablation studies, and user evaluations demonstrate that the proposed approaches outperform the state-of-the-art. Rendering (DE-588)4219666-8 gnd rswk-swf Computeranimation (DE-588)4199710-4 gnd rswk-swf Dreidimensionale Rekonstruktion (DE-588)4150634-0 gnd rswk-swf Gesicht (DE-588)4020687-7 gnd rswk-swf (DE-588)4113937-9 Hochschulschrift gnd-content Computeranimation (DE-588)4199710-4 s DE-604 Dreidimensionale Rekonstruktion (DE-588)4150634-0 s Gesicht (DE-588)4020687-7 s Rendering (DE-588)4219666-8 s Erscheint auch als Online-Ausgabe Paier, Wolfgang Hybrid methods for the analysis and synthesis of human faces 10.18452/29350 urn:nbn:de:kobv:11-110-18452/31090-5 (DE-604)BV049955232 http://edoc.hu-berlin.de/18452/31090 Verlag kostenfrei Volltext 1\p emakn 0,15997 20241119 DE-101 https://d-nb.info/provenance/plan#emakn 2\p emasg 0,42053 20241119 DE-101 https://d-nb.info/provenance/plan#emasg
spellingShingle	Paier, Wolfgang Hybrid methods for the analysis and synthesis of human faces Rendering (DE-588)4219666-8 gnd Computeranimation (DE-588)4199710-4 gnd Dreidimensionale Rekonstruktion (DE-588)4150634-0 gnd Gesicht (DE-588)4020687-7 gnd
subject_GND	(DE-588)4219666-8 (DE-588)4199710-4 (DE-588)4150634-0 (DE-588)4020687-7 (DE-588)4113937-9
title	Hybrid methods for the analysis and synthesis of human faces
title_auth	Hybrid methods for the analysis and synthesis of human faces
title_exact_search	Hybrid methods for the analysis and synthesis of human faces
title_full	Hybrid methods for the analysis and synthesis of human faces von M.Sc. Wolfgang Paier
title_fullStr	Hybrid methods for the analysis and synthesis of human faces von M.Sc. Wolfgang Paier
title_full_unstemmed	Hybrid methods for the analysis and synthesis of human faces von M.Sc. Wolfgang Paier
title_short	Hybrid methods for the analysis and synthesis of human faces
title_sort	hybrid methods for the analysis and synthesis of human faces
topic	Rendering (DE-588)4219666-8 gnd Computeranimation (DE-588)4199710-4 gnd Dreidimensionale Rekonstruktion (DE-588)4150634-0 gnd Gesicht (DE-588)4020687-7 gnd
topic_facet	Rendering Computeranimation Dreidimensionale Rekonstruktion Gesicht Hochschulschrift
url	http://edoc.hu-berlin.de/18452/31090
work_keys_str_mv	AT paierwolfgang hybridmethodsfortheanalysisandsynthesisofhumanfaces

Verfügbarkeit

MARC

Datensatz im Suchindex

Ähnliche Einträge