A nonlinear viseme model for triphone-based speech synthesis

Bargmann, Robert; Blanz, Volker; Seidel, Hans-Peter

Local TagsRelease HistoryDetailsSummary

A nonlinear viseme model for triphone-based speech synthesis

Bargmann, R., Blanz, V., & Seidel, H.-P.(2007). A nonlinear viseme model for triphone-based speech synthesis (MPI-I-2007-4-003). Saarbrücken: Max-Planck-Institut für Informatik.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/11858/00-001M-0000-0014-66DC-7 Version Permalink: https://hdl.handle.net/21.11116/0000-000E-2FE6-B

Genre: Report

Files

show Files

hide Files

:

MPI-I-2007-4-003.ps (Any fulltext), 31MB

View Save

File Permalink:
https://hdl.handle.net/11858/00-001M-0000-0014-66DE-3

Name:
MPI-I-2007-4-003.ps

Description:
-

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/postscript / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
-

Locators

show

Creators

show

hide

Creators:
Bargmann, Robert¹, Author
Blanz, Volker¹, Author
Seidel, Hans-Peter¹, Author

Affiliations:
1Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047

Content

show

hide

Free keywords: -

Abstract: This paper presents a representation of visemes that defines a measure of similarity between different visemes, and a system of viseme categories. The representation is derived from a statistical data analysis of feature points on 3D scans, using Locally Linear Embedding (LLE). The similarity measure determines which available viseme and triphones to use to synthesize 3D face animation for a novel audio file. From a corpus of dynamic recorded 3D mouth articulation data, our system is able to find the best suited sequence of triphones over which to interpolate while reusing the coarticulation information to obtain correct mouth movements over time. Due to the similarity measure, the system can deal with relatively small triphone databases and find the most appropriate candidates. With the selected sequence of database triphones, we can finally morph along the successive triphones to produce the final articulation animation. In an entirely data-driven approach, our automated procedure for defining viseme categories reproduces the groups of related visemes that are defined in the phonetics literature.

Details

show

hide

Language(s): eng - English

Dates: Date issued: 2007

Publication Status: Issued

Pages: 28 p.

Publishing info: Saarbrücken : Max-Planck-Institut für Informatik

Table of Contents: -

Rev. Type: -

Identifiers: URI: http://domino.mpi-inf.mpg.de/internet/reports.nsf/NumberView/2007-4-003
Report Nr.: MPI-I-2007-4-003
BibTex Citekey: BargmannBlanzSeidel2007

Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show

hide

Title: Research Report / Max-Planck-Institut für Informatik

Source Genre: Series

Creator(s):

Affiliations:

Publ. Info: -

Pages: - Volume / Issue: - Sequence Number: - Start / End Page: - Identifier: -