IFACE: Interactive Face Animation - Comprehensive Environment

Research Collaborators
Steve DiPaola , Ali Arya

Rapid of growth of visual communication systems, from video phones to virtual agents in games and web services, has a brought a new generation of multimedia systems that we refer to as face-centric. Such systems are mainly concerned with multimedia representation of facial activities. Examples can be video messaging on cell phones and online customer support agents. Needless to say, each one of these applications has its own domain-specific algorithms, data, and control mechanism, but they all share the same front end that has to support a well-defined set of face-related capabilities. The concept of a “face multimedia object” arises from the need to encapsulate all the common requirements of such applications into one single component. We introduce Interactive Face Animation – Comprehensive Environment (iFACE) as a framework for Face Multimedia Object (FMO). Just like any other object, FMO includes data objects of less complicated types (audio, video, etc), and needs to provide proper services and interfaces to access and process those objects. iFACE integrates:

  • Hierarchical 3D head model for controlling facial actions from vertex to feature-group levels
  • 2D image transformations for direct image-based animation
  • Text-to-speech audio generation and lip-sync
  • Structured content description language
  • Streaming components
  • Wrapper objects to simplify development
  • Behavioural modeling
  • Proper interfaces to access the underlying objects from a variety of client application

Face Multimedia Object

Face Multimedia Object (FMO) encapsulates all the functionality and data related to facial actions. It forms a layer of abstraction on top of all the underlying details to simplify and streamline the creation of facial animations, including video, audio, timing and streaming-related data, control signals and events, and possibly descriptions (textual, metadata, etc). From a multimedia system point of view, in addition to obvious application-dependent features such as Realism, we consider the following requirements for FMO:
o Hierarchical Geometry: Face animation is dynamic manipulation of geometry, regardless of how we define it (e.g. pixels of a 2D image or vertices in a 3D model). The important attribute of this geometry is meaningful relations and grouping of its elements to form facial features and regions. Each one of these groups provides certain amount of detail and allow certain type of manipulation. For instance, a vertex can be moved to any new location by itself, but moving a feature (e.g. opening mouth) will move all related vertices. FMO has to expose proper layers of geometry for different types of access, mainly:

  • Vertex-level (or pixel-level)
  • Feature-level
  • Region-level
  • Timeliness: Face is a time-based object. Facial actions need to be synchronized (parallel, sequential, etc) with each other and external events.
  • Expressiveness: FMO should be able to express a variety of characters by personalizing the geometry and performing speech, movements, and emotions.
  • Interactivity: Just like any other software object, FMO needs proper interfaces to expose its services to a different types of client, such as:
  • GUI applications
  • HTML-based web pages
  • Web service clients
  • Behavioural Model: Actions of an agent (similar to people) is mainly based on stimulus-response model. In simplest case, behavioural rules, stored as individual “knowledge”, determine the proper response to any stimulus. But in a more general context, “personality” and “mood” are also affecting factors on behaviour. An ideal behavioural modeling for FMO supports defining and dynamically changing all these three factors.

Head Model

iFACE head model uses an abstracted hierarchy consisting of Head as the top object, Regions, Feature Lines and Points, and Physical Points, as shown above.

Face Modeling Language

Face Modeling Language (FML) is a Structured Content Description mechanism based on eXtensible Markup Language (XML). The main ideas behind FML are:

  • Hierarchical representation of face animation (from frames to simple moves, to meaningful actions, and finally stories)
  • Timeline definition of the relation between facial actions and external events (parallel and sequential actions, and also choice of one action from a set based on an external event)
  • Defining capabilities, behavioural templates, and models (FML is independent of the type of model but provides means of defining it.)
  • Compatibility with MPEG-4 (MPEG-4 FAPs are supported explicitly, and FDPs implicitly by general-purpose model definition mechanisms.)
  • Compatibility with XML and related web technologiesFML time containers allow different temporal relations between facial actions. par and seq combine facial actions in parallel or sequential ways.
    Using excl (exclusive) time container, one option among a group of facial actions can be selected. Time containers are illustrated in the following example where based on an external user input, a head yaw and opening the mouth can be done in parallel or one after another:
    sample FML, movie of the first choice, movie of the second choice


Behavioural Modeling

We consider a facial presentation to be a function of the following groups of parameters:

  • Geometry (a hierarchy of modules on top of 2D or 3D data)
  • Knowledge (including stimulus-response rules of interaction)
  • Personality (long term individual characteristics)
  • Mood (generally transient emotions and sensations)

Although having combined, and sometimes closely intertwined, effects, each one of these groups can be considered an independent “dimension” in face behavior, as shown in the above figure.


rFace and ShowFace are two independent face animation projects (based on 3D and 2D methods, respectively). iFACE project started by the experiences with these two systems and lessons learnt from them.

Downloads and Links
WSCG-06 paper “Socially Communicative Characters for Interactive Applications,” WSCG (2006)
 EVA-05 paper  “Socially Expressive Communication Agents: A Face-centric Approach,” European Conference on Electronic Imaging and the Visual Arts (2005)
WIAMIS-04 paper  “Face as a Multimedia Object,” 5th International Workshop on Image Analysis for Multimedia Interactive Services (2004)

iFACE Executables (2MB) v1.992. Unzip in c:iFace, and read iFaceDescription.htm first. Requires DirectX v9.0c and .NET framework.
Extra Heads and Textures (31MB)  Unzip in c:iFaceHiRes.
.NET Framework Installer (23MB)  After installing iFACE, run TestDotNet.exe and if there is an error, download and install .NET.
DirectX 9.0c Installer (34MB)  After installing iFACE, run TestD3D.exe and if there is an error, download and install DirectX.
rFace Demo Toolkit (8MB)  This is a demo version for our Face technology (all required DLLs included).

XXX Additional info. and demo movies in research/facetoolkit.
This project uses character animation software from Visage Technologies AB under the free Academic License.