Elsevier

Graphical Models

Volume 68, Issues 5–6, September–November 2006, Pages 402-414
Graphical Models

Video motion analysis for the synthesis of dynamic cues and Futurist art

https://doi.org/10.1016/j.gmod.2006.05.003Get rights and content

Abstract

This paper presents new methods for stylising video to produce cartoon motion emphasis cues and modern art. Specifically, we introduce “dynamic cues” as a class of motion emphasis cue, encompassing traditional animation techniques such as anticipation and motion exaggeration. We describe methods for automatically synthesising such cues within video premised upon the recovery of articulated figures, and the subsequent manipulation of the recovered pose trajectories. Additionally, we show how our motion emphasis framework may be applied to emulate artwork in the Futurist style, popularised by Duchamp.

Introduction

The paper addresses the problem of stylising real-world video sequences to create animations. This problem comprises two principal technical challenges. First, how to generate stable artistic stylisations over the video (for example, an oil painterly effect)? Second, how to emulate the motion emphasis cues used by traditional animators? Early attempts to solve the first problem suffered from a distracting flickering [1], [2] that more recent approaches suppress [3], [4]. This paper focuses on the second problem of motion emphasis which, until recently, has received little attention in the non-photorealistic rendering (NPR) literature. A limited range of motion emphasis effects have been produced from three-dimensional computer graphics models [5], [6], by motion capturing cartoons [7], or interactively from drawings [8] and video [9]; see [10] for a wider review. Of greatest relevance to this paper is previous work by the authors addressing the production of both augmentation cues and deformation cues in real video [11]. The contribution of this paper is to extend the analytic framework required for augmentation and deformation cues so that dynamic cues can be automatically produced. Furthermore the Futurist school of painting, typified by Duchamp, can be emulated; this too is a unique contribution to NPR.

Traditional animators emphasise motion with a variety of cues that are familiar to anyone who has watched animations. Streak-lines depicting the paths of objects, and ghosting effects that echo trailing edges, are both examples of what we call augmentation cues: the animation is visually augmented with marks of some kind. Animated objects may stretch as they accelerate, squash as they slow down, or bend to show drag or inertia—we call these deformation cues. Furthermore objects may “anticipate” movement by a slight prior movement backwards, or move in a characteristic way that exaggerates ordinary motion. These latter cues we call dynamic cues. Examples of these cues are illustrated in Fig. 1. A deeper understanding of the differences between them relies on a definition of pose trajectory, as we now explain.

At any given instant in time an object has a particular pose, typically specified by a vector of numbers (for example, inter-joint orientations and world position). As this pose vector changes in time we obtain a pose trajectory. Augmentation cues and deformation cues are rendered as a function of pose trajectory. Dynamic cues differ because they alter the pose trajectory. This makes rendering dynamic cues very difficult because both the pose and timing of the object may change: poor rendering could leave “gaps” in the video, for example. Furthermore generating dynamic cues is non-trivial: a cartoon character can “wind up to run” in a way that is unique to them. The essential simplicities that bind the set of dynamic cues are very difficult to find.

Our purpose here is to provide an initial in-road into an understanding of dynamic cues. To this end we show how to generate and analyse a pose trajectory to produce:

  • anticipation effects;

  • motion “caricaturing” e.g. exaggeration effects;

  • Futurist-like stills, in a style reflecting that of Duchamp.

Our broad approach is to track polygons fitted around rigid objects so as to estimate their pose trajectory. This is analysed to construct a hierarchical articulated figure of rigid parts, with its pose trajectory (Section 2). The dynamic cues we produce from this (Section 3) integrate fully with our early published framework for synthesising augmentation and deformation cues [11]. Further, all motion emphasis cues integrate with our stable video stylisation technique [3]. Therefore, the contribution of this paper completes our work in the automated production of animations from real-world video, see [10] for a full description of our Video Paintbox.

Section snippets

Recovering articulated structure

Our problem is to recover the motion of a articulated figure—a doll—from monocular video. The doll is to be built from rigid parts and have a hierarchical structure. The hierarchy is a tree in which each part corresponds to a tree node. Two nodes are linked in the tree if they are physically connected by a pivot.

Humans are an important class of articulated figures, and the recovery of human motion from video sequences is a well-researched problem, see Hicks for a review [12]. Briefly, most

Dynamic cues and modern art

Given a recovered doll, we can produce not only dynamic cues as seen in traditional animations, but also emulate the Futurist style of modern art. So far as we are aware, both represent unique contributions.

As mentioned the general form of dynamic cues is to map one pose trajectory into another:p(t)=F[p(t)]The new pose trajectory is used to govern all other cues, so that objects can be augmented and deformed. Again as mentioned, a full understanding of dynamic cues eludes us at the present

Concluding remarks

This paper described our initial steps towards automatically synthesising dynamic cues from video, focusing on anticipation and motion exaggeration. Whether the principles we have introduced in addressing these cases generalise easily is unknown. It is likely that inverse kinematics of some kind will play a major role in automating anticipation, although whether pose analysis will ever be of sufficient power to produce the necessary key-frames is an open problem.

As presented, our framework for

References (22)

  • J.K. Aggarwal et al.

    Human motion analysis: a review

    Comput. Vis. Image Understanding: CVIU

    (1999)
  • P. Litwinowicz, Processing images and video for an impressionist effect, in: Proc. 24th Intl. Conference on Computer...
  • A. Hertzmann, K. Perlin, Painterly rendering for video and interaction, in: Proc. 1st ACM Symposium on...
  • J.P. Collomosse et al.

    Stroke surfaces: temporally coherent artistic animations from video

    IEEE Trans. Vis. Comput. Graph.

    (2005)
  • J. Wang, Y. Xu, H.-Y. Shum, M. Cohen, Video tooning, in: Proc. ACM SIGGRAPH, 2004, pp....
  • S. Chenney, M. Pingel, R. Iverson, M. Szymanski, Simulating cartoon style animation, in: Proc. 2nd ACM Symposium on...
  • M. Brand, A. Hertzmann, Style machines, in: ACM SIGGRAPH, 2000, pp....
  • C. Bregler, L. Loeb, E. Chuang, H. Deshpande, Turning to the masters: motion capturing cartoons, in: Proc. 29th Intl....
  • T. Strothotte, B. Preim, A. Raab, J. Schumann, D.R. Forsey, How to render frames and influence people, in: Proc....
  • A. Agarwala, A. Hertzmann, D. Salesin, S. Seitz, Keyframe-based tracking for rotoscoping and animation, in: Proc. ACM...
  • J.P. Collomosse, Higher level techniques for the artistic rendering of images and video, Ph.D. thesis, University of...
  • Cited by (0)

    View full text