An Architecture of 3D Interaction for Models Deformed on the GPU

Doctorate work of Harlen Costa Batagelo, advised by Dr.-Ing. Wu, Shin-Ting
Department of Computer Engineering and Industrial Automation (DCA)
School of Electrical and Computer Engineering (FEEC)
State University of Campinas (Unicamp)

Introduction

This doctorate work is about the development of a software architecture for supporting direct manipulations that efficiently deal with models deformed on the GPU. This is motivated by the increasing number of applications that employ vertex and pixel shaders to deform geometry but demand accurate interactions with such models after the deformations on the rendering pipeline. The goal of the architecture is to provide basic functionalities of cursor picking and cursor snapping without requiring a copy of the 3D models to exist in system memory, i.e., the same models used for rendering are also used for interaction. This architecture provides a foundation for the development of interaction tasks which are at the same time efficient – because most part of processing for interaction is performed on the GPU – and accurate with respect to what is being actually visualized.

The architecture is designed mainly for games and editors of game assets in which the user is able to perform accurate and efficient picking and snapping of meshes deformed in real-time, such as skinned mesh models, surfaces with per-vertex and per-pixel displacements (e.g., relief mapping, parallax occlusion mapping) and meshes under arbitrary deformations. The architecture produces pixel-exact results and therefore ensures that the user interacts with what he is actually seeing, even if the original model submitted to the rendering pipeline has a very different final appearance.

Project news (2008-04-12)

We are adapting the proposed architecture of interaction for supporting volume data. The goal is to allow the direct manipulation of volume data processed on the GPU and rendered as isosurfaces or direct rendering. As for surfaces, the only data necessary for this interaction are the data already stored in video memory for rendering. For volume data, these are the 3D textures containing the raw volume data and gradient data.

Below is a screenshot and video of a prototype application that demonstrates snapping a cursor to isosurfaces of volume data rendered by ray casting on the GPU.

Snapping to isosurfaces
This video illustrates snapping a triad cursor to isosurfaces extracted from volume data on the GPU. The red axis of the triad cursor shows the direction of the normal vector at the snapped isosurface point.
AVI (26.8 MB)

More results about interaction with volume data are available here.

Thesis download

Batagelo, H. C.; Uma Arquitetura de Suporte a Interações 3D Integrada a GPU, Ph.D. thesis, State University of Campinas, Campinas, SP, Brazil, July, 2007. In Portuguese.
PDF (43.7 MB).

Videos

	Picking This video illustrates the most basic interaction task using the architecture: selection. A set of intersecting knot models is displayed and the knot model selected by the cursor is highlighted. AVI (1.22 MB)
	Interaction maps This illustrates the technique of interaction maps proposed by J. S. Pierce and R. Pausch, 2003. An event is triggered whenever the mouse cursor hovers a texture-mapped colored spot. AVI (2.03 MB)
	Face picking In this video, the architecture returns the identifier of all faces that would be intersected by the selection ray in a ray-picking approach. Here, no explicit intersection test is computed. AVI (2,38 MB)
	Surface snapping A triad cursor snaps to a mesh surface and aligns its axes according to the tangent basis calculated at each point. AVI (4.23 MB)
	Snapping to borders The 3D cursor (red sphere) snaps to surface points corresponding to borders detected by a post-processing filter. A small quad around the cursor shows the non-null pixels of the “region of interest,” which are the pixels of the detected borders. AVI (1.19 MB)
	Snapping to principal directions The movement of a triad cursor snapped to the surface is constrained to the direction of minimum curvature. The surface is painted in order to highlight the trajectory on the surface. AVI (4.35 MB)
	Painting and sculpting on a relief mapped quad A triad cursor snaps to a quad rendered with relief mapping while the user interactively paints the surface and sculpts the height field. The cursor correctly takes into account the height field depth and direction of the normal vector. AVI (7.55 MB)
	Geometric snapping When the user triggers a keyboard key, the triad cursor snaps to a point of the model's surface with greater median curvature around the 2D cursor. AVI (3.9 MB)

Development history and publications

The architecture development started in 2002 inspired by previous works on interaction via direct manipulation. In particular, the architecture followed the philosophy of the MTK (Manipulation Toolkit) project, which exploited the idea of making the processing of interaction an application-independent task. At that time, however, as we observed the fast advance of the shader languages for performing geometry modeling tasks, the problem of interacting with models deformed on graphics hardware became our main focus. The possibility of performing all the interaction computation on the GPU seemed to be the best way of both solving the problem of efficiently interacting with deformed geometry and maintaining the interaction application-independent.

In 2005 we implemented the basic idea of the architecture, which comprises computing geometry attributes on the GPU after the deformations (only vertex normals at that time), storing such attributes into render targets and using them for performing picking and surface snapping on geometry arbitrarily deformed on the GPU.

The work was greatly extended in 2006 in order to devise a full architecture of interaction implemented as a simple function library. Besides vertex normals, the architecture was now able to compute vertex tangents and bitangents for detail mapping algorithms. Recently, we have proposed a novel algorithm for computing elements of discrete differential geometry of 2nd and 3rd orders. This includes the tensor of curvature, tensor of curvature derivative, principal directions and principal curvatures. Such choice of attributes was the result of the analysis of several direct manipulation tasks which showed that differential geometry attributes and user-defined attributes for each surface point suffice for most interaction tasks using a 2D cursor.
The following works were published during the thesis writing:

Batagelo, H. C.; Wu, S. T.; Application-Independent 3D Interaction Using Geometry Attributes Computed on the GPU, In Proc. of the 20th Brazilian Symposium on Computer Graphics and Image Processing, pp. 19-26, IEEE CS Press, Belo Horizonte, MG, Brazil, October 2007.
PDF (1.28 MB).
Batagelo, H. C.; Wu, S. T.; Estimating Curvatures and Their Derivatives on Meshes of Arbitrary Topology from Sampling Directions, The Visual Computer, 23:9-11, pp. 803-812, Springer, June 2007.
PDF (3.17 MB). Project’s home page.
Batagelo, H. C.; Wu, S. T.; What You See Is What You Snap: Snapping to Geometry Deformed on the GPU, In Proc. of the 2005 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, pp. 81-86, ACM Press, Washington, DC, April 2005.
PDF (915 KB). Project’s home page.
Wu, S. T.; Abrantes, M.; Tost, D.; Batagelo, H. C.; Picking and Snapping for 3D Input Devices, In Proceedings of the 16th Brazilian Symposium on Computer Graphics and Image Processing, pp. 140-147, IEEE CS Press, São Carlos, SP, Brasil, October 2003.
PDF (1.3 MB). Presentation slides (100.4 KB).

Features

The basic idea of the architecture is to use the actual rendering pipeline to compute all the attributes necessary to perform fundamental direct manipulation actions (picking and snapping). The architecture does that by rendering the objects of interest onto off-screen buffers in additional render passes, using a set of shaders that compute the required attributes. The objects of interest are the objects that the user may interact with, and are the same objects used for the actual rendering. Instead of producing an image for visualization, the pixels of the render targets will be written with color-encoded geometry attributes of the mesh (e.g., position, normal vector and curvatures). The contents of these off-screen buffers are read back by the CPU and the geometric information is made available to the application. The application is responsible for actually using the geometric data for performing the interaction task. The current version of the architecture has the following features:

Works with any primitive handled by the graphics pipeline (points, lines, triangles).
Computes the following per-fragment attributes of the models on the GPU: 3D position in object space; depth in camera space; normal vector; tangent vector; bitangent vector; tensor of curvature; principal curvatures; principal directions and tensor of curvature derivative. Normal vector and curvatures are computed only for triangle meshes.
Allows per-vertex application-defined attributes both for indexed and non-indexed versions of the geometry. The application can include attributes such as texture coordinates, weights of barycentric coordinates, vertex IDs, face IDs and model IDs to each vertex of the geometry.
Computation of attributes is constrained to a resizable rectangular region of interest around the hotspot of the 2D cursor.
Allows deformation shaders both in the vertex processor and fragment processor.

Efficiency

The process of performing a picking interaction task by rendering the geometry to an off-screen buffer and then reading back its contents from a single pixel is way faster than evaluating an algorithm of ray-triangle intersections on the CPU. This favorable result is mainly due to the higher stream processing power of today’s GPUs compared to CPUs. The required additional render passes are usually very efficient since only a quite small screen area is relevant to the application (normally a single pixel). For that reason, the CPU stalls due to reading back the render targets are negligible for most cases.
The difference in efficiency is even higher if we take into account the time for deforming the geometry on the GPU as compared to the time for deforming the copy of the geometry on the CPU. Although this is the traditional approach for picking deformed meshes today, it has poor performance when compared to the approach used by our architecture.

Our architecture uses the model already in video memory to perform all the computations, while the traditional intersection test on the CPU requires a copy of the model in system memory. On the other hand, our architecture require additional video memory if the model is deformed in real-time. This is because the attributes of each model are stored as vertex textures. Floating point textures are used to store vertex positions, vertex normals and other attributes computed on the GPU. The application may take advantage of such additional data and use them for the actual rendering.

The efficiency test results presented in the thesis were obtained using a graphics card NVIDIA GeForce 8800 GTX. The same tests running on a GeForce 6200 and GeForce 7900 GTX are available below:

Test results on a GeForce 6200 with 256 MB. DOC (123 KB).
Tests running on a GeForce 7900 GTX with 512 MB. DOC (160 KB).

Download

The architecture is provided as a small C++ library and a set of SM 3.0 shaders. The library contains two classes: CIntManager and CIntObj. In the header file of CIntManager, one of the directives #define D3D9 or #define OGL2 should be uncommented in order to compile for Direct3D 9 or OpenGL 2.0 (using GLEW). Simply compile and link your project with the source files. The set of SM 3.0 shaders (in FX or GLSL format depending on the API used) are used by the library during runtime.

Library source code: ZIP (74 KB).
Minimal picking sample code using GLEW and GLUT: ZIP (411 KB).

License

The library and sample code are released under the LGPL license:

Copyright (C) 2007 Harlen Costa Batagelo

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

Acknowledgments

This project was supported by the National Research Council (CNPq) under the grant number 141685/2002-6 and The State of São Paulo Research Support Foundation (FAPESP) under the grant numbers 1996/0962-0 and 03/13090-6.