Mesh Skinning & Animation

C++

GPU-Focused

Animation is an important component of any game engine system. It is also very costly. Fortunately, the type of work required for animating models is perfect for the multi-core architecture of GPUs.

For each keyframe, each bone is able to be interpolated (LERP)/spherically interpolated (SLERP) independently from one another. This information must be calculated on a per bone-basis for each frame. Using one model as an example, that would be 80 bones * 2 LERPs * 1 SLERP, totaling 240 different blends for just one graphic object. These calculations would eat up into CPU time. By offloading this to the GPU, we can free up the CPU for other work, such as AI.

The offloaded work is performed in multiple compute shaders during the update cycle of the game engine. Between each stage, a memory barrier is sent to the GPU to ensure that all calculations have finished before going forward to the next stage. Each animation clip is shared between all models using that skeleton and contains its own shader storage buffer object (SSBO) to be used as a scratchboard between stages to keep the memory footprint down. Once all animation calculations have been completed, the final animation data for this graphic object is stored in an SSBO that is owned by the graphic object.

One thing to note is that the bone matrix information is stored relative to its parents. This is an issue in our multi-threaded model, as we can't access the parent's information because the bone information may not have been calculated yet. Just like CPU-multithreading, order of dispatch/completion is not guaranteed and we do not know the order in which these bones will finish. In order to combat this, we also need to store a hierarchy table for this model's skeleton on the GPU. This table is used to iteratively get the bone's model position by multiplying down the hierarchy until we get to the current bone that is being worked on.

GPU Responsibilities

LERP/SLERP between keyframes per bone per clip
LERP/SLERP blending between two different clips
Local space bone matrix calculation
Bone weight blending per vertex
Local space to world space transformation

Blending between Animations

Within this animation system, support to blend between two different animations is also available. This is done within the compute pipeline mentioned earlier. This will blend the final interpolated keyframes of two different animations into one. Given good animation data, this will result in a seamless blend and transition as seen in the example below.

Animation Compression

During the FBX export process, animation clip data size has been reduced by using a user-defined delta degree of error. This allows for previously-included keyframes to be removed and instead be interpolated by keyframes that were deemed required using my algorithm. This algorithm, using the defined delta degree of error, compares the original real keyframe with an interpolated frame. Each bone within the keyframe is compared, and if within margin of error, is deemed closed enough to the original bone. On average, reduction in animation clip size was ~3:1, however, it was able to reach ratios greater than 11:1 based on the animation. Data storage optimizations could improve ratios even further.