All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Sequencable, Re-timable Animated "Value Clips"

Overview

USD's composition arcs allow timeSampled animation to be assembled from a variety of sources into a single composition. However, because stage composition must not (for scalability) take time into account when "indexing" layers, the value resolution behavior we are able to provide for layers reached through composition arcs stipulates that the first (strongest) layer that contains any timeSample for an attribute is the source of all timeSamples for the attribute. For many uses of USD this is sufficient, and additionally flexible because each Reference and SubLayer can specify a constant time offset and scale to be applied to the referenced or sublayered timeSamples. However, sometimes more flexibility is required!

The USD Value Clips feature allows users to decompose time-varying data across many layers that can then be sequenced and re-sequenced back together in flexible ways. This feature is purely a value resolution -level feature, not a composition-level feature. Value clips allow users to retime sequence in various ways. This allows users to reuse a set of value clips in different scenarios, with only the sequencing metadata changing. At Pixar, we have found value clips useful for efficiently animating medium to large crowds, and for representing very large, simulated effects. For more detail on these use cases, see the glossary entry for Value Clips.

At a very high level, value clips consume special metadata on a prim, indicating:

  • the targeted "clip" layers (which are assets) to be sequenced
  • the intervals over which each clip is active, and how "stage time" maps into each clip
  • which samples to consume within the active clip

Terminology

Before going further, let's establish some terminology:

  • Value Clip: An individual layer containing time varying data over some interval. All metadata, relationships, and default values present in a layer are ignored when the layer is consumed as a Value Clip.
  • Static Scene Topology: An individual layer containing the aggregate of all attributes in a sequence of value clips. Note that these attributes are merely declared, they have no values authored in this layer. If an attribute is not declared in the static scene topology, value resolution will behave as if the attribute does not exist, even if one or more value clips contain data for the attribute.
  • Clip Metadata: A set of prim-level metadata which control USD's value resolution semantics.
  • Clip Set: A named set of value clips defined by a group of clip metadata. A prim may have multiple clip sets.
  • Anchor Point: The strongest layer in which either assetPaths or templateAssetPath is authored for a given clip set. This determines the strength of clips with respect to value resolution, see Value Resolution Semantics for details.

Metadata

Value Clip behavior is controlled by metadata authored on a particular prim. Each application of value clips takes one of two possible forms: template and explicit metadata. Explicit metadata encodes the exact assets and sequence timings. Template metadata, on the other hand, authors a regex-style asset path template, and infers the explicit metadata when a UsdStage is opened. Template metadata is strictly less powerful than explicit metadata (it can't achieve behaviors such as looping, reversing, or holding clip data), but provides an extremely compact and easy to debug encoding for situations in which animation is broken up into a large number of regularly named files. Regardless of which form a value clip application takes, there are also a set of "universal" metadata common to both.

  • Universal Clip Metadata
    • primPath
    • manifestAssetPath
      • An asset path (SdfAssetPath) representing the path to a layer that contains an optimized index of the data we can expect to find authored in the set of clips. Depending on how it was created, this can be exactly the static scene topology layer. See Performance Characteristics and Maximizing Clip Performance for a deeper explanation of the use of this optional layer.
  • Explicit Clip Metadata
    • assetPaths
      • An ordered list of asset paths to the clips holding time varying data.
    • active
      • A list of pairs of the form (stageTime, assetIndex) representing when, in terms of stage time, a particular clip in assetPaths is active. For example, given a assetPaths of [ @foo.usd@, @bar.usd@ ] and a active of [ (101, 0), (102, 0) ], foo.usd would be the only clip ever examined for attribute values under this particular prim.
    • times
      • A list of pairs of the form (stageTime, clipTime) representing the mapping from stage time to clip time, for whichever clip is active at the given stage time. Note that every authored pair will be represented in a call to UsdAttribute::GetTimeSamples() .
  • Template Clip Metadata
    • templateAssetPath
      • A regex-esque template string representing the form of our asset paths' names. This can be of two forms: 'path/basename.###.usd' and 'path/basename.###.###.usd'. These represent integer stage times and sub-integer stage times respectively. In both cases the number of hashes in each section is variable, and indicates to USD how much padding to apply when looking for asset paths. Note that USD is strict about the format of this string: there must be exactly one or two groups of hashes, and if there are two, they must be adjacent, separated by a dot.
    • templateStartTime
      • The (double precision float) first number to substitute into our template asset path. For example, given 'path/basename.###.usd' as a template string, and 12 as a template start time, USD will populate the internal asset path list with 'path/basename.012.usd' as its first element, if it resolves to a valid identifier through the active ArResolver . If the template asset path represents integer frames and the start time has a fractional component, USD will truncate this to an integer.
    • templateEndTime
      • The (double precision float) last number to substitute into our template string. If the template asset path represents integer frames and the end time has a fractional component, USD will truncate this to an integer.
    • templateStride
      • A (double precision float) number indicating the stride at which USD will increment when looking for files to resolve. For example, given a start time of 12, an end time of 25, a template string of 'path/basename.#.usd', and a stride of 6, USD will look to resolve the following paths: 'path/basename.12.usd', 'path/basename.18.usd' and 'path/basename.24.usd'.
    • templateActiveOffset
      • An optional (double precision float) number indicating the offset USD will use when calculating the clipActive value.
      • Given a start time of 101, an endTime of 103, a stride of 1, and an offset of 0.5, USD will generate the following:
        • clipTimes = [(100.5,100.5), (101,101), (102,102), (103,103), (103.5,103.5)]
        • clipActive = [(101.5, 0), (102.5, 1), (103.5, 2)]
      • Note that USD generates two additional clip time 'knots' on the ends of the clipTime array. This allows users to query time samples outside the start/end range based on the absolute value of their offset.
      • Note that templateActiveOffset cannot exceed the absolute value of templateStride.
Warning
In the case where both explicit clip metadata and template clip metadata are authored, USD will prefer the explicit metadata for composition.

USD provides schema level support for authoring this metadata via UsdClipsAPI. This gives a typesafe way to interact with the relevant metadata as well as various helper functions.

Clip Sets

A "clip set" is a named group of value clips specified by the metadata described above. Prims can have multiple clip sets. In this example, the prim "Prim" has two clip sets, "clip_set_1" and "clip_set_2", each defined by a different set of clip metadata.

#usda 1.0
def "Prim" (
clips = {
dictionary clip_set_1 = {
double2[] active = [(101, 0), (102, 1), (103, 2)]
asset[] assetPaths = [@./clip1.usda@, @./clip2.usda@, @./clip3.usda@]
asset manifestAssetPath = @./clipset1.topology.usda@
string primPath = "/ClipSet1"
double2[] times = [(101, 101), (102, 102), (103, 103)]
}
dictionary clip_set_2 = {
string templateAssetPath = "clipset2.#.usd"
double templateStartTime = 101
double templateEndTime = 103
double templateStride = 1
asset manifestAssetPath = @./clipset2.topology.usda@
string primPath = "/ClipSet2"
}
}
reorder clipSets = ["clip_set_2", "clip_set_1"]
)
{
}

The clip set definitions are stored in a dictionary-valued metadata field named "clips", which is composed according to the rules in Dictionary-valued Metadata. This allows users to define clip sets in various layers and have them compose together, or sparsely override metadata in clip sets non-destructively.

The "clipSets" metadata field is a list op that allows users to control how clip sets are used during value resolution. See Value Resolution Semantics for more details.

Users can specify the clip set to author to when using the UsdClipsAPI schema to clip metadata. If no clip set is specified, UsdClipsAPI will author to a clip set named "default".

Value Resolution Semantics

The presence of Value Clips introduces a new behavior to value resolution of any attribute affected by the clips. "Affected by the clips" means all attributes on the prim hosting the value clip metadata, and all of its descendant prims' attributes. The strength of data in a set of value clips is based on the anchor point. The clip data is just weaker than the "Local" (L in LIVRPS) data of the anchoring layer. Clip data can be overridden by adding overrides to a stronger layer or in a local opinion, just as for any other kind of data.

During attribute value resolution, each clip set defined on the attribute's owning prim will be queried to find the first clip that affects the attribute. If no such clip exists, clip sets defined on the owning prim's parents will be queried.

By default, the clip sets will be queried in lexicographical order by name. However, users may change the order in which clip sets are queried or even remove a clip set from consideration by authoring reorder or delete statements in the "clipSets" list op.

When an attribute is affected by a clip, value resolution performs the following extra computation:

  • Determine the active clip, through the active metadata. Given the following active metadata: [(t1, c1), (t2, c2)], USD will look in c1 from the beginning of time until t2-epsilon. That is, the value of the leftmost active holds from the beginning of time until the next clip. Similarly, the rightmost active will hold until the end of time.
  • Once the active clip is known, USD will open the value clip in assetPaths at the index determined.
  • From the relevant value clip, USD will do any necessary mapping from the "external time" of the stage, to the "internal time" of the clip.
  • Once the mapping has been determined, the value resolution proceeds as a normal Get() call for a timeSamples-based query, as if the active clip were the immediate super-layer of the anchoring layer.
  • Addendum 1: What happens if the times metadata points to a time outside the range of provided samples in a clip? In this case, the first and last samples in the clip are held in their respective directions, which is the same behavior for timeSamples appearing in non-clip layers.
  • Addendum 2: What happens if there is no active clip at the current time? The value from the prior active clip (in the active timeline) is held. Within the active clip, the normal resolution semantics apply.
  • Addendum 3: What happens if there is no attribute in the active clip that we are querying? In this case, calls to Get() will return nothing.

All of the above behavior applies to the template metadata as well. The key difference is that the times, active and assetPaths metadata are derived from the template metadata. Although this derivation is not a direct component of value resolution, it sets the stage for it, so it may be worthwhile understanding how it works:

  • The set of explicit asset paths (assetPaths) is derived by taking the template pattern (templateAssetPath) and substituting times from templateStartTime to templateEndTime, incrementing by templateStride.
  • Once the set of relevant asset paths has been determined. The times and active metadata can be derived. For each time t specified in each derived assetPath, the time (t, t) will be authored; similarly, the active (t, n) will be authored, where n represents the index of the derived assetPath.

Layer Offsets

Layer offsets affect value clips in the following ways:

  • If using template metadata encoding:
    • Layer offsets are applied to generated times and active metadata relative to the anchor point. Note that layer offsets will not affect the generated set of asset paths, as they are not applied to templateStartTime, templateEndTime and templateStride.
  • If using explicit metadata encoding:
    • Layer offsets are applied to times and active metadata relative to the strongest layer in which they were authored. Note that this layer may be different from the anchor point.

Performance Characteristics and Maximizing Clip Performance

The flexibility and reuse of animated data that clips provides does come with some performance characteristics with which pipeline builders may want to be familiar.

Clip Manifest Asset Path

The manifestAssetPath metadata can be crucial for achieving good performance with value clips. While this metadata is currently optional, specifying it allows USD to make much faster queries.

Clip Layers Opened On-Demand

In Pixar use of clips, it is not uncommon for a single UsdStage to consume thousands to tens of thousands of clip layers. If the act of opening a stage were to discover and open all of the layers consumed by clips, it would, in these cases, add considerable time and memory to the cost. Further, many clients of the stage (such as a single-frame render) only require data from a small time range, which generally translates to a small fraction of the total number of clip layers. Therefore, clip layers are opened lazily, only when value resolution must interrogate a particular clip. Of course, since USD supports value resolution in multiple threads concurrently, it means that resolving attributes affected by clips may require acquiring a lock that is unnecessary during "normal" value resolution, so there is some performance penalty.

Further, the broader the time interval over which an application extracts attribute values, the more layers that will be opened and cached (until the stage is closed). We deem this an acceptable cost since it is in keeping with our general principle of paying for what you use. The alternative would be adding a more sophisticated caching strategy to clip-layer retention that limits the number of cached layers; however, since the most memory-conscious clients (renderers) are generally unaffected, and the applications that do want to stream through time generally prioritize highest performance over memory consumption, we are satisfied with the caching strategy for now.

Warning
Users of UsdAttributeQuery may experience unanticipated overhead with respect to clips. When a UsdAttributeQuery is constructed with a UsdAttribute affected by clips, it will scan all available clips until it finds one with timeSamples authored. In the case that many consecutive clips don't have timeSamples authored, the user pays a cost of opening all those layers.

usdview

  • Usdview supports value clip debugging through the layer stack viewer(lower left). When a particular attribute(who's value is held in a clip layer) is highlighted, the layer stack viewer will show which clip the value is coming from.
  • The metadata tab will display the value of each piece of metadata authored on the prim introducing clips.

usdcat

Usdcat supports flattening value clips. Attributes with values from clips flatten to timeSamples. Any active clip at timecode t with an authored SdfValueBlock or an absent value will result in an SdfValueBlock at t in the output timeSamples.

Consider the following set of files:

root.usda, our root layer:

#usda 1.0
(
endTimeCode = 103
startTimeCode = 101
subLayers = [
@./root.topology.usda@
]
)
over "points" (
clips = {
dictionary default = {
double2[] active = [(101, 0), (102, 1), (103, 2)]
asset[] assetPaths = [@./clip1.usda@, @./clip2.usda@, @./clip3.usda@]
asset manifestAssetPath = @./root.topology.usda@
string primPath = "/points"
double2[] times = [(101, 101), (102, 102), (103, 103)]
}
}
)
{
}

root.topology.usda, our static scene topology:

#usda 1.0
def Points "points"
{
float3[] extent
}

clip1.usda, our first (of three) clips:

#usda 1.0
def Points "points"
{
float3[] extent.timeSamples = {
101: None,
}
}

clip2.usda:

#usda 1.0
def Points "points"
{
float3[] extent.timeSamples = {
102: [(102,102,102), (102,102,102)],
}
}

clip3.usda:

#usda 1.0
def Points "points"
{
}

Now, lets examine the flattened result (achieved with $ usdcat –flatten root.usda –out flattenedClips.usda):

#usda 1.0
(
endTimeCode = 103
startTimeCode = 101
)
def Points "points"
{
float3[] extent.timeSamples = {
101: None,
102: [(102, 102, 102), (102, 102, 102)],
103: None,
}
}

Note that both clip1.usda, which has an authored SdfValueBlock, and clip3.usda, which does not have an authored value, produce a block in the resulting timeSamples.

usdstitchclips

The USD toolset provides a tool, usdstitchclips, for generating the necessary value clip metadata as well as the static scene topology. This tool can author both explicit and template clip metadata.

For example, given a directory containing three clip files clip.101.usd, clip.102.usd and clip.103.usd:

$ usdstitchclips --clipPath /World/model --out result.usda clip*

Will generate the following result.usda:

#usda 1.0
(
endTimeCode = 103
startTimeCode = 101
subLayers = [
@./result.topology.usda@
]
)
over "World"
{
over "model" (
clips = {
dictionary default = {
double2[] active = [(101, 0), (102, 1), (103, 2)]
asset[] assetPaths = [@./101.usd@, @./102.usd@, @./103.usd@]
asset manifestAssetPath = @./result.topology.usda@
string primPath = "/World/model"
double2[] times = [(101, 101), (102, 102), (103, 103)]
}
}
)
{
}
}

and the following result.topology.usd:

#usda 1.0
(
endTimeCode = 103
startTimeCode = 101
upAxis = "Z"
)
def "World"
{
def "model"
{
int x
}
}

For generating template metadata:

$ usdstitchclips --clipPath /World/model
--templateMetadata
--startTimeCode 101
--endTimeCode 103
--stride 1
--templatePath clip.#.usd
--out result.usda clip*

Will generate the following result.usda:

#usda 1.0
(
endTimeCode = 103
startTimeCode = 101
subLayers = [
@./result.topology.usda@
]
)
def "World"
{
over "model" (
clips = {
dictionary default = {
string templateAssetPath = "clip.#.usd"
double templateStartTime = 101
double templateEndTime = 103
double templateStride = 1
asset manifestAssetPath = @./result.topology.usda@
string primPath = "/World/model"
}
}
)
{
}
}

Encoding Behaviors

Looping

It is often desirable to create a looping effect over a set of clips. This is very simple to achieve with the usdstitchclips tool. Given a set of value clips, say clip1.usd, clip2.usd and clip3.usd. We could loop them three times with the following encoding:

#usda 1.0
(
endTimeCode = 9
startTimeCode = 1
subLayers = [
@./result.topology.usda@
]
)
def "World"
{
over "model" (
clips = {
dictionary default = {
asset manifestAssetPath = @./result.topology.usda@
string primPath = "/World/model"
asset[] assetPaths = [@clip1.usd@, @clip2.usd@, @clip3.usd@]
double2[] active = [(1, 0), (2, 1), (3, 2),
(4, 0), (5, 1), (6, 2),
(7, 0), (8, 1), (9, 2)]
double2[] times = [(1, 1), (2, 2), (3, 3),
(4, 1), (5, 2), (6, 3),
(7, 1), (8, 2), (9, 3)]
}
}
)
{
}
}

Note that we were able to achieve this solely through the metadata. No additional asset loads or restructuring needed to happen.

Warning
Note that this supposes that the final frame and the first frame of the set of clips transition smoothly.