UsdGeom : USD Geometry Schema

UsdGeom defines the 3D graphics-related prim and property schemas that together form a basis for interchanging geometry between DCC tools in a graphics pipeline.

Geometric Primitive Schemas


Currently, all classes in UsdGeom inherit from UsdGeomImageable , whose intent is to capture any prim type that might want to be rendered or visualized. This distinction is made for two reasons:

  • so that there could be types that would never want to be renderered, and can thus be optimized around, for traversals, and also to enable validation: for example, in a compatible shading schema, only UsdGeomImageable-derived prims should be able to express a look/material binding.
  • for the common properties described in UsdGeomImageable, including visibility, purpose, and the attribute schema for primvars.

Admittedly, not all of the classes inheriting from UsdGeomImageable really need to be imageable - they are grouped as they are to avoid the need for multiple-inheritance, which would arise because some classes that may not necessarily be imageable are definitely transformable.


In UsdGeom, all geometry prims are directly transformable. This is primarily a scalability and complexity management decision, since prim-count has a strong correlation to total scene composition time and memory footprint, and eliminating the need for a "shape" node for every piece of geometry generally reduces overall prim count by anywhere from 30% to 50%, depending on depth and branching factor of a scene's namespace hierarchy.

UsdGeomXformable encapsulates the schema for a prim that is transformable. Readers familiar with AbcGeom's Xform schema will find Xformable familiar, but more easily introspectable. Xformable decomposes a transformation into an ordered sequence of ops; unlike AbcGeom::Xform, which packs the op data into static and varying arrays, UsdGeomXformable expresses each op as an independent UsdAttribute. This data layout, while somewhat more expensive to extract, is much more conducive to "composed scene description" because it allows individual ops to be overridden in stronger layers independently of all other ops. We provide facilities leveraging core Usd features that help mitigate the extra cost of reading more attributes per-prim for performance-sensitive clients.

Of course, UsdGeom still requires a prim schema that simply represents a transformable prim that scopes other child prims, which is fulfilled by UsdGeomXform .

You may find it useful to digest the basic assumptions of UsdGeom linear algebra


UsdGeomGprim is the base class for all "geometric primitives", which encodes several per-primitive graphics-related properties. Defined Gprims currently include:

We expect there to be some debate around the last five "intrinsic" Gprims: Capsule, Cone, Cube, Cylinder, and Sphere, as not all DCC's support them as primitives. In Pixar's pipeline, we in fact rarely render these primitives, but find them highly useful for their fast inside/outside tests in defining volumes for lighting effects, procedural modifiers (such as "kill spheres" for instancers), and colliders. The last, in particular, is quite useful for interchanging data with rigid-body simulators. It is necessary to be able to transmit these volumes from dressing/animation tools to simulation/lighting/rendering tools, thus their presence in our schema. We expect to support these and other "non-native" schema types as some form of proxy or "pass through" prim in DCC's that do not understand them.


UsdGeomPointInstancer provides a powerful, scalable encoding for scattering many instances of multiple prototype objects (which can be arbitrary subtrees of the UsdStage that contains the PointInstancer), animating both the instances and prototypes, and pruning/masking instances based on integer ID.


UsdGeomCamera encodes a transformable camera.


UsdGeomModelAPI is an API schema that extends the basic UsdModelAPI API with concepts unique to models that contain 3D geometry. This includes:

Primvars (Primitive Variables)

"Primvars" are an important concept in UsdGeom. Primvars are attributes with a number of extra features that address the following problems in computer graphics:

  1. The need to "bind" user data on geometric primitives that becomes available to shaders during rendering.
  2. The need to specify a set of values associated with vertices or faces of a primitive that will interpolate across the primitive's surface under subdivision or shading.
  3. The need to inherit attributes down namespace to allow sparse authoring of sharable data that is compatible with native scenegraph instancing

One example that involves the first two problems is texture coordinates (commonly referred to as "uv's"), which are cast as primvars in UsdGeom. UsdGeomPrimvar encapsulates a single primvar, and provides the features associated with interpolating data across a surface. UsdGeomPrimvarsAPI provides the interface for creating and querying primvars on a prim, as well as the computations related to primvar inheritance.

Imageable Purpose

Purpose is a concept we have found useful in our pipeline for classifying geometry into categories that can each be independently included or excluded from traversals of prims on a stage, such as rendering or bounding-box computation traversals. The fallback purpose, default indicates that a prim has "no special purpose" and should generally be included in all traversals. Prims with purpose render should generally only be included when performing a "final quality" render. Prims with purpose proxy should generally only be included when performing a lightweight proxy render (such as openGL).

Finally, prims with purpose guide should generally only be included when an interactive application has been explicitly asked to "show guides".

A prim that is Imageable with an authored opinion about its purpose will always have the same effective purpose as its authored value. If the prim is not Imageable or does not have an authored opinion about its own purpose, then it will inherit the purpose of the closest Imageable ancestor with an authored purpose opinion. If there are no Imageable ancestors with an authored purpose opinion then this prim uses its fallback purpose.

For example, if you have a prim tree like such

def "Root" {
token purpose = "proxy"
def Xform "RenderXform" {
token purpose = "render"
def "Prim" {
token purpose = "default"
def Xform "InheritXform" {
def Xform "GuideXform" {
token purpose = "guide"
def Xform "Xform" {
  • </Root> is not Imageable so its purpose attribute is ignored and its effective purpose is default.
  • </Root/RenderXform> is Imageable and has an authored purpose of render so its effective purpose is render.
  • </Root/RenderXform/Prim> is not Imageable so its purpose attribute is ignored. ComputePurpose will return the effective purpose of render, inherited from its parent Imageable's authored purpose.
  • </Root/RenderXform/Prim/InheritXform> is Imageable but with no authored purpose. Its effective purpose is render, inherited from the authored purpose of </Root/RenderXform>
  • </Root/RenderXform/Prim/GuideXform> is Imageable and has an authored purpose of guide so its effective purpose is guide.
  • </Root/Xform> is Imageable but with no authored purpose. It also has no Imageable ancestor with an authored purpose its effective purpose is its fallback value of default.

Purpose render can be useful in creating "light blocker" geometry for raytracing interior scenes. Purposes render and proxy can be used together to partition a complicated model into a lightweight proxy representation for interactive use, and a fully realized, potentially quite heavy, representation for rendering. One can use UsdVariantSets to create proxy representations, but doing so requires that we recompose parts of the UsdStage in order to change to a different runtime level of detail, and that does not interact well with the needs of multithreaded rendering. Purpose provides us with a better tool for dynamic, interactive complexity management.

As demonstrated in UsdGeomBBoxCache, a traverser should be ready to accept combinations of included purposes as an input.

Linear Algebra in UsdGeom

To ensure reliable interchange, we stipulate the following foundational mathematical assumptions, which are codified in the Graphics Foundations (Gf) math module:

  • Matrices are laid out and indexed in row-major order, such that, given a GfMatrix4d datum mat, mat[3][1] denotes the second column of the fourth row.
  • GfVec datatypes are row vectors that pre-multiply matrices to effect transformations, which implies, for example, that it is the fourth row of a GfMatrix4d that specifies the translation of the transformation.
  • All rotation angles are expressed in degrees, not radians.
  • Vector cross-products and rotations intrinsically follow the right hand rule.

So, for example, transforming a vector v by first a Scale matrix S, then a Rotation matrix R, and finally a Translation matrix T can be written as the following mathematical expression:

vt = v × S × R × T

Because Gf exposes transformation methods on Matrices, not Vectors, to effect this transformation in Python, one would write:

vt = (S * R * T).Transform(v)

Coordinate System, Winding Order, Orientation, and Surface Normals

Deriving from the mathematical assumptions in the preceding section, UsdGeom positions objects in a right handed coordinate system, and a UsdGeomCamera views the scene in a right-handed coordinate system where up is +Y, right is +X, and the forward viewing direction is -Z - this is explained and diagrammed in UsdRenderCamera. If you find yourself needing to import USD into a system that operates in a left-handed coordinate system, you may find this article useful.

UsdGeom also, by default, applies the right hand rule to compute the "intrinsic", surface normal (also sometimes referred to as the geometric normal) for all non-implicit surface and solid types.

That is, the normal computed from (e.g.) a polygon's sequential vertices using the right handed winding rule determines the "front" or "outward" facing direction, that typically, when rendered will receive lighting calculations and shading.

Since not all modeling and animation packages agree on the right hand rule, UsdGeomGprim introduces the orientation attribute to enable individual gprims to select the left hand winding rule, instead. So, gprims whose orientation is "rightHanded" (which is the fallback) must use the right hand rule to compute their surface normal, while gprims whose orientation is "leftHanded" must use the left hand rule.

However, any given gprim's local-to-world transformation can flip its effective orientation, when it contains an odd number of negative scales. This condition can be reliably detected using the (Jacobian) determinant of the local-to-world transform: if the determinant is less than zero, then the gprim's orientation has been flipped, and therefore one must apply the opposite handedness rule when computing its surface normals (or just flip the computed normals) for the purposes of hidden surface detection and lighting calculations.

Applying Timesampled Velocities to Geometry

UsdGeomPointBased primitives and UsdGeomPointInstancer primitives all allow the specification of velocities and accelerations to describe point (or instance) motion at off-sample UsdTimeCode s, as an alternative to relying on native UsdStage linear sample interpolation.

Using velocities is the only reliable way of encoding the motion of primitives whose topology is varying over time, as adjacent samples' indices may be unrelated to each other, and the samples themselves may not even possess the same number of elements.

To help ensure that all consumers of UsdGeom data will compute identical posing from the same dataset, we describe how the position, velocity, and acceleration data should be sampled and combined to produce "interpolated" positions. There are several cases to consider, for which we stipulate the following logic:

  • If no velocities are authored, then we fall back to the "standard" position computation logic: if the timeSamples bracketing a requested sample have the same number of elements, apply linear interpolation between the two samples; otherwise, use the value of the sample with the lower/earlier ordinate.
  • If the bracketing timeSamples for velocities from the requested timeSample have the same ordinates as those for points then use the lower velocities timeSample and the lower points timeSample for the computations described below.
  • If velocities are authored, but the sampling does not line up with that of points, fall back to standard position computation logic, as if no velocities were authored. This is effectively a silent error case.
  • If no accelerations are authored, use the lower velocities timeSample and the lower points timeSample for the computations described below. accelerations are set to 0 in all dimensions for the computations.
  • If the bracketing timeSamples for accelerations from the requested timeSample have the same ordinates as those for velocities and points then use the lower accelerations timeSample, the lower velocities timeSample and the lower points timeSample for the computations described below.
  • If accelerations are authored but the sampling does not line up with that of velocities, if the sampling of velocities lines up with that of positions use the lower velocities timeSample and the lower points timeSample for the computations described below, as if no accelerations were authored. If the sampling of velocities does not line up with that of positions, fall back to the "standard" position computation logic as if no velocities or accelerations were authored.

In summary, we stipulate that the sample-placement of the points, velocities, and accelerations attributes be identical in each range over which we want to compute motion samples. We do not allow velocities to be recorded at times at which there is not a corresponding points sample.

This is to simplify and expedite the calculations required to compute a position at any requested time. Since most simulators produce both a position and velocity at each timeStep, we do not believe this restriction should impose an undue burden.

Note that the sampling requirements are applied to each requested motion sampling interval independently. So, for example, if points and velocities have samples at times 0, 1, 2, 3, but then velocities has an extra sample at 2.5, and we are computing forward motion blur on each frame, then we should get velocity-interpolated positions for the motion-blocks for frames 0, 1, and 3, but no interpolation for frame 2.

Computing a Single Requested Position

If one requires a pose at only a single point in time, sampleTime, such as when stepping through "sub-frames" in an application like usdview, then we need simply apply the above rules, and if we successfully sample points, velocities, and accelerations, let:

tpoints = the lower bracketing time sample for the evaluated points attribute
velocityScale = the inherited value from UsdGeomMotionAPI::ComputeVelocityScale()
timeScale = velocityScale / stage->GetTimeCodesPerSecond()

... then

pointsInterpolated = points + (sampleTime - tpoints) * timeScale * (velocities + (0.5 * (sampleTime - tpoints) * timeScale * accelerations))

Computing a Range of Requested Positions

Computer graphics renderers typically simulate the effect of non-zero camera shutter intervals (which introduces motion blur into an image) by sampling moving geometry at multiple, nearby sample times, for each rendered image, linearly blending the results of each sample. Most, if not all renderers introduce the simplifying assumption that for any given image we wish to render, we will not allow the topology of geometry to change over the time-range we sample for motion blur.

Therefore, if we are sampling a topologically varying, velocities-possessing UsdGeomMesh at sample times t1, t2 ... tn in order to render the mesh with motion blur, we stipulate that all n samples be computed from the same sampled points, velocities, and accelerations values sampled at_sampleTime_. Therefore, we would compute all n samples using the above formula, but iterating over the n samples, substituting ti for sampleTime.

Two things to note:

  • Since we are applying strictly linear interpolation, why is it useful to compute more than two samples? For UsdGeomPointBased primitives, the object-space samples will not require more than two samples, although local-to-world transformations may introduce non-linear motion. For UsdGeomPointInstancer primitives, which also possess an angularVelocities attribute for the instances, it may often be desirable to sample the instance matrices (and therefore positions) at a higher frequency since angular motion is non-linear.
  • If the range of t1 to tn is greater than the recorded sampling frequency of points, then computing the "singular" value of points at some time tother that is within the range t1 to tn may produce a different value (with differing number of elements) than the computed value for the same singular time using the motion blur technique. This derives from our requirement that over the given motion range, the topology must not change, so we specifically ignore any other points, velocities, or accelerations samples that occur in the requested motion range.

Stage Metrics

The classes described above are concerned with individual primitives and properties. Some geometic quantities, however, describe aspects of an entire scene, which we encode as stage metadata. For example it is UsdGeom that allows Encoding Stage UpAxis and Encoding Stage Linear Units.