Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add visualization schema proposal #42

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nvmkuruc
Copy link
Contributor

@nvmkuruc nvmkuruc commented Mar 26, 2024

Description of Proposal

The proposes introducing a shared schema domain to collect and standardize behaviors of primitives designed for nondiegetic visualization.

https://github.com/NVIDIA-Omniverse/USD-proposals/blob/vprim/proposals/vprim/README.md

Supporting Materials

Contributing

Copy link
Member

@spiffmon spiffmon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's alot I like here, and would like to see some rendering-folk perspectives on it.

Also:

  • Would Viz prims be minimally UsdGeomXformable? Or they just refer/constrain to Xformable or BOundable prims?
  • Formatting-wise, if you use text-fill or add linebreaks yourself within paragraphs, it allows more precise commenting in the gitHub UI.

Users are currently seeking a way to visualize data through schemas added to OpenUSD. A user may want describe a "red line" to draw attention to a particular object or a "green curve" to represent available paths towards exits during building design.

### Schema Slicing
One instance of these new schemas is `CurveStyleAPI`, proposed as an API schema for the existing `Curves` schemas. The API schema makes three transformative changes. It transforms the meaning of the `widths` attribute to be a screen space size (`constant` or `uniform` only), changes the appearence of joints and end caps, and resticts the materials allowed to be bound to a restricted set of "styling" materials.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turn this into a reference to the other proposal?

We similarly wonder if all variations of the current set of `Curves` are needed to describe screen space visualizations and when adding support, how renderers and other consumers know which to prioritize to establish compatability. A new axis of variation that transform the existing geometric schemas complicates interchange and compatability of the `UsdGeom` domain. We suggest new schema domain with a new set of schemas may be the best place to drive development of nondiegetic primitives separate from their diegetic `UsdGeom` cousins.

## Proposal
Introduce a class of visualization primitives that are defined in a 3-D scene, but whose imaging definition is allowed to include nondiegetic screen effects and with a relaxed definition of boundability. Let's call them `Vprims` (visualization primitives). Boundability as a schema partioning mechanism is already employed in `UsdLux` which classifies light schemas as either boundable (ie. `RectLight` and `SphereLight`) or non-boundable (`DomeLight`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One salient thing to note about this partitioning is the effect it may or may not have on usefully bounding a scene, in typical usage. Wrt lights, the non-boundable lights are typically aggregated somewhere off to the side of the "primary 3D scene hierarchy". If non-boundable non-diegetic prims will live alongside models/primitives they are "related to" then we could commonly wind up with. non-boundable scenes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

* `VizPolylines`: Align with existing curve styling proposals
* `VizPaths`: Cubically interpolated curves that align with existing curve styling proposals
* `VizText`: Align with existing text schema proposals

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a Billboard nothing more than a textured Plane inside a VizScope that anchors to a partial scene coordinate system but is screen facing? I.e. is it a concern fully handled by VizScope (which I expected to see here in this list)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be for simple uses. Some users want billboards to be vectorized though like points and curves. A billboard cloud might warrant its own distinct definition.

Look for a revision to this document that makes VizScope more prominent shortly.

### API Schema and Grouping Adapters
This proposal frames screen space points (and paths) visualization primitives as fundamentally different from their `Gprim` counterparts, warranting their own first class `Vprim` typed schemas. Does this proposal set the precedent that other `Gprim` types like `Cube` or `Cylinder` require visualization equivalents if they wanted to be adapted into the "visualization" domain? We'd suggest no and offer two solutions that have prior art in OpenUSD.

If a primitive retains its underlying definition, `UsdLux` already provides a solution to the problem of adapting `Gprim`s into another domain. Rather than a `MeshLight`, the `Mesh` `Gprim` is adapted via an API schema. `Gprim`s could be similarly "adapted" into the visualization domain. To avoid "schema slicing" an adaptation must not change the underlying definition or require their own distinct `purpose` or `visibility` tagging. Consideration must be also given to if or how these schemas affect interactive picking. A non-slicing visualization effect might include sillhoutte outlines, glows, or overlays applied to a `Gprim` to style or highlight its importance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned above that formalizing Viz geometry helps us clearly demarcate what should get rendered in a "post physically-based renderer" pass, and that's compelling and something to follow up on with Hydra folks. Thinking about the example non-slicing effects presented here, we can't simply say non-diegetic implies "overlay", as a silhouetted object that is partially obscured by other 3D geometry doesn't want a full silhouette rendered. But the MacBeth example does want to be an overlay (I think?). Is the determination of whether a Viz rendering "Style" leverages the depth buffer of the 3D scene tied to whether it is screen-space defined/anchored or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rendered in a "post physically-based renderer" pass

Or a physically based renderer that handles these objects specially.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there's a philosophical question; practically, we'll implement overlay and silhouette as image-plane operations because we don't want them to participate in rendering and after rendering is done image-plane is the best way to express things. Any kind of image operation can reasonably expect to get a certain amount of data, e.g. color, depth, id, etc if needed. I think silhouette can be done as an image operation as an edge-detect convolution on primId? And overlay can of course be done trivially as a primId match.

Internally we do some more complicated things where we render additional 3d geometry with custom compositing, e.g. drawing rotation widgets on top as transparent if they fail the depth test and as opaque if they pass the depth test. I don't know if we want to try to model that stuff here. In games that's something that can be modelled by materials, but since that kind of thing isn't physically based it's not captured by UsdShade. It's certainly useful for those kinds of prims to identify that they don't participate in light transport, at least, and I could imagine having hydra support for rendering 3d prims into the scene after the primary render (e.g. by running Viz prims through Storm and comping them in).

Copy link
Contributor Author

@nvmkuruc nvmkuruc Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything you said makes sense to me Tom-- I'd expect that most styles would be implemented as image shaders that way.

I suspect there's probably going to eventually be a need for shaders to support more advanced effects, but I also suspect that there's a lot of concepts that benefit from a simplified initial specification. I'm hoping that we could start there.

Readers may agree that some of these nondiegetic primitives are distinct enough to warrant new schemas, but may reject the premise that screen independence is necessarily a fundemental property of `Gprim`s. The USD Glossary says `Gprim`s are just drawable, transformable, and boundable. The problems of how to bound, transform (and otherwise collide / intersect) screen space objects should be clarified and reconciled within the existing `Gprim` definition, rather than introduce a new type. This is not the preferred approach of this proposal which sees value having a high level classification for diegetic and nondiegetic primitives that clients can use to organize scenes and imaging pipelines around.

### Disaggregate Boundability out of `Gprim`
The proposal suggests that `Vprim`s required a relaxed definition of boundability. One could imagine that just as `visibility` was disaggregated out of `Gprim` into its own schema, boundability could disaggregated as well so that not every `Gprim` requires a traditional `extent`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, visibility has not been pulled out of Imageable. There is a note in the class dox that we intended to do that at some point, but doing so while satisfying our goals for doing so would introduce a large, breaking change for any scenes with authored visibility, requiring fixbrokenpixarschemas to update...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying!

We think there might be other core properties (`doublesided`) and behaviors that warrant removal from the "visualization" domain so this isn't our preferred approach.

## Summary
To preserve compatability, supportability, and interchange of existing `UsdGeom` schemas while extending USD's ability to describe nondiegetic visualizations, a new domain should be introduced. The `Vprim` domain would allow visualizations to be described in relation to both screen and other primitives and would avoid introducing schemas that might "slice" the current set of `Gprim`s.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be the Viz domain rather than the Vprim domain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

As currently described, visualization primitives should be strictly hidden via depth. However, one could imagine customizing hiding for these prims. Some visualizations might want to hide based on a particular reference point for the entire string of text. Some visualizations might want to express priority with respect to other objects in the scene. Hiding is another axis where the specification may eventually end up diverging from the `Gprim` space.

### Breaking the Fourth Wall
Some care needs to be taken to establish best practices and rules for visibility of screen-constrained primitives. While we generally would expect nondiegetic primitives to not participate in physically based lighting, consider screen constrained "context spheres" as reference objects. They "break the fourth wall" and capture light and reflections. It should be possible to specify this behavior even when using nondiegetic transformations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one somewhat throws my thoughts about the proposal into disarray... it doesn't submit to the same clean "let a rasterizer handle it in a post-pass", and it also goes against the final section below, as a fully formed material specification is required to capture light and reflections...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look for some clarifications in a revision shortly. My hope would be that we could establish some set of default behavior for objects that works for most visualizations and consider if additional schemas or configurations are needed to bridge the gaps. (It's possible the set of effects that can be described using schemas without a material network ends up being too narrow.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem like there are a few entangled concepts here:
1.) Objects whose position or size are dependent on the camera.
2.) Objects that don't participate in light transport.
3.) Possibly redundant with 1 & 2, but stylizations done in the image plane. Objects in categories 1 and 2 can be placed in the physical scene, if not in USD then at least in the renderer, and I suppose a silhouette could be constructed in the 3d scene, but that wouldn't be my choice for implementation.

If we handle concerns 1 & 2 separately, having screen-constrained primitives that participate in light transport is easy.

Copy link
Contributor Author

@nvmkuruc nvmkuruc Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed 100%-- I'm working on a revision as we speak that tries to formalize some of the ideas and came to the conclusion that it's better to keep some of these concerns separate.

I've worked through "anchors" right now which I'm describing as a "nondiegetic break in a single diegetic space". That decouples it from any opinions about light transport.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using my few functional brain cells to parse "nondiegetic break in a single diegetic space";;;;;

"nondiegetic break in a single diegetic space"

means

"a thing on the stage that is not part of the world represented by the stage". If that's correct, could your amendment parenthetically add the explanation for people who need more sleep?

Copy link
Contributor Author

@nvmkuruc nvmkuruc Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nick. I'll try to clarify that statement more in my revisions. The original idea of this document assumed that all nondiegetic effects were nonphysical and therefore should be excluded from lighting. Talking to Dhruv and others, more narrow use cases emerged where it makes sense to have physically based materials and geometry constrained to view orientation.

So the revision for "anchoring" will introduce no constraints on lighting which aligns with Tom's observation that a few assumptions and usages were entangled and could be decoupled.

@nvmkuruc
Copy link
Contributor Author

nvmkuruc commented Apr 9, 2024

There's alot I like here, and would like to see some rendering-folk perspectives on it.

Also:

  • Would Viz prims be minimally UsdGeomXformable? Or they just refer/constrain to Xformable or BOundable prims?
  • Formatting-wise, if you use text-fill or add linebreaks yourself within paragraphs, it allows more precise commenting in the gitHub UI.

I think Vprims would be Xformable yes.

@spiffmon spiffmon mentioned this pull request Apr 11, 2024
1 task
### Schema Slicing
One important consideration when introducing schemas in this domain is
to whether or not they are new drawables (`Vprim`s) or styles on top
of existing drawables (`Gprim` + `VizStyleAPI`).
Copy link

@tcauchois tcauchois Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the concerns about schema slicing, but I also think that this approach will require (in the long term) a ton of duplicate schema definitions for everything in the Gprim hierarchy, which I don't love.

At the hydra level, I'd want to implement these with the same usdImaging and hydra code and set appropriate flags for xform computation and lighting participation based on Vprim/Gprim; I think this is relatively easy to express in hydra, and would save a lot of duplicated code that would be a pretty significant maintenance burden. If this necessitated duplicated usdImaging code, that would worry me a lot.

Copy link
Contributor Author

@nvmkuruc nvmkuruc Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent is definitely not to duplicate the Gprim hierarchy. To me, the main grey area (with the current hierarchy) is how to handle screen dependent points and curves. With curves, there are already multiple schema definitions and they're already partitioned between those intended for production rendering (BasisCurves) and annotation and rigging purposes (HermiteCurves and NurbsCurves).

However, that doesn't necessarily mean that Vprim is necessarily the right answer to that specific problem. Without introducing a higher level Vprim class, you could disambiguate curves that are "wires" and "tubes/ribbons". Wires without widths would be "intrinsically boundable" by their points and a "styling API" could be used to give wires nondiegetic effects like pixel widths. This would (perhaps narrowly) avoid a "slice" (like interpreting a single pixel width as a meter width) because consumers unaware of or unable to process styles would simply process the curve as an infinitely thin wire. I'd argue that the disambiguation between "wires" and "non-wires" would be best done at the schema level (ie-- Wires, but presence of widths and normals could be used as well. (I suspect that if "wire-ness" is not specified at the schema level, validators will flag curves authored without widths as errors which is why I'd lean that way.)

sillhoutte outlines or overlays applied to a `Gprim` to style or
highlight its importance.

### Example: `VizScreenAnchor`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note on implementation thoughts; I left a bunch of comments on the Autodesk billboards proposal along these lines:

My preferred way to do screen/camera anchoring is to augment the hydra "xform" definition to say that a transform is rooted in view space. This matches what GlfSimpleLight does for camera lights, and is consistent with the other flag we have in the hydra "xform" definition: resetXformStack, which can be thought of as saying that a transform is rooted at the origin of world space. The idea is that these xforms would flow to the renderer, and Storm (for example) could turn a view-relative transform into a world-relative transform in the shader, meaning that we don't need to recompute the transforms for each viewer in a multi-viewer app.

If we want to separate position anchoring, rotation anchoring, and scale anchoring, it's a bit more complicated but you can say (for example) that a matrix is translate-anchored to the camera and just compensate the translation as above.

I think it is cool and useful to let transforms below one of these camera-anchoring operations compose, since the composed transform would be (implied inverse view xform) * parent xform * child xform, and the child composed transform just inherits the flag indicating the implied inverse view xform.

This is all consistent with VizScreenAnchor being a scoping mechanism, and would potentially be with an xformOp as well.

Any kind of screen space positioning, scaling, styling is perfectly xformable and boundable in the renderer, of course, but absent a GfFrustum for context the UsdGeomXformCache/UsdGeomBboxCache become meaningless at some point in the tree, so for non-hydra consumers I suppose it's important to flag when that becomes the case. I think if XformCache/BboxCache had an option to take a GfFrustum, it could replicate the compensation that hydra is doing, and that would be a useful utility to provide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think consumers like hydra that can guarantee they have a single frustum have a lot of flexibility with how this gets implemented.

In my latest updates, I'm suggesting that transforms and bounds might want to be evaluated with respect to anchoring points-- to most prims that would be the pseudo root and nothing would change, but you'd be able to identify the anchoring point and use whatever extrinsic context was needed (resolution, frustum, etc.).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Draft
Development

Successfully merging this pull request may close these issues.

5 participants