-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove all the uniforms #346
Conversation
@@ -152,13 +151,12 @@ impl Locals { | |||
impl Hash for Locals { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need to cache it any more: just upload the new data for every target.
@@ -152,13 +151,12 @@ impl Locals { | |||
impl Hash for Locals { | |||
fn hash<H: Hasher>(&self, state: &mut H) { | |||
self.transform_as_u32_slice().hash(state); | |||
self.uMode.hash(state); | |||
} | |||
} | |||
|
|||
impl PartialEq for Locals { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't be needed either
webrender/src/device/gfx/device.rs
Outdated
@@ -1373,9 +1339,7 @@ impl<B: hal::Backend> Device<B> { | |||
if self.bound_locals.uTransform != projection { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of this caching of the transform needs to go: instead we'll just need to update the data in a constant buffer that is bound per render target
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the caching in 215e71b
webrender/src/device/gfx/device.rs
Outdated
self.bind_uniforms(new_locals); | ||
} | ||
self.bound_locals = new_locals; | ||
info!("Gfx backend has uMode uniform baked into the pipeline"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be too much spam, wouldn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to upstream a change where the uMode
is passed to program binding, which will help this code
webrender/src/device/gfx/program.rs
Outdated
..(SPECIALIZATION_CONSTANT_SIZE * (i + 1)) as _, | ||
.map(|(i, mode)| { | ||
specialization_data.push(vec![0; (SPECIALIZATION_CONSTANT_COUNT - 1) * SPECIALIZATION_CONSTANT_SIZE]); | ||
let mut constants: ArrayVec<[SpecializationConstant; SPECIALIZATION_CONSTANT_COUNT]> = SPECIALIZATION_FEATURES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can omit the type of a element here, i.e . ArrayVec<[_; SPECIALIZATION_CONSTANT_COUNT]>
65c1a44
to
68cd26f
Compare
68cd26f
to
193a9a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to do another full review pass after the current concerns are addressed
\t\t// an operation mode for this batch.\n\ | ||
\t\tint uMode;\n\ | ||
\t} pushConstants;\n", | ||
); | ||
new_data.push_str(&format!( | ||
"\tlayout(set = {}, binding = 0) uniform Locals {{\n\ | ||
\t\tuniform mat4 uTransform; // Orthographic projection\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we still having the transform here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have a misunderstanding here, looks like I didn't payed attention what you mentioned in #345
WebRender has it associated with the target, so we can safely move it into a constant buffer within the per-target descriptor set.
Unfortunately we don't have descriptor sets on a per target basis, we have the following groups: global (for each rendering loop), per WR pass and per draw. (The per-target variant would fit somewhere between second and third one.) That's the reason why I kept the uniform block around. If you think that getting rid of the uniform block and introducing a new descriptor binding with a new set index which points to a buffer could benefit us, I can make those changes as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can safely just adopt the "per pass" descriptor set and increase its frequency just a bit to bind whenever the projection happens. This is still relatively rare, i.e. dozens per frame, not thousands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's sounds totally reasonable. Also I will add a more descriptive name to it.
webrender/src/device/gfx/buffer.rs
Outdated
buffer_usage: hal::buffer::Usage, | ||
data_stride: usize, | ||
buffers: ArrayVec<[Buffer<B>; MAX_FRAME_COUNT]>, | ||
offsets: ArrayVec<[usize; MAX_FRAME_COUNT]>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we combine these to into (Buffer<B>, BufferAddress)
pairs?
webrender/src/device/gfx/buffer.rs
Outdated
self.offset = 0; | ||
pub(super) fn buffer(&self, next_id: usize) -> (&B::Buffer, u64) { | ||
let ref buffer = self.buffers[next_id]; | ||
(&buffer.buffer, ((self.offsets[next_id] - 1) * buffer.stride) as u64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we subtracting 1 here?
bound: bool, | ||
} | ||
|
||
struct DecsriptorBundle<B: hal::Backend> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we are having this descriptor re-usal infrastructure in this PR.
As we discussed on the call, this is supposed to be a single descriptor with dynamic offset (for the uniform buffer that carries out the projection matrix), which is just re-used constantly for many passes.
Yes, what I mentioned is still what we need to do.
So the question is - is that "uniform Locals" the constant buffer
associated with a render target (and changed at the render target
frequency)? If so, can we call it something more descriptive than "Locals"?
…-Dzmitry
On 1/16/20 2:01 PM, Gyula Zakor wrote:
***@***.**** commented on this pull request.
------------------------------------------------------------------------
In webrender/build_gfx.rs
<#346 (comment)>:
> @@ -341,20 +331,11 @@ fn extend_sampler_definition(
}
fn replace_non_sampler_uniforms(new_data: &mut String) {
- new_data.push_str(
- "\tlayout(push_constant) uniform PushConsts {\n\
- \t\tmat4 uTransform; // Orthographic projection\n\
- \t\t// A generic uniform that shaders can optionally use to configure\n\
- \t\t// an operation mode for this batch.\n\
- \t\tint uMode;\n\
- \t} pushConstants;\n",
- );
new_data.push_str(&format!(
"\tlayout(set = {}, binding = 0) uniform Locals {{\n\
\t\tuniform mat4 uTransform; // Orthographic projection\n\
I think we have a misunderstanding here, in #345
<#345> . you
mentionedWebRender has it associated with the target, so we can safely
move it into a constant buffer within the per-target descriptor set.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#346?email_source=notifications&email_token=AAA2GJLLXUB4365MLEIE5UDQ6CVIJA5CNFSM4KFZLRT2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCSBTPHA#discussion_r367594605>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA2GJOEU6YNU6Y4F34ZPK3Q6CVIJANCNFSM4KFZLRTQ>.
|
9d539c4
to
58e78c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need another pass at this :)
webrender/build_gfx.rs
Outdated
\t\tuniform mat4 uTransform; // Orthographic projection\n\ | ||
\t\t// A generic uniform that shaders can optionally use to configure\n\ | ||
\t\t// an operation mode for this batch.\n\ | ||
\t\tuniform int uMode;\n\ | ||
\t}};\n", | ||
DESCRIPTOR_SET_LOCALS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this parameter for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the format string: https://github.com/szeged/webrender/pull/346/files/58e78c1b0097e66e1d0656a0ff86e8eb0ddc3c25#diff-711aa321dbe62813ad158403a418e1d3R335
But changed the name here https://github.com/szeged/webrender/pull/346/files#diff-711aa321dbe62813ad158403a418e1d3R338
webrender/src/device/gfx/program.rs
Outdated
@@ -652,8 +665,8 @@ impl<B: hal::Backend> Program<B> { | |||
.into_iter() | |||
.chain(iter::once(desc_set_per_frame)) | |||
.chain(iter::once(desc_set_per_draw)) | |||
.chain(desc_set_locals), | |||
&[], | |||
.chain(iter::once(desc_set_locals)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't right. The descriptor set carrying projection matrix as well as other per-target parameters should have lower index than the per-draw one, since it changes at less frequency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the ordering in d8716f7
@@ -652,8 +665,8 @@ impl<B: hal::Backend> Program<B> { | |||
.into_iter() | |||
.chain(iter::once(desc_set_per_frame)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we calling this for every draw call? we definitely shouldn't :)
non_coherent_atom_size_mask: u64, | ||
) { | ||
let _ = self.buffer.update(device, data, self.buffer_offset as usize, non_coherent_atom_size_mask); | ||
self.buffer_offset += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where are we
// Maximum number of bound projections per frame. | ||
// Projections are bound on a per render target basis, so those should fit in the 96 limit. | ||
// We need this limit to define a fixed size for our uniform buffers. | ||
pub const PROJECTION_PER_FRAME: usize = 96; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where are we checking if we hit this limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an assert for now https://github.com/szeged/webrender/pull/346/files#diff-5dbe2f57bb9273f9a363bcb71ec6834eR765
webrender/src/device/gfx/device.rs
Outdated
@@ -1309,6 +1328,7 @@ impl<B: hal::Backend> Device<B> { | |||
&[self.bound_projection], | |||
self.next_id | |||
); | |||
self.rebind_descriptors = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we even need set_uniforms
function? We know when we start working on the new target, so we can just unconditionally update the projection uniforms and rebind the descriptor, instead of doing it lazily here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we don't , I will change according to that.
webrender/src/device/gfx/device.rs
Outdated
@@ -1331,6 +1351,7 @@ impl<B: hal::Backend> Device<B> { | |||
bound_sampler[1] = per_draw_bindings.1[1]; | |||
bound_sampler[2] = per_draw_bindings.1[2]; | |||
|
|||
self.rebind_descriptors = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this assignment here?
webrender/src/device/gfx/program.rs
Outdated
&[dynamic_offset], | ||
); | ||
if rebind_descriptors { | ||
cmd_buffer.bind_graphics_descriptor_sets( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that we are still deferring to much work till this function. We need to aim for it to be just pass.draw()
basically, reducing as much cruft as possible.
For descriptor binding, can't we just do it higher level? I.e. the start of a frame binds its descriptor set and keeps it forever. The start of a target binds its descriptor sets and not touches it. When we break a batch, we can bind the per-draw descriptor set (even unconditionally).
P.S. also, aren't index buffers always the same? we should be just binding it once for the whole frame, ideally, instead of rebinding here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For descriptor binding, can't we just do it higher level? I.e. the start of a frame binds its descriptor set and keeps it forever. The start of a target binds its descriptor sets and not touches it. When we break a batch, we can bind the per-draw descriptor set (even unconditionally).
I think this would be only applicable if we only had one pipeline layout, but currently we have three. I tried to implement it, but things broke when I try to bind the per frame graphics descriptor sets (for descriptor slot 0), because those are different for each layout, and if we bind three different descriptor sets to the free different layouts first slot only the third one remains active and I get an error for the other two kind of pipeline layouts:
UNASSIGNED-CoreValidation-DrawState-DescriptorSetNotBound(ERROR / SPEC): msgNum: 0 - VkPipeline 0x27d uses set #0 but that set is not bound.
Objects: 1
[0] 0x22486f180c8, type: 6, name: NULL
On the other hand having only one pipeline layout would allow us to make this change also reduces the overhead on descriptor set lookup. The drawback of this would be that we would have to touch the shaders again to move those bindings in a shared shader part. Also I don't know the overhead having those in the shaders, even though they are unused.
I can make this change but I think that would be out of the scope of this PR. I will add this to the list in #335.
P.S. also, aren't index buffers always the same? we should be just binding it once for the whole frame, ideally, instead of rebinding here
Those are only related to debug shaders, so those don't have impact on the test benchmarking results. I will create an issue for this to not forget
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed clarification!
When exactly do we have a switch of the pipeline layouts? I recall, for example, that you have a pipeline layout for the clip shader kind. I'd expect them to be together in the alpha pass. Same for the brush ones.
If we do switch the pipeline layout from time to time, we only need to rebind the descriptor sets starting from the first incompatible descriptor set layout. I.e. you can have the same descriptor set layout for slot [0] of all the used pipeline layouts and never need to re-bind it on a pipeline change.
So it still seems hazardly wrong to try rebinding all the sets on each batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately the incompatible descriptor sets are int the first two slots. That's why I mentioned to change the shaders to have a common pipeline layout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we'd only change them on the pipeline layout change, which is still not expected to happen often.
@@ -1300,16 +1300,8 @@ impl<B: hal::Backend> Device<B> { | |||
pub fn set_uniforms( | |||
&mut self, | |||
_program_id: &ProgramId, | |||
projection: &default::Transform3D<f32>, | |||
_projection: &default::Transform3D<f32>, | |||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could do some assert_eq!(mCurrentProjection, projection)
just to be sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thanks!
So we are leaving the binding refactor as a follow-up, right?
That's right. |
776540a
to
3f1c571
Compare
This adresses #345
WIP because these changes are not tested on macOS yet.This change is