forked from xenia-project/xenia
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update stfs-writer to the latest master #44
Open
epozzobon
wants to merge
861
commits into
emoose:stfs-writer
Choose a base branch
from
epozzobon:stfs-writer
base: stfs-writer
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
While the alpha of the texture data is not used at all (replaced with blue using the view swizzle), still make the shader code state the intention more explicitly if the format is decompressed for use as signed. Unsigned 1.0 is 0xFF, while signed 1.0 is 0x7F.
The resolution scale is now taken into account when copying from the mip tail.
Keep the current lane active as it may be needed for derivatives.
There's no limit on the number of memory exports in a shader on the real Xenos, and exports can be done anywhere, including in loops. Now, instead of deferring the exports to the end of the shader, and assuming that export allocs are executed only once, Xenia flushes exports when it reaches an alloc (allocs terminate memory exports on Xenos, as well as individual ALU instructions with `serialize`, but not handling this case for simplicity, it's only truly mandatory to flush memory exports before starting a new one), the end of the shader, or a pixel with outstanding exports is killed. To know which eM# registers need to be flushed to the memory, traversing the successors of each exec potentially writing any eM#, and specifying that certain eM# registers might have potentially been written before each reached control flow instruction, until a flush point or the end of the shader is reached. Also, some games export to sub-32bpp formats. These are now supported via atomic AND clearing the bits of the dword to replace followed by an atomic OR inserting the new byte/short.
There can be jumps across an exece, so the code beyond it may still be executed.
I don't know of any title that utilizes this instruction, but I went ahead and implemented it for completeness. Verified the implementation with `instr__gen_vaddcuw` from xenia-project#1348. Can be grabbed with: ``` git checkout origin/gen_tests -- src\xenia\cpu\ppc\testing\*vaddcuw.s ```
Other half of xenia-project#2125. I don't know of any title that utilizes this instruction, but I went ahead and implemented it for completeness. Verified the implementation with `instr__gen_vsubcuw` from xenia-project#1348. Can be grabbed with: ``` git checkout origin/gen_tests -- src\xenia\cpu\ppc\testing\*vsubcuw.s ```
AVX512 has native unsigned integer comparisons instructions, removing the need to XOR the most-significant-bit with a constant in memory to use the signed comparison instructions. These instructions only write to a k-mask register though and need an additional call to `vpmovm2*` to turn the mask-register into a vector-mask register. As of Icelake: `vpcmpu*` is all L3/T1 `vpmovm2d` is L1/T0.33 `vpmovm2{b,w}` is L3/T0.33 As of Zen4: `vpcmpu*` is all L3/T0.50 `vpmovm2*` is all L1/T0.25
Plus: limit it to 64 entries Thanks to Bo98 for pointing that out
epozzobon
force-pushed
the
stfs-writer
branch
2 times, most recently
from
July 8, 2023 10:39
00aba94
to
06ed9ab
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As mentioned on #43, I made an attempt at updating the stfs-writer to the latest commit in the master branch.
This seems to work on the games I play, but I would appreciate some more testing.