Commit Graph

257 Commits

Author SHA1 Message Date
ReinUsesLisp a63a0daa5e gl_arb_decompiler: Implement an assembly shader decompiler
Emit code compatible with NV_gpu_program5.
This should emit code compatible with Fermi, but it wasn't tested on
that architecture. Pascal has some issues not present on Turing GPUs.
2020-06-11 22:12:07 -03:00
ReinUsesLisp abcea1bb18 rasterizer_cache: Remove files and includes
The rasterizer cache is no longer used. Each cache has its own generic
implementation optimized for the cached data.
2020-06-07 04:32:57 -03:00
ReinUsesLisp dc27252352 shader_cache: Implement a generic shader cache
Implement a generic shader cache for fast lookups and invalidations.
Invalidations are cheap but expensive when a shader is invalidated.

Use two mutexes instead of one to avoid locking invalidations for
lookups and vice versa. When a shader has to be removed, lookups are
locked as expected.
2020-06-07 04:32:32 -03:00
David Marcec b032ebdfee Implement macro JIT 2020-05-30 11:40:04 +10:00
David Marcec d0bdd26c26 Add xbyak external 2020-05-30 10:55:27 +10:00
ReinUsesLisp a2dcc642c1 map_interval: Add interval allocator and drop hack
Drop the std::list hack to allocate memory indefinitely.

Instead use a custom allocator that keeps references valid until
destruction. This allocates fixed chunks of memory and puts pointers in
a free list. When an allocation is no longer used put it back to the
free list, this doesn't heap allocate because std::vector doesn't change
the capacity. If the free list is empty, allocate a new chunk.
2020-05-21 16:44:00 -03:00
bunnei 41682e0888
Merge pull request #3815 from FernandoS27/command-list-2
GPU: More optimizations to GPU Command List Processing and DMA Copy Optimizations
2020-05-05 17:12:42 -04:00
Fernando Sahmkow 9df67b2095 Clang Format and Documentation. 2020-04-28 14:02:51 -04:00
ReinUsesLisp ddd82ef42b shader/memory_util: Deduplicate code
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as
well as shader decoder code.

While we are at it, fix a bug in gl_shader_cache where compute shaders
had an start offset of a stage shader.
2020-04-26 01:38:51 -03:00
bunnei bf2ddb8fd5
Merge pull request #3677 from FernandoS27/better-sync
Introduce Predictive Flushing and Improve ASYNC GPU
2020-04-22 22:09:38 -04:00
ReinUsesLisp b752faf2d3 vk_fence_manager: Initial implementation 2020-04-22 11:36:19 -04:00
Fernando Sahmkow 1f345ebe3a GPU: Implement a Fence Manager. 2020-04-22 11:36:10 -04:00
ReinUsesLisp 0e232cfdc1 renderer_vulkan: Integrate Nvidia Nsight Aftermath on Windows
Adds optional support for Nsight Aftermath. It is enabled through
ENABLE_NSIGHT_AFTERMATH in cmake. A path to the SDK has to be provided
by the environment variable NSIGHT_AFTERMATH_SDK.

Nsight Aftermath allows an application to generate "minidumps" of the
GPU state when a device loss happens. By analysing these on Nsight we
can know what a game was doing and why it triggered a device loss.

The dump is generated inside %APPDATA%\yuzu\log\gpucrash and this
directory is deleted every time a new instance is initialized with
Nsight enabled.

To enable it on yuzu there has a to be a driver and device capable of
running Nsight Aftermath on Vulkan. That means only Turing based GPUs
on the latest stable driver, beta drivers won't work for now.

It is manually enabled in Configuration>Debug>Enable Graphics Debugging
because when using all debugging capabilities there is a runtime cost.
2020-04-14 00:39:21 -03:00
ReinUsesLisp 2905142f47 renderer_vulkan: Drop Vulkan-Hpp 2020-04-10 22:49:02 -03:00
ReinUsesLisp d7db088180 video_core/texture: Use a LUT to convert sRGB texture borders
This is a reversed look up table extracted from
https://gist.github.com/rygorous/2203834#file-gistfile1-cpp-L41-L62

that is used in
04d4e9e587/source/maxwell/tsc_generate.cpp (L38)

Games usually bind 0xFD expecting a float texture border of 1.0f.
The conversion previous to this commit was multiplying the uint8 sRGB
texture border color by 255. This is close to 1.0f but when that
difference matters, some graphical glitches appear.

This look up table is manually changed in the edges, clamping towards
0.0f and 1.0f.

While we are at it, move this logic to its own translation unit.
2020-04-07 20:38:14 -03:00
ReinUsesLisp f5cee0e885 renderer_vulkan/wrapper: Add ToString function for VkResult 2020-03-27 03:21:03 -03:00
ReinUsesLisp 92c8d783b3 renderer_vulkan/wrapper: Add Vulakn wrapper and a span helper
The intention behind a Vulkan wrapper is to drop Vulkan-Hpp.

The issues with Vulkan-Hpp are:
- Regular breaks of the API.
- Copy constructors that do the same as the aggregates (fixed recently)
- External dynamic dispatch that is hard to remove
- Alias KHR handles with non-KHR handles making it impossible to use
smart handles on Vulkan 1.0 instances with extensions that were included
on Vulkan 1.1.
- Dynamic dispatchers silently change size depending on preprocessor
definitions. Different files will have different dispatch definitions,
generating all kinds of hard to debug memory issues.

In other words, Vulkan-Hpp is not "production ready" for our needs and
this wrapper aims to replace it without losing RAII and exception
safety.
2020-03-27 03:13:18 -03:00
ReinUsesLisp 3dcaa84ba4 shader/transform_feedback: Add host API friendly TFB builder 2020-03-13 18:33:04 -03:00
ReinUsesLisp e8efd5a901 video_core: Rename "const buffer locker" to "registry" 2020-03-09 18:40:06 -03:00
ReinUsesLisp bd8b9bbcee gl_shader_cache: Rework shader cache and remove post-specializations
Instead of pre-specializing shaders and then post-specializing them,
drop the later and only "specialize" the shader while decoding it.
2020-03-09 18:40:06 -03:00
ReinUsesLisp ac204754d4 dirty_flags: Deduplicate code between OpenGL and Vulkan 2020-02-28 17:56:43 -03:00
ReinUsesLisp 1bd95a314f vk_state_tracker: Initial implementation
Add support for render targets and viewports.
2020-02-28 17:56:43 -03:00
ReinUsesLisp eed789d0d1 video_core: Reintroduce dirty flags infrastructure 2020-02-28 17:56:41 -03:00
ReinUsesLisp b92dfcd7f2 gl_state: Remove completely 2020-02-28 17:56:35 -03:00
ReinUsesLisp 96ac3d518a gl_rasterizer: Remove dirty flags 2020-02-28 16:39:27 -03:00
ReinUsesLisp bcd348f238 vk_query_cache: Implement generic query cache on Vulkan 2020-02-14 17:38:27 -03:00
ReinUsesLisp c31382ced5 query_cache: Abstract OpenGL implementation
Abstract the current OpenGL implementation into the VideoCommon
namespace and reimplement it on top of that. Doing this avoids repeating
code and logic in the Vulkan implementation.
2020-02-14 17:38:27 -03:00
ReinUsesLisp 2b58652f08 maxwell_3d: Slow implementation of passed samples (query 21)
Implements GL_SAMPLES_PASSED by waiting immediately for queries.
2020-02-14 17:27:17 -03:00
bunnei c31ec00d67
Merge pull request #3337 from ReinUsesLisp/vulkan-staged
yuzu: Implement Vulkan frontend
2020-02-03 16:56:25 -05:00
ReinUsesLisp f92cbc5501 yuzu: Implement Vulkan frontend
Adds a Qt and SDL2 frontend for Vulkan. It also finishes the missing
bits on Vulkan initialization.
2020-01-29 17:53:11 -03:00
Fernando Sahmkow c921e496eb GPU: Implement guest driver profile and deduce texture handler sizes. 2020-01-24 16:43:29 -04:00
ReinUsesLisp f5dfe68a94 vk_blit_screen: Initial implementation
This abstraction takes care of presenting accelerated and
non-accelerated or "framebuffer" images to the Vulkan swapchain.
2020-01-19 21:12:43 -03:00
ReinUsesLisp fe5356d223 vk_rasterizer: Implement Vulkan's rasterizer
This abstraction is Vulkan's equivalent to OpenGL's rasterizer. It takes
care of joining all parts of the backend and rendering accordingly on
demand.
2020-01-16 23:05:15 -03:00
ReinUsesLisp 38e789c761 renderer_vulkan: Add header as placeholder 2020-01-16 22:54:15 -03:00
ReinUsesLisp 09e17fbb0f vk_texture_cache: Implement generic texture cache on Vulkan
It currently ignores PBO linearizations since these should be dropped as
soon as possible on OpenGL.
2020-01-13 20:37:50 -03:00
ReinUsesLisp 908e085d02 vk_compute_pass: Add compute passes to emulate missing Vulkan features
This currently only supports quad arrays and u8 indices.

In the future we can remove quad arrays with a table written from the
CPU, but this was used to bootstrap the other passes helpers and it
was left in the code.

The blob code is generated from the "shaders/" directory. Read the
instructions there to know how to generate the SPIR-V.
2020-01-08 19:24:26 -03:00
ReinUsesLisp 82a64da077 vk_shader_util: Add helper to build SPIR-V shaders 2020-01-08 19:22:20 -03:00
ReinUsesLisp 2effdeb924 vk_graphics_pipeline: Initial implementation
This abstractio represents the state of the 3D engine at a given draw.
Instead of changing individual bits of the pipeline how it's done in
APIs like D3D11, OpenGL and NVN; on Vulkan we are forced to put
everything together into a single, immutable object.

It takes advantage of the few dynamic states Vulkan offers.
2020-01-06 22:02:26 -03:00
ReinUsesLisp dc96a59fa0 vk_compute_pipeline: Initial implementation
This abstraction represents a Vulkan compute pipeline.
2020-01-06 22:02:26 -03:00
ReinUsesLisp b392a5986e vk_pipeline_cache: Add file and define descriptor update template filler
This function allows us to share code between compute and graphics
pipelines compilation.
2020-01-06 22:02:26 -03:00
ReinUsesLisp 9c548146ca vk_rasterizer: Add placeholder 2020-01-06 22:02:26 -03:00
ReinUsesLisp 5aeff9aff5 vk_renderpass_cache: Initial implementation
The renderpass cache is used to avoid creating renderpasses on each
draw. The hashed structure is not currently optimized.
2020-01-06 18:28:32 -03:00
ReinUsesLisp 322d6a0311 vk_update_descriptor: Initial implementation
The update descriptor is used to store in flat memory a large chunk of
staging data used to update descriptor sets through templates. It
provides a push interface to easily insert descriptors following the
current pipeline. The order used in the descriptor update template has
to be implicitly followed. We can catch bugs here using validation
layers.
2020-01-06 18:28:32 -03:00
Fernando Sahmkow 56e450a3f7
Merge pull request #3264 from ReinUsesLisp/vk-descriptor-pool
vk_descriptor_pool: Initial implementation
2020-01-05 15:54:41 -04:00
ReinUsesLisp 0d6d8129c4 yuzu: Remove Maxwell debugger
This was carried from Citra and wasn't really used on yuzu. It also adds
some runtime overhead. This commit removes it from yuzu's codebase.
2020-01-02 23:09:44 -03:00
ReinUsesLisp 1fe7df4517 vk_descriptor_pool: Initial implementation
Create a large descriptor pool where we allocate all our descriptors
from. It has to be wide enough to support any pipeline, hence its large
numbers.

If the descritor pool is filled, we allocate more memory at that moment.
This way we can take advantage of permissive drivers like Nvidia's that
allocate more descriptors than what the spec requires.
2020-01-01 16:44:06 -03:00
Fernando Sahmkow 7bd447355f
Merge pull request #3248 from ReinUsesLisp/vk-image
vk_image: Add an image object abstraction
2019-12-30 14:25:14 -04:00
ReinUsesLisp 3813af2f3c
vk_staging_buffer_pool: Add a staging pool for temporary operations
The job of this abstraction is to provide staging buffers for temporary
operations. Think of image uploads or buffer uploads to device memory.

It automatically deletes unused buffers.
2019-12-25 18:12:17 -03:00
ReinUsesLisp c83bf7cd1e
vk_image: Add an image object abstraction
This object's job is to contain an image and manage its transitions.
Since Nvidia hardware doesn't know what a transition is but Vulkan
requires them anyway, we have to state track image subresources
individually.

To avoid the overhead of tracking each subresource in images with many
subresources (think of cubemap arrays with several mipmaps), this commit
tracks when subresources have diverged. As long as this doesn't happen
we can check the state of the first subresource (that will be shared
with all subresources) and update accordingly.

Image transitions are deferred to the scheduler command buffer.
2019-12-25 18:00:16 -03:00
ReinUsesLisp 4a3026b16b
fixed_pipeline_state: Define structure and loaders
The intention behind this hasheable structure is to describe the state
of fixed function pipeline state that gets compiled to a single graphics
pipeline state object. This is all dynamic state in OpenGL but Vulkan
wants it in an immutable state, even if hardware can edit it freely.

In this commit the structure is defined in an optimized state (it uses
booleans, has paddings and many data entries that can be packed to
single integers). This is intentional as an initial implementation that
is easier to debug, implement and review. It will be optimized in later
stages, or it might change if Vulkan gets more dynamic states.
2019-12-22 22:59:11 -03:00
ReinUsesLisp c8a48aacc0
video_core: Unify ProgramType and ShaderStage into ShaderType 2019-11-22 21:28:48 -03:00
ReinUsesLisp 80eacdf89b
texture_cache: Use a table instead of switch for texture formats
Use a large flat array to look up texture formats. This allows us to
properly implement formats with different component types. It should
also be faster.
2019-11-14 20:57:10 -03:00
Rodrigo Locatti fb9418798d
video_core: Enable sign conversion warnings
Enable sign conversion warnings but don't treat them as errors.
2019-11-11 18:00:37 -03:00
ReinUsesLisp 18c1cb68fd video_core: Treat implicit conversions as errors 2019-11-08 22:49:39 +00:00
ReinUsesLisp bd2aff3e26
rasterizer_accelerated: Add intermediary for GPU rasterizers
Add an intermediary class that implements common functions across GPU
accelerated rasterizers. This avoids code repetition on different
backends.
2019-10-27 03:40:08 -03:00
Fernando Sahmkow 1a58f45d76 VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders. 2019-10-25 09:01:29 -04:00
Fernando Sahmkow 47e4f6a52c Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes. 2019-10-04 18:52:50 -04:00
Fernando Sahmkow 8be6e1c522 shader_ir: Corrections to outward movements and misc stuffs 2019-10-04 18:52:48 -04:00
Fernando Sahmkow c17953978b shader_ir: Initial Decompile Setup 2019-10-04 18:52:47 -04:00
bunnei e424615839
Merge pull request #2783 from FernandoS27/new-buffer-cache
Implement a New LLE Buffer Cache
2019-08-29 13:07:01 -04:00
ReinUsesLisp 4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
Fernando Sahmkow 862bec001b Video_Core: Implement a new Buffer Cache 2019-08-21 12:14:22 -04:00
bunnei 3477b92289
Merge pull request #2675 from ReinUsesLisp/opengl-buffer-cache
buffer_cache: Implement a generic buffer cache and its OpenGL backend
2019-07-14 19:03:43 -04:00
Fernando Sahmkow 8af6e6a052 shader_ir: Implement a new shader scanner 2019-07-09 08:14:36 -04:00
ReinUsesLisp 32c0212b24 buffer_cache: Implement a generic buffer cache
Implements a templated class with a similar approach to our current
generic texture cache. It is designed to be compatible with Vulkan and
OpenGL,
2019-07-06 00:37:55 -03:00
ReinUsesLisp 345f852bdb gl_rasterizer: Drop gl_global_cache in favor of gl_buffer_cache 2019-07-06 00:37:55 -03:00
ReinUsesLisp 06c4ce8645 shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp 9098905dd1 gl_framebuffer_cache: Use a hashed struct to cache framebuffers 2019-06-20 21:36:12 -03:00
ReinUsesLisp 1b4503c571 texture_cache: Split texture cache into different files 2019-06-20 21:36:11 -03:00
ReinUsesLisp bab21e8cb3 gl_texture_cache: Initial implementation 2019-06-20 21:36:11 -03:00
ReinUsesLisp 2f2a61887a video_core/engines: Move ConstBufferInfo out of Maxwell3D 2019-06-07 19:47:15 -03:00
Zach Hilman de33ad25f5
Merge pull request #2514 from ReinUsesLisp/opengl-compat
video_core: Drop OpenGL core in favor of OpenGL compatibility
2019-06-07 17:23:25 -04:00
ReinUsesLisp e1b3be7ced shader: Move Node declarations out of the shader IR header
Analysis passes do not have a good reason to depend on shader_ir.h to
work on top of nodes. This splits node-related declarations to their own
file and leaves the IR in shader_ir.h
2019-06-06 20:02:37 -03:00
ReinUsesLisp bf4dfb3ad4 shader: Use shared_ptr to store nodes and move initialization to file
Instead of having a vector of unique_ptr stored in a vector and
returning star pointers to this, use shared_ptr. While changing
initialization code, move it to a separate file when possible.

This is a first step to allow code analysis and node generation beyond
the ShaderIR class.
2019-06-05 20:41:52 -03:00
ReinUsesLisp df509486c4 gl_rasterizer: Use GL_QUADS to emulate quads rendering 2019-05-30 13:21:01 -03:00
bunnei c27b81cb85
Merge pull request #2429 from FernandoS27/compute
Corrections and Implementation on GPU Engines
2019-05-09 13:19:22 -04:00
bunnei 4fad91ca45
Merge pull request #2383 from ReinUsesLisp/aoffi-test
gl_shader_decompiler: Disable variable AOFFI on unsupported devices
2019-04-22 22:14:02 -04:00
Fernando Sahmkow a91d3fc639 Revamp Kepler Memory to use a subegine to manage uploads 2019-04-22 18:50:56 -04:00
bunnei 4294062516
Merge pull request #2318 from ReinUsesLisp/sampler-cache
gl_sampler_cache: Port sampler cache to OpenGL
2019-04-17 21:45:56 -04:00
bunnei ea80e2bc57
Merge pull request #2235 from ReinUsesLisp/spirv-decompiler
vk_shader_decompiler: Implement a SPIR-V decompiler
2019-04-11 21:54:23 -04:00
bunnei 6951741a94
Merge pull request #2278 from ReinUsesLisp/vc-texture-cache
video_core: Implement API agnostic view based texture cache
2019-04-10 21:17:35 -04:00
ReinUsesLisp 0032821864 gl_device: Implement interface and add uniform offset alignment 2019-04-10 15:56:12 -03:00
ReinUsesLisp ad53b233c5 vk_shader_decompiler: Declare and stub interface for a SPIR-V decompiler 2019-04-10 14:20:25 -03:00
ReinUsesLisp 970d9e57c8 video_core: Add sirit as optional dependency with Vulkan
sirit is a runtime assembler for SPIR-V
2019-04-10 14:20:25 -03:00
bunnei d6374b2522
Merge pull request #2093 from FreddyFunk/disk-cache-better-compression
Better LZ4 compression utilization for the disk based shader cache and the yuzu build system
2019-04-03 21:50:29 -04:00
ReinUsesLisp 576ad9a012 gl_sampler_cache: Port sampler cache to OpenGL 2019-04-02 16:58:08 -03:00
ReinUsesLisp c5047540c9 video_core: Abstract vk_sampler_cache into a templated class 2019-04-02 15:54:11 -03:00
unknown 798d76f4c7 data_compression: Move LZ4 compression from video_core/gl_shader_disk_cache to common/data_compression 2019-03-29 16:42:19 +01:00
ReinUsesLisp 746dab407e vk_swapchain: Implement a swapchain manager 2019-03-29 00:00:51 -03:00
ReinUsesLisp d708d03d20 video_core: Implement API agnostic view based texture cache
Implements an API agnostic texture view based texture cache. Classes
defined here are intended to be inherited by the API implementation and
used in API-specific code.

This implementation exposes protected virtual functions to be called
from the implementer.

Before executing any surface copies methods (defined in API-specific code)
it tries to detect if the overlapping surface is a superset and if it
is, it creates a view. Views are references of a subset of a surface, it
can be a superset view (the same as referencing the whole texture).
Current code manages 1D, 1D array, 2D, 2D array, cube maps and cube map
arrays with layer and mipmap level views. Texture 3D slices views are
not implemented.

If the view attempt fails, the fast path is invoked with the overlapping
textures (defined in the implementer). If that one fails (returning
nullptr) it will flush and reload the texture.
2019-03-22 13:34:04 -03:00
ReinUsesLisp aa59d77c3b vk_sampler_cache: Implement a sampler cache 2019-03-12 20:20:57 -03:00
bunnei 633ce92908
Merge pull request #2147 from ReinUsesLisp/texture-clean
shader_ir: Remove "extras" from the MetaTexture
2019-03-10 17:28:36 -04:00
bunnei 1143923cdd
Merge pull request #2191 from ReinUsesLisp/maxwell-to-vk
maxwell_to_vk: Initial implementation
2019-03-08 11:51:08 -05:00
bunnei 4f352833a5
Merge pull request #2055 from bunnei/gpu-thread
Asynchronous GPU command processing
2019-03-07 10:41:53 -05:00
bunnei 076c76f4e4
Merge pull request #2149 from ReinUsesLisp/decoders-style
gl_rasterizer_cache: Move format conversion functions to their own file
2019-03-06 21:56:20 -05:00
bunnei aaa373585c gpu: Refactor a/synchronous implementations into their own classes. 2019-03-06 21:48:57 -05:00
bunnei 7b574f406b gpu: Move command processing to another thread. 2019-03-06 21:48:57 -05:00
ReinUsesLisp 1f6571b3de maxwell_to_vk: Initial implementation 2019-03-04 04:06:05 -03:00
ReinUsesLisp 35c105a108 vk_buffer_cache: Implement a buffer cache
This buffer cache is just like OpenGL's buffer cache with some minor
style changes. It uses VKStreamBuffer.
2019-03-01 17:33:36 -03:00
ReinUsesLisp 0ad3c031f4 gl_rasterizer_cache: Move format conversion to its own file 2019-02-26 20:08:27 -03:00