Commit Graph

11249 Commits

Author SHA1 Message Date
Lioncash
0d8ef2d3b9 common/swap: Improve codegen of the default swap fallbacks
Uses arithmetic that can be identified more trivially by compilers for
optimizations. e.g. Rather than shifting the halves of the value and
then swapping and combining them, we can swap them in place.

e.g. for the original swap32 code on x86-64, clang 8.0 would generate:

    mov     ecx, edi
    rol     cx, 8
    shl     ecx, 16
    shr     edi, 16
    rol     di, 8
    movzx   eax, di
    or      eax, ecx
    ret

while GCC 8.3 would generate the ideal:

    mov     eax, edi
    bswap   eax
    ret

now both generate the same optimal output.

MSVC used to generate the following with the old code:

    mov     eax, ecx
    rol     cx, 8
    shr     eax, 16
    rol     ax, 8
    movzx   ecx, cx
    movzx   eax, ax
    shl     ecx, 16
    or      eax, ecx
    ret     0

Now MSVC also generates a similar, but equally optimal result as clang/GCC:

    bswap   ecx
    mov     eax, ecx
    ret     0

====

In the swap64 case, for the original code, clang 8.0 would generate:

    mov     eax, edi
    bswap   eax
    shl     rax, 32
    shr     rdi, 32
    bswap   edi
    or      rax, rdi
    ret

(almost there, but still missing the mark)

while, again, GCC 8.3 would generate the more ideal:

    mov     rax, rdi
    bswap   rax
    ret

now clang also generates the optimal sequence for this fallback as well.

This is a case where MSVC unfortunately falls short, despite the new
code, this one still generates a doozy of an output.

    mov     r8, rcx
    mov     r9, rcx
    mov     rax, 71776119061217280
    mov     rdx, r8
    and     r9, rax
    and     edx, 65280
    mov     rax, rcx
    shr     rax, 16
    or      r9, rax
    mov     rax, rcx
    shr     r9, 16
    mov     rcx, 280375465082880
    and     rax, rcx
    mov     rcx, 1095216660480
    or      r9, rax
    mov     rax, r8
    and     rax, rcx
    shr     r9, 16
    or      r9, rax
    mov     rcx, r8
    mov     rax, r8
    shr     r9, 8
    shl     rax, 16
    and     ecx, 16711680
    or      rdx, rax
    mov     eax, -16777216
    and     rax, r8
    shl     rdx, 16
    or      rdx, rcx
    shl     rdx, 16
    or      rax, rdx
    shl     rax, 8
    or      rax, r9
    ret     0

which is pretty unfortunate.
2019-04-12 00:07:39 -04:00
Lioncash
66b73fd399 common/swap: Mark byte swapping free functions with [[nodiscard]] and noexcept
Allows the compiler to inform when the result of a swap function is
being ignored (which is 100% a bug in all usage scenarios). We also mark
them noexcept to allow other functions using them to be able to be
marked as noexcept and play nicely with things that potentially inspect
"nothrowability".
2019-04-11 20:42:44 -04:00
Lioncash
9cb4b7be40 common/swap: Simplify swap function ifdefs
Including every OS' own built-in byte swapping functions is kind of
undesirable, since it adds yet another build path to ensure compilation
succeeds on.

Given we only support clang, GCC, and MSVC for the time being, we can
utilize their built-in functions directly instead of going through the
OS's API functions.

This shrinks the overall code down to just

if (msvc)
  use msvc's functions
else if (clang or gcc)
  use clang/gcc's builtins
else
  use the slow path
2019-04-11 20:36:19 -04:00
Lioncash
598954436f common/swap: Remove 32-bit ARM path
We don't plan to support host 32-bit ARM execution environments, so this
is essentially dead code.
2019-04-11 20:15:47 -04:00
bunnei
2598433f9c
Merge pull request #2354 from lioncash/header
video_core/texures/texture: Remove unnecessary includes
2019-04-09 19:19:41 -04:00
bunnei
61f63bb994
Merge pull request #1957 from DarkLordZach/title-provider
file_sys: Provide generic interface for accessing game data
2019-04-09 19:16:37 -04:00
bunnei
353a099481
Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
bunnei
1a3098f11a
Merge pull request #2132 from FearlessTobi/port-4437
Port citra-emu/citra#4437: "citra-qt: Make hotkeys configurable via the GUI (Attempt 2)"
2019-04-09 18:08:30 -04:00
bunnei
71182643f7
Merge pull request #2370 from lioncash/qt-warn
yuzu/loading_screen: Resolve runtime Qt string formatting warnings
2019-04-09 17:21:18 -04:00
bunnei
bc7e149835
Merge pull request #2369 from FernandoS27/mip-align
gl_backend: Align Pixel Storage
2019-04-09 17:20:43 -04:00
bunnei
088c7c1bb5
Merge pull request #2368 from FernandoS27/fix-lop
Correct LOP_IMM encoding
2019-04-09 17:19:56 -04:00
Hexagon12
b81260c65c
Merge pull request #2371 from lioncash/pagetable
kernel/process: Set page table when page table resizes occur.
2019-04-09 20:13:37 +03:00
Lioncash
2abf979c35 kernel/process: Set page table when page table resizes occur.
We need to ensure dynarmic gets a valid pointer if the page table is
resized (the relevant pointers would be invalidated in this scenario).

In this scenario, the page table can be resized depending on what kind
of address space is specified within the NPDM metadata (if it's
present).
2019-04-09 13:00:56 -04:00
Lioncash
b73e433dff yuzu/loading_screen: Resolve runtime Qt string formatting warnings
In our error console, when loading a game, the strings:

QString::arg: Argument missing: "Loading...", 0
QString::arg: Argument missing: "Launching...", 0

would occasionally pop up when the loading screen was running. This was
due to the strings being assumed to have formatting indicators in them,
however only two out of the four strings actually have them.

This only applies the arguments to the strings that have formatting
specifiers provided, which avoids these warnings from occurring.
2019-04-09 10:49:38 -04:00
Fernando Sahmkow
9f16833097 gl_backend: Align Pixel Storage
This commit makes sure GL reads on the correct pack size for the
respective texture buffer.
2019-04-08 17:16:02 -04:00
Fernando Sahmkow
5c55ae4e18 Correct LOP_IMN encoding 2019-04-08 13:39:12 -04:00
Fernando Sahmkow
16adc735a5 Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
bunnei
f14328bf0a
Merge pull request #2300 from FernandoS27/null-shader
shader_cache: Permit a Null Shader in case of a bad host_ptr.
2019-04-07 17:58:27 -04:00
bunnei
c2fee0e519
Merge pull request #2355 from ReinUsesLisp/sync-point
maxwell_3d: Reduce severity of ProcessSyncPoint
2019-04-07 17:56:11 -04:00
bunnei
06ece52cfe
Merge pull request #2359 from FearlessTobi/port-2-prs
Port citra-emu/citra#4718: "fix clang-format target when using a path with spaces on windows"
2019-04-07 17:54:57 -04:00
bunnei
8aaf418bd6
Merge pull request #2306 from ReinUsesLisp/aoffi
shader_ir: Implement AOFFI for TEX and TLD4
2019-04-07 17:52:30 -04:00
bunnei
3c1ce290d0
Merge pull request #2361 from lioncash/pagetable
core/memory: Minor simplifications to page table management
2019-04-07 17:50:31 -04:00
bunnei
6b18a1592f
Merge pull request #2321 from ReinUsesLisp/gl-state-rework
gl_state: Rework to enable individual applies
2019-04-07 17:50:07 -04:00
bunnei
21a4e7deea
Merge pull request #2098 from FreddyFunk/disk-cache-zstd
gl_shader_disk_cache: Use Zstandard for compression
2019-04-07 17:48:33 -04:00
bunnei
52ad5fa0e8
Merge pull request #2356 from lioncash/pair
kernel/{server_port, server_session}: Return pairs instead of tuples from pair creation functions
2019-04-07 17:48:00 -04:00
bunnei
d9b1c24f4f
Merge pull request #2362 from lioncash/enum
core/memory: Remove unused enum constants
2019-04-07 17:46:09 -04:00
bunnei
80162888e6
Merge pull request #2352 from bunnei/mem-manager-fixes
memory_manager: Improved implementation of read/write/copy block.
2019-04-07 17:44:59 -04:00
Fernando Sahmkow
021cd56bc9 Permit a Null Shader in case of a bad host_ptr. 2019-04-07 07:52:01 -04:00
Lioncash
36a1e6a982 core/memory: Remove unused enum constants
These are holdovers from Citra and can be removed.
2019-04-07 03:04:55 -04:00
Lioncash
abae7577d2 core/memory: Remove GetCurrentPageTable()
Now that nothing actually touches the internal page table aside from the
memory subsystem itself, we can remove the accessor to it.
2019-04-07 02:47:37 -04:00
Lioncash
a6a82bb004 arm/arm_dynarmic: Remove unnecessary current_page_table member
Given the page table will always be guaranteed to be that of whatever
the current process is, we no longer need to keep this around.
2019-04-07 02:43:51 -04:00
Lioncash
e779686a76 kernel: Handle page table switching within MakeCurrentProcess()
Centralizes the page table switching to one spot, rather than making
calling code deal with it everywhere.
2019-04-07 01:12:54 -04:00
khang06
945e39471d fix clang-format target when using a path with spaces on windows 2019-04-07 02:10:01 +02:00
Lioncash
7a7ffa602d kernel/server_session: Return a std::pair from CreateSessionPair()
Keeps the return type consistent with the function name. While we're at
it, we can also reduce the amount of boilerplate involved with handling
these by using structured bindings.
2019-04-06 01:42:03 -04:00
Lioncash
04d265562f kernel/server_port: Return a std::pair from CreatePortPair()
Returns the same type that the function name describes.
2019-04-06 01:36:53 -04:00
ReinUsesLisp
ddcb711ee8 maxwell_3d: Reduce severity of ProcessSyncPoint 2019-04-06 02:18:20 -03:00
Lioncash
89c106e31b video_core/textures/convert: Replace include with a forward declaration
Avoids dragging in a direct dependency in a header.
2019-04-06 00:14:36 -04:00
Lioncash
fbf452ab0e video_core/texures/texture: Remove unnecessary includes
Nothing in this header relies on common_funcs or the memory manager.

This gets rid of reliance on indirect inclusions in the OpenGL caches.
2019-04-06 00:03:35 -04:00
bunnei
864280fabc
Merge pull request #2317 from FernandoS27/sync
Implement SyncPoint Register in the GPU.
2019-04-05 23:50:54 -04:00
bunnei
7d1c0fd1ad
Merge pull request #2325 from lioncash/name
kernel/server_session: Provide a GetName() override
2019-04-05 23:48:13 -04:00
bunnei
fddafa14c8
Merge pull request #2342 from lioncash/warning
common/multi_level_queue: Silence truncation warnings
2019-04-05 23:47:27 -04:00
bunnei
54c7e8e40e
Merge pull request #2240 from FearlessTobi/port-4651
Port citra-emu/citra#4651: "gdbstub: Fix some bugs in IsMemoryBreak() and ServeBreak. Add workaround to let watchpoints break into GDB."
2019-04-05 23:46:37 -04:00
bunnei
e3402d976d
Merge pull request #2346 from lioncash/header
video_core/engines: Remove unnecessary inclusions where applicable
2019-04-05 23:44:27 -04:00
bunnei
20be92d5e6 memory_manager: Improved implementation of read/write/copy block.
- Fixes graphical issues with Chocobo's Mystery Dungeon EVERY BUDDY!
- Fixes a crash with Mario Tennis Aces
2019-04-05 23:43:34 -04:00
bunnei
89b8801a97
Merge pull request #2350 from lioncash/vmem
video_core/memory_manager: Mark a few member functions with the const qualifier
2019-04-05 23:40:54 -04:00
bunnei
00207cc965
Merge pull request #2340 from lioncash/view
file_sys/fsmitm_romfsbuild: Utilize a string_view in romfs_calc_path_hash
2019-04-05 23:40:16 -04:00
bunnei
e86b26cd2b
Merge pull request #2334 from lioncash/override
core: Add missing override specifiers where applicable
2019-04-05 23:39:52 -04:00
bunnei
41890a84be
Merge pull request #2347 from lioncash/trunc
video_core/gpu_thread: Silence truncation warning in ThreadManager's constructor
2019-04-05 23:39:31 -04:00
bunnei
23d3cd7604
Merge pull request #2341 from lioncash/compare
file_sys/nca_metadata: Remove unnecessary comparison operators for TitleType
2019-04-05 23:38:37 -04:00
bunnei
d6cddffb78
Merge pull request #2339 from lioncash/rank
service/fsp_srv: Update SaveDataInfo and SaveDataDescriptor structs
2019-04-05 23:36:46 -04:00