We don't need to iterate over the full shader code when creating
a new shader module. This optimization may slightly reduce the
initial pipeline creation time.
Drivers from both major vendors implement their own shader cache
already, and storing a cache per game causes more issues than it
solves. Should fix#261.
If an application compiles the same shader multiple times, we should reuse
an already existing DxvkShaderModule instead of creating a new one. This
helps keep the number of DxvkGraphicsPipeline objects low in games such
as Rise of the Tomb Raider.
* [dxgi] Fix compilation with WINE headers
```gcc
error: cannot convert 'MONITORINFOEX* {aka tagMONITORINFOEXA*}' to 'LPMONITORINFO {aka tagMONITORINFO*}' for argument '2' to 'BOOL GetMonitorInfoA(HMONITOR, LPMONITORINFO)'
```
```clang
cannot initialize a parameter of type 'LPMONITORINFO' (aka 'tagMONITORINFO *') with an rvalue of type '::MONITORINFOEX *' (aka 'tagMONITORINFOEXA *')
```
This can be WINE bug but I don't want to dig now, firs suggestion is wrong "tag":
wine variant
```c
typedef struct tagMONITORINFO
{
...
} MONITORINFO, *LPMONITORINFO;
typedef struct tagMONITORINFOEXA
{ /* the 4 first entries are the same as MONITORINFO */
...
} MONITORINFOEXA, *LPMONITORINFOEXA;
typedef struct tagMONITORINFOEXW
{ /* the 4 first entries are the same as MONITORINFO */
...
} MONITORINFOEXW, *LPMONITORINFOEXW;
DECL_WINELIB_TYPE_AW(MONITORINFOEX)
DECL_WINELIB_TYPE_AW(LPMONITORINFOEX)
```
VS
MinGW variant
```c
typedef struct tagMONITORINFO {
...
} MONITORINFO,*LPMONITORINFO;
typedef struct tagMONITORINFOEXA : public tagMONITORINFO {
CHAR szDevice[CCHDEVICENAME];
} MONITORINFOEXA,*LPMONITORINFOEXA;
typedef struct tagMONITORINFOEXW : public tagMONITORINFO {
WCHAR szDevice[CCHDEVICENAME];
} MONITORINFOEXW,*LPMONITORINFOEXW;
__MINGW_TYPEDEF_AW(MONITORINFOEX)
__MINGW_TYPEDEF_AW(LPMONITORINFOEX)
```
* [dxgi] Fix compilation with WINE headers
Use C++-style casts rather than C ones.
Since we are synchronizing once per frame anyway, there is no need to
artificially limit the number of chunks in flight. Applications which
use deferred contexts and submit a large number of CS chunks through
command lists may benefit from this optimization.
This optimization may help keep the GPU busy in case there's
a large number of draw calls pending at the time a command
list from a deferred context is submitted for execution.
HUD elements can be enabled individually using a comma-separated
list. Supported options include:
- fps: Displays the framerate
- devinfo: Displays device info
Passing "1" has the same effect as "fps,devinfo".
`limits.h` required for `UINT_MAX` and not always used, so better to use standard C++ variant from `<limits>`.
Some compilers may simply return `UINT_MAX` value, gcc version: `max() _GLIBCXX_USE_NOEXCEPT { return __INT_MAX__ * 2U + 1; }`.
* [dxgi] Replace MSVC _countof macro with std::size
pro: crosscompiler
con: may not work for older clang/gcc/...
http://en.cppreference.com/w/cpp/iterator/size
For example `Run this code` mode produces errors for GCC-5.2(C++17) and clang-3.8(C++17). Local GCC-7.3 and clang-7 versions are ok.
Not tested w/ MinGW64.
* [dxgi] Replace MSVC _countof macro with std::size
We cannot run these in parallel in case the hull shader's output vertex
count, and thus the invocation count, is less than the fork/join phase
invocation count.
May reduce execution time of hull shaders on the GPU by running
the fork/join phases in parallel, as originally intended. Tested
on RADV 18.0.99 with LLVM 6.0.0.
Since we create only one DxvkContext per D3D11Device, rather than
per D3D11DeviceContext as originally planned, there is no need to
keep the pipeline manager as a global thread-safe object. This may
slightly reduce CPU overhead.