2018-08-03 16:06:01 +01:00
# rpi-vk-driver
(not conformant yet, can't use official name or logo)
2018-08-26 14:49:21 +01:00
## Milestones
- [x] clear screen example working
2018-10-14 17:29:57 +01:00
- [x] triangle example working
- [x] shader from assembly, vertices from vertex buffer object, no uniforms, color hardcoded
2019-08-25 19:03:04 +01:00
- [x] uniforms for matrix multiplication and animation
- [x] texture coordinates and texture sampling
2019-09-22 16:46:05 +01:00
- [x] varyings
- [x] Multiple vertex attributes
2019-09-23 15:52:34 +01:00
- [x] Depth buffers
2019-09-23 17:51:46 +01:00
- [x] Stencil buffers
2019-09-23 20:27:32 +01:00
- [x] Indexed draw calls
2019-09-24 22:07:53 +01:00
- [x] blending
2020-04-10 18:25:40 +01:00
- [x] mipmapping
2020-04-14 21:31:02 +01:00
- [x] cube mapping
2020-01-31 20:44:04 +00:00
- [x] shadow mapping / depth texture sampling
2020-06-09 20:59:52 +01:00
- [x] Cubemaps with mipmaps
2020-05-25 17:39:19 +01:00
- [ ] Multi threaded cmdbuf generation test (secondary cmdbufs)
2019-07-05 21:19:59 +01:00
- [x] Shader compiler chain
2019-07-05 20:53:50 +01:00
- [x] QPU assembler / disassembler
2019-08-21 22:13:08 +01:00
- [x] Resources
- [x] Descriptor support
- [x] VkSampler support
2019-07-27 21:59:24 +01:00
- [x] Push constant support
2019-10-02 00:09:53 +01:00
- [x] Platform features
- [x] Layer support
2020-05-31 13:31:46 +01:00
- [x] Emulated features
- [x] Clear command support
2020-05-29 23:09:18 +01:00
- [x] Copy command support
2020-02-18 22:01:17 +00:00
- [x] Render to texture features
2020-02-18 22:01:05 +00:00
- [x] VkRenderPass support
2020-02-16 23:14:19 +00:00
- [x] MSAA support
2020-02-24 22:07:52 +00:00
- [x] Performance
- [x] Performance counters
2020-06-04 21:34:24 +01:00
- [x] WSI
2020-04-16 18:19:16 +01:00
- [x] Direct to display support
2020-06-04 21:34:11 +01:00
- [x] Vsync support
- [x] Present modes support
2019-12-12 21:56:06 +00:00
- [ ] Fixes
- [ ] Hardware bug workarounds
- [ ] Handle offsets wherever required
- [ ] Handle subresource ranges properly
- [ ] Handle allocation scopes properly
2020-05-17 17:41:39 +01:00
- [ ] Code cleanup
2020-06-08 19:24:24 +01:00
- [x] Clean up compile time warnings
2020-05-17 16:50:34 +01:00
- [ ] Profile and optimise the driver code
2020-06-08 19:24:24 +01:00
- [x] Run Clang static analysis
2020-05-31 13:33:39 +01:00
- [ ] Documentation
2020-05-31 13:33:57 +01:00
- [ ] Github pages
2020-05-31 13:33:39 +01:00
- [ ] Wiki
2019-12-14 20:37:58 +00:00
- [ ] Performance recommendations
- [ ] How to do blending, depth/stencil testing, attributes
2020-05-31 13:34:20 +01:00
- [ ] Try to pass as much of the VK CTS as possible with existing feature set
2018-11-17 15:57:20 +00:00
## VK CTS progress
2019-02-09 16:20:14 +00:00
- Passed: 7894/67979 (11.6%)
- Failed: 878/67979 (1.3%)
- Not supported: 59206/67979 (87.1%)
- Warnings: 1/67979 (0.0%)
2018-11-17 16:39:13 +00:00
2019-02-09 16:24:29 +00:00
Conformance run is considered passing if all tests finish with allowed result
codes.
Following status
codes are allowed:
- Pass
- NotSupported
- QualityWarning
- CompatibilityWarning
2019-09-04 11:38:12 +01:00
There are about 470.000 conformance tests.
2018-11-17 16:39:13 +00:00
## FAQ
### Will this ever be a fully functional VK driver?
2020-06-08 19:41:45 +01:00
As far as I know the PI is NOT fully VK capable on the hardware level. Some things will be emulated and others won't ever be supported.
2018-11-17 16:39:13 +00:00
### What performance should you expect?
2020-06-09 21:20:28 +01:00
Performance wise, the Pi is quite capable. The specs and architecture is close to the GPU in the iPhone 4s. The only problem I see is bandwidth as you only have about 2.5GB/s compared to 25-50GB/s on typical mobile phones. So post processing is a huge no and you'd need to be very careful about the techniques that you use. Eg. you'd need to stay on chip at all times.
2018-11-17 16:39:36 +00:00
CPU performance (eg. number of draw calls) should be enough on the quad-core PIs as you can easily utilise all cores using VK.
2018-11-17 16:39:13 +00:00
2019-09-15 15:14:09 +01:00
### What features will not be supported?
- 3D textures
- sparse textures
- occlusion queries (https://github.com/anholt/mesa/wiki/VC4-OpenGL-support)
2020-02-24 21:54:59 +00:00
- pipeline statistics
2019-09-15 15:14:09 +01:00
- indirect draws
- events
- proper semaphore support
- tessellation shaders
- geometry shaders
- 32 bit indices
2019-09-23 19:50:31 +01:00
- instancing
2020-02-20 20:48:03 +00:00
- multiple color attachments
2020-06-08 19:41:45 +01:00
### What features could be supported if kernel support was present?
2020-02-20 22:49:27 +00:00
- HDR render targets and textures (lack of kernel support for 64bpp render target)
- ETC textures (lack of kernel support for 64bpp render target)
2020-04-14 21:32:04 +01:00
- linear RGBA8 textures (lack of kernel support)
- linear YUYV textures https://www.linuxtv.org/downloads/v4l-dvb-apis-old/V4L2-PIX-FMT-YUYV.html (lack of kernel support)
2020-02-24 22:20:57 +00:00
- timing blocks for profiling (kernel supports interrupts, but data needs to be routed to userspace ie. add tiler/renderer start/end timing to seqnos)
2020-06-08 19:41:45 +01:00
- compute shaders (though could be supported to some extent if the kernel side would support it)
### What features could be supported given enough time?
- spirv shaders
- pipeline caches (currently doesn't make sense with assembly shaders)
2019-09-15 15:14:09 +01:00
### What additional features will this driver support?
2020-06-08 19:41:45 +01:00
- I already added support to load shader assembly. This will enable devs to optimise shaders to the last cycle.
2019-09-15 15:14:09 +01:00
- Videocore IV provides some performance counters these will be exposed
- Videocore IV supports some texture formats that are not present in the spec
2019-09-15 17:34:55 +01:00
- bw1: 1 bit black and white
- a4: 4 bit alpha
- a1: 1 bit alpha
2019-12-08 15:41:34 +00:00
### Shader patching
The Broadcom Videocore IV needs a couple of operations to happen in shader code that might have fixed function hardware on other platforms.
These are:
- writing stencil state setup register
- writing depth value to depth buffer
- performing blending in software
- writing vertex parameter memory read and write setup registers
Since the project will not include a compiler, but rather works with an assembly based shader setup, I decided not to patch shaders based on the state provided to the driver, but rather let the developer have full control.
This means that regardless of what
- depth write state
- blending state
- stencil state
- vertex attribute state
is passed to the driver, this will not be reflected in the final behaviour unless the developer adds it to the assembly shaders.
Helper functionality will be provided to aid with encoding register values. Additionally, general documentation will be provided on how to perform these operations.
This will enable developers to take full control and optimise shaders to the last cycle.