Qt Quick Direct3D 12 Adaptation

The Direct3D 12 adaptation for Windows 10, both in Win32 (windows platform plugin) and in UWP (winrt platform plugin), is shipped as a dynamically loaded plugin. This adaptation doesn't work on earlier Windows versions. Building this plugin is enabled automatically, whenever the necessary D3D and DXGI develpoment files are present. In practice, this currently means Visual Studio 2015 and newer.

The adaptation is available both in normal, OpenGL-enabled Qt builds, and also when Qt is configured with -no-opengl. However, it's never the default, meaning that the user or the application has to explicitly request it by setting the QT_QUICK_BACKEND environment variable to d3d12 or by calling QQuickWindow::setSceneGraphBackend().

Motivation

This experimental adaptation is the first Qt Quick backend that focuses on a modern, lower-level graphics API in combination with a windowing system interface that's different from the traditional approaches used in combination with OpenGL.

This adaptation also allows better integration with Windows, as Direct3D is the primary vendor-supported solution. Consequently, there are fewer problems anticipated with drivers, operations like window resizes, and special events like graphics device loss caused by device resets or graphics driver updates.

Performance-wise, the general expectation is a somewhat lower CPU usage compared to OpenGL, due to lower driver overhead, and a higher GPU utilization with less idle time wastage. The backend doesn't heavily utilize threads yet, which means there are opportunities for further improvements in the future, for example to further optimize image loading.

The D3D12 backend also introduces support for pre-compiled shaders. All the backend's own shaders (used by the built-in materials on which the Rectangle, Image, Text, and other QML types are built with) are compiled to D3D shader bytecode when you compile Qt. Applications using ShaderEffect items can choose to ship bytecode either in regular files, via the Qt resource system, or use High Level Shading Language for DirectX (HLSL) source strings. Unlike OpenGL, the compilation for HLSL is properly threaded, meaning shader compilation won't block the application and its user interface.

Graphics Adapters

The plugin does not necessarily require hardware acceleration. You can also use WARP, the Direct3D software rasterizer. By default, the first adapter providing hardware acceleration is chosen. To override this and use another graphics adapter or to force the use of the software rasterizer, set the QT_D3D_ADAPTER_INDEX environment variable to the index of the adapter. The adapters discovered are printed at startup when QSG_INFO or the qt.scenegraph.general logging category is enabled.

Troubleshooting

If you encounter issues, always set the QSG_INFO and QT_D3D_DEBUG environment variables to 1, to get debug and warning messages printed on the debug output. QT_D3D_DEBUG enables the Direct3D debug layer.

Note: The debug layer shouldn't be enabled in production use, since it can significantly impact performance (CPU load) due to increased API overhead.

Render Loops

By default, the D3D12 adaptation uses a single-threaded render loop similar to OpenGL's windows render loop. A threaded variant is also available, that you can request by setting the QSG_RENDER_LOOP environment variable to threaded. However, due to conceptual limitations in DXGI, the windowing system interface, the threaded loop is prone to deadlocks when multiple QQuickWindow or QQuickView instances are shown. Consequently, for the time being, the default is the single-threaded loop. This means that with the D3D12 backend, applications are expected to move their work from the main (GUI) thread out to worker threads, instead of expecting Qt to keep the GUI thread responsive and suitable for heavy, blocking operations.

For more information see Qt Quick Scene Graph for details on render loops and Multithreading and DXGI regarding the issues with multithreading.

Renderer

The scene graph renderer in the D3D12 adaptation currently doesn't perform any batching. This is less of an issue, unlike OpenGL, because state changes don't present any problems in the first place. The simpler renderer logic can also lead to lower CPU overhead in some cases. The trade-offs between the various approaches are currently under research.

Shader Effects

The ShaderEffect QML type is fully functional with the D3D12 adaptation as well. However, the interpretation of the fragmentShader and vertexShader properties is different than with OpenGL.

With D3D12, these strings can either be a URL for a local file, a file in the resource system, or an HLSL source string. Using a URL for a local file or a file in the resource system indicates that the file in question contains pre-compiled D3D shader bytecode generated by the fxc tool, or, alternatively, HLSL source code. The type of file is detected automatically. This means that the D3D12 backend supports all options from GraphicsInfo.shaderCompilationType and GraphicsInfo.shaderSourceType.

Unlike OpenGL, whenever you open a file, there is a QFileSelector with the extra hlsl selector used. This provides easy creation of ShaderEffect items that are functional across both backends, for example by placing the GLSL source code into shaders/effect.frag, the HLSL source code or - preferably - pre-compiled bytecode into shaders/+hlsl/effect.frag, while simply writing fragmentShader: "qrc:shaders/effect.frag" in QML. For more details, see ShaderEffect.

Multisample Render Targets

The Direct3D 12 adaptation ignores the QSurfaceFormat set on the QQuickWindow or QQuickView, or set via QSurfaceFormat::setDefaultFormat(), with two exceptions: QSurfaceFormat::samples() and QSurfaceFormat::alphaBufferSize() are still taken into account. When the sample value is greater than 1, multisample offscreen render targets will be created with the specified sample count at the maximum supported quality level. The backend automatically performs resolving into the non-multisample swapchain buffers after each frame.

Semi-transparent Windows

When the alpha channel is enabled either via QQuickWindow::setDefaultAlphaBuffer() or by setting alphaBufferSize to a non-zero value in the window's QSurfaceFormat or in the global format managed by QSurfaceFormat::setDefaultFormat(), the D3D12 backend will create a swapchain for composition and go through DirectComposition. This is necessary, because the mandatory flip model swapchain wouldn't support transparency otherwise.

Therefore, it's important not to unneccessarily request an alpha channel. When the alphaBufferSize is 0 or the default -1, all these extra steps can be avoided and the traditional window-based swapchain is sufficient.

On WinRT, this isn't relevant because the backend there always uses a composition swapchain which is associated with the ISwapChainPanel that backs QWindow on that platform.

Mipmaps

Mipmap generation is supported and handled transparently to the applications via a built-in compute shader. However, at the moment, this feature is experimental and only supports power-of-two images. Textures of other size will work too, but this involves a QImage-based scaling on the CPU first. Therefore, avoid enabling mipmapping for Non-Power-Of-Two (NPOT) images whenever possible.

Image Formats

When creating textures via C++ scene graph APIs like QQuickWindow::createTextureFromImage(), 32-bit formats won't involve any conversion, they'll map directly to the corresponding R8G8B8A8_UNORM or B8G8R8A8_UNORM format. Everything else will trigger a QImage-based format conversion on the CPU first.

Unsupported Features

Particles and some other OpenGL-dependent utilities, like QQuickFramebufferObject, are currently not supported.

Like with Software adaptation, text is always rendered using the native method. Distance field-based text rendering is currently not implemented.

The shader sources in the Qt Graphical Effects module have not been ported to any format other than the OpenGL 2.0 compatible one, meaning that the QML types provided by that module are currently not functional with the D3D12 backend.

Texture atlases are currently not in use.

The renderer may lack support for certain minor features, such as drawing points and lines with a width other than 1.

Custom Qt Quick items using custom scene graph nodes can be problematic because materials are inherently tied to the graphics API. Therefore, only items that use the utility rectangle and image nodes are functional across all adaptations.

QQuickWidget and its underlying OpenGL-based compositing architecture is not supported. If you need to mix with QWidget-based user interfaces, use QWidget::createWindowContainer() to embed the native window of the QQuickWindow or QQuickView.

Finally, rendering via QSGEngine and QSGAbstractRenderer is not feasible with the D3D12 adaptation at the moment.

To integrate custom Direct3D 12 rendering, use QSGRenderNode in combination with QSGRendererInterface. This approach doesn't rely on OpenGL contexts or API specifics like framebuffers, and allows exposing the graphics device and command buffer from the adaptation. It's not necessarily suitable for easy integration of all types of content, in particular true 3D, so it'll likely get complemented by an alternative to QQuickFramebufferObject in future releases.

To perform runtime decisions based on the adaptation, use QSGRendererInterface from C++ and GraphicsInfo from QML. They can also be used to check the level of shader support: shading language, compilation approach, and so on.

When creating custom items, use the new QSGRectangleNode and QSGImageNode classes. These replace the now deprecated QSGSimpleRectNode and QSGSimpleTextureNode. Unlike their predecessors, these new classes are interfaces, and implementations are created via the QQuickWindow::createRectangleNode() and QQuickWindow::createImageNode() factory functions.

Advanced Configuration

The D3D12 adaptation can keep multiple frames in flight, similar to modern game engines. This is somewhat different from the traditional "render - swap - wait for vsync" model and allows for better GPU utilization at the expense of higher resource use. This means that the renderer will be a number of frames ahead of what is displayed on the screen.

For a discussion of flip model swap chains and the typical configuration parameters, refer to Sample Application for Direct3D 12 Flip Model Swap Chains.

Vertical synchronization is always enabled, meaning Present() is invoked with an interval of 1.

The configuration can be changed by setting the following environment variables:

Environment variableDescription
QT_D3D_BUFFER_COUNTThe number of swap chain buffers in range 2 - 4. The default value is 3.
QT_D3D_FRAME_COUNTThe number of frames prepared without blocking in range 1 - 4. The default value is 2. Present() starts blocking after queuing 3 frames (regardless of QT_D3D_BUFFER_COUNT), unless the waitable object is in use. Every additional frame increases GPU resource usage since geometry and constant buffer data needs to be duplicated, and involves more bookkeeping on the CPU side.
QT_D3D_WAITABLE_SWAP_CHAIN_MAX_LATENCYThe frame latency in range 1 - 16. The default value is 0 (disabled). Changes the limit for Present() and triggers a wait for an available swap chain buffer when beginning each frame. For a detailed discussion, see the article linked above.

Note: Currently, this behavior is experimental.

QT_D3D_BLOCKING_PRESENTThe time the CPU should wait, a non-zero value, for the GPU to finish its work after each call to Present(). The default value is 0 (disabled). This behavior effectively kills all parallelism but makes the behavior resemble the traditional swap-blocks-for-vsync model, which can be useful in some special cases. However, this behavior is not the same as setting the frame count to 1 because that still avoids blocking after Present(), and may only block when starting to prepare the next frame (or may not block at all depending on the time gap between the frames).