The DirectX Shader Compiler project includes a compiler and related tools used to compile High-Level Shader Language (HLSL) programs into DirectX Intermediate Language (DXIL) representation

Overview

DirectX Shader Compiler

Build status

The DirectX Shader Compiler project includes a compiler and related tools used to compile High-Level Shader Language (HLSL) programs into DirectX Intermediate Language (DXIL) representation. Applications that make use of DirectX for graphics, games, and computation can use it to generate shader programs.

For more information, see the Wiki.

Visit the DirectX Landing Page for more resources for DirectX developers.

Downloads

You can download the latest successful build's artifacts (built by Appveyor) for the master branch:

Downloads
Windows
Ubuntu

Features and Goals

The starting point of the project is a fork of the LLVM and Clang projects, modified to accept HLSL and emit a validated container that can be consumed by GPU drivers.

At the moment, the DirectX HLSL Compiler provides the following components:

  • dxc.exe, a command-line tool that can compile HLSL programs for shader model 6.0 or higher

  • dxcompiler.dll, a DLL providing a componentized compiler, assembler, disassembler, and validator

  • dxilconv.dll, a DLL providing a converter from DXBC (older shader bytecode format)

  • various other tools based on the above components

The Microsoft Windows SDK releases include a supported version of the compiler and validator.

The goal of the project is to allow the broader community of shader developers to contribute to the language and representation of shader programs, maintaining the principles of compatibility and supportability for the platform. It's currently in active development across two axes: language evolution (with no impact to DXIL representation), and surfacing hardware capabilities (with impact to DXIL, and thus requiring coordination with GPU implementations).

Pre-built Releases

Binary packages containing the output of this project are available from appveyor. Development kits containing only the dxc.exe driver app, the dxcompiler.dll, and the dxil.dll signing binary are available here, or in the releases tab.

SPIR-V CodeGen

As an example of community contribution, this project can also target the SPIR-V intermediate representation. Please see the doc for how HLSL features are mapped to SPIR-V, and the wiki page for how to build, use, and contribute to the SPIR-V CodeGen.

Building Sources

Note: If you intend to build from sources on Linux/macOS, follow these instructions.

Before you build, you will need to have some additional software installed. This is the most straightforward path - see Building Sources on the Wiki for more options, including Visual Studio 2015 and Ninja support.

After cloning the project, you can set up a build environment shortcut by double-clicking the utils\hct\hctshortcut.js file. This will create a shortcut on your desktop with a default configuration. If your system doesn't have the requisite association for .js files, this may not work. If so, open a cmd window and invoke: wscript.exe utils\hct\hctshortcut.js.

Tests are built using the TAEF framework which is included in the Windows Driver Kit.

To build, run this command on the HLSL Console.

hctbuild

You can also clean, build and run tests with this command.

hctcheckin

To see a list of additional commands available, run hcthelp

Running Tests

To run tests, open the HLSL Console and run this command after a successful build.

hcttest

Some tests will run shaders and verify their behavior. These tests also involve a driver that can run these execute these shaders. See the next section on how this should be currently set up.

Running Shaders

To run shaders compiled as DXIL, you will need support from the operating system as well as from the driver for your graphics adapter. Windows 10 Creators Update is the first version to support DXIL shaders. See the Wiki for information on using experimental support or the software adapter.

Hardware Support

Hardware GPU support for DXIL is provided by the following vendors:

NVIDIA

NVIDIA's r396 drivers (r397.64 and later) provide release mode support for DXIL 1.1 and Shader Model 6.1 on Win10 1709 and later, and experimental mode support for DXIL 1.2 and Shader Model 6.2 on Win10 1803 and later. These drivers also support DXR in experimental mode.

Drivers can be downloaded from geforce.com.

AMD

AMD’s driver (Radeon Software Adrenalin Edition 18.4.1 or later) provides release mode support for DXIL 1.1 and Shader Model 6.1. Drivers can be downloaded from AMD's download site.

Intel

Intel's 15.60 drivers (15.60.0.4849 and later) support release mode for DXIL 1.0 and Shader Model 6.0 as well as release mode for DXIL 1.1 and Shader Model 6.1 (View Instancing support only).

Drivers can be downloaded from the following link Intel Graphics Drivers

Direct access to 15.60 driver (latest as of of this update) is provided below:

Installer

Release Notes related to DXIL

Making Changes

To make contributions, see the CONTRIBUTING.md file in this project.

Documentation

You can find documentation for this project in the docs directory. These contain the original LLVM documentation files, as well as two new files worth nothing:

License

DirectX Shader Compiler is distributed under the terms of the University of Illinois Open Source License.

See LICENSE.txt and ThirdPartyNotices.txt for details.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Issues
  • [spirv] Add support for [[vk::set(X)]]

    [spirv] Add support for [[vk::set(X)]]

    A new attribute [[vk::set(X)]] can be used to specify the descriptor set of a resource variable, leaving the binding index to be determined automatically based on what's available.

    Fixes https://github.com/Microsoft/DirectXShaderCompiler/issues/1340

    spirv 
    opened by pow2clk 29
  • DXC extension for DXR Payload Access Qualifiers

    DXC extension for DXR Payload Access Qualifiers

    This extension adds qualifiers for payload structures accompanied with semantic checks and code generation. The information added by the developer is stored in the DXIL type system and a new metadata node is emitted during code generation. This feature is opt-in for SM 6.5 and 6.6 and requires DXIL version >= 1.5 and validation version >= 1.6.

    opened by pacxx 28
  • [SPIRV] Add support hlsl export function attribute

    [SPIRV] Add support hlsl export function attribute

    DXIL could export functions with export attribute while spirv would ignore such functions. Currently glsl spirv does not have linkage export decorate. so the exported function would be treated as normal function while add dummy entry point according to the spirv validations

    spirv 
    opened by jiaolu 26
  • [linux-port] Support full IID comparison on GCC

    [linux-port] Support full IID comparison on GCC

    Fixes #2680, fixes #3097

    @ehsannas As per request, my patches to support IIDs on all compilers without __declspec(uuid("...)) support like GCC. For reference most of the implementation details have been elaborated here.

    In the current form external applications and wrappers using DxcCreateInstance (on an __EMULATE_UUID-enabled dxcompiler) get an E_NOINTERFACE because they cannot pass in the same pointer (or hash since #2796), nor know whether the (dynamically loaded) library is compiled with a compiler that supports __uuidof() (msvc, clang). To solve this all objects "simply" need to expose the full IID when they are emulated. While relatively straightforward to implement (see initial commits) this requires duplication of the IID and an extra macro to be added to every structure (the latter already existing).

    To counter this all struct __declspec(uuid("...")) ISomething are replaced with a macro that emits a similar struct header on supported compilers, but a template specialization wrapping the IID on all other compilers. This ensures the IID is not duplicated within the code nor the compatibility macro forgotten about, though is less readable than a normal stringified uuid.

    One major downside of this method, further elaborated in https://github.com/microsoft/DirectXShaderCompiler/issues/2680#issuecomment-635499459, is __uuidof() not working on an interface implementation (subclass). That might be solved by consuming the inheritance chain and { bracket in the macro and emitting the static uuidof() method again.

    This PR picks two small warning fixes from LLVM, but needs more to fully cover all warnings currently spewed out by the compilers. Or are there plans to update LLVM in larger swaths? In addition CMakeSettings.json is updated to support VS2019, though I am unable to test backwards compatibility with VS2017 and below.

    Things that have yet to be done:

    • Autoformat the changed class definitions (unfortunately autoformatting the entire file results in too many changes);
    • Squash unnecessary commits after individual idea/implementation review;
    • deal with TODO/WIP/HACK markings in commits and comments, ie.:
      • Make sure all classes have the new INTERFACE_STRUCT_HEADER, then remove the empty check in IsEqualIID (I think all are covered now?);
      • Move INTERFACE_STRUCT_HEADER to a single place accessible by everything (ie. the defines in WinAdapter.h are not available to dxcapi.h when not compiling under WIN32);
    • Finally we have to discuss a binary incompatibility issue on vtable layout when WinAdapter is used, but that is irrelevant for this PR.
    linux 
    opened by MarijnS95 26
  • Compiling DirectXShaderCompiler on Linux/macOS

    Compiling DirectXShaderCompiler on Linux/macOS

    Do you know if is it possible to compile DirectXShaderCompiler on Linux? It is mainly for having the compiler to generate SPIR-V shaders for Vulkan from HLSL sources on Linux.

    enhancement linux up-for-grabs 
    opened by Cry-Mory 26
  • Enable generation of llvm.lifetime.start/.end intrinsics

    Enable generation of llvm.lifetime.start/.end intrinsics

    Enable generation of llvm.lifetime.start/.end intrinsics.

    • Remove HLSL change from CGDecl.cpp::EmitLifetimeStart() that disabled generation of lifetime markers in the front end.
    • Enable generation of lifetime intrinsics when inlining functions (PassManagerBuilder.cpp).
    • Both of these cover a different set of situations that can lead to inefficient code without lifetime intrinsics (see examples below):
      • Assume a struct is created inside a loop but some or all of its fields are only initialized conditionally before the struct is being used. If the alloca of that struct does not receive lifetime intrinsics before being lowered to SSA its definition will effectively be hoisted out of the loop, which changes the original semantics: Since the initialization is conditional, the correct SSA form for this code requires a phi node in the loop header that persists the value of the struct field throughout different iterations because the compiler doesn't know anymore that the field can't be initialized in a different iteration than when it is used.
      • If the lifetime of an alloca in a function is the entire function it doesn't need lifetime intrinsics. However, when inlining that function, the alloca's lifetime will then suddenly span the entire caller, causing similar issues as described above.
    • For backwards compatibility, replace lifetime.start/.end intrinsics with a store of undef in DxilPreparePasses.cpp, or, for validator version < 1.6, with a store of 0 (undef store is disallowed). This is slightly inconvenient but achieves the same goal as the lifetime intrinsics. The zero initialization is actually the current manual workaround for developers that hit one of the above issues.
    • Allow lifetime intrinsics to pass DXIL validation.
    • Allow undef stores to pass DXIL validation.
    • Allow bitcast to i8* to pass DXIL validation.
    • Make various places in the code aware of lifetime intrinsics and their related bitcasts to i8*.
    • Adjust ScalarReplAggregatesHLSL so it generates new intrinsics for each element once a structure is broken up. Also make sure that lifetime intrinsics are removed when replacing one pointer by another upon seeing a memcpy. This is required to prevent a pointer accidentally "inheriting" wrong lifetimes.
    • Adjust PromoteMemoryToRegister to treat an existing lifetime.start intrinsic as a definition.
    • Since lifetime intrinsics require a cleanup, the logic in CGStmt.cpp:EmitBreakStmt() had to be changed: EmitHLSLCondBreak() now returns the generated BranchInst. That branch is then passed into EmitBranchThroughCleanup(), which uses it instead of creating a new one. This way, the cleanup is generated correctly and the wave handling also still works as intended.
    • Adjust a number of tests that now behave slightly differently. memcpy_preuser.hlsl was actually exhibiting exactly the situation explained above and relied on the struct definition of "oStruct" to be hoisted out to produce the desired IR. And entry_memcpy() in cbuf_memcpy_replace.hlsl required an explicit initialization: With lifetime intrinsics, the original code correctly collapsed to returning undef. Without lifetime intrinsics, the compiler could not prove this. With proper initialization, the test now has the intended effect, even though the collapsing to undef could be a desireable test for lifetime intrinsics.

    Example 1:

    Original code:

    for( ;; ) {
      func();
      MyStruct s;
      if( c ) {
        s.x = ...;
        ... = s.x;
      }
      ... = s.x;
    }
    

    Without lifetime intrinsics, this is equivalent to:

    MyStruct s;
    for( ;; ) {
      func();
      if( c ) {
        s.x = ...;
        ... = s.x;
      }
      ... = s.x;
    }
    

    After SROA, we now have a value live across the function call, which will cause a spill:

    for( ;; ) {
      x_p = phi( undef, x_p2 );
      func();
      if( c ) {
        x1 = ...;
        ... = x1;
      }
      x_p2 = phi( x_p, x1 );
      ... = x_p2;
    }
    

    Example 2:

    void consume(in Data data);
    void expensiveComputation();
    
    bool produce(out Data data) {
        if (condition) {
            data = ...; // <-- conditional assignment of out-qualified parameter
            return true;
        }
        return false; // <-- out-qualified parameter left uninitialized
    }
    void foo(int N) {
        for (int i=0; i<N; ++i) {
            Data data;
            bool valid = produce(data); // <-- generates a phi to prior iteration's value when inlined. There should be none
            if (valid)
                consume(data);
            expensiveComputation(); // <-- said phi is alive here, inflating register pressure
        }
    }
    
    opened by rkarrenberg 25
  • Add support to convert DXR HLSL to SPV_NV_ray_tracing

    Add support to convert DXR HLSL to SPV_NV_ray_tracing

    @ehsannas @antiagainst Ehsan, Lei, please review the changes

    Pattern match tests have been added for new features

    Let us know if anything more is required from us Thanks!

    spirv 
    opened by alelenv 24
  • [spirv] Tessellation Control shader code generation appears incorrect

    [spirv] Tessellation Control shader code generation appears incorrect

    We have encountered what appears to be a bug in dxc that causes information communicated from HLSL's logical "hull shader" to its "patch constant shader" to not actually get communicated (except for from the first control point), when converted to Vulkan's logical "tessellation control shader".

    As far as I can tell, when data is transmitted from the hull shader to the patch constant shader, SPIRV-dxc attempts to do this by writing the per-thread data into a temporary array, that has one entry for each thread. Then a barrier is inserted and thread 0 consumes the data from all threads to do the patch-constant shader work. The problem appears to be that this temporary array is just regular registers, not any kind of LDS or "groupshared" memory. So, thread 0 doesn't actually access data written by the other threads, it just consumes uninitialized data. This causes the tessellation either to not show up, or to flicker like mad and do crazy stuff.

    If I modify the shader so that the patch-constant shader does not use any information from the hull-shader, then tessellation works as expected. However, for our use case, we require the ability to use information computed in the hull shader.

    NOTE: We have not found a work around for this, so getting it fixed "soon" would be pretty important to us.

    Here is a simple HLSL shader that demonstrates the problem.

    #define BARYCENTRIC_INTERPOLATE(INPUTARRAY, INPUTPARAM, BARYC) ((BARYC.x * INPUTARRAY[0].INPUTPARAM) + (BARYC.y * INPUTARRAY[1].INPUTPARAM) + (BARYC.z * INPUTARRAY[2].INPUTPARAM))
    
    static const float2 Positions[3] = {
    	float2(0.0, -0.5),
    	float2(0.5, 0.5),
    	float2(-0.5, 0.5)
    };
    
    static const float3 Colors[3] = {
    	float3(1.0, 0.0, 0.0),
    	float3(0.0, 1.0, 0.0),
    	float3(0.0, 0.0, 1.0)
    };
    
    float Tessellation_factor; //set by CPU
    
    struct VS_OUTPUT
    {
    	float4 pos : POSITION0;
    	float3 color : VERTCOLOR0;
    };
    
    struct HS_OUTPUT
    {
    	float4 pos : POSITION0;
    	float3 color : VERTCOLOR0;
    	float tess_factor : TESSFACTOR0;
    };
    
    struct DS_OUTPUT
    {
    	float4 pos : SV_Position;
    	float3 color : VERTCOLOR0;
    };
    
    
    VS_OUTPUT main_vs( uint vert_index : SV_VertexID)
    {
    	VS_OUTPUT OUT;
    
    	OUT.pos = float4(Positions[vert_index], 0.0, 1.0);
    	OUT.color = Colors[vert_index];
    
    	return OUT;
    }
    
    struct HS_PATCH_OUTPUT
    {
    	float tess_factor[3] : SV_TessFactor; //tessellation factor per edge
    	float inside_tess_factor : SV_InsideTessFactor; //tessellation factor of triangle interior
    };
    
    [domain("tri")]
    [partitioning("fractional_odd")]
    [outputtopology("triangle_cw")]
    [patchconstantfunc("main_hs_patch")]
    [outputcontrolpoints(3)]
    HS_OUTPUT main_hs(InputPatch<VS_OUTPUT, 3> inputpoints, uint i : SV_OutputControlPointID)
    {
    	HS_OUTPUT OUT;
    	OUT.pos = inputpoints[i].pos;
    	OUT.color = inputpoints[i].color;
    	OUT.tess_factor = Tessellation_factor;
    	
    	return OUT;
    }
    
    HS_PATCH_OUTPUT main_hs_patch(const OutputPatch<HS_OUTPUT, 3> patch)
    {
    	HS_PATCH_OUTPUT OUT;
    	OUT.tess_factor[0] = patch[0].tess_factor;
    	OUT.tess_factor[1] = patch[1].tess_factor;
    	OUT.tess_factor[2] = patch[2].tess_factor;
    	OUT.inside_tess_factor = (patch[0].tess_factor + patch[1].tess_factor + patch[2].tess_factor) / 3;
    
    	return OUT;
    }
    
    [domain("tri")]
    DS_OUTPUT main_ds(HS_PATCH_OUTPUT constants, float3 BarycentricCoords : SV_DomainLocation, const OutputPatch<HS_OUTPUT, 3> orig_triangle)
    {
    	DS_OUTPUT OUT;
    	OUT.pos = BARYCENTRIC_INTERPOLATE(orig_triangle, pos, BarycentricCoords);
    	OUT.color = BARYCENTRIC_INTERPOLATE(orig_triangle, color, BarycentricCoords);
    
    	return OUT;
    }
    
    float4 main_ps( DS_OUTPUT data) : SV_Target0
    {
    	return float4(data.color, 1.0);
    }
    

    This generates SPIR-V like this for the Tessellation Control stage:

    ; SPIR-V
    ; Version: 1.3
    ; Generator: Google spiregg; 0
    ; Bound: 103
    ; Schema: 0
                   OpCapability Tessellation
                   OpExtension "SPV_GOOGLE_hlsl_functionality1"
                   OpMemoryModel Logical GLSL450
                   OpEntryPoint TessellationControl %main_hs "main_hs" %in_var_POSITION0 %in_var_VERTCOLOR0 %gl_InvocationID %out_var_POSITION0 %out_var_VERTCOLOR0 %out_var_TESSFACTOR0 %gl_TessLevelOuter %gl_TessLevelInner
                   OpExecutionMode %main_hs Triangles
                   OpExecutionMode %main_hs SpacingFractionalOdd
                   OpExecutionMode %main_hs VertexOrderCw
                   OpExecutionMode %main_hs OutputVertices 3
                   OpSource HLSL 600
                   OpName %type__Globals "type.$Globals"
                   OpMemberName %type__Globals 0 "Tessellation_factor"
                   OpName %_Globals "$Globals"
                   OpName %in_var_POSITION0 "in.var.POSITION0"
                   OpName %in_var_VERTCOLOR0 "in.var.VERTCOLOR0"
                   OpName %out_var_POSITION0 "out.var.POSITION0"
                   OpName %out_var_VERTCOLOR0 "out.var.VERTCOLOR0"
                   OpName %out_var_TESSFACTOR0 "out.var.TESSFACTOR0"
                   OpName %main_hs "main_hs"
                   OpName %VS_OUTPUT "VS_OUTPUT"
                   OpMemberName %VS_OUTPUT 0 "pos"
                   OpMemberName %VS_OUTPUT 1 "color"
                   OpName %param_var_inputpoints "param.var.inputpoints"
                   OpName %HS_OUTPUT "HS_OUTPUT"
                   OpMemberName %HS_OUTPUT 0 "pos"
                   OpMemberName %HS_OUTPUT 1 "color"
                   OpMemberName %HS_OUTPUT 2 "tess_factor"
                   OpName %temp_var_hullMainRetVal "temp.var.hullMainRetVal"
                   OpName %if_merge "if.merge"
                   OpDecorateString %in_var_POSITION0 UserSemantic "POSITION0"
                   OpDecorateString %in_var_VERTCOLOR0 UserSemantic "VERTCOLOR0"
                   OpDecorate %gl_InvocationID BuiltIn InvocationId
                   OpDecorateString %gl_InvocationID UserSemantic "SV_OutputControlPointID"
                   OpDecorateString %out_var_POSITION0 UserSemantic "POSITION0"
                   OpDecorateString %out_var_VERTCOLOR0 UserSemantic "VERTCOLOR0"
                   OpDecorateString %out_var_TESSFACTOR0 UserSemantic "TESSFACTOR0"
                   OpDecorate %gl_TessLevelOuter BuiltIn TessLevelOuter
                   OpDecorateString %gl_TessLevelOuter UserSemantic "SV_TessFactor"
                   OpDecorate %gl_TessLevelOuter Patch
                   OpDecorate %gl_TessLevelInner BuiltIn TessLevelInner
                   OpDecorateString %gl_TessLevelInner UserSemantic "SV_InsideTessFactor"
                   OpDecorate %gl_TessLevelInner Patch
                   OpDecorate %in_var_POSITION0 Location 0
                   OpDecorate %in_var_VERTCOLOR0 Location 1
                   OpDecorate %out_var_POSITION0 Location 0
                   OpDecorate %out_var_TESSFACTOR0 Location 1
                   OpDecorate %out_var_VERTCOLOR0 Location 2
                   OpDecorate %_Globals DescriptorSet 0
                   OpDecorate %_Globals Binding 0
                   OpMemberDecorate %type__Globals 0 Offset 0
                   OpDecorate %type__Globals Block
          %float = OpTypeFloat 32
           %uint = OpTypeInt 32 0
         %uint_0 = OpConstant %uint 0
         %uint_1 = OpConstant %uint 1
         %uint_2 = OpConstant %uint 2
            %int = OpTypeInt 32 1
          %int_2 = OpConstant %int 2
          %int_0 = OpConstant %int 0
          %int_1 = OpConstant %int 1
         %uint_3 = OpConstant %uint 3
        %v3float = OpTypeVector %float 3
    %_arr_v3float_uint_3 = OpTypeArray %v3float %uint_3
    %type__Globals = OpTypeStruct %float
    %_ptr_Uniform_type__Globals = OpTypePointer Uniform %type__Globals
        %v4float = OpTypeVector %float 4
    %_arr_v4float_uint_3 = OpTypeArray %v4float %uint_3
    %_ptr_Input__arr_v4float_uint_3 = OpTypePointer Input %_arr_v4float_uint_3
    %_ptr_Input__arr_v3float_uint_3 = OpTypePointer Input %_arr_v3float_uint_3
    %_ptr_Input_uint = OpTypePointer Input %uint
    %_ptr_Output__arr_v4float_uint_3 = OpTypePointer Output %_arr_v4float_uint_3
    %_ptr_Output__arr_v3float_uint_3 = OpTypePointer Output %_arr_v3float_uint_3
    %_arr_float_uint_3 = OpTypeArray %float %uint_3
    %_ptr_Output__arr_float_uint_3 = OpTypePointer Output %_arr_float_uint_3
         %uint_4 = OpConstant %uint 4
    %_arr_float_uint_4 = OpTypeArray %float %uint_4
    %_ptr_Output__arr_float_uint_4 = OpTypePointer Output %_arr_float_uint_4
    %_arr_float_uint_2 = OpTypeArray %float %uint_2
    %_ptr_Output__arr_float_uint_2 = OpTypePointer Output %_arr_float_uint_2
           %void = OpTypeVoid
             %45 = OpTypeFunction %void
      %VS_OUTPUT = OpTypeStruct %v4float %v3float
    %_arr_VS_OUTPUT_uint_3 = OpTypeArray %VS_OUTPUT %uint_3
    %_ptr_Function__arr_VS_OUTPUT_uint_3 = OpTypePointer Function %_arr_VS_OUTPUT_uint_3
      %HS_OUTPUT = OpTypeStruct %v4float %v3float %float
    %_arr_HS_OUTPUT_uint_3 = OpTypeArray %HS_OUTPUT %uint_3
    %_ptr_Function__arr_HS_OUTPUT_uint_3 = OpTypePointer Function %_arr_HS_OUTPUT_uint_3
    %_ptr_Output_v4float = OpTypePointer Output %v4float
    %_ptr_Output_v3float = OpTypePointer Output %v3float
    %_ptr_Output_float = OpTypePointer Output %float
    %_ptr_Function_HS_OUTPUT = OpTypePointer Function %HS_OUTPUT
           %bool = OpTypeBool
    %_ptr_Function_float = OpTypePointer Function %float
    %_ptr_Function_v4float = OpTypePointer Function %v4float
    %_ptr_Function_v3float = OpTypePointer Function %v3float
    %_ptr_Uniform_float = OpTypePointer Uniform %float
       %_Globals = OpVariable %_ptr_Uniform_type__Globals Uniform
    %in_var_POSITION0 = OpVariable %_ptr_Input__arr_v4float_uint_3 Input
    %in_var_VERTCOLOR0 = OpVariable %_ptr_Input__arr_v3float_uint_3 Input
    %gl_InvocationID = OpVariable %_ptr_Input_uint Input
    %out_var_POSITION0 = OpVariable %_ptr_Output__arr_v4float_uint_3 Output
    %out_var_VERTCOLOR0 = OpVariable %_ptr_Output__arr_v3float_uint_3 Output
    %out_var_TESSFACTOR0 = OpVariable %_ptr_Output__arr_float_uint_3 Output
    %gl_TessLevelOuter = OpVariable %_ptr_Output__arr_float_uint_4 Output
    %gl_TessLevelInner = OpVariable %_ptr_Output__arr_float_uint_2 Output
    %float_0_333333343 = OpConstant %float 0.333333343
        %main_hs = OpFunction %void None %45
             %60 = OpLabel
    %param_var_inputpoints = OpVariable %_ptr_Function__arr_VS_OUTPUT_uint_3 Function
    %temp_var_hullMainRetVal = OpVariable %_ptr_Function__arr_HS_OUTPUT_uint_3 Function
             %61 = OpLoad %_arr_v4float_uint_3 %in_var_POSITION0
             %62 = OpLoad %_arr_v3float_uint_3 %in_var_VERTCOLOR0
             %63 = OpCompositeExtract %v4float %61 0
             %64 = OpCompositeExtract %v3float %62 0
             %65 = OpCompositeConstruct %VS_OUTPUT %63 %64
             %66 = OpCompositeExtract %v4float %61 1
             %67 = OpCompositeExtract %v3float %62 1
             %68 = OpCompositeConstruct %VS_OUTPUT %66 %67
             %69 = OpCompositeExtract %v4float %61 2
             %70 = OpCompositeExtract %v3float %62 2
             %71 = OpCompositeConstruct %VS_OUTPUT %69 %70
             %72 = OpCompositeConstruct %_arr_VS_OUTPUT_uint_3 %65 %68 %71
                   OpStore %param_var_inputpoints %72
             %73 = OpLoad %uint %gl_InvocationID
             %74 = OpAccessChain %_ptr_Function_v4float %param_var_inputpoints %73 %int_0
             %75 = OpLoad %v4float %74
             %76 = OpAccessChain %_ptr_Function_v3float %param_var_inputpoints %73 %int_1
             %77 = OpLoad %v3float %76
             %78 = OpAccessChain %_ptr_Uniform_float %_Globals %int_0
             %79 = OpLoad %float %78
             %80 = OpCompositeConstruct %HS_OUTPUT %75 %77 %79
             %81 = OpAccessChain %_ptr_Output_v4float %out_var_POSITION0 %73
                   OpStore %81 %75
             %82 = OpAccessChain %_ptr_Output_v3float %out_var_VERTCOLOR0 %73
                   OpStore %82 %77
             %83 = OpAccessChain %_ptr_Output_float %out_var_TESSFACTOR0 %73
                   OpStore %83 %79
             %84 = OpAccessChain %_ptr_Function_HS_OUTPUT %temp_var_hullMainRetVal %73
                   OpStore %84 %80
                   OpControlBarrier %uint_2 %uint_4 %uint_0
             %85 = OpIEqual %bool %73 %uint_0
                   OpSelectionMerge %if_merge None
                   OpBranchConditional %85 %86 %if_merge
             %86 = OpLabel
             %87 = OpAccessChain %_ptr_Function_float %temp_var_hullMainRetVal %uint_0 %int_2
             %88 = OpLoad %float %87
             %89 = OpAccessChain %_ptr_Function_float %temp_var_hullMainRetVal %uint_1 %int_2
             %90 = OpLoad %float %89
             %91 = OpAccessChain %_ptr_Function_float %temp_var_hullMainRetVal %uint_2 %int_2
             %92 = OpLoad %float %91
             %93 = OpLoad %float %87
             %94 = OpLoad %float %89
             %95 = OpFAdd %float %93 %94
             %96 = OpLoad %float %91
             %97 = OpFAdd %float %95 %96
             %98 = OpFMul %float %97 %float_0_333333343
             %99 = OpAccessChain %_ptr_Output_float %gl_TessLevelOuter %uint_0
                   OpStore %99 %88
            %100 = OpAccessChain %_ptr_Output_float %gl_TessLevelOuter %uint_1
                   OpStore %100 %90
            %101 = OpAccessChain %_ptr_Output_float %gl_TessLevelOuter %uint_2
                   OpStore %101 %92
            %102 = OpAccessChain %_ptr_Output_float %gl_TessLevelInner %uint_0
                   OpStore %102 %98
                   OpBranch %if_merge
       %if_merge = OpLabel
                   OpReturn
                   OpFunctionEnd
    

    To make it easier to understand the issue, I've decompiled the SPIR-V into GLSL using SPIRV-Cross, also:

    #version 450
    layout(vertices = 3) out;
    
    struct VS_OUTPUT
    {
        vec4 pos;
        vec3 color;
    };
    
    struct HS_OUTPUT
    {
        vec4 pos;
        vec3 color;
        float tess_factor;
    };
    
    layout(set = 0, binding = 0, std140) uniform type_Globals
    {
        float Tessellation_factor;
    } _Globals;
    
    layout(location = 0) in vec4 in_var_POSITION0[];
    layout(location = 1) in vec3 in_var_VERTCOLOR0[];
    layout(location = 0) out vec4 out_var_POSITION0[3];
    layout(location = 2) out vec3 out_var_VERTCOLOR0[3];
    layout(location = 1) out float out_var_TESSFACTOR0[3];
    
    void main()
    {
        vec4 _61_unrolled[3];
        for (int i = 0; i < int(3); i++)
        {
            _61_unrolled[i] = in_var_POSITION0[i];
        }
        vec3 _62_unrolled[3];
        for (int i = 0; i < int(3); i++)
        {
            _62_unrolled[i] = in_var_VERTCOLOR0[i];
        }
        VS_OUTPUT param_var_inputpoints[3] = VS_OUTPUT[](VS_OUTPUT(_61_unrolled[0], _62_unrolled[0]), VS_OUTPUT(_61_unrolled[1], _62_unrolled[1]), VS_OUTPUT(_61_unrolled[2], _62_unrolled[2]));
        out_var_POSITION0[gl_InvocationID] = param_var_inputpoints[gl_InvocationID].pos;
        out_var_VERTCOLOR0[gl_InvocationID] = param_var_inputpoints[gl_InvocationID].color;
        out_var_TESSFACTOR0[gl_InvocationID] = _Globals.Tessellation_factor;
        HS_OUTPUT temp_var_hullMainRetVal[3];
        temp_var_hullMainRetVal[gl_InvocationID] = HS_OUTPUT(param_var_inputpoints[gl_InvocationID].pos, param_var_inputpoints[gl_InvocationID].color, _Globals.Tessellation_factor);
        barrier();
        if (gl_InvocationID == 0u)
        {
            gl_TessLevelOuter[0u] = temp_var_hullMainRetVal[0u].tess_factor;
            gl_TessLevelOuter[1u] = temp_var_hullMainRetVal[1u].tess_factor;
            gl_TessLevelOuter[2u] = temp_var_hullMainRetVal[2u].tess_factor;
            gl_TessLevelInner[0u] = ((temp_var_hullMainRetVal[0u].tess_factor + temp_var_hullMainRetVal[1u].tess_factor) + temp_var_hullMainRetVal[2u].tess_factor) * 0.3333333432674407958984375;
        }
    }
    
    

    As you can see, the temp_var_hullMainRetVal variable appears to just be a regular local array, and not cross-thread shared memory of any kind.

    Thank you! Dxc with SPIR-V is pretty awesome, and we appreciate all the work that has gone into it. We'd love to get this tessellation issue fixed as soon as possible.

    spirv 
    opened by scottkircher 23
  • Enable printing dependencies of compilation target

    Enable printing dependencies of compilation target

    This commit adds -dump-dependencies option to print dependencies of compilation target.

    For example, when a HLSL file X includes Y, Z, W files,

    $ ./bin/dxc -dump-dependencies -T ps_6_0 -E main X

    will print: X Y Z W

    The first line i.e., X is the name of the main compilation target. Following lines are dependencies for X.

    opened by jaebaek 22
  • Split dxc.exe into a static lib and a driver executable

    Split dxc.exe into a static lib and a driver executable

    We have a need to produce a version of dxc.exe that has a few default command line arguments but otherwise is just the standard dxc executable. This commit splits dxc.exe into a static lib with all the dxc logic and a thin wrapper main function that calls into the lib.

    The split allows us to link to the dxc lib and pass our own default arguments.

    opened by dmpots 22
  • Use the default allocator for spirv-opt.

    Use the default allocator for spirv-opt.

    When doing some compile time performance investigation, I noticed that the allocator seems very slow. When compiling large hlsl files the total time is ~44s. The profiler tells me that 80-90% of that time is spent deallocating memory in the spirv-opt library. The interesting part is that if I use the standalone spirv-opt.exe it take ~5s to optimize, and almost no time is spent in dealing with the allocator.

    I don't know why the NoSerializeHeapMalloc is used, but I was wondering if it would be a problem to change the allocator just before calling the optimizer to the default allocator. Then change it back after we are done optimizing.

    opened by s-perron 21
  • [SPIR-V] `FileTest.ByteAddressBufferTemplatedStoreStruct2` test is broken

    [SPIR-V] `FileTest.ByteAddressBufferTemplatedStoreStruct2` test is broken

    The FileTest.ByteAddressBufferTemplatedStoreStruct2 test is currently broken. It compiles the wrong HLSL file:

    TEST_F(FileTest, ByteAddressBufferTemplatedStoreStruct) {
      runFileTest("method.byte-address-buffer.templated-store.struct.hlsl");
    }
    TEST_F(FileTest, ByteAddressBufferTemplatedStoreStruct2) {
      runFileTest("method.byte-address-buffer.templated-store.struct.hlsl");  // Should be .struct2.hlsl
    }
    

    When we switch it to use the correct input shader, the test fails:

    bin/clang-spirv-tests --spirv-test-root ~/dxc/DirectXShaderCompiler/tools/clang/test/CodeGenSPIRV --gtest_filter="FileTest.ByteAddressBufferTemplatedStoreStruct2"
    Note: Google Test filter = FileTest.ByteAddressBufferTemplatedStoreStruct2
    [==========] Running 1 test from 1 test case.
    [----------] Global test environment set-up.
    [----------] 1 test from FileTest
    method.byte-address-buffer.templated-store.struct2.hlsl:44:15: error: expected string not found in input
    // CHECK:     [[s0_a:%\d+]] = OpCompositeExtract %half [[sArr]] 0
                  ^
    <codegen>:163:36: note: scanning from here
    %120 = OpCompositeExtract %S %118 1
                                       ^
    <codegen>:163:36: note: with variable "sArr" equal to "%118"
    %120 = OpCompositeExtract %S %118 1
                                       ^
    
    ~/dxc/DirectXShaderCompiler/tools/clang/unittests/SPIRV/FileTestFixture.cpp:129: Failure
          Expected: result.status()
          Which is: 4-byte object <01-00 00-00>
    To be equal to: effcee::Result::Status::Ok
          Which is: 4-byte object <00-00 00-00>
    

    We should update the test so that it uses the correct file and passes, or remove it entirely if broken beyond repair.

    bug spirv test 
    opened by kuhar 2
  • Pass parameters to CMake in a correct way

    Pass parameters to CMake in a correct way

    According to Azure pipeline's document, the $(...) is used in bash to reference a variable. The original ${...} doesn't really do it. CC, CXX, CXX_FLAGS are fine because they are also environment variables. But the configuration doesn't pass into CMAKE_BUILD_TYPE. The CI is actually build Debug build twice, not a Debug and a Release.

    opened by gongminmin 1
  • Feature request (spirv): add annotations for images readonly, writeonly

    Feature request (spirv): add annotations for images readonly, writeonly

    Currently in spirv backend, Texture objects are always translated to sampled image. OpTypeImage will have sampled set to "will be used with sampler" (https://www.khronos.org/registry/SPIR-V/specs/1.0/SPIRV.html#OpTypeImage).

    There are many cases where we want to use readonly Texture without sampling (e.g. the format does not support sampling - for example uint64). In this case we want it to be a readonly storage image and GLSL provides correct decorations to do that:

    // 1.comp
    // glslangValidator -H -V -o 1.spv 1.comp
    #version 450
    layout(set = 0, binding = 0, rgba8) uniform highp readonly image2D image0;
    layout(set = 0, binding = 1, rgba8) uniform highp writeonly image2D image1;
    void main() {
        vec4 pixel = imageLoad(image0, ivec2(gl_LocalInvocationID));
        imageStore(image1, ivec2(gl_LocalInvocationID), pixel);
    }
    

    The result contains "NonWritable" and "NonReadable":

    Decorate 12(image0) DescriptorSet 0
    Decorate 12(image0) Binding 0
    Decorate 12(image0) NonWritable
    Decorate 17(gl_LocalInvocationID) BuiltIn LocalInvocationId
    Decorate 27(image1) DescriptorSet 0
    Decorate 27(image1) Binding 1
    Decorate 27(image1) NonReadable
    

    Using HLSL with Vulkan makes it difficult to handle these cases, because read-only images will still be treated as UAVs but we want them to be SRVs.

    Somewhat related discussion: https://github.com/KhronosGroup/SPIRV-Cross/issues/1306

    To illustrate the problem, let me provide a more detailed use case.

    Consider the following HLSL shader, which is pretty much identical to GLSL sample above:

    // 1.hlsl
    Texture2D<uint64_t> image0 : register(t0, space0);
    RWTexture2D<uint64_t> image1 : register(u0, space0);
    
    [numthreads (16, 16, 1)]
    void main(uint3 dispatchThreadId : SV_DispatchThreadID) {
        uint64_t pixel = image0[dispatchThreadId.xy];
        image1[dispatchThreadId.xy] = pixel;
    }
    

    Here we are using uint64_t format, which does not support sampling. On DirectX 12, we can use SRV descriptor for image0.

    For Vulkan, DXC spirv codegen will generate "sampled image" for image0 and reflection tools (e.g. spirv_reflect) will treat it as SRV and descriptor type DESCRIPTOR_TYPE_SAMPLED_IMAGE. What we really want is SRV and DESCRIPTOR_TYPE_STORAGE_IMAGE. The only way to make it work is to change image0 to RWTexture2D, which will result in it being an UAV now (and register specification must change as well).

    I believe providing "readony", "writeonly" annotations and emitting corresponding decorations would make it easy to tools to correctly map to Vulkan descriptor types.

    opened by bsekura 0
  • [SPIR-V] Implement `dot4add_u8packed` and `dot4add_i8packed` intrinsic functions

    [SPIR-V] Implement `dot4add_u8packed` and `dot4add_i8packed` intrinsic functions

    If possible, we should add support for the HLSL packed dot product intrinsic functions to the SPIR-V backend, as documented in Shader Model 6.4.

    This is a follow-up from #2953.

    spirv 
    opened by sudonatalie 0
  • Using a static const array in a member function declaration causes CREATEPIXELSHADER_INVALIDSHADERBYTECODE during CreatePipelineState

    Using a static const array in a member function declaration causes CREATEPIXELSHADER_INVALIDSHADERBYTECODE during CreatePipelineState

    Tested with dxcompiler.dll version 1.6.0.3597 (commit a19c32629, 2022/06/22)

    The following snippet compiles successfully with no errors but, returns CREATEPIXELSHADER_INVALIDSHADERBYTECODE or CREATEMESHSHADER_INVALIDSHADERBYTECODE, depending on where the code is used.

    struct MyClass {
      float3 GetTestValue(uint index) { 
        static const float3 kValues[3] = {float3(0, 0, 1), float3(0, 1, 0), float3(1, 0, 0)};
        return kValues[index];
      }
    };
    

    Workarounds:

    • Removing the static declaration fixes the error.
    • Using a global function instead of a class member function fixes the error.
    • Moving the static array declaration to global scope fixes the error.

    Test sample code: test_dxc_bug.hlsl.txt

    The following test case define will cause an error when creating a pipeline state.

    #define TEST_CASE CASE_MEMBER_FUNCTION_STATIC
    

    D3D12 Validation Layer Error log Note that the error message says Pixel Shader is unsigned, but using a different define (with no change to compile pipeline), that error message doesn't happen.

    D3D12 ERROR: ID3D12Device::CreatePixelShader: Pixel Shader is unsigned. [ STATE_CREATION ERROR #93: CREATEPIXELSHADER_INVALIDSHADERBYTECODE]
    D3D12: **BREAK** enabled for the previous message, which was: [ ERROR STATE_CREATION #93: CREATEPIXELSHADER_INVALIDSHADERBYTECODE ]
    
    D3D12 ERROR: ID3D12Device::CreateMeshShader: Mesh Shader is corrupt unsigned. [ STATE_CREATION ERROR #1265: CREATEMESHSHADER_INVALIDSHADERBYTECODE]
    D3D12: **BREAK** enabled for the previous message, which was: [ ERROR STATE_CREATION #1265: CREATEMESHSHADER_INVALIDSHADERBYTECODE ]
    
    opened by calhsu-nvidia 0
Releases(v1.6.2112)
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
DirectX shader bytecode cross compiler

HLSLcc DirectX shader bytecode cross compiler. Originally based on https://github.com/James-Jones/HLSLCrossCompiler. This library takes DirectX byteco

Unity Technologies 697 Jun 20, 2022
Blend text in a HLSL shader and have it look like native DirectWrite

dwrite-hlsl This project demonstrates how to blend text in a HLSL shader and have it look like native DirectWrite. License This project is an extract

Leonard Hecker 11 May 24, 2022
A header-only C-like shading language compiler that writes Metal, HLSL, GLSL

GPUC A generic shading language compiler that writes metal, HLSL, and GLSL GPUC is a work in progress, not ready for prime time. The primary motivatio

Garett Bass 57 Jun 22, 2022
The DirectX Tool Kit (aka DirectXTK) is a collection of helper classes for writing DirectX 11.x code in C++

DirectX Tool Kit for DirectX 11 http://go.microsoft.com/fwlink/?LinkId=248929 Copyright (c) Microsoft Corporation. All rights reserved. January 9, 202

Microsoft 2.1k Jun 22, 2022
FDF is a 42 Project to learn about 3d programming. The program takes a map as parameter and creates its 3d representation.

FDF Project Overview FDF is a 42 Project to learn about 3d programming. The program takes a map as parameter and creates its 3d representation. Render

Mmoumni08 5 Feb 15, 2022
Legion Low Level Rendering Interface provides a graphics API agnostic rendering interface with minimal CPU overhead and low level access to verbose GPU operations.

Legion-LLRI Legion-LLRI, or “Legion Low Level Rendering Interface” is a rendering API that aims to provide a graphics API agnostic approach to graphic

Rythe Interactive 25 Mar 8, 2022
GLSL and ESSL are Khronos high-level shading languages.

GLSL GLSL and ESSL are Khronos high-level shading languages. Khronos Registries are available for OpenGL OpenGL ES Vulkan Extension specifications in

The Khronos Group 228 Jun 17, 2022
Cocos2d-x is a suite of open-source, cross-platform, game-development tools used by millions of developers all over the world.

Cocos2d-x is a suite of open-source, cross-platform, game-development tools used by millions of developers all over the world.

cocos2d 16.2k Jun 20, 2022
Selfies but in C++. Celfies is for HPC. Robust representation of semantically constrained graphs, in particular for molecules in chemistry.

celfies Selfies but in C++. Why? Because RDKit in C++ and other speed reasons. Follow the originators of the ideas & code @MarioKrenn, @AlstonLo, @Sey

sevenTM 3 Nov 24, 2021
A C++/DirectX 11 implementation of "A Scalable and Production Ready Sky and Atmosphere Rendering Technique"

Atmosphere Renderer A C++/DirectX 11 implementation of "A Scalable and Production Ready Sky and Atmosphere Rendering Technique" Features interactive e

Z Guan 29 Jun 23, 2022
DirectX 11 and 12 library that provides a scalable and GCN-optimized solution for deferred shadow filtering

AMD ShadowFX The ShadowFX library provides a scalable and GCN-optimized solution for deferred shadow filtering. Currently the library supports uniform

GPUOpen Effects 163 Jun 23, 2022
A real-time DirectX 11 renderer. The renderer is named by my girlfriend's english name.

sophia Sophia is a real-time DirectX 11 renderer. It is not quite a rich graphics engine, only packages some low-level DirectX functions and contains

BB 6 Dec 11, 2021
✖🌱 A DirectX 12 starter repo that you could use to get the ball rolling.

DirectX 12 Seed A DirectX 12 repo you can use to get started with your own renderer. Setup First install: Git CMake Visual Studio Then type the follow

Alain Galvan 60 May 10, 2022
This repo contains the DirectX Graphics samples that demonstrate how to build graphics intensive applications on Windows.

DirectX-Graphics-Samples This repo contains the DirectX 12 Graphics samples that demonstrate how to build graphics intensive applications for Windows

Microsoft 4.6k Jun 27, 2022
Playground for DirectX 11 / 12 simple graphics demo examples ...

graphicsdemoskeleton Playground for DirectX 11 / 12 simple graphics demo examples ... If anyone from Microsoft reads this: C99 support is broken in Di

Wolfgang Engel 46 Jan 13, 2022
A modern cross-platform low-level graphics library and rendering framework

Diligent Engine A Modern Cross-Platform Low-Level 3D Graphics Library Diligent Engine is a lightweight cross-platform graphics API abstraction library

Diligent Graphics 2.4k Jun 29, 2022
Low Level Graphics Library (LLGL) is a thin abstraction layer for the modern graphics APIs OpenGL, Direct3D, Vulkan, and Metal

Low Level Graphics Library (LLGL) Documentation NOTE: This repository receives bug fixes only, but no major updates. Pull requests may still be accept

Lukas Hermanns 1.4k Jun 19, 2022
Deno gl - WIP Low-level OpenGL (GLFW) bindings and WebGL API implementation for Deno.

deno_gl WIP Low-level OpenGL (GLFW) bindings and WebGL API implementation for Deno. Building Make dist directory if it doesn't exist. Build gl helper

DjDeveloper 14 Jun 11, 2022
Basic framework for D3D11 init, model/texture loading, shader compilation and camera movement.

reed-framework Basic framework for D3D11 init, model/texture loading, camera movement, etc. Instructions: #include <framework.h> Link with framework.l

Nathan Reed 34 May 18, 2022