Tiny CUDA Neural Networks

Overview

Tiny CUDA Neural Networks

This is a small, self-contained framework for training and querying neural networks. Most notably, it contains a lightning fast "fully fused" multi-layer perceptron as well as support for various advanced input encodings, losses, and optimizers.

This framework powers the following publication:

Real-time Neural Radiance Caching for Path Tracing
Thomas Müller, Fabrice Rousselle, Jan Novák, Alexander Keller
To appear: ACM Transactions on Graphics (SIGGRAPH) 2021

GTC talk

For business inquiries, please contact [email protected].
For press and other inquiries, please contact Hector Marinez at [email protected].

Performance

Image Fully fused networks vs. TensorFlow v2.5.0 w/ XLA. Measured on 64 (solid line) and 128 (dashed line) neurons wide multi-layer perceptrons on an RTX 3090. Generated by benchmarks/bench_ours.cu and benchmarks/bench_tensorflow.py.

License and Citation

This framework is licensed under the BSD 3-clause license. Please see LICENSE.txt for details.

If you use it in your research, we would appreciate a citation via

@misc{tiny-cuda-nn,
    Author = {Thomas M\"uller},
    Year = {2021},
    Note = {https://github.com/nvlabs/tiny-cuda-nn},
    Title = {Tiny {CUDA} Neural Network Framework}
}

Special thanks go to the NRC authors for helpful discussions and to Nikolaus Binder for providing part of the infrastructure of this framework, as well as for help with utilizing TensorCores from within CUDA.

Usage

Tiny CUDA neural networks have a simple C++/CUDA API:

training_batch_inputs(n_input_dims, batch_size); GPUMatrix training_batch_targets(n_output_dims, batch_size); for (int i = 0; i < n_training_steps; ++i) { generate_training_batch(&training_batch_inputs, &training_batch_targets); // <-- your code float loss; trainer->training_step(nullptr, training_batch_inputs, training_batch_targets, &loss); std::cout << "iteration=" << i << " loss=" << loss << std::endl; } // Use the model GPUMatrix inference_inputs(n_input_dims, batch_size); generate_inputs(&inference_inputs); // <-- your code GPUMatrix inference_outputs(n_output_dims, batch_size); network->inference(nullptr, inference_inputs, inference_outputs); ">
#include <tiny-cuda-nn/common.h>

// Configure the model
nlohmann::json config = {
	{"loss", {
		{"otype", "L2"}
	}},
	{"optimizer", {
		{"otype", "Adam"},
		{"learning_rate", 1e-3},
	}},
	{"encoding", {
		{"otype", "OneBlob"},
		{"n_bins", 32},
	}},
	{"network", {
		{"otype", "FullyFusedMLP"},
		{"n_neurons", 64},
		{"n_hidden_layers", 5},
		{"activation", "ReLU"},
		{"output_activation", "None"},
	}},
};

using namespace tcnn;

auto [loss, optimizer, network, trainer] =
	create_from_config(n_input_dims_to_encode, n_input_dims_to_pass_through, n_output_dims, config);

// Train the model
GPUMatrix<float, MatrixLayout::ColumnMajor> training_batch_inputs(n_input_dims, batch_size);
GPUMatrix<float, MatrixLayout::ColumnMajor> training_batch_targets(n_output_dims, batch_size);

for (int i = 0; i < n_training_steps; ++i) {
	generate_training_batch(&training_batch_inputs, &training_batch_targets); // <-- your code

	float loss;
	trainer->training_step(nullptr, training_batch_inputs, training_batch_targets, &loss);
	std::cout << "iteration=" << i << " loss=" << loss << std::endl;
}

// Use the model
GPUMatrix<float, MatrixLayout::ColumnMajor> inference_inputs(n_input_dims, batch_size);
generate_inputs(&inference_inputs); // <-- your code

GPUMatrix<float, MatrixLayout::ColumnMajor> inference_outputs(n_output_dims, batch_size);
network->inference(nullptr, inference_inputs, inference_outputs);

Example: learning a 2D image

We provide a sample application where an image function (x,y) -> (R,G,B) is learned. It can be run via

tiny-cuda-nn/build> ./mlp_learning_an_image ../data/images/albert.exr ../data/config.json

producing an image every 1000 training steps. Each 1000 steps should take roughly 0.8 seconds with the default configuration on an RTX 3090.

Learned image after 1,000 steps Learned image after 10,000 steps Reference image
1,000 steps 10,000 steps reference

Requirements

  • CUDA v11.2 or higher.
  • CMake v3.17 or higher.
  • A C++14 capable compiler.
  • A high-end NVIDIA GPU that supports TensorCores and has a large amount of shared memory. The framework was tested primarily with an RTX 3090.
    • Ampere GeForce GPUs: compiles out of the box.
    • Ampere A100: requires changing CMAKE_CUDA_ARCHITECTURE to 80 in CMakeLists.txt.
    • Turing GPUs: requires changing CMAKE_CUDA_ARCHITECTURE to 75 in CMakeLists.txt as well as changing SmArch in include/tiny-cuda-nn/cutlass_matmul.h to cutlass::arch::Sm75.

Compilation

Begin by cloning this repository and all its submodules using the following command:

> git clone --recursive https://github.com/nvlabs/tiny-cuda-nn
> cd tiny-cuda-nn
tiny-cuda-nn>

Then, use CMake to generate build files:

tiny-cuda-nn> mkdir build
tiny-cuda-nn> cd build
tiny-cuda-nn/build> cmake ..

Then, depending on your operating system

On Windows, open tiny-cuda-nn/build/tiny-cuda-nn.sln in Visual Studio and click the "Build" button. On Linux you can compile with

tiny-cuda-nn/build> make -j

Components

The following is a summary of all components of this framework that are currently released. Please consult the JSON documentation for how to configure them.

Networks    
Fully fused MLP src/fully_fused_mlp.cu Lightning fast implementation of small multi-layer perceptrons (MLPs).
CUTLASS MLP src/cutlass_mlp.cu MLP based on CUTLASS' GEMM routines. Slower than fully-fused, but handles larger networks and still is reasonably fast.
CUTLASS ResNet src/cutlass_resnet.cu Fully connected residual network based on CUTLASS' GEMM routines.
Input encodings    
Identity include/tiny-cuda-nn/encodings/identity.h Leaves values untouched.
Oneblob include/tiny-cuda-nn/encodings/oneblob.h From Neural Importance Sampling [Müller et al. 2019] and Neural Control Variates [Müller et al. 2020].
Frequency include/tiny-cuda-nn/encodings/frequency.h From NeRF [Mildenhall et al. 2020].
NRC include/tiny-cuda-nn/encodings/nrc.h Combined oneblob and frequency encoding used in Neural Radiance Caching [Müller et al. 2021].
Losses    
L2 include/tiny-cuda-nn/losses/l2.h Standard L2 loss.
Relative L2 include/tiny-cuda-nn/losses/relative_l2.h Relative L2 loss normalized by the network prediction [Lehtinen et al. 2018].
Relative L2 Luminance include/tiny-cuda-nn/losses/relative_l2_luminance.h Same as above, but normalized by the luminance of the network prediction. Only applicable when network prediction is RGB. Used in Neural Radiance Caching [Müller et al. 2021].
Cross Entropy include/tiny-cuda-nn/losses/cross_entropy.h Standard cross entropy loss. Only applicable when the network prediction is a PDF.
Variance include/tiny-cuda-nn/losses/variance_is.h Standard variance loss. Only applicable when the network prediction is a PDF.
Optimizers    
Adam include/tiny-cuda-nn/optimizers/adam.h Implementation of Adam [Kingma and Ba 2014], generalized to AdaBound [Luo et al. 2019].
Novograd include/tiny-cuda-nn/optimizers/lookahead.h Implementation of Novograd [Ginsburg et al. 2019].
SGD include/tiny-cuda-nn/optimizers/sgd.h Standard stochastic gradient descent (SGD).
Shampoo include/tiny-cuda-nn/optimizers/shampoo.h Implementation of the 2nd order Shampoo optimizer [Gupta et al. 2018] with home-grown optimizations as well as those by Anil et al. [2020].
Average include/tiny-cuda-nn/optimizers/average.h Wraps another optimizer and computes a linear average of the weights over the last N iterations. The average is used for inference only (does not feed back into training).
Batched include/tiny-cuda-nn/optimizers/batched.h Wraps another optimizer, invoking the nested optimizer once every N steps on the averaged gradient. Has the same effect as increasing the batch size but requires only a constant amount of memory.
EMA include/tiny-cuda-nn/optimizers/average.h Wraps another optimizer and computes an exponential moving average of the weights. The average is used for inference only (does not feed back into training).
Exponential Decay include/tiny-cuda-nn/optimizers/exponential_decay.h Wraps another optimizer and performs piecewise-constant exponential learning-rate decay.
Lookahead include/tiny-cuda-nn/optimizers/lookahead.h Wraps another optimizer, implementing the lookahead algorithm [Zhang et al. 2019].
Comments
  • Very high memory usage at compilation time

    Very high memory usage at compilation time

    Hi,

    I'm in the process of setting up nerfstudio and need to install tiny-cuda-nn via

    pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
    

    I have currently ~130GB of RAM alllocated for this job but that does not seem to be enough. Screenshot from 2022-10-19 13-39-20

    Terminal output so far:

    $ pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
    Collecting git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
      Cloning https://github.com/NVlabs/tiny-cuda-nn/ to /scratch_local/ahochlehnert48-1936838/tmp/pip-req-build-zbi99zii
      Running command git clone --filter=blob:none --quiet https://github.com/NVlabs/tiny-cuda-nn/ /scratch_local/user-1936838/tmp/pip-req-build-zbi99zii
      Resolved https://github.com/NVlabs/tiny-cuda-nn/ to commit 93ed3e38b9bf25bc159d9dcff93d78472c8e1141
      Running command git submodule update --init --recursive -q
      Preparing metadata (setup.py) ... done
    Building wheels for collected packages: tinycudann
      Building wheel for tinycudann (setup.py) ... | 
    

    How much memory do I need in order to build tiny-cuda-nn using pip?

    Thx

    opened by libeanim 14
  • Underlying buffer has been detached

    Underlying buffer has been detached

    I followed

    conda create -n dmodel python=3.9
    activate dmodel
    conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
    pip install ninja imageio PyOpenGL glfw xatlas gdown
    pip install git+https://github.com/NVlabs/nvdiffrast/
    pip install --global-option="--no-networks" git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
    imageio_download_bin freeimage
    

    in Windows 10.

    pip install --global-option="--no-networks" git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch

    threw

                    instantiation of "decltype(auto) std::_Get_unwrapped(_Iter &&) [with _Iter=nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>> *const &]"
        e:/VS/VC/Tools/MSVC/14.29.30133/include\xmemory(1703): here
                    instantiation of "std::_Alloc_ptr_t<_Alloc> std::_Uninitialized_move(_InIt, _InIt, std::_Alloc_ptr_t<_Alloc>, _Alloc &) [with _InIt=nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>> *, _Alloc=std::_Rebind_alloc_t<std::allocator<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>>, nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>>]"
        e:/VS/VC/Tools/MSVC/14.29.30133/include\vector(1651): here
                    instantiation of "void std::vector<_Ty, _Alloc>::_Umove_if_noexcept1(std::vector<_Ty, _Alloc>::pointer, std::vector<_Ty, _Alloc>::pointer, std::vector<_Ty, _Alloc>::pointer, std::true_type) [with _Ty=nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>, _Alloc=std::allocator<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>>]"
        e:/VS/VC/Tools/MSVC/14.29.30133/include\vector(1662): here
                    instantiation of "void std::vector<_Ty, _Alloc>::_Umove_if_noexcept(std::vector<_Ty, _Alloc>::pointer, std::vector<_Ty, _Alloc>::pointer, std::vector<_Ty, _Alloc>::pointer) [with _Ty=nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>, _Alloc=std::allocator<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>>]"
        e:/VS/VC/Tools/MSVC/14.29.30133/include\vector(1297): here
                    instantiation of "void std::vector<_Ty, _Alloc>::_Reallocate_exactly(std::vector<_Ty, _Alloc>::size_type) [with _Ty=nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>, _Alloc=std::allocator<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>>]"
        e:/VS/VC/Tools/MSVC/14.29.30133/include\vector(1363): here
                    instantiation of "void std::vector<_Ty, _Alloc>::reserve(std::vector<_Ty, _Alloc>::size_type) [with _Ty=nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>, _Alloc=std::allocator<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>>]"
        E:/Temp/pip-req-build-dx4hpd_b/dependencies\json/json.hpp(18616): here
                    instantiation of "void nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::json_value::destroy(nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::value_t) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]"
        E:/Temp/pip-req-build-dx4hpd_b/dependencies\json/json.hpp(19828): here
                    instantiation of "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::~basic_json() [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]"
        E:/Temp/pip-req-build-dx4hpd_b/dependencies\json/json.hpp(20679): here
    
    ...
    
        e:/VS/VC/Tools/MSVC/14.29.30133/include\xutility(124): error: expected a "("
                  detected during:
                    instantiation of "void *std::_Voidify_iter(_Iter) [with _Iter=std::vector<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>, std::allocator<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>>> *]"
        e:/VS/VC/Tools/MSVC/14.29.30133/include\xmemory(681): here
                    instantiation of "void std::_Default_allocator_traits<_Alloc>::construct(_Alloc &, _Objty *, _Types &&...) [with _Alloc=std::allocator<std::vector<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>, std::allocator<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>>>>, _Objty=std::vector<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>, std::allocator<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>>>, _Types=<>]"
        E:/Temp/pip-req-build-dx4hpd_b/dependencies\json/json.hpp(18440): here
                    instantiation of "T *nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::create<T,Args...>(Args &&...) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>, T=std::vector<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>, std::allocator<nlohmann::basic_json<std::map, std::vector, std::string, __nv_bool, int64_t, uint64_t, double, std::allocator, nlohmann::adl_serializer, std::vector<uint8_t, std::allocator<uint8_t>>>>>, Args=<>]"
        E:/Temp/pip-req-build-dx4hpd_b/dependencies\json/json.hpp(18517): here
                    instantiation of "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::json_value::json_value(nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::value_t) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]"
        E:/Temp/pip-req-build-dx4hpd_b/dependencies\json/json.hpp(18947): here
                    instantiation of "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::basic_json(nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::value_t) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]"
        E:/Temp/pip-req-build-dx4hpd_b/dependencies\json/json.hpp(18971): here
                    instantiation of "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::basic_json(std::nullptr_t) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]"
        E:/Temp/pip-req-build-dx4hpd_b/dependencies\json/json.hpp(24402): here
                    instantiation of "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType> nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::parse(IteratorType, IteratorType, nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::parser_callback_t, __nv_bool, __nv_bool) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>, IteratorType=const char *]"
        E:/Temp/pip-req-build-dx4hpd_b/dependencies\json/json.hpp(26513): here
    
        Error limit reached.
        100 errors detected in the compilation of "E:/Temp/pip-req-build-dx4hpd_b/src/cpp_api.cu".
        Compilation terminated.
        cpp_api.cu
        [5/5] cl /showIncludes /nologo /O2 /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IE:\Temp\pip-req-build-dx4hpd_b/include -IE:\Temp\pip-req-build-dx4hpd_b/dependencies -IE:\Temp\pip-req-build-dx4hpd_b/dependencies/cutlass/include -IE:\Temp\pip-req-build-dx4hpd_b/dependencies/cutlass/tools/util/include -IE:\miniconda\envs\dmodel\lib\site-packages\torch\include -IE:\miniconda\envs\dmodel\lib\site-packages\torch\include\torch\csrc\api\include -IE:\miniconda\envs\dmodel\lib\site-packages\torch\include\TH -IE:\miniconda\envs\dmodel\lib\site-packages\torch\include\THC "-IE:\Eigene Programme\Cuda\include" -IE:\miniconda\envs\dmodel\include -IE:\miniconda\envs\dmodel\Include "-IE:\VS\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-IE:\VS\VC\Tools\MSVC\14.29.30133\include" "-IE:\WK\NETFXSDK\4.8\include\um" "-IE:\Windows Kits\10\include\10.0.19041.0\ucrt" "-IE:\Windows Kits\10\include\10.0.19041.0\shared" "-IE:\Windows Kits\10\include\10.0.19041.0\um" "-IE:\Windows Kits\10\include\10.0.19041.0\winrt" "-IE:\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c E:\Temp\pip-req-build-dx4hpd_b\bindings\torch\tinycudann\bindings.cpp /FoE:\Temp\pip-req-build-dx4hpd_b\bindings\torch\build\temp.win-amd64-3.9\Release\tinycudann/bindings.obj /std:c++14 -DTCNN_MIN_GPU_ARCH=52 -DTCNN_NO_NETWORKS -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
       
       ...
        Hinweis: Einlesen der Datei: E:\Temp\pip-req-build-dx4hpd_b/dependencies\json/json.hpp
        Hinweis: Einlesen der Datei:  E:\VS\VC\Tools\MSVC\14.29.30133\include\cassert
        Hinweis: Einlesen der Datei:   E:\Windows Kits\10\include\10.0.19041.0\ucrt\assert.h
        Hinweis: Einlesen der Datei: E:\Temp\pip-req-build-dx4hpd_b/dependencies\pybind11_json/pybind11_json.hpp
        Hinweis: Einlesen der Datei: E:\Temp\pip-req-build-dx4hpd_b/include\tiny-cuda-nn/cpp_api.h
        ninja: build stopped: subcommand failed.
        Traceback (most recent call last):
          File "E:\miniconda\envs\dmodel\lib\site-packages\torch\utils\cpp_extension.py", line 1740, in _run_ninja_build
            subprocess.run(
          File "E:\miniconda\envs\dmodel\lib\subprocess.py", line 528, in run
            raise CalledProcessError(retcode, process.args,
        subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
    
        The above exception was the direct cause of the following exception:
    
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "E:\Temp\pip-req-build-dx4hpd_b\bindings/torch\setup.py", line 117, in <module>
            setup(
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\__init__.py", line 87, in setup
            return distutils.core.setup(**attrs)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\core.py", line 148, in setup
            return run_commands(dist)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\core.py", line 163, in run_commands
            dist.run_commands()
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\dist.py", line 967, in run_commands
            self.run_command(cmd)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\dist.py", line 1214, in run_command
            super().run_command(command)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\dist.py", line 986, in run_command
            cmd_obj.run()
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\command\install.py", line 68, in run
            return orig.install.run(self)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\command\install.py", line 664, in run
            self.run_command('build')
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\cmd.py", line 313, in run_command
            self.distribution.run_command(command)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\dist.py", line 1214, in run_command
            super().run_command(command)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\dist.py", line 986, in run_command
            cmd_obj.run()
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\command\build.py", line 135, in run
            self.run_command(cmd_name)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\cmd.py", line 313, in run_command
            self.distribution.run_command(command)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\dist.py", line 1214, in run_command
            super().run_command(command)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\dist.py", line 986, in run_command
            cmd_obj.run()
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\command\build_ext.py", line 79, in run
            _build_ext.run(self)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 339, in run
            self.build_extensions()
          File "E:\miniconda\envs\dmodel\lib\site-packages\torch\utils\cpp_extension.py", line 741, in build_extensions
            build_ext.build_extensions(self)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 448, in build_extensions
            self._build_extensions_serial()
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 473, in _build_extensions_serial
            self.build_extension(ext)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\command\build_ext.py", line 202, in build_extension
            _build_ext.build_extension(self, ext)
          File "E:\miniconda\envs\dmodel\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 528, in build_extension
            objects = self.compiler.compile(sources,
          File "E:\miniconda\envs\dmodel\lib\site-packages\torch\utils\cpp_extension.py", line 714, in win_wrap_ninja_compile
            _write_ninja_file_and_compile_objects(
          File "E:\miniconda\envs\dmodel\lib\site-packages\torch\utils\cpp_extension.py", line 1419, in _write_ninja_file_and_compile_objects
            _run_ninja_build(
          File "E:\miniconda\envs\dmodel\lib\site-packages\torch\utils\cpp_extension.py", line 1756, in _run_ninja_build
            raise RuntimeError(message) from e
        RuntimeError: Error compiling objects for extension
        Error in atexit._run_exitfuncs:
        Traceback (most recent call last):
          File "E:\miniconda\envs\dmodel\lib\site-packages\colorama\ansitowin32.py", line 59, in closed
            return stream.closed
        ValueError: underlying buffer has been detached
        ----------------------------------------
    ERROR: Command errored out with exit status 1: 'E:\miniconda\envs\dmodel\python.exe' -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'E:\\Temp\\pip-req-build-dx4hpd_b\\bindings/torch\\setup.py'"'"'; __file__='"'"'E:\\Temp\\pip-req-build-dx4hpd_b\\bindings/torch\\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' --no-networks install --record 'E:\Temp\pip-record-xxbieno2\install-record.txt' --single-version-externally-managed --compile --install-headers 'E:\miniconda\envs\dmodel\Include\tinycudann' Check the logs for full command output.
    
    
    opened by ErfolgreichCharismatisch 12
  • Got cutlass error: Error Internal at: 363, when trying to run samples/mlp_learning_an_image_pytorch.py

    Got cutlass error: Error Internal at: 363, when trying to run samples/mlp_learning_an_image_pytorch.py

    Hi, thank you for your pytorch extention! When I tried to run samples/mlp_learning_an_image_pytorch.py, I got an error message:

    Warning: FullyFusedMLP is not supported for the selected architecture 61. Falling back to CutlassMLP. For maximum performance, raise the target GPU architecture to 75+. Warning: FullyFusedMLP is not supported for the selected architecture 61. Falling back to CutlassMLP. For maximum performance, raise the target GPU architecture to 75+. NetworkWithInputEncoding(n_input_dims=2, n_output_dims=3, seed=1337, dtype=torch.float32, hyperparams={'encoding': {'base_resolution': 16, 'interpolation': 'Linear', 'log2_hashmap_size': 15, 'n_features_per_level': 2, 'n_levels': 16, 'otype': 'Grid', 'per_level_scale': 1.5, 'type': 'Hash'}, 'network': {'activation': 'ReLU', 'n_hidden_layers': 2, 'n_neurons': 64, 'otype': 'CutlassMLP', 'output_activation': 'None'}, 'otype': 'NetworkWithInputEncoding'}) Writing 'reference.jpg'... done. Beginning optimization with 10000000 training steps. samples/mlp_learning_an_image_pytorch.py:74: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. xs = xs * torch.tensor([shape[1], shape[0]], device=xs.device).float() samples/mlp_learning_an_image_pytorch.py:74: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! xs = xs * torch.tensor([shape[1], shape[0]], device=xs.device).float() Got cutlass error: Error Internal at: 363

    Maybe there is something wrong with my environment?

    My environment: Ubuntu 20.04.4 LTS GeForce GTX 1080 Ti CUDA 11.0 / Driver Version: 470.86 pytorch 1.7.1+cu110 cmake 3.22.2 I installed tinycudann by runningpython setup.py install.

    Thank you

    opened by OctoberKat 10
  • build failing

    build failing

    Build is failing

    [ 5%] Building CUDA object src/CMakeFiles/tiny-cuda-nn.dir/common.cu.o nvcc fatal : A single input file is required for a non-link phase when an outputfile is specified make[2]: *** [src/CMakeFiles/tiny-cuda-nn.dir/build.make:76: src/CMakeFiles/tiny-cuda-nn.dir/common.cu.o] Error 1 make[1]: *** [CMakeFiles/Makefile2:134: src/CMakeFiles/tiny-cuda-nn.dir/all] Error 2 make: *** [Makefile:91: all] Error 2

    opened by rathken 10
  • segmentation fault

    segmentation fault

    Thanks for your amazing work! When I use tcnn, I got this error:

    [1] 299697 segmentation fault (core dumped) python scripts/test_grid_bwdbwd.py

    my python env is below: python 3.6.13 pytorch 1.10.2 cuda 11.3

    opened by lambdald 9
  • Any plans for double backward / second-order gradients ? i.e. backward for backward functions.

    Any plans for double backward / second-order gradients ? i.e. backward for backward functions.

    Hi, First of all, thanks for the great repo! I've already built a project based on tcnn and found it extremely helpful.

    However during usage, I found out that since the backward functions are c++ implemented, they are not trackable by pytorch, causing autograd.grad(..., create_graph=True) fails to generate grad_fn for grads (i.e. second-order gradients).

    This functionality is helpful when training and losses are related to first-order gradients. For example, when training a SDF MLP, typically a eikonal loss will be used, which is a loss applied on dy_dx (nablas) of the network. To achieve this, a d(dy_dx)_dparam is needed. Ref: https://arxiv.org/abs/2002.10099 Fig: image

    Currently I'm writing custom backward_backward functions upon tcnn's grid.h and fully_fused_mlp.cu, but it would be really nice if this could be officially supported. :smile:

    BR, Ventus


    🎉🎉🎉 UPDATE: to all people who reach here

    For now, a partial support for double backward and only for grid encodings is implemented within the tiny-cuda-nn repo.

    Example usage script could be found here.

    For implementation details, please check the original PR #69 .

    opened by ventusff 8
  • Do you plan to have a python wrapper for the fully fused MLP?

    Do you plan to have a python wrapper for the fully fused MLP?

    Hi, I am not an expert on cuda coding but have more experience on pytorch/tensorflow... Do you have any plans to have this code with a python (more specifically pytorch) wrapper? Or will it be possible to point the location for forward/backward function of this MLP implementation so that we can potentially incorporate this into other python code?

    Thanks a lot

    opened by MultiPath 8
  • Error while installing python extension

    Error while installing python extension

    Windows 10 Visual Studio 2019 version 16.11.17 Anaconda 3, python 3.9.12 CUDA: 11.6 torch: 1.12.0+cu116 cmake 3.22.0-rc2 GPU: RTX 6000

    The tiny-cuda-nn itself compiled normally, but the pytorch extension failed to build.

    When running python setup.py install, error was reported in format.h:

    Building PyTorch extension for tiny-cuda-nn version 1.6
    Targeting compute capability 75
    running install
    running bdist_egg
    running egg_info
    writing tinycudann.egg-info\PKG-INFO
    writing dependency_links to tinycudann.egg-info\dependency_links.txt
    writing top-level names to tinycudann.egg-info\top_level.txt
    reading manifest file 'tinycudann.egg-info\SOURCES.txt'
    writing manifest file 'tinycudann.egg-info\SOURCES.txt'
    installing library code to build\bdist.win-amd64\egg
    running install_lib
    running build_py
    running build_ext
    building 'tinycudann_bindings._C' extension
    Emitting ninja build file E:\Downloads\tiny-cuda-nn\bindings\torch\build\temp.win-amd64-3.9\Release\build.ninja...
    Compiling objects...
    Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
    [1/7] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc --generate-dependencies-with-compile --dependency-output E:\Downloads\tiny-cuda-nn\bindings\torch\build\src/common.obj.d --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IE:\Downloads\tiny-cuda-nn/include -IE:\Downloads\tiny-cuda-nn/dependencies -IE:\Downloads\tiny-cuda-nn/dependencies/cutlass/include -IE:\Downloads\tiny-cuda-nn/dependencies/cutlass/tools/util/include -IE:\Downloads\tiny-cuda-nn/dependencies/fmt/include -IE:\Anaconda3\envs\dmodel\lib\site-packages\torch\include -IE:\Anaconda3\envs\dmodel\lib\site-packages\torch\include\torch\csrc\api\include -IE:\Anaconda3\envs\dmodel\lib\site-packages\torch\include\TH -IE:\Anaconda3\envs\dmodel\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\include" -IE:\Anaconda3\envs\dmodel\include -IE:\Anaconda3\envs\dmodel\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c E:\Downloads\tiny-cuda-nn\src\common.cu -o E:\Downloads\tiny-cuda-nn\bindings\torch\build\src/common.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -DTCNN_MIN_GPU_ARCH=75 -DFMT_HEADER_ONLY=1 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
    FAILED: E:/Downloads/tiny-cuda-nn/bindings/torch/build/src/common.obj 
    C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc --generate-dependencies-with-compile --dependency-output E:\Downloads\tiny-cuda-nn\bindings\torch\build\src/common.obj.d --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IE:\Downloads\tiny-cuda-nn/include -IE:\Downloads\tiny-cuda-nn/dependencies -IE:\Downloads\tiny-cuda-nn/dependencies/cutlass/include -IE:\Downloads\tiny-cuda-nn/dependencies/cutlass/tools/util/include -IE:\Downloads\tiny-cuda-nn/dependencies/fmt/include -IE:\Anaconda3\envs\dmodel\lib\site-packages\torch\include -IE:\Anaconda3\envs\dmodel\lib\site-packages\torch\include\torch\csrc\api\include -IE:\Anaconda3\envs\dmodel\lib\site-packages\torch\include\TH -IE:\Anaconda3\envs\dmodel\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\include" -IE:\Anaconda3\envs\dmodel\include -IE:\Anaconda3\envs\dmodel\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c E:\Downloads\tiny-cuda-nn\src\common.cu -o E:\Downloads\tiny-cuda-nn\bindings\torch\build\src/common.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -DTCNN_MIN_GPU_ARCH=75 -DFMT_HEADER_ONLY=1 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
    cl: cmd warning D9025 :rewritting“/D__CUDA_NO_HALF_OPERATORS__”(using“/U__CUDA_NO_HALF_OPERATORS__”)
    cl: cmd warning D9025 :rewritting“/D__CUDA_NO_HALF_CONVERSIONS__”(using“/U__CUDA_NO_HALF_CONVERSIONS__”)
    cl: cmd warning D9025 :rewritting“/D__CUDA_NO_HALF2_OPERATORS__”(using“/U__CUDA_NO_HALF2_OPERATORS__”)
    common.cu
    cl: cmd warning D9025 :rewritting“/D__CUDA_NO_HALF_OPERATORS__”(using“/U__CUDA_NO_HALF_OPERATORS__”)
    cl: cmd warning D9025 :rewritting“/D__CUDA_NO_HALF_CONVERSIONS__”(using“/U__CUDA_NO_HALF_CONVERSIONS__”)
    cl: cmd warning D9025 :rewritting“/D__CUDA_NO_HALF2_OPERATORS__”(using“/U__CUDA_NO_HALF2_OPERATORS__”)
    common.cu
    E:\Downloads\tiny-cuda-nn\dependencies\fmt\include\fmt\format.h(2478): error: too many recursive substitutions of function template signatures
              detected during:
                processing of template argument list for "fmt::v9::detail::has_isfinite" 
    (3177): here
                instantiation of "fmt::v9::detail::isfinite" 
    (3177): here
                processing of template argument list for "fmt::v9::detail::has_isfinite" 
    (3177): here
                instantiation of "fmt::v9::detail::isfinite" 
    (3177): here
                processing of template argument list for "fmt::v9::detail::has_isfinite" 
    (3177): here
                [ 397 instantiation contexts not shown ]
                instantiation of "auto fmt::v9::detail::write(OutputIt, T, fmt::v9::basic_format_specs<Char>, fmt::v9::detail::locale_ref)->OutputIt [with Char=char, OutputIt=fmt::v9::appender, T=float, <unnamed>=0]" 
    (3217): here
                instantiation of "auto fmt::v9::detail::write<Char,OutputIt,T,<unnamed>>(OutputIt, T)->OutputIt [with Char=char, OutputIt=fmt::v9::appender, T=float, <unnamed>=0]" 
    (3351): here
                instantiation of "auto fmt::v9::detail::default_arg_formatter<Char>::operator()(T)->fmt::v9::detail::default_arg_formatter<Char>::iterator [with Char=char, T=float]" 
    E:/Downloads/tiny-cuda-nn/dependencies/fmt/include\fmt/core.h(1644): here
                instantiation of "auto fmt::v9::visit_format_arg(Visitor &&, const fmt::v9::basic_format_arg<Context> &)->decltype((<expression>)) [with Visitor=fmt::v9::detail::default_arg_formatter<char>, Context=fmt::v9::format_context]" 
    (4055): here
                instantiation of "void fmt::v9::detail::vformat_to(fmt::v9::detail::buffer<Char> &, fmt::v9::basic_string_view<Char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<fmt::v9::detail::buffer_appender<fmt::v9::type_identity_t<Char>>, fmt::v9::type_identity_t<Char>>>, fmt::v9::detail::locale_ref) [with Char=char]" 
    E:\Downloads\tiny-cuda-nn\dependencies\fmt\include\fmt\format-inl.h(1472): here
    
    E:\Downloads\tiny-cuda-nn\dependencies\fmt\include\fmt\format.h(2475): error: duplicate base class name
              detected during:
                instantiation of class "fmt::v9::detail::has_isfinite<T, Enable> [with T=float, Enable=void]" 
    (3177): here
                instantiation of "fmt::v9::detail::isfinite" 
    (3177): here
                processing of template argument list for "fmt::v9::detail::has_isfinite" 
    (3177): here
                instantiation of "fmt::v9::detail::isfinite" 
    (3177): here
                processing of template argument list for "fmt::v9::detail::has_isfinite" 
    (3177): here
                [ 395 instantiation contexts not shown ]
                instantiation of "auto fmt::v9::detail::write(OutputIt, T, fmt::v9::basic_format_specs<Char>, fmt::v9::detail::locale_ref)->OutputIt [with Char=char, OutputIt=fmt::v9::appender, T=float, <unnamed>=0]" 
    (3217): here
                instantiation of "auto fmt::v9::detail::write<Char,OutputIt,T,<unnamed>>(OutputIt, T)->OutputIt [with Char=char, OutputIt=fmt::v9::appender, T=float, <unnamed>=0]" 
    (3351): here
                instantiation of "auto fmt::v9::detail::default_arg_formatter<Char>::operator()(T)->fmt::v9::detail::default_arg_formatter<Char>::iterator [with Char=char, T=float]" 
    E:/Downloads/tiny-cuda-nn/dependencies/fmt/include\fmt/core.h(1644): here
                instantiation of "auto fmt::v9::visit_format_arg(Visitor &&, const fmt::v9::basic_format_arg<Context> &)->decltype((<expression>)) [with Visitor=fmt::v9::detail::default_arg_formatter<char>, Context=fmt::v9::format_context]" 
    (4055): here
                instantiation of "void fmt::v9::detail::vformat_to(fmt::v9::detail::buffer<Char> &, fmt::v9::basic_string_view<Char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<fmt::v9::detail::buffer_appender<fmt::v9::type_identity_t<Char>>, fmt::v9::type_identity_t<Char>>>, fmt::v9::detail::locale_ref) [with Char=char]" 
    E:\Downloads\tiny-cuda-nn\dependencies\fmt\include\fmt\format-inl.h(1472): here
    
    E:\Downloads\tiny-cuda-nn\dependencies\fmt\include\fmt\format.h(2475): error: duplicate base class name
              detected during:
                instantiation of class "fmt::v9::detail::has_isfinite<T, Enable> [with T=float, Enable=void]" 
    (3177): here
                instantiation of "fmt::v9::detail::isfinite" 
    (3177): here
                processing of template argument list for "fmt::v9::detail::has_isfinite" 
    (3177): here
                instantiation of "fmt::v9::detail::isfinite" 
    (3177): here
                processing of template argument list for "fmt::v9::detail::has_isfinite" 
    (3177): here
                [ 393 instantiation contexts not shown ]
                instantiation of "auto fmt::v9::detail::write(OutputIt, T, fmt::v9::basic_format_specs<Char>, fmt::v9::detail::locale_ref)->OutputIt [with Char=char, OutputIt=fmt::v9::appender, T=float, <unnamed>=0]" 
    (3217): here
                instantiation of "auto fmt::v9::detail::write<Char,OutputIt,T,<unnamed>>(OutputIt, T)->OutputIt [with Char=char, OutputIt=fmt::v9::appender, T=float, <unnamed>=0]" 
    (3351): here
                instantiation of "auto fmt::v9::detail::default_arg_formatter<Char>::operator()(T)->fmt::v9::detail::default_arg_formatter<Char>::iterator [with Char=char, T=float]" 
    E:/Downloads/tiny-cuda-nn/dependencies/fmt/include\fmt/core.h(1644): here
                instantiation of "auto fmt::v9::visit_format_arg(Visitor &&, const fmt::v9::basic_format_arg<Context> &)->decltype((<expression>)) [with Visitor=fmt::v9::detail::default_arg_formatter<char>, Context=fmt::v9::format_context]" 
    (4055): here
                instantiation of "void fmt::v9::detail::vformat_to(fmt::v9::detail::buffer<Char> &, fmt::v9::basic_string_view<Char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<fmt::v9::detail::buffer_appender<fmt::v9::type_identity_t<Char>>, fmt::v9::type_identity_t<Char>>>, fmt::v9::detail::locale_ref) [with Char=char]" 
    E:\Downloads\tiny-cuda-nn\dependencies\fmt\include\fmt\format-inl.h(1472): here
    ...
    More errors
    ...
    
    Error limit reached.
    100 errors detected in the compilation of "E:/Downloads/tiny-cuda-nn/src/fully_fused_mlp.cu".
    Compilation terminated.
    fully_fused_mlp.cu
    ninja: build stopped: subcommand failed.
    
    opened by bacTlink 7
  • Loading weights into TinyCUDA

    Loading weights into TinyCUDA

    Hi! I'm very excited by TinyCUDA and I'd like to test it out for an inference task on a pre-trained model. I have the network weights as a .npy file and I'd ideally like to load them into the fully fused MLP. From a quick scan of the codebase it looks like there isn't any way to load pre-computed model weights (please correct me if I'm wrong). Do you have any advice on how I could go about accomplishing this?

    opened by ZakSingh 7
  • Python Extension Installation Error on Docker

    Python Extension Installation Error on Docker

    Hello there!

    I am trying to install the library on Ubuntu22.04, 3090ti, CUDA 11.7.0 host machine with following Dockerfile:

    FROM nvidia/cuda:11.7.0-cudnn8-devel-ubuntu22.04
    
    ENV DEBIAN_FRONTEND noninteractive
    
    # Basic setup
    ENV CMAKE_VERSION=3.23.4
    ENV PYTHON_VERSION=3.9
    
    ENV PATH="/usr/bin/cmake/bin:${PATH}"
    RUN apt-get update && apt-get install -y --no-install-recommends \
            build-essential git curl ca-certificates \
            wget vim pkg-config unzip rsync \
            ninja-build x11-apps \
        && wget https://github.com/Kitware/CMake/releases/download/v${CMAKE_VERSION}/cmake-${CMAKE_VERSION}-Linux-x86_64.sh \
            -q -O /tmp/cmake-install.sh \
        && chmod u+x /tmp/cmake-install.sh \
        && mkdir /usr/bin/cmake \
        && /tmp/cmake-install.sh --skip-license --prefix=/usr/bin/cmake \
        && rm /tmp/cmake-install.sh \
        && apt-get install sudo \
        && rm -rf /var/lib/apt/lists/*
    
    # Set working directory
    WORKDIR /opt
    ENV LC_ALL=C.UTF-8
    ENV LANG=C.UTF-8
    
    # Install Miniconda with given python version
    RUN curl -o ~/miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
         && chmod +x ~/miniconda.sh \
         && ~/miniconda.sh -b -p /opt/conda \
         && rm ~/miniconda.sh \
         && /opt/conda/bin/conda install -y python=${PYTHON_VERSION} \
    	 && /opt/conda/bin/conda clean -ya
    ENV PATH=/opt/conda/bin:$PATH
    ENV PATH=/root/.local/bin:$PATH 
    
    # Install required libraries
    RUN python -m pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116 \
        && conda install -y tensorboard matplotlib scikit-image scipy jupyter  \
            ninja cython typing future pytest black isort flake8 scikit-learn \
        && /opt/conda/bin/python -m pip install -U wandb python-dotenv pre-commit nbstripout \
            hydra-core hydra-colorlog hydra-optuna-sweeper rich pytorch-lightning torchmetrics
    
    
    ENV CUDA_HOME=/usr/local/cuda
    ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
    ENV PATH=$PATH:$CUDA_HOME/bin
    
    RUN git clone --recurse-submodules -j8 https://github.com/NVlabs/tiny-cuda-nn.git \
            && cd /opt/tiny-cuda-nn/bindings/torch \
            && pip install -e .
    
    WORKDIR /
    

    I am getting the following error:

      Running setup.py develop for tinycudann
        ERROR: Command errored out with exit status 1:
         command: /opt/conda/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/opt/tiny-cuda-nn/bindings/torch/setup.py'"'"'; __file__='"'"'/opt/tiny-cuda-nn/bind
    ings/torch/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.rea
    d().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps
             cwd: /opt/tiny-cuda-nn/bindings/torch/
        Complete output (88 lines):
        Building PyTorch extension for tiny-cuda-nn version 1.6
        Obtained compute capability 86 from PyTorch
        running develop
        /opt/conda/lib/python3.9/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standar
    ds-based tools.
          warnings.warn(
        /opt/conda/lib/python3.9/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based to
    ols.
          warnings.warn(
        running egg_info
        creating tinycudann.egg-info
        writing tinycudann.egg-info/PKG-INFO
        writing dependency_links to tinycudann.egg-info/dependency_links.txt
        writing top-level names to tinycudann.egg-info/top_level.txt
        writing manifest file 'tinycudann.egg-info/SOURCES.txt'
        reading manifest file 'tinycudann.egg-info/SOURCES.txt'
        writing manifest file 'tinycudann.egg-info/SOURCES.txt'
        running build_ext
        /opt/conda/lib/python3.9/site-packages/torch/utils/cpp_extension.py:387: UserWarning: The detected CUDA version (11.7) has a minor version mismatch with the version that was used to
     compile PyTorch (11.6). Most likely this shouldn't be a problem.
          warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
        building 'tinycudann_bindings_86._C' extension
        creating /opt/tiny-cuda-nn/bindings/torch/dependencies
        creating /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt
        creating /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src
        creating /opt/tiny-cuda-nn/bindings/torch/src
        creating /opt/tiny-cuda-nn/bindings/torch/build
        creating /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9
        creating /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/tinycudann
        Emitting ninja build file /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/build.ninja...
        Compiling objects...
        Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
        [1/10] c++ -MMD -MF /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/os.o.d -pthread -B /opt/conda/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O
    2 -isystem /opt/conda/include -I/opt/conda/include -fPIC -O2 -isystem /opt/conda/include -fPIC -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependenc
    ies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.9/site-packages/torch/include -I/op
    t/conda/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/lib/python3.9/site-packages/torch/include
    /THC -I/usr/local/cuda/include -I/opt/conda/include/python3.9 -c -c /opt/tiny-cuda-nn/dependencies/fmt/src/os.cc -o /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/os.o -std=c++14
     -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C
     -D_GLIBCXX_USE_CXX11_ABI=0
        [2/10] c++ -MMD -MF /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/format.o.d -pthread -B /opt/conda/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPI
    C -O2 -isystem /opt/conda/include -I/opt/conda/include -fPIC -O2 -isystem /opt/conda/include -fPIC -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/depen
    dencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.9/site-packages/torch/include -
    I/opt/conda/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/lib/python3.9/site-packages/torch/inc
    lude/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.9 -c -c /opt/tiny-cuda-nn/dependencies/fmt/src/format.cc -o /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/format.
    o -std=c++14 -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTEN
    SION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        [3/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutla
    ss/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.9/site-packages/torch/include -I/opt/conda/lib/python3.9/site-packages/torch/include/torch/cs
    rc/api/include -I/opt/conda/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python
    3.9 -c -c /opt/tiny-cuda-nn/src/common_device.cu -o /opt/tiny-cuda-nn/bindings/torch/src/common_device.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16
    _CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATO
    RS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-co
    nversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI=
    "_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        [4/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.9/site-packages/torch/include -I/opt/conda/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.9 -c -c /opt/tiny-cuda-nn/src/common.cu -o /opt/tiny-cuda-nn/bindings/torch/src/common.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_
    NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        [5/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.9/site-packages/torch/include -I/opt/conda/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.9 -c -c /opt/tiny-cuda-nn/src/network.cu -o /opt/tiny-cuda-nn/bindings/torch/src/network.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        [6/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.9/site-packages/torch/include -I/opt/conda/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.9 -c -c /opt/tiny-cuda-nn/src/cpp_api.cu -o /opt/tiny-cuda-nn/bindings/torch/src/cpp_api.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        [7/10] c++ -MMD -MF /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/tinycudann/bindings.o.d -pthread -B /opt/conda/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /opt/conda/include -I/opt/conda/include -fPIC -O2 -isystem /opt/conda/include -fPIC -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.9/site-packages/torch/include -I/opt/conda/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.9 -c -c /opt/tiny-cuda-nn/bindings/torch/tinycudann/bindings.cpp -o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/tinycudann/bindings.o -std=c++14 -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        In file included from /opt/conda/lib/python3.9/site-packages/torch/include/torch/csrc/Exceptions.h:13,
                         from /opt/conda/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/python.h:11,
                         from /opt/conda/lib/python3.9/site-packages/torch/include/torch/extension.h:6,
                         from /opt/tiny-cuda-nn/bindings/torch/tinycudann/bindings.cpp:34:
        /opt/conda/lib/python3.9/site-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::EPrecision>’:
        /opt/conda/lib/python3.9/site-packages/torch/include/pybind11/pybind11.h:2134:7:   required from ‘class pybind11::enum_<tcnn::cpp::EPrecision>’
        /opt/tiny-cuda-nn/bindings/torch/tinycudann/bindings.cpp:304:49:   required from here
        /opt/conda/lib/python3.9/site-packages/torch/include/pybind11/pybind11.h:1479:7: warning: ‘pybind11::class_<tcnn::cpp::EPrecision>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
         1479 | class class_ : public detail::generic_type {
              |       ^~~~~~
        /opt/conda/lib/python3.9/site-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::Context>’:
        /opt/tiny-cuda-nn/bindings/torch/tinycudann/bindings.cpp:317:45:   required from here
        /opt/conda/lib/python3.9/site-packages/torch/include/pybind11/pybind11.h:1479:7: warning: ‘pybind11::class_<tcnn::cpp::Context>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
        /opt/conda/lib/python3.9/site-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<Module>’:
        /opt/tiny-cuda-nn/bindings/torch/tinycudann/bindings.cpp:324:32:   required from here
        /opt/conda/lib/python3.9/site-packages/torch/include/pybind11/pybind11.h:1479:7: warning: ‘pybind11::class_<Module>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
        [8/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.9/site-packages/torch/include -I/opt/conda/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.9 -c -c /opt/tiny-cuda-nn/src/cutlass_mlp.cu -o /opt/tiny-cuda-nn/bindings/torch/src/cutlass_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        [9/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.9/site-packages/torch/include -I/opt/conda/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.9 -c -c /opt/tiny-cuda-nn/src/encoding.cu -o /opt/tiny-cuda-nn/bindings/torch/src/encoding.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        [10/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.9/site-packages/torch/include -I/opt/conda/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.9 -c -c /opt/tiny-cuda-nn/src/fully_fused_mlp.cu -o /opt/tiny-cuda-nn/bindings/torch/src/fully_fused_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning #1675-D: unrecognized GCC pragma
        
        creating build/lib.linux-x86_64-3.9
        creating build/lib.linux-x86_64-3.9/tinycudann_bindings_86
        g++ -pthread -B /opt/conda/compiler_compat -shared -Wl,-rpath,/opt/conda/lib -Wl,-rpath-link,/opt/conda/lib -L/opt/conda/lib -L/opt/conda/lib -Wl,-rpath,/opt/conda/lib -Wl,-rpath-link,/opt/conda/lib -L/opt/conda/lib /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/../../dependencies/fmt/src/format.o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/../../dependencies/fmt/src/os.o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/../../src/common.o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/../../src/common_device.o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/../../src/cpp_api.o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/../../src/cutlass_mlp.o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/../../src/encoding.o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/../../src/fully_fused_mlp.o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/../../src/network.o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-3.9/tinycudann/bindings.o -L/opt/conda/lib/python3.9/site-packages/torch/lib -L/usr/local/cuda/lib64 -lcuda -lcudadevrt -lcudart_static -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda_cu -ltorch_cuda_cpp -o build/lib.linux-x86_64-3.9/tinycudann_bindings_86/_C.cpython-39-x86_64-linux-gnu.so
        copying build/lib.linux-x86_64-3.9/tinycudann_bindings_86/_C.cpython-39-x86_64-linux-gnu.so -> tinycudann_bindings_86
        error: could not create 'tinycudann_bindings_86/_C.cpython-39-x86_64-linux-gnu.so': No such file or directory
        ----------------------------------------
    ERROR: Command errored out with exit status 1: /opt/conda/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/opt/tiny-cuda-nn/bindings/torch/setup.py'"'"'; __file__='"'"'/opt/tiny-cuda-nn/bindings/torch/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output.
    

    I tried different base images such as CUDA11.3 (I change installed torch version etc.) and following error is shown:

    Installing collected packages: tinycudann
      Running setup.py develop for tinycudann
        error: subprocess-exited-with-error
    
        × python setup.py develop did not run successfully.
        │ exit code: 1
        ╰─> [124 lines of output]
            Building PyTorch extension for tiny-cuda-nn version 1.6
            Obtained compute capability 86 from PyTorch 
            running develop
            running egg_info
            writing tinycudann.egg-info/PKG-INFO
            writing dependency_links to tinycudann.egg-info/dependency_links.txt
            writing top-level names to tinycudann.egg-info/top_level.txt
            reading manifest file 'tinycudann.egg-info/SOURCES.txt'
            writing manifest file 'tinycudann.egg-info/SOURCES.txt'
            running build_ext
            building 'tinycudann_bindings_86._C' extension
            creating /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-cpython-38
            creating /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-cpython-38/tinycudann
            Emitting ninja build file /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-cpython-38/build.ninja...
            Compiling objects...
            Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
            /opt/conda/lib/python3.8/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
              warnings.warn(
            /opt/conda/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
              warnings.warn(
            [1/10] c++ -MMD -MF /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/os.o.d -pthread -B /opt/conda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/dependencies/fmt/src/os.cc -o /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/os.o -std=c++14 -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
            [2/10] c++ -MMD -MF /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/format.o.d -pthread -B /opt/conda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/dependencies/fmt/src/format.cc -o /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/format.o -std=c++14 -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
            [3/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/common_device.cu -o /opt/tiny-cuda-nn/bindings/torch/src/common_device.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            [4/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/common.cu -o /opt/tiny-cuda-nn/bindings/torch/src/common.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            [5/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/network.cu -o /opt/tiny-cuda-nn/bindings/torch/src/network.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            [6/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/cpp_api.cu -o /opt/tiny-cuda-nn/bindings/torch/src/cpp_api.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
            [7/10] c++ -MMD -MF /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-cpython-38/tinycudann/bindings.o.d -pthread -B /opt/conda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/bindings/torch/tinycudann/bindings.cpp -o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-cpython-38/tinycudann/bindings.o -std=c++14 -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
            [8/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/cutlass_mlp.cu -o /opt/tiny-cuda-nn/bindings/torch/src/cutlass_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            [9/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/fully_fused_mlp.cu -o /opt/tiny-cuda-nn/bindings/torch/src/fully_fused_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            [10/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/encoding.cu -o /opt/tiny-cuda-nn/bindings/torch/src/encoding.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            FAILED: /opt/tiny-cuda-nn/bindings/torch/src/encoding.o
            /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/encoding.cu -o /opt/tiny-cuda-nn/bindings/torch/src/encoding.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
            /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
    
            Killed
            ninja: build stopped: subcommand failed.
            Traceback (most recent call last):
              File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build
                subprocess.run(
              File "/opt/conda/lib/python3.8/subprocess.py", line 516, in run
                raise CalledProcessError(retcode, process.args,
            subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
    
            The above exception was the direct cause of the following exception:
    
            Traceback (most recent call last):
              File "<string>", line 2, in <module>
              File "<pip-setuptools-caller>", line 34, in <module>
              File "/opt/tiny-cuda-nn/bindings/torch/setup.py", line 127, in <module>
                setup(
              File "/opt/conda/lib/python3.8/site-packages/setuptools/__init__.py", line 87, in setup
                return distutils.core.setup(**attrs)
              File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
                return run_commands(dist)
              File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
                dist.run_commands()
              File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 973, in run_commands
                self.run_command(cmd)
              File "/opt/conda/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
                super().run_command(command)
              File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 992, in run_command
                cmd_obj.run()
              File "/opt/conda/lib/python3.8/site-packages/setuptools/command/develop.py", line 34, in run
                self.install_for_development()
              File "/opt/conda/lib/python3.8/site-packages/setuptools/command/develop.py", line 114, in install_for_development
                self.run_command('build_ext')
              File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
                self.distribution.run_command(command)  
              File "/opt/conda/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
                super().run_command(command)
              File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 992, in run_command
                cmd_obj.run()
              File "/opt/conda/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 79, in run
                _build_ext.run(self)
              File "/opt/conda/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
                _build_ext.build_ext.run(self)
              File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
                self.build_extensions()
              File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 765, in build_extensions
                build_ext.build_extensions(self)
              File "/opt/conda/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
                _build_ext.build_ext.build_extensions(self)
              File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions
                self._build_extensions_serial()
              File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial
                self.build_extension(ext)
              File "/opt/conda/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
                _build_ext.build_extension(self, ext)   
              File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension
                objects = self.compiler.compile(
              File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 586, in unix_wrap_ninja_compile
                _write_ninja_file_and_compile_objects(  
              File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1487, in _write_ninja_file_and_compile_objects
                _run_ninja_build(
              File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build
                raise RuntimeError(message) from e
            RuntimeError: Error compiling objects for extension
            [end of output]
    
        note: This error originates from a subprocess, and is likely not a problem with pip.
    error: subprocess-exited-with-error
    
    × python setup.py develop did not run successfully. 
    │ exit code: 1
    ╰─> [124 lines of output]
        Building PyTorch extension for tiny-cuda-nn version 1.6
        Obtained compute capability 86 from PyTorch
        running develop
        running egg_info
        writing tinycudann.egg-info/PKG-INFO
        writing dependency_links to tinycudann.egg-info/dependency_links.txt
        writing top-level names to tinycudann.egg-info/top_level.txt
        reading manifest file 'tinycudann.egg-info/SOURCES.txt'
        writing manifest file 'tinycudann.egg-info/SOURCES.txt'
        running build_ext
        building 'tinycudann_bindings_86._C' extension  
        creating /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-cpython-38
        creating /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-cpython-38/tinycudann
        Emitting ninja build file /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-cpython-38/build.ninja...
        Compiling objects...
        Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
        /opt/conda/lib/python3.8/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
          warnings.warn(
        /opt/conda/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
          warnings.warn(
        [1/10] c++ -MMD -MF /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/os.o.d -pthread -B /opt/conda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/dependencies/fmt/src/os.cc -o /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/os.o -std=c++14 -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
        [2/10] c++ -MMD -MF /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/format.o.d -pthread -B /opt/conda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/dependencies/fmt/src/format.cc -o /opt/tiny-cuda-nn/bindings/torch/dependencies/fmt/src/format.o -std=c++14 -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
        [3/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/common_device.cu -o /opt/tiny-cuda-nn/bindings/torch/src/common_device.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        [4/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/common.cu -o /opt/tiny-cuda-nn/bindings/torch/src/common.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        [5/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/network.cu -o /opt/tiny-cuda-nn/bindings/torch/src/network.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        [6/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/cpp_api.cu -o /opt/tiny-cuda-nn/bindings/torch/src/cpp_api.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        [7/10] c++ -MMD -MF /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-cpython-38/tinycudann/bindings.o.d -pthread -B /opt/conda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/bindings/torch/tinycudann/bindings.cpp -o /opt/tiny-cuda-nn/bindings/torch/build/temp.linux-x86_64-cpython-38/tinycudann/bindings.o -std=c++14 -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
        [8/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/cutlass_mlp.cu -o /opt/tiny-cuda-nn/bindings/torch/src/cutlass_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        [9/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/fully_fused_mlp.cu -o /opt/tiny-cuda-nn/bindings/torch/src/fully_fused_mlp.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        [10/10] /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/encoding.cu -o /opt/tiny-cuda-nn/bindings/torch/src/encoding.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        FAILED: /opt/tiny-cuda-nn/bindings/torch/src/encoding.o
        /usr/local/cuda/bin/nvcc  -I/opt/tiny-cuda-nn/include -I/opt/tiny-cuda-nn/dependencies -I/opt/tiny-cuda-nn/dependencies/cutlass/include -I/opt/tiny-cuda-nn/dependencies/cutlass/tools/util/include -I/opt/tiny-cuda-nn/dependencies/fmt/include -I/opt/conda/lib/python3.8/site-packages/torch/include -I/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.8 -c -c /opt/tiny-cuda-nn/src/encoding.cu -o /opt/tiny-cuda-nn/bindings/torch/src/encoding.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -Xcompiler=-mf16c -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -DTCNN_MIN_GPU_ARCH=86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
        /opt/tiny-cuda-nn/dependencies/fmt/include/fmt/core.h(286): warning: unrecognized GCC pragma
     
        Killed
        ninja: build stopped: subcommand failed.
        Traceback (most recent call last):
          File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build
            subprocess.run(
          File "/opt/conda/lib/python3.8/subprocess.py", line 516, in run
            raise CalledProcessError(retcode, process.args,
        subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
     
        The above exception was the direct cause of the following exception:
     
        Traceback (most recent call last):
          File "<string>", line 2, in <module>
          File "<pip-setuptools-caller>", line 34, in <module>
          File "/opt/tiny-cuda-nn/bindings/torch/setup.py", line 127, in <module>
            setup(
          File "/opt/conda/lib/python3.8/site-packages/setuptools/__init__.py", line 87, in setup
            return distutils.core.setup(**attrs)
          File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
            return run_commands(dist)
          File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
            dist.run_commands()
          File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 973, in run_commands
            self.run_command(cmd)
          File "/opt/conda/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
            super().run_command(command)
          File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 992, in run_command
            cmd_obj.run()
          File "/opt/conda/lib/python3.8/site-packages/setuptools/command/develop.py", line 34, in run
            self.install_for_development()
          File "/opt/conda/lib/python3.8/site-packages/setuptools/command/develop.py", line 114, in install_for_development
            self.run_command('build_ext')
          File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
            self.distribution.run_command(command)
          File "/opt/conda/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
            super().run_command(command)
          File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 992, in run_command
            cmd_obj.run()
          File "/opt/conda/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 79, in run
            _build_ext.run(self)
          File "/opt/conda/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
            _build_ext.build_ext.run(self)
          File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
            self.build_extensions()
          File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 765, in build_extensions
            build_ext.build_extensions(self)
          File "/opt/conda/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
            _build_ext.build_ext.build_extensions(self) 
          File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions
            self._build_extensions_serial()
          File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial
            self.build_extension(ext)
          File "/opt/conda/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
            _build_ext.build_extension(self, ext)
          File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension
            objects = self.compiler.compile(
          File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 586, in unix_wrap_ninja_compile
            _write_ninja_file_and_compile_objects(
          File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1487, in _write_ninja_file_and_compile_objects
            _run_ninja_build(
          File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build
            raise RuntimeError(message) from e
        RuntimeError: Error compiling objects for extension
        [end of output]
    
    note: This error originates from a subprocess, and is likely not a problem with pip.
    

    I tried to change cmake version to 3.22.2 but still not installing the library.

    Would really appreciate your help for installing it properly. Thanks!

    opened by myaldiz 6
  • Get nested optimizer

    Get nested optimizer

    Can we add a method to optimizer that allows us to access the nested optimizer if it exists?

    virtual std::shared_ptr<Optmizer> nested() { return std::shared_ptr<Optimizer>(); }
    
    opened by jc211 6
  • Support for Higher Order Derivatives in `SphericalHarmonics` Encoding

    Support for Higher Order Derivatives in `SphericalHarmonics` Encoding

    I have noticed that the SphericalHarmonics encoding does not support second-order derivatives (or higher), unlike Grid and Frequency encodings. Given that it mainly involves multiply-add operations, as shown here, is it possible to support higher order derivatives in future releases?

    opened by low5545 0
  • pip install: RuntimeError: Error compiling objects for extension

    pip install: RuntimeError: Error compiling objects for extension

    when i install nerfstudio, it need the tiny cuda as requirement. So I install the tiny cuda as instructions as >https://docs.nerf.studio/en/latest/quickstart/installation.html But I found this error : #RuntimeError: Error compiling objects for extension# Did anybody have the same problem?

    opened by Miaosheng1 2
  • AttributeError: module 'tinycudann_bindings_75._C' has no attribute xxx

    AttributeError: module 'tinycudann_bindings_75._C' has no attribute xxx

    Hi I encountered some problems while using tiny-cuda-nn in RTX 4090 GPU. It kept showing AttributeError: module 'tinycudann_bindings_75._C' has no attribute xxx errors when I use normal functions such as create_network_with_input_encoding, create_network.

    opened by SarahWeiii 2
  • error installing tinycuda

    error installing tinycuda

    I run the install command from nerfstudio page and the second command line goes wrong

    pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
    

    I get the following error which is quite long:

    Building PyTorch extension for tiny-cuda-nn version 1.7
      Obtained compute capability 75 from PyTorch
      running bdist_wheel
      C:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\utils\cpp_extension.py:411: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
        warnings.warn(msg.format('we could not find ninja.'))
      running build
      running build_py
      creating build
      creating build\lib.win-amd64-cpython-38
      creating build\lib.win-amd64-cpython-38\tinycudann
      copying tinycudann\modules.py -> build\lib.win-amd64-cpython-38\tinycudann
      copying tinycudann\__init__.py -> build\lib.win-amd64-cpython-38\tinycudann
      running egg_info
      creating tinycudann.egg-info
      writing tinycudann.egg-info\PKG-INFO
      writing dependency_links to tinycudann.egg-info\dependency_links.txt
      writing top-level names to tinycudann.egg-info\top_level.txt
      writing manifest file 'tinycudann.egg-info\SOURCES.txt'
      reading manifest file 'tinycudann.egg-info\SOURCES.txt'
      writing manifest file 'tinycudann.egg-info\SOURCES.txt'
      copying tinycudann\bindings.cpp -> build\lib.win-amd64-cpython-38\tinycudann
      running build_ext
      building 'tinycudann_bindings_75._C' extension
      creating build\dependencies
      creating build\dependencies\fmt
      creating build\dependencies\fmt\src
      creating build\src
      creating build\temp.win-amd64-cpython-38
      creating build\temp.win-amd64-cpython-38\Release
      creating build\temp.win-amd64-cpython-38\Release\tinycudann
      "C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/include -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies/cutlass/include -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies/cutlass/tools/util/include -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies/fmt/include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include\TH -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" /EHsc /Tp../../dependencies/fmt/src/format.cc /Fobuild\temp.win-amd64-cpython-38\Release\../../dependencies/fmt/src/format.obj /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc /std:c++14 -DTCNN_MIN_GPU_ARCH=75 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
      format.cc
      "C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/include -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies/cutlass/include -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies/cutlass/tools/util/include -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies/fmt/include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include\TH -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" /EHsc /Tp../../dependencies/fmt/src/os.cc /Fobuild\temp.win-amd64-cpython-38\Release\../../dependencies/fmt/src/os.obj /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc /std:c++14 -DTCNN_MIN_GPU_ARCH=75 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
      os.cc
      "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin\nvcc" -c ../../src/common.cu -o build\temp.win-amd64-cpython-38\Release\../../src/common.obj -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/include -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies/cutlass/include -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies/cutlass/tools/util/include -IC:\Users\UserToto\AppData\Local\Temp\pip-req-build-s2124542/dependencies/fmt/include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include\TH -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\include -IC:\Users\UserToto\Documents\MyLocalApps\Anaconda3\envs\nerfstudio\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -std=c++14 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -DTCNN_MIN_GPU_ARCH=75 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 --use-local-env
      clÿ: Ligne de commande warning D9025ÿ: substitution de '/D__CUDA_NO_HALF_OPERATORS__' par '/U__CUDA_NO_HALF_OPERATORS__'
      clÿ: Ligne de commande warning D9025ÿ: substitution de '/D__CUDA_NO_HALF_CONVERSIONS__' par '/U__CUDA_NO_HALF_CONVERSIONS__'
      clÿ: Ligne de commande warning D9025ÿ: substitution de '/D__CUDA_NO_HALF2_OPERATORS__' par '/U__CUDA_NO_HALF2_OPERATORS__'
      common.cu
      C:/Program Files (x86)/Microsoft Visual Studio/2019/Professional/VC/Tools/MSVC/14.29.30133/include\xutility(1309): error: expected a "("
                detected during instantiation of "void std::_Adl_verify_range(const _Iter &, const _Sentinel &) [with _Iter=const char *, _Sentinel=const char *]"
      C:/Program Files (x86)/Microsoft Visual Studio/2019/Professional/VC/Tools/MSVC/14.29.30133/include\xlocale(1990): here
      
    
    ....
    
    
      C:/Program Files (x86)/Microsoft Visual Studio/2019/Professional/VC/Tools/MSVC/14.29.30133/include\xutility(131): error: expected a "("
                detected during:
                  instantiation of "void *std::_Voidify_iter(_Iter) [with _Iter=std::string *]"
      C:/Program Files (x86)/Microsoft Visual Studio/2019/Professional/VC/Tools/MSVC/14.29.30133/include\xmemory(714): here
                  instantiation of "void std::_Default_allocator_traits<_Alloc>::construct(_Alloc &, _Objty *, _Types &&...) [with _Alloc=std::allocator<std::string>, _Objty=std::string, _Types=<const char (&)[1]>]"
      C:/Users/UserToto/AppData/Local/Temp/pip-req-build-s2124542/dependencies\json/json.hpp(18440): here
                  instantiation of "T *nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::create<T,Args...>(Args &&...) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>, T=std::string, Args=<const char (&)[1]>]"
      C:/Users/UserToto/AppData/Local/Temp/pip-req-build-s2124542/dependencies\json/json.hpp(18523): here
                  instantiation of "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::json_value::json_value(nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::value_t) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]"
      C:/Users/UserToto/AppData/Local/Temp/pip-req-build-s2124542/dependencies\json/json.hpp(18947): here
                  instantiation of "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::basic_json(nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::value_t) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]"
      C:/Users/UserToto/AppData/Local/Temp/pip-req-build-s2124542/dependencies\json/json.hpp(18971): here
                  instantiation of "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::basic_json(std::nullptr_t) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>]"
      C:/Users/UserToto/AppData/Local/Temp/pip-req-build-s2124542/dependencies\json/json.hpp(24402): here
                  instantiation of "nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType> nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::parse(IteratorType, IteratorType, nlohmann::basic_json<ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType, JSONSerializer, BinaryType>::parser_callback_t, __nv_bool, __nv_bool) [with ObjectType=std::map, ArrayType=std::vector, StringType=std::string, BooleanType=__nv_bool, NumberIntegerType=int64_t, NumberUnsignedType=uint64_t, NumberFloatType=double, AllocatorType=std::allocator, JSONSerializer=nlohmann::adl_serializer, BinaryType=std::vector<uint8_t, std::allocator<uint8_t>>, IteratorType=const char *]"
      C:/Users/UserToto/AppData/Local/Temp/pip-req-build-s2124542/dependencies\json/json.hpp(26513): here
      
      Error limit reached.
      100 errors detected in the compilation of "../../src/common.cu".
      Compilation terminated.
      common.cu
      error: command 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.3\\bin\\nvcc.exe' failed with exit code 1
      [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: legacy-install-failure
    
    Encountered error while trying to install package.
    
    tinycudann
    
    note: This is an issue with the package mentioned above, not pip.
    hint: See above for output from the failure.
    
    

    My setup : RTX 2060 VisualStudio 2019

    Any help would be appreciated

    opened by ZAO29 3
Releases(v1.6)
  • v1.6(Dec 15, 2022)

    With as many improvements as have happened since April, as well as the duration for which tiny-cuda-nn's current state has been stable, I think it's about time for another release.

    Changes Since Last Release

    • Multi-GPU support: tiny-cuda-nn can now run on multiple GPUs simultaneously. It is the user's responsibility to ensure that parameters, inputs, outputs, and streams reside on the currently active CUDA device.
      • PyTorch multi-GPU operation works out-of-the-box.
    • CMake improvements: When using tiny-cuda-nn as a CMake submodule, its include folders and libraries are now tracked as part of its PUBLIC interface. This means the following two lines of CMake are sufficient for a parent project to be able to use tiny-cuda-nn in its CUDA code:
      add_subdirectory(dependencies/tiny-cuda-nn)
      target_link_libraries(<parent project> PUBLIC tiny-cuda-nn)
      
    • Assorted functionality upgrades:
      • AdamOptimizer can now perform weight clipping.
      • A new CompositeOptimizer got added (courtesy of @Solonets). It can optimize different parts of the model (such as encoding and neural net) using different optimizers, e.g. to use different learning rates.
      • CompositeEncoding can now perform sum or product reduction over its nested encodings.
      • Alignment of Encoding's input and output matrices has been simplified and should work automatically in all cases now.
      • Many situations that used to cause undefined behavior are now checked and throw descriptive exceptions.
      • Parameter initialization model->initialize_params(...) and setting model->set_params(...) has been decoupled. Calling set_params is required before being able to use a model. Calling initialize_params no longer influences the parameters of the model and instead merely returns a set of parameters that serves as a good initial state for training.
      • Snapshots are now compatible across CutlassMLP and FullyFusedMLP, as well as across float and __half precision. This means snapshots generated from any GPU can be loaded by any other GPU.
      • The hash function of GridEncoding can now be configured.
    • Countless bug fixes and performance improvements.
    Source code(tar.gz)
    Source code(zip)
  • v1.5(Apr 22, 2022)

    Changes Since Last Release

    • Encodings and neural networks in tiny-cuda-nn now share the same generic API for differentiable objects. This simplifies implementations significantly.
      • As part of this generalization, encodings and neural networks can now take and produce row- and column-major matrices (i.e. both AoS and SoA data). Additionally, input data may be strided arbitrarily, which permits slicing of input matrices without copying.
    • Added GridEncoding support for double-backward, which is useful for e.g. eikonal supervision (courtesy of @ventusff).
    • Dropped the dependency on PyEXR / tinyexr in the sample applications (using imageio / stb_image instead).
    • Fixed many bug, added several performance improvements, and improved compatibility with older GPUs.
    Source code(tar.gz)
    Source code(zip)
  • v1.4(Feb 14, 2022)

    Changes Since Last Release

    Major Changes

    • Added a PyTorch extension for using tiny-cuda-nn from within Python.
      • This functionality is considered to be in a "beta" state. Please do report any issues you come across!
      • See the this section of the README for installation/usage instructions.
      • Caveat: the overheads of Python/PyTorch can be extensive. For example, the bundled mlp_learning_an_image example is ~2x slower through PyTorch than native CUDA. (This is still faster than implementing everything from scratch in Python, but something to be aware of.)
    • Significantly reduced memory usage (sometimes 3x lower)
      • Added a GPU memory arena that permits efficient, stream-ordered allocation and de-allocation of temporary buffers. This circumvents the need for pre-allocation, resulting in often 3x lower memory consumption.
      • The memory arena uses the GPU's virtual memory mapper to get its performance without invalidating pointers or shuffling memory around.
    • All neural networks in tiny-cuda-nn now additionally support row-major input memory layout. This affords higher performance and lower memory usage when transposition was otherwise required.
      • GridEncoding naturally outputs row-major data and is thus sped-up by ~20% when followed by a neural network.
    • tiny-cuda-nn now runs on older GPUs down to compute capability 37.

    Minor Changes

    • Sped up the input gradient computation of GridEncoding by ~3x.
    • Sped up SyncedMultiStream.
    • Fixed incorrect gradients of SphericalHarmonicsEncoding.
    • Fixed incorrect gradients of GridEncoding when max_level arguments were provided or Interpolation::Nearest was used.
    Source code(tar.gz)
    Source code(zip)
  • v1.3(Jan 14, 2022)

    Changes Since Last Release

    Major Changes

    • Adds a new encoding: GridEncoding
    • tiny-cuda-nn now runs on CUDA 10.2 (previously required CUDA 11 and higher)
    • tiny-cuda-nn now only requires C++14 (previously C++17)

    Minor Changes

    • This repository now supports continuous integration builds through GitHub Actions.
    • Added support for 16 neurons wide FullyFusedMLP
    • Added support for nesting of SyncedMultiStream
    Source code(tar.gz)
    Source code(zip)
  • v1.2(Dec 15, 2021)

    Changes Since Last Release

    Major Changes

    • Adds three new encodings: (i) TriangleWave, (ii) SphericalHarmonics, (iii) Composite
    • Pitched pointers are now used to parameterize inputs and outputs of all encodings.
      • This feature enables a new Composite encoding that can apply basic encodings to different subsets of input dimensions.
      • This also removes the distinction of "encoded dims" vs. "passthrough_dims". The old behavior of passing through certain dimensions can be achieved by composing with the Identity encoding.
    • tiny-cuda-nn no longer depends on cuRAND and instead uses an implementation of the PCG32 random number generator (derived from https://github.com/wjakob/pcg32) for all randomness.
    • Activation code has been centralized within and across CUTLASS components. All neural network implementations now support all activation functions (except for the ResNet, which still only supports ReLU activations in its hidden layers).

    Minor Changes

    • Installed GPUs are now correctly automatically detected and targeted by CMake.
    • Samples and benchmarks can now be disabled when tiny-cuda-nn is used as a submodule.
    • The required CUDA version has been relaxed. Future plans include compatibility with CUDA 10.2
    Source code(tar.gz)
    Source code(zip)
  • v1.1(Oct 30, 2021)

    Changes Since Last Release

    Major Changes

    • tiny-cuda-nn now supports saving and loading snapshots via Trainer::serialize and Trainer::deserialize. These functions produce a nlohmann::json object containing the trained parameters of the model as well as, optionally, the state of the optimizer (to support continued training).

    The intended way to efficiently store the resulting json blob to disk is:

    std::ofstream f("checkpoint.msgpack", std::ios::out | std::ios::binary);
    json::to_msgpack(trainer->serialize(), f);
    

    and to load it again:

    std::ifstream f{"checkpoint.msgpack", std::ios::in | std::ios::binary};
    trainer->deserialize(json::from_msgpack(f));
    
    • tiny-cuda-nn now supports L1-type losses. Four new losses were added: L1, Relative L1, MAPE (Mean Absolute Percentage Error), and SMAPE (Symmetric Mean Absolute Percentage Error).
    • GPUMatrix has been made much less verbose. Column-major matrices now have the type GPUMatrix<T> and row-major matrices GPUMatrix<T, RM>. We also introduced a dynamically laid out matrix type: GPUMatrixDynamic<T>. As a result, the API for dynamically laid out network outputs is now simplified.

    Minor Changes

    • Extends the functionality of Network/NetworkWithInputEncoding to support features such as extraction of neuron activations or gradients of the output w.r.t. the input.
    • Added Squareplus and Softplus activations to FullyFusedMLP.
    • CMake now automatically detects the GPU architecture of the system, simplifying the compilation process for Turing and A100 GPUs (see updated README.md)
    • Removed data_factor from all losses. To achieve the same behavior, please wrap existing losses in a helper class.
    Source code(tar.gz)
    Source code(zip)
Owner
NVIDIA Research Projects
NVIDIA Research Projects
A lightweight C library for artificial neural networks

Getting Started # acquire source code and compile git clone https://github.com/attractivechaos/kann cd kann; make # learn unsigned addition (30000 sam

Attractive Chaos 617 Dec 19, 2022
Convolutional Neural Networks

Darknet Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation. D

Joseph Redmon 23.7k Jan 9, 2023
Low dependency(C++11 STL only), good portability, header-only, deep neural networks for embedded

LKYDeepNN LKYDeepNN 可訓練的深度類神經網路 (Deep Neural Network) 函式庫。 輕量,核心部份只依賴 C++11 標準函式庫,低相依性、好移植,方便在嵌入式系統上使用。 Class diagram 附有訓練視覺化 demo 程式 訓練視覺化程式以 OpenCV

Lin Kao-Yuan 44 Nov 7, 2022
Raspberry Pi guitar pedal using neural networks to emulate real amps and pedals.

NeuralPi NeuralPi is a guitar pedal using neural networks to emulate real amps and pedals on a Raspberry Pi 4. The NeuralPi software is a VST3 plugin

Keith Bloemer 865 Jan 5, 2023
An Efficient Implementation of Analytic Mesh Algorithm for 3D Iso-surface Extraction from Neural Networks

AnalyticMesh Analytic Marching is an exact meshing solution from neural networks. Compared to standard methods, it completely avoids geometric and top

Jiabao Lei 45 Dec 21, 2022
An Efficient Implementation of Analytic Mesh Algorithm for 3D Iso-surface Extraction from Neural Networks

AnalyticMesh Analytic Marching is an exact meshing solution from neural networks. Compared to standard methods, it completely avoids geometric and top

null 45 Dec 21, 2022
A header-only C++ library for deep neural networks

MiniDNN MiniDNN is a C++ library that implements a number of popular deep neural network (DNN) models. It has a mini codebase but is fully functional

Yixuan Qiu 336 Dec 22, 2022
InsNet Runs Instance-dependent Neural Networks with Padding-free Dynamic Batching.

InsNet documentation InsNet (documentation) is a powerful neural network library aiming at building instance-dependent computation graphs. It is desig

Chauncey Wang 62 Jan 3, 2023
A framework for generic hybrid two-party computation and private inference with neural networks

MOTION2NX -- A Framework for Generic Hybrid Two-Party Computation and Private Inference with Neural Networks This software is an extension of the MOTI

ENCRYPTO 15 Nov 29, 2022
TS-9 guitar pedal clone using neural networks.

TS-M1N3 TS-M1N3 is a guitar plugin clone of the TS-9 Tubescreamer overdrive pedal. Machine learning was used to train a model of both the drive and to

Keith Bloemer 29 Nov 23, 2022
A library for creating Artificial Neural Networks, for use in Machine Learning and Deep Learning algorithms.

iNeural A library for creating Artificial Neural Networks, for use in Machine Learning and Deep Learning algorithms. What is a Neural Network? Work on

Fatih Küçükkarakurt 5 Apr 5, 2022
A Tool for Verifying Neural Networks using SMT-Based Model Checking

Project Title QNNVerifier Description A Tool for Verifying Neural Networks using SMT-Based Model Checking. Using Frama-C and ESBMC as the backends. Yo

null 2 Dec 11, 2021
CoDi is a cellular automaton model for spiking neural networks

CoDi CoDi is a cellular automaton (CA) model for spiking neural networks (SNNs). CoDi is an acronym for Collect and Distribute, referring to the signa

Jett LaRue 6 May 5, 2022
A GPU (CUDA) based Artificial Neural Network library

Updates - 05/10/2017: Added a new example The program "image_generator" is located in the "/src/examples" subdirectory and was submitted by Ben Bogart

Daniel Frenzel 93 Dec 10, 2022
International Business Machines 10 Dec 20, 2022
Grouped Feedback Delay Networks for Coupled Room Modeling

Grouped Feedback Delay Networks Reverb Plugin GFDNs connect multiple spaces with different T60 characteristics and a parameterized mixing matrix to co

Orchisama Das 28 Dec 5, 2022
Parallel library for approximate inference on discrete Bayesian networks

baylib C++ library Baylib is a parallel inference library for discrete Bayesian networks supporting approximate inference algorithms both in CPU and G

Massimiliano Pronesti 26 Dec 7, 2022
Computer Networks, [email protected], taught by Hong Xu

CSCI4430, Computer Networks (Spring 2022) Administrivia Schedule Lectures: Mon 12:30pm -- 2:15pm, ERB LT (Zoom link) Tue 4:30pm -- 5:15pm, ERB LT (Zoo

Hong Xu 15 Dec 15, 2022
GPU Cloth TOP in TouchDesigner using CUDA-enabled NVIDIA Flex

This project demonstrates how to use NVIDIA FleX for GPU cloth simulation in a TouchDesigner Custom Operator. It also shows how to render dynamic meshes from the texture data using custom PBR GLSL material shaders inside TouchDesigner.

Vinícius Ginja 37 Jul 27, 2022