A toolkit for making real world machine learning and data analysis applications in C++


dlib C++ library Travis Status

Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems. See http://dlib.net for the main project documentation and API reference.

Compiling dlib C++ example programs

Go into the examples folder and type:

mkdir build; cd build; cmake .. ; cmake --build .

That will build all the examples. If you have a CPU that supports AVX instructions then turn them on like this:

mkdir build; cd build; cmake .. -DUSE_AVX_INSTRUCTIONS=1; cmake --build .

Doing so will make some things run faster.

Finally, Visual Studio users should usually do everything in 64bit mode. By default Visual Studio is 32bit, both in its outputs and its own execution, so you have to explicitly tell it to use 64bits. Since it's not the 1990s anymore you probably want to use 64bits. Do that with a cmake invocation like this:

cmake .. -G "Visual Studio 14 2015 Win64" -T host=x64 

Compiling your own C++ programs that use dlib

The examples folder has a CMake tutorial that tells you what to do. There are also additional instructions on the dlib web site.

Alternatively, if you are using the vcpkg dependency manager you can download and install dlib with CMake integration in a single command:

vcpkg install dlib

Compiling dlib Python API

Before you can run the Python example programs you must compile dlib. Type:

python setup.py install

Running the unit test suite

Type the following to compile and run the dlib unit test suite:

cd dlib/test
mkdir build
cd build
cmake ..
cmake --build . --config Release
./dtest --runall

Note that on windows your compiler might put the test executable in a subfolder called Release. If that's the case then you have to go to that folder before running the test.

This library is licensed under the Boost Software License, which can be found in dlib/LICENSE.txt. The long and short of the license is that you can use dlib however you like, even in closed source commercial software.

dlib sponsors

This research is based in part upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA) under contract number 2014-14071600010. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of ODNI, IARPA, or the U.S. Government.

  • YOLO loss

    YOLO loss

    Hi, I've been spending the last few days trying to make a loss_yolo layer for dlib, in particular the loss presented in the YOLOv3 paper.

    I think I came up with a pretty straightforward implementation but, as of now, it still does not work.

    I wondered if you could have a look. I am quite confident the loss implementation is correct, however I think I might be making some assumptions about the dlib API when the loss layer takes several inputs from the network.

    I tried to make the loss similar to the loss_mmod in the way you set the options of the layer, etc.

    So, my question is, does this way of coding the loss in dlib make sense for multiple outputs? Or is dlib doing something I don't expect?

    There's also a simple example program that takes a path containing a training.xml file (like the one from the face or vehicle detection examples).

    Thanks in advance :)

    opened by arrufat 102
  • QUESTION : is yolov3 possible in DLIB

    QUESTION : is yolov3 possible in DLIB

    I am trying to define yolov3 using dlib's dnn module. I'm stuck with the darknet53 backbone, as I want it to output the outputs of the last three layers. So far i have this:

    using namespace dlib;
    template <int outc, int kern, int stride, typename SUBNET> 
    using conv_block = leaky_relu<affine<con<outc,kern,kern,stride,stride,SUBNET>>>;
    template <int inc, typename SUBNET>
    using resblock = add_prev1<conv_block<inc,3,1,conv_block<inc/2,1,1,tag1<SUBNET>>>>;
    template<int nblocks, int outc, typename SUBNET>
    using conv_resblock = repeat<nblocks, resblock<outc,
                          conv_block<outc, 3, 2, SUBNET>>>;
    template<typename SUBNET>
    using darknet53 = tag3<conv_resblock<4, 1024,
                      tag2<conv_resblock<8, 512,
                      tag1<conv_resblock<8, 256,
                      conv_resblock<2, 128,
                      conv_resblock<1, 64,
                      conv_block<32, 3, SUBNET

    Is it possible for darknet53 to output tag1, tag2 and tag3?

    opened by pfeatherstone 83
  • Optimize dlib for POWER8 VSX

    Optimize dlib for POWER8 VSX

    Enable and optimize support for POWER8 VSX SIMD instructions on PPC64LE Linux to dlib/simd.

    $$$$ Financial bounties available. Any reasonable suggested value will be seriously considered.

    I welcome contact / replies from developers in the dlib community who are interested to work on this project.

    opened by edelsohn 79
  • Example of DCGAN

    Example of DCGAN

    Hi, I would like to contribute a DCGAN example to dlib.

    I have implemented a version of Pytorch DCGAN for C++.

    However, I would need some guidance with some things I don't know how to do. I am wondering on how I should proceed. Should I attach my current code here (around 150 lines), or make a pull request, even if the code is not able to learn anything? Maybe @edubois can help out, since he stated that he managed to make it work on https://github.com/davisking/dlib/issues/1261

    Thanks for your hard work on dlib.

    opened by arrufat 67
  • Arbitrary sized FFTs using modified kissFFT as default backend and MKL otherwise

    Arbitrary sized FFTs using modified kissFFT as default backend and MKL otherwise

    This PR adds a C++ port of kissFFT as the default backend of dlib's fft routines. All the kiss code is inserted into the dlib namespace to avoid conflicts with user code that may or may not be using original (C) kissFFT. All MKL fft code is put into it's own separate header file. Now dlib/matrix_fft.h simply calls the right wrappers depending on a pre-processor macro (as before). I've removed the FFTW wrapper code as per @davisking's request as the fftw planner isn't thread safe. I've also removed the original fft backend code, again, as per @davisking's request.

    Note that this PR isn't quite ready yet. Need more unit tests. Not quite sure why MKL wrappers only worked for matrix<complex<double>> before. There's no reason why it couldn't work for matrix<complex<float>> if using DFTI_SINGLE instead of DFTI_DOUBLE. (maybe user code had to do a cast?) So need more units tests for MKL wrappers.

    Not quite happy with the number of copies at the moment. When passing const matrix_exp<EXP>& data as input, there are 2 copies: one to evaluate the matrix expression to matrix<typename EXP::type> and one to do either the in-place fft (copy is done in the implementation details of the fft functions) or out-of-place fft (copy is done explicitly in dlib/matrix_fft.h). For example, if you want to do the fft of std::vector<std::complex>, you have to use dlib::mat which returns a matrix expression then dlib::fft will evaluate that to dlib::matrix<std::complex<float>>. So there is a copy of std::vector to dlib::matrix that is unnecessary. Could add some function overloads to support std::vector. Don't know what's the best thing to do without dirtying the API. Could benefit from @davisking's advice.

    I think if we've gone this far, we should support real FFTs so the requirement that input should be complex vanishes. Output will be complex though. Both KISS and MKL support this so there shouldn't be a lot of leg work. More unit tests.

    opened by pfeatherstone 56
  • Add dnn self supervised learning example

    Add dnn self supervised learning example

    Hi, recently I've been interested in self-supervised learning (SSL) methods in deep learning. I noticed that dlib has support for unsupervised loss layers by just leaving out the training_label_type typedef and the truth iterator argument to compute_loss_value_and_gradient().

    However, most methods I've read about and tried are quite complicated (hard-negative mining, predictors, stop gradient, exponential moving average of the weights between both architectures) due to the need of breaking the symmetry between both branches to avoid collapse. But recently, I came across a simple method named Barlow Twins: Self-Supervised Learning via Redundancy Reduction. The idea is as follows:

    1. forward two augmented versions of an image
    2. compute the empirical cross-correlation matrix between both feature representations
    3. make that matrix as close to the identity as possible.

    That prevents collapse, since it's making each individual dimension of the feature vectors to focus on different tasks, and thus avoiding redundancies between dimensions (it needs a high dimension for it to work, since it relies on sparse representations).

    So far, I've implemented:


    It takes a std::vector<std::pair<matrix<rgb_pixel>, matrix<rgb_pixel>>> and puts the first elements of the pairs in the first half of the batch and the second elements of the batch in the second half of the batch. This allows computing batch normalization on each half efficiently (done on the loss layer).


    I tried to follow the official paper as close as possible, and made use of the awesome Matrix Calculus site for the gradients (having element wise operations, (off-)diagonal terms, and summations was a bit tedious to do manually 😅)

    But… I am experiencing some difficulties:

    • if I use a dnn_trainer I get a segfault (but if I train manually, like in the code, it works: at least the loss goes down)
    • if I use a batch size larger than 64, I also get a segfault (but I have plenty of RAM/VRAM)

    So maybe I misunderstood something about how to implement the input layer or the unsupervised loss layer… If you could have a look at some point… You can just run the example by giving a path to a folder containing the CIFAR-10 dataset.

    It would be great to have an example of SSL on dlib.

    Thanks in advance (all the code is inside the example program, I will make a proper PR if we manage to get this work, and you are interested, since the method is fairly new)

    opened by arrufat 54
  • Add support for fused convolutions

    Add support for fused convolutions

    I've been playing a bit with the idea of having fused convolutions (convolution + batch_norm) in dlib. I think the first step would be to move all the operations that are done by the affine_ layer into the convolution, that is, update the bias of the convolution and re-scale the filters.

    This PR adds some helper methods that allow doing this. The next step could be adding a new layer that can be constructed from an affine_ layer and it's a no-op, like the tag layers, or add a version of the affine layer that does nothing (just outputs its input, without copying or anything). How would you approach this?

    Finally, here's an example that uses a visitor to update the convolutions that are below an affine layer. It can be build from by putting the file into the examples folder and loading the pretrained resnet 50 from the dnn_introduction3_ex.cpp. If we manage to make something interesting out of it, maybe it would be interesting to have this visitor, too.

    #include "resnet.h"
    #include <dlib/dnn.h>
    #include <dlib/image_io.h>
    using namespace std;
    using namespace dlib;
    class visitor_fuse_convolutions
        template <typename T> void fuse_convolutions(T&) const
            // disable other layer types
        // handle the standard case (convolutional layer followed by affine;
        template <long nf, long nr, long nc, int sy, int sx, int py, int px, typename U, typename E>
        void fuse_convolutions(add_layer<affine_, add_layer<con_<nf, nr, nc, sy, sx, py, px>, U>, E>& l)
            // get the parameters from the affine layer as alias_tensor_instance
            auto gamma = l.layer_details().get_gamma();
            auto beta = l.layer_details().get_beta();
            // get the convolution below the affine layer and its paramaters
            auto& conv = l.subnet().layer_details();
            const long num_filters_out = conv.num_filters();
            const long num_rows = conv.nr();
            const long num_cols = conv.nc();
            tensor& params = conv.get_layer_params();
            // guess the number of input filters
            long num_filters_in;
            if (conv.bias_is_disabled())
                num_filters_in = params.size() / num_filters_out / num_rows / num_cols;
                num_filters_in = (params.size() - num_filters_out) / num_filters_out / num_rows / num_cols;
            // set the new number of parameters for this convolution
            const size_t num_params = num_filters_in * num_filters_out * num_rows * num_cols + num_filters_out;
            alias_tensor filters(num_filters_out, num_filters_in, num_rows, num_cols);
            alias_tensor biases(1, num_filters_out);
            if (conv.bias_is_disabled())
                resizable_tensor new_params = params;
                biases(new_params, filters.size()) = 0;
                params = new_params;
            // update the biases
            auto b = biases(params, filters.size());
            b+= mat(beta);
            // rescale the filters
            DLIB_CASSERT(filters.num_samples() == gamma.k());
            auto t = filters(params, 0);
            float* f = t.host();
            const float* g = gamma.host();
            for (long n = 0; n < filters.num_samples(); ++n)
                for (long k = 0; k < filters.k(); ++k)
                    for (long r = 0; r < filters.nr(); ++r)
                        for (long c = 0; c < filters.nc(); ++c)
                            f[tensor_index(t, n, k, r, c)] *= g[n];
            // reset the affine layer
            gamma = 1;
            beta = 0;
        template <typename input_layer_type>
        void operator()(size_t , input_layer_type& l) const
            // ignore other layers
        template <typename T, typename U, typename E>
        void operator()(size_t , add_layer<T, U, E>& l)
    int main(const int argc, const char** argv)
        resnet::infer_50 net1, net2;
        std::vector<std::string> labels;
        deserialize("resnet50_1000_imagenet_classifier.dnn") >> net1 >> labels;
        net2 = net1;
        matrix<rgb_pixel> image;
        load_image(image, "elephant.jpg");
        const auto& label1 = labels[net1(image)];
        const auto& out1 = net1.subnet().get_output();
        resizable_tensor probs(out1);
        tt::softmax(probs, out1);
        cout << "pred1: " << label1 << " (" << max(mat(probs)) << ")" << endl;
        // fuse the convolutions in the network
        dlib::visit_layers_backwards(net2, visitor_fuse_convolutions());
        const auto& label2 = labels[net2(image)];
        const auto& out2 = net2.subnet().get_output();
        tt::softmax(probs, out2);
        cout << "pred2: " << label2 << " (" << max(mat(probs)) << ")" << endl;
        cout << "max abs difference: " << max(abs(mat(out1) - mat(out2))) << endl;
        DLIB_CASSERT(max(abs(mat(out1) - mat(out2))) < 1e-2);
    catch (const exception& e)
        cout << e.what() << endl;
        return EXIT_FAILURE;

    output with this image (elephant.jpg): elephant

    pred1: African_elephant (0.962677)
    pred2: African_elephant (0.962623)
    max abs difference: 0.00436211

    UPDATE: make visitor more generic and show a results with a real image

    opened by arrufat 53
  • Semantic Segmentation Functionality

    Semantic Segmentation Functionality

    I'm interested in using dlib for semantic segmentation. I think the only necessary features would be:

    • Loss function - it doesn't look like loss_multiclass_log would work for this
    • "Unpooling" upsampling layer - there are a few ways to do this

    Is this something you're interested in supporting? I'm planning to add this functionality regardless, just wanted to check in.

    Regarding implementation, would the upsampling be something you would want added to the pooling class or its own upsample class?

    enhancement help wanted 
    opened by davidmascharka 53
  • Adding an install target to dlib's CMakeLists

    Adding an install target to dlib's CMakeLists

    This PR covers the first part of items discussed in #34, namely, the installation.

    A follow-up PR (hopefully tomorrow) should cover the 2nd part, namely, the generation and installation of dlibConfig.cmake

    opened by severin-lemaignan 51
  • Trying to compile dlib 19.20 with cuda 11 and cudnn 8, or cuda 10.1 and cudnn 7.6.4

    Trying to compile dlib 19.20 with cuda 11 and cudnn 8, or cuda 10.1 and cudnn 7.6.4

    Environment: Pop-Os (Ubuntu 20.04) Gcc 9 and 8 (tried both) Cmake 3.16.3 dlib 19.20.99 python 3.8.2 NVIDIA Quadro P600

    Expected Behavior

    Compiling with cuda..

    Current Behavior

    ..... -- Found cuDNN: /usr/lib/x86_64-linux-gnu/libcudnn.so -- Building a CUDA test project to see if your compiler is compatible with CUDA... -- Checking if you have the right version of cuDNN installed. -- *** Found cuDNN, but it looks like the wrong version so dlib will not use it. *** -- *** Dlib requires cuDNN V5.0 OR GREATER. Since cuDNN is not found DLIB WILL NOT USE CUDA. .....

    • Where did you get dlib: git clone https://github.com/davisking/dlib.git

    I first tried downloading the latest (and suggested) version of cuda from Nvidia site, which was Cuda 11. then downloaded cuDNN for Cuda 11, which was 8.0.0 (runtime and dev .deb packets) installed them following Nvidia method.

    when compiling "cudnn_samples_v8" everything works, so I think installations went ok. but no way to get dlib compiled with Cuda.

    I've tried to uninstall cuda 11 and cudnn 8 and install cuda 10.1 and cudnn 7.6 (suggested for cuda 10.1) but the result is the same. Every time I erased the build directory, and compiled in 2 ways: as per dlib instructions, using: $ sudo python3 setup.py install or: $ cmake .. -DDLIB_USE_CUDA=1 -DUSE_AVX_INSTRUCTIONS=1

    Any suggestion? Thanks.

    opened by spiderdab 48
  • Arbitrary FFT sizes

    Arbitrary FFT sizes

    Feature suggestion: allow the fft routine to do arbitrary sized FFTs. Let FFTW, lapack, BLAS or whatever do the magic of choosing the correct kernels for the input

    opened by pfeatherstone 47
  • ValueError: path 'dlib/CMakeLists.txt/' cannot end with '/'

    ValueError: path 'dlib/CMakeLists.txt/' cannot end with '/'

    I am installing GPU accelerated dlib. My compiling orders are:

    git clone https://github.com/davisking/dlib.git cd dlib mkdir build cd build cmake .. -DDLIB_USE_CUDA=1 -DUSE_AVX_INSTRUCTIONS=1 cmake --build . cd .. python setup.py install --set USE_AVX_INSTRUCTIONS=1 --set DLIB_USE_CUDA=1

    The environment is Anoconda. And the version of python is Python 3.6.13 :: Anaconda, Inc.

    And I encountered an error. ValueError: path 'dlib/CMakeLists.txt/' cannot end with '/' DlibInstallError

    I donot know how to fix this. I need your help.

    opened by Azheng-tora 0
  • Type erasure tooling

    Type erasure tooling

    Still working on dnn2, was implementing storage policies for type erasure and thought it was sufficiently general and useful that it merited its own PR. I've applied it to dlib::any as an example. We could trivially implement dlib::basic_any with the storage policy as a template type if we wanted. Don't care but why not.

    opened by pfeatherstone 7
  • Failure to build with py11

    Failure to build with py11

    Expected Behavior

    dlib should be able to build with python 3.11

    Current Behavior

    dib failed to build with newest version of python 3.11. build.log

    Steps to Reproduce

    Attempt to build dlib with python 3.11. Provided log above to see the failure.

    • Version: 19.24
    • Where did you get dlib: dlib.net via https://src.fedoraproject.org/rpms/dlib/
    • Platform: Fedora Linux 36
    • Compiler: gcc 12.2.1
    opened by luyatshimbalanga 5
  • Extra Random Forest (aka Extremely Randomized Trees) with dlib

    Extra Random Forest (aka Extremely Randomized Trees) with dlib

    Just wondering if anyone knows how to transform the dlib Random Forest Regressor into the "Extra Random" (aka "ExtraTreesRegressor", "Extremely Randomized Trees", "Extra Random Forest", etc) version of the algorithm? Will probably post here when I figure it out, if no one answers.

    opened by Sheldonfrith 2
  • Attempting to modernise the cmake build script

    Attempting to modernise the cmake build script

    This PR requires CMake 3.17 or higher. Maybe this is too new. If anything, this is a good exercise in trying to understand cmake in more detail. The key points are:

    • use the target_xxx() functions
    • don't need variables for needed libraries, includes or sources. You can just use cmake functions to set things on targets
    • use the built-in helpers for findings libraries like PNG, JPEG, GIF, SQLITE, even BLAS, LAPACK and CUDA It's not ready yet of course.
    opened by pfeatherstone 12
Davis E. King
Davis E. King
A C++ standalone library for machine learning

Flashlight: Fast, Flexible Machine Learning in C++ Quickstart | Installation | Documentation Flashlight is a fast, flexible machine learning library w

Facebook Research 4.5k Sep 25, 2022
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

mlpack 4.1k Sep 19, 2022
null 5.6k Sep 21, 2022
Flashlight is a C++ standalone library for machine learning

Flashlight is a fast, flexible machine learning library written entirely in C++ from the Facebook AI Research Speech team and the creators of Torch and Deep Speech.

null 4.5k Sep 18, 2022
Edge ML Library - High-performance Compute Library for On-device Machine Learning Inference

Edge ML Library (EMLL) offers optimized basic routines like general matrix multiplications (GEMM) and quantizations, to speed up machine learning (ML) inference on ARM-based devices. EMLL supports fp32, fp16 and int8 data types. EMLL accelerates on-device NMT, ASR and OCR engines of Youdao, Inc.

NetEase Youdao 176 Jul 21, 2022
Machine Learning Framework for Operating Systems - Brings ML to Linux kernel

Machine Learning Framework for Operating Systems - Brings ML to Linux kernel

File systems and Storage Lab (FSL) 182 Aug 26, 2022
ML++ - A library created to revitalize C++ as a machine learning front end

ML++ Machine learning is a vast and exiciting discipline, garnering attention from specialists of many fields. Unfortunately, for C++ programmers and

marc 1k Sep 15, 2022
A Modern C++ Data Sciences Toolkit

MeTA: ModErn Text Analysis Please visit our web page for information and tutorials about MeTA! Build Status (by branch) master: develop: Outline Intro

null 647 Sep 19, 2022
A RGB-D SLAM system for structural scenes, which makes use of point-line-plane features and the Manhattan World assumption.

This repo proposes a RGB-D SLAM system specifically designed for structured environments and aimed at improved tracking and mapping accuracy by relying on geometric features that are extracted from the surrounding.

Yanyan Li 242 Sep 21, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.2k Sep 24, 2022
Samsung Washing Machine replacing OS control unit

hacksung Samsung Washing Machine WS1702 replacing OS control unit More info at https://www.hackster.io/roni-bandini/dead-washing-machine-returns-to-li

null 24 May 12, 2022
R2LIVE is a robust, real-time tightly-coupled multi-sensor fusion framework, which fuses the measurement from the LiDAR, inertial sensor, visual camera to achieve robust, accurate state estimation.

R2LIVE is a robust, real-time tightly-coupled multi-sensor fusion framework, which fuses the measurement from the LiDAR, inertial sensor, visual camera to achieve robust, accurate state estimation.

HKU-Mars-Lab 576 Sep 5, 2022
Distributed (Deep) Machine Learning Community 681 Aug 14, 2022
Caffe: a fast open framework for deep learning.

Caffe Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR)/The Berke

Berkeley Vision and Learning Center 32.9k Sep 22, 2022
RNNLIB is a recurrent neural network library for sequence learning problems. Forked from Alex Graves work http://sourceforge.net/projects/rnnl/

Origin The original RNNLIB is hosted at http://sourceforge.net/projects/rnnl while this "fork" is created to repeat results for the online handwriting

Sergey Zyrianov 875 Sep 6, 2022
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

Frog - A Tagger-Lemmatizer-Morphological-Analyzer-Dependency-Parser for Dutch Copyright 2006-2020 Ko van der Sloot, Maarten van Gompel, Antal van den

Language Machines 70 Aug 24, 2022
oneAPI Data Analytics Library (oneDAL)

Intel® oneAPI Data Analytics Library Installation | Documentation | Support | Examples | Samples | How to Contribute Intel® oneAPI Data Analytics Libr

oneAPI-SRC 518 Sep 13, 2022
SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

SSL_SLAM2 Lightweight 3-D Localization and Mapping for Solid-State LiDAR (Intel Realsense L515 as an example) This repo is an extension work of SSL_SL

Wang Han 王晗 322 Sep 9, 2022