mlpack is an intuitive, fast, and flexible C++ machine learning library with bindings to other languages. It is meant to be a machine learning analog to LAPACK, and aims to implement a wide array of machine learning methods and functions as a "swiss army knife" for machine learning researchers. In addition to its powerful C++ interface, mlpack also provides command-line programs, Python bindings, Julia bindings, Go bindings and R bindings.

mlpack uses an open governance model and is fiscally sponsored by NumFOCUS. Consider making a tax-deductible donation to help the project pay for developer time, professional services, travel, workshops, and a variety of other needs.

0. Contents

  1. Introduction
  2. Citation details
  3. Dependencies
  4. Building mlpack from source
  5. Running mlpack programs
  6. Using mlpack from Python
  7. Further documentation
  8. Bug reporting

1. Introduction

The mlpack website can be found at and it contains numerous tutorials and extensive documentation. This README serves as a guide for what mlpack is, how to install it, how to run it, and where to find more documentation. The website should be consulted for further information:

2. Citation details

If you use mlpack in your research or software, please cite mlpack using the citation below (given in BibTeX format):

    title     = {mlpack 3: a fast, flexible machine learning library},
    author    = {Curtin, Ryan R. and Edel, Marcus and Lozhnikov, Mikhail and
                 Mentekidis, Yannis and Ghaisas, Sumedh and Zhang,
    journal   = {Journal of Open Source Software},
    volume    = {3},
    issue     = {26},
    pages     = {726},
    year      = {2018},
    doi       = {10.21105/joss.00726},
    url       = {}

Citations are beneficial for the growth and improvement of mlpack.

3. Dependencies

mlpack has the following dependencies:

  Armadillo      >= 8.400.0
  Boost (math_c99, spirit) >= 1.58.0
  CMake          >= 3.2.2
  ensmallen      >= 2.10.0
  cereal         >= 1.1.2

All of those should be available in your distribution's package manager. If not, you will have to compile each of them by hand. See the documentation for each of those packages for more information.

If you would like to use or build the mlpack Python bindings, make sure that the following Python packages are installed:

  cython >= 0.24
  pandas >= 0.15.0

If you would like to build the Julia bindings, make sure that Julia >= 1.3.0 is installed.

If you would like to build the Go bindings, make sure that Go >= 1.11.0 is installed with this package:


If you would like to build the R bindings, make sure that R >= 4.0 is installed with these R packages.

 Rcpp >= 0.12.12
 RcppArmadillo >= 0.8.400.0
 RcppEnsmallen >=
 BH >= 1.58

If the STB library headers are available, image loading support will be compiled.

If you are compiling Armadillo by hand, ensure that LAPACK and BLAS are enabled.

4. Building mlpack from source

This document discusses how to build mlpack from source. These build directions will work for any Linux-like shell environment (for example Ubuntu, macOS, FreeBSD etc). However, mlpack is in the repositories of many Linux distributions and so it may be easier to use the package manager for your system. For example, on Ubuntu, you can install the mlpack library and command-line executables (e.g. mlpack_pca, mlpack_kmeans etc.) with the following command:

$ sudo apt-get install libmlpack-dev mlpack-bin

On Fedora or Red Hat (EPEL): $ sudo dnf install mlpack-devel mlpack-bin

Note: Older Ubuntu versions may not have the most recent version of mlpack available---for instance, at the time of this writing, Ubuntu 16.04 only has mlpack 3.4.2 available. Options include upgrading your Ubuntu version, finding a PPA or other non-official sources, or installing with a manual build.

There are some useful pages to consult in addition to this section:

mlpack uses CMake as a build system and allows several flexible build configuration options. You can consult any of the CMake tutorials for further documentation, but this tutorial should be enough to get mlpack built and installed.

First, unpack the mlpack source and change into the unpacked directory. Here we use mlpack-x.y.z where x.y.z is the version.

$ tar -xzf mlpack-x.y.z.tar.gz
$ cd mlpack-x.y.z

Then, make a build directory. The directory can have any name, but 'build' is sufficient.

$ mkdir build
$ cd build

The next step is to run CMake to configure the project. Running CMake is the equivalent to running ./configure with autotools. If you run CMake with no options, it will configure the project to build with no debugging symbols and no profiling information:

$ cmake ../

Options can be specified to compile with debugging information and profiling information:

$ cmake -D DEBUG=ON -D PROFILE=ON ../

Options are specified with the -D flag. The allowed options include:

DEBUG=(ON/OFF): compile with debugging symbols
PROFILE=(ON/OFF): compile with profiling symbols
ARMA_EXTRA_DEBUG=(ON/OFF): compile with extra Armadillo debugging symbols
BOOST_ROOT=(/path/to/boost/): path to root of boost installation
ARMADILLO_INCLUDE_DIR=(/path/to/armadillo/include/): path to Armadillo headers
ARMADILLO_LIBRARY=(/path/to/armadillo/ Armadillo library
BUILD_CLI_EXECUTABLES=(ON/OFF): whether or not to build command-line programs
BUILD_PYTHON_BINDINGS=(ON/OFF): whether or not to build Python bindings
PYTHON_EXECUTABLE=(/path/to/python_version): Path to specific Python executable
PYTHON_INSTALL_PREFIX=(/path/to/python/): Path to root of Python installation
BUILD_JULIA_BINDINGS=(ON/OFF): whether or not to build Julia bindings
JULIA_EXECUTABLE=(/path/to/julia): Path to specific Julia executable
BUILD_GO_BINDINGS=(ON/OFF): whether or not to build Go bindings
GO_EXECUTABLE=(/path/to/go): Path to specific Go executable
BUILD_GO_SHLIB=(ON/OFF): whether or not to build shared libraries required by Go bindings
BUILD_R_BINDINGS=(ON/OFF): whether or not to build R bindings
R_EXECUTABLE=(/path/to/R): Path to specific R executable
BUILD_TESTS=(ON/OFF): whether or not to build tests
BUILD_SHARED_LIBS=(ON/OFF): compile shared libraries as opposed to
   static libraries
DISABLE_DOWNLOADS=(ON/OFF): whether to disable all downloads during build
DOWNLOAD_ENSMALLEN=(ON/OFF): If ensmallen is not found, download it
ENSMALLEN_INCLUDE_DIR=(/path/to/ensmallen/include): path to include directory
   for ensmallen
DOWNLOAD_STB_IMAGE=(ON/OFF): If STB is not found, download it
STB_IMAGE_INCLUDE_DIR=(/path/to/stb/include): path to include directory for
   STB image library
USE_OPENMP=(ON/OFF): whether or not to use OpenMP if available
BUILD_DOCS=(ON/OFF): build Doxygen documentation, if Doxygen is available
   (default ON)

Other tools can also be used to configure CMake, but those are not documented here. See this section of the build guide for more details, including a full list of options, and their default values.

By default, command-line programs will be built, and if the Python dependencies (Cython, setuptools, numpy, pandas) are available, then Python bindings will also be built. OpenMP will be used for parallelization when possible by default.

Once CMake is configured, building the library is as simple as typing 'make'. This will build all library components as well as 'mlpack_test'.

$ make

If you do not want to build everything in the library, individual components of the build can be specified:

$ make mlpack_pca mlpack_knn mlpack_kfn

If the build fails and you cannot figure out why, register an account on Github and submit an issue. The mlpack developers will quickly help you figure it out:

mlpack on Github

Alternately, mlpack help can be found in IRC at #mlpack on

If you wish to install mlpack to /usr/local/include/mlpack/, /usr/local/lib/, and /usr/local/bin/, make sure you have root privileges (or write permissions to those three directories), and simply type

$ make install

You can now run the executables by name; you can link against mlpack with -lmlpack and the mlpack headers are found in /usr/local/include/mlpack/ and if Python bindings were built, you can access them with the mlpack package in Python.

If running the programs (i.e. $ mlpack_knn -h) gives an error of the form

error while loading shared libraries: cannot open shared object file: No such file or directory

then be sure that the runtime linker is searching the directory where was installed (probably /usr/local/lib/ unless you set it manually). One way to do this, on Linux, is to ensure that the LD_LIBRARY_PATH environment variable has the directory that contains Using bash, this can be set easily:

export LD_LIBRARY_PATH="/usr/local/lib/:$LD_LIBRARY_PATH"

(or whatever directory is installed in.)

5. Running mlpack programs

After building mlpack, the executables will reside in build/bin/. You can call them from there, or you can install the library and (depending on system settings) they should be added to your PATH and you can call them directly. The documentation below assumes the executables are in your PATH.

Consider the 'mlpack_knn' program, which finds the k nearest neighbors in a reference dataset of all the points in a query set. That is, we have a query and a reference dataset. For each point in the query dataset, we wish to know the k points in the reference dataset which are closest to the given query point.

Alternately, if the query and reference datasets are the same, the problem can be stated more simply: for each point in the dataset, we wish to know the k nearest points to that point.

Each mlpack program has extensive help documentation which details what the method does, what each of the parameters is, and how to use them:

$ mlpack_knn --help

Running mlpack_knn on one dataset (that is, the query and reference datasets are the same) and finding the 5 nearest neighbors is very simple:

$ mlpack_knn -r dataset.csv -n neighbors_out.csv -d distances_out.csv -k 5 -v

The -v (--verbose) flag is optional; it gives informational output. It is not unique to mlpack_knn but is available in all mlpack programs. Verbose output also gives timing output at the end of the program, which can be very useful.

6. Using mlpack from Python

If mlpack is installed to the system, then the mlpack Python bindings should be automatically in your PYTHONPATH, and importing mlpack functionality into Python should be very simple:

>>> from mlpack import knn

Accessing help is easy:

>>> help(knn)

The API is similar to the command-line programs. So, running knn() (k-nearest-neighbor search) on the numpy matrix dataset and finding the 5 nearest neighbors is very simple:

>>> output = knn(reference=dataset, k=5, verbose=True)

This will store the output neighbors in output['neighbors'] and the output distances in output['distances']. Other mlpack bindings function similarly, and the input/output parameters exactly match those of the command-line programs.

7. Further documentation

The documentation given here is only a fraction of the available documentation for mlpack. If doxygen is installed, you can type make doc to build the documentation locally. Alternately, up-to-date documentation is available for older versions of mlpack:

8. Bug reporting

(see also mlpack help)

If you find a bug in mlpack or have any problems, numerous routes are available for help.

Github is used for bug tracking, and can be found at It is easy to register an account and file a bug there, and the mlpack development team will try to quickly resolve your issue.

In addition, mailing lists are available. The mlpack discussion list is available at

mlpack discussion list

and the git commit list is available at

commit list

Lastly, the IRC channel #mlpack on Freenode can be used to get help.

  • 3.4.2(Oct 28, 2020)

    Released Oct. 28, 2020.

    • Added Mean Absolute Percentage Error.
    • Added Softmin activation function as layer in ann/layer.
    • Fix spurious ARMA_64BIT_WORD compilation warnings on 32-bit systems (#2665).
    Source code(tar.gz)
    Source code(zip)
  • 3.4.1(Sep 7, 2020)

    Released Sep. 7, 2020.

    • Fix incorrect parsing of required matrix/model parameters for command-line bindings (#2600).

    • Add manual type specification support to data::Load() and data::Save() (#2084, #2135, #2602).

    • Remove use of internal Armadillo functionality (#2596, #2601, #2602).

    Source code(tar.gz)
    Source code(zip)
  • 3.4.0(Sep 1, 2020)

    Released Sept. 1st, 2020.

    • Issue warnings when metrics produce NaNs in KFoldCV (#2595).

    • Added bindings for R during Google Summer of Code (#2556).

    • Added common striptype function for all bindings (#2556).

    • Refactored common utility function of bindings to bindings/util (#2556).

    • Renamed InformationGain to HoeffdingInformationGain in methods/hoeffding_trees/information_gain.hpp (#2556).

    • Added macro for changing stream of printing and warnings/errors (#2556).

    • Added Spatial Dropout layer (#2564).

    • Force CMake to show error when it didn't find Python/modules (#2568).

    • Refactor ProgramInfo() to separate out all the different information (#2558).

    • Add bindings for one-hot encoding (#2325).

    • Added Soft Actor-Critic to RL methods (#2487).

    • Added Categorical DQN to q_networks (#2454).

    • Added N-step DQN to q_networks (#2461).

    • Add Silhoutte Score metric and Pairwise Distances (#2406).

    • Add Go bindings for some missed models (#2460).

    • Replace boost program_options dependency with CLI11 (#2459).

    • Additional functionality for the ARFF loader (#2486); use case sensitive categories (#2516).

    • Add bayesian_linear_regression binding for the command-line, Python, Julia, and Go. Also called "Bayesian Ridge", this is equivalent to a version of linear regression where the regularization parameter is automatically tuned (#2030).

    • Fix defeatist search for spill tree traversals (#2566, #1269).

    • Fix incremental training of logistic regression models (#2560).

    • Change default configuration of BUILD_PYTHON_BINDINGS to OFF (#2575).

    Source code(tar.gz)
    Source code(zip)
  • 3.3.2(Jun 18, 2020)

    Released June 18, 2020.

    • Added Noisy DQN to q_networks (#2446).

    • Add [preview release of] Go bindings (#1884).

    • Added Dueling DQN to q_networks, Noisy linear layer to ann/layer and Empty loss to ann/loss_functions (#2414).

    • Storing and adding accessor method for action in q_learning (#2413).

    • Added accessor methods for ANN layers (#2321).

    • Addition of Elliot activation function (#2268).

    • Add adaptive max pooling and adaptive mean pooling layers (#2195).

    • Add parameter to avoid shuffling of data in preprocess_split (#2293).

    • Add MatType parameter to LSHSearch, allowing sparse matrices to be used for search (#2395).

    • Documentation fixes to resolve Doxygen warnings and issues (#2400).

    • Add Load and Save of Sparse Matrix (#2344).

    • Add Intersection over Union (IoU) metric for bounding boxes (#2402).

    • Add Non Maximal Supression (NMS) metric for bounding boxes (#2410).

    • Fix no_intercept and probability computation for linear SVM bindings (#2419).

    • Fix incorrect neighbors for k > 1 searches in approx_kfn binding, for the QDAFN algorithm (#2448).

    • Add RBF layer in ann module to make RBFN architecture (#2261).

    Source code(tar.gz)
    Source code(zip)
  • 3.3.1(Apr 30, 2020)

    Released April 29th, 2020.

    • Minor Julia and Python documentation fixes (#2373).

    • Updated terminal state and fixed bugs for Pendulum environment (#2354, #2369).

    • Added EliSH activation function (#2323).

    • Add L1 Loss function (#2203).

    • Pass CMAKE_CXX_FLAGS (compilation options) correctly to Python build (#2367).

    • Expose ensmallen Callbacks for sparseautoencoder (#2198).

    • Bugfix for LARS class causing invalid read (#2374).

    • Add serialization support from Julia; use mlpack.serialize() and mlpack.deserialize() to save and load from IOBuffers.

    Source code(tar.gz)
    Source code(zip)
  • 3.3.0(Apr 7, 2020)

    Released April 7th, 2020.

    • Templated return type of Forward function of loss functions (#2339).

    • Added R2 Score regression metric (#2323).

    • Added mean squared logarithmic error loss function for neural networks (#2210).

    • Added mean bias loss function for neural networks (#2210).

    • The DecisionStump class has been marked deprecated; use the DecisionTree class with NoRecursion=true or use ID3DecisionStump instead (#2099).

    • Added probabilities_file parameter to get the probabilities matrix of AdaBoost classifier (#2050).

    • Fix STB header search paths (#2104).

    • Add DISABLE_DOWNLOADS CMake configuration option (#2104).

    • Add padding layer in TransposedConvolutionLayer (#2082).

    • Fix pkgconfig generation on non-Linux systems (#2101).

    • Use log-space to represent HMM initial state and transition probabilities (#2081).

    • Add functions to access parameters of Convolution and AtrousConvolution layers (#1985).

    • Add Compute Error function in lars regression and changing Train function to return computed error (#2139).

    • Add Julia bindings (#1949). Build settings can be controlled with the BUILD_JULIA_BINDINGS=(ON/OFF) and JULIA_EXECUTABLE=/path/to/julia CMake parameters.

    • CMake fix for finding STB include directory (#2145).

    • Add bindings for loading and saving images (#2019); mlpack_image_converter from the command-line, mlpack.image_converter() from Python.

    • Add normalization support for CF binding (#2136).

    • Add Mish activation function (#2158).

    • Update init_rules in AMF to allow users to merge two initialization rules (#2151).

    • Add GELU activation function (#2183).

    • Better error handling of eigendecompositions and Cholesky decompositions (#2088, #1840).

    • Add LiSHT activation function (#2182).

    • Add Valid and Same Padding for Transposed Convolution layer (#2163).

    • Add CELU activation function (#2191)

    • Add Log-Hyperbolic-Cosine Loss function (#2207)

    • Change neural network types to avoid unnecessary use of rvalue references (#2259).

    • Bump minimum Boost version to 1.58 (#2305).

    • Refactor STB support so HAS_STB macro is not needed when compiling against mlpack (#2312).

    • Add Hard Shrink Activation Function (#2186).

    • Add Soft Shrink Activation Function (#2174).

    • Add Hinge Embedding Loss Function (#2229).

    • Add Cosine Embedding Loss Function (#2209).

    • Add Margin Ranking Loss Function (#2264).

    • Bugfix for incorrect parameter vector sizes in logistic regression and softmax regression (#2359).

    Source code(tar.gz)
    Source code(zip)
  • 3.2.1(Nov 26, 2019)

    Released Oct. 1, 2019. (But I forgot to release it on Github; sorry about that.)

    • Enforce CMake version check for ensmallen #2032.
    • Fix CMake check for Armadillo version #2029.
    • Better handling of when STB is not installed #2033.
    • Fix Naive Bayes classifier computations in high dimensions #2022.
    Source code(tar.gz)
    Source code(zip)
  • 3.2.0(Sep 26, 2019)

    Released Sept. 25, 2019.

    • Fix occasionally-failing RADICAL test (#1924).

    • Fix gcc 9 OpenMP compilation issue (#1970).

    • Added support for loading and saving of images (#1903).

    • Add Multiple Pole Balancing Environment (#1901, #1951).

    • Added functionality for scaling of data (#1876); see the command-line binding mlpack_preprocess_scale or Python binding preprocess_scale().

    • Add new parameter maximum_depth to decision tree and random forest bindings (#1916).

    • Fix prediction output of softmax regression when test set accuracy is calculated (#1922).

    • Pendulum environment now checks for termination. All RL environments now have an option to terminate after a set number of time steps (no limit by default) (#1941).

    • Add support for probabilistic KDE (kernel density estimation) error bounds when using the Gaussian kernel (#1934).

    • Fix negative distances for cover tree computation (#1979).

    • Fix cover tree building when all pairwise distances are 0 (#1986).

    • Improve KDE pruning by reclaiming not used error tolerance (#1954, #1984).

    • Optimizations for sparse matrix accesses in z-score normalization for CF (#1989).

    • Add kmeans_max_iterations option to GMM training binding gmm_train_main.

    • Bump minimum Armadillo version to 8.400.0 due to ensmallen dependency requirement (#2015).

    Source code(tar.gz)
    Source code(zip)
  • mlpack-3.1.1(May 27, 2019)

    Released May 26, 2019.

    • Fix random forest bug for numerical-only data (#1887).
    • Significant speedups for random forest (#1887).
    • Random forest now has minimum_gain_split and subspace_dim parameters (#1887).
    • Decision tree parameter print_training_error deprecated in favor of print_training_accuracy.
    • output option changed to predictions for adaboost and perceptron binding. Old options are now deprecated and will be preserved until mlpack 4.0.0 (#1882).
    • Concatenated ReLU layer (#1843).
    • Accelerate NormalizeLabels function using hashing instead of linear search (see src/mlpack/core/data/normalize_labels_impl.hpp) (#1780).
    • Add ConfusionMatrix() function for checking performance of classifiers (#1798).
    • Install ensmallen headers when it is downloaded during build (#1900).
    Source code(tar.gz)
    Source code(zip)
  • mlpack-3.1.0(Apr 26, 2019)

    Released April 25, 2019. Release email

    • Add DiagonalGaussianDistribution and DiagonalGMM classes to speed up the diagonal covariance computation and deprecate DiagonalConstraint (#1666).

    • Add kernel density estimation (KDE) implementation with bindings to other languages (#1301).

    • Where relevant, all models with a Train() method now return a double value representing the goodness of fit (i.e. final objective value, error, etc.) (#1678).

    • Add implementation for linear support vector machine (see src/mlpack/methods/linear_svm).

    • Change DBSCAN to use PointSelectionPolicy and add OrderedPointSelection (#1625).

    • Residual block support (#1594).

    • Bidirectional RNN (#1626).

    • Dice loss layer (#1674, #1714) and hard sigmoid layer (#1776).

    • output option changed to predictions and output_probabilities to probabilities for Naive Bayes binding (mlpack_nbc/nbc()). Old options are now deprecated and will be preserved until mlpack 4.0.0 (#1616).

    • Add support for Diagonal GMMs to HMM code (#1658, #1666). This can provide large speedup when a diagonal GMM is acceptable as an emission probability distribution.

    • Python binding improvements: check parameter type (#1717), avoid copying Pandas dataframes (#1711), handle Pandas Series objects (#1700).

    Source code(tar.gz)
    Source code(zip)
  • mlpack-3.0.4(Nov 13, 2018)

    Released November 13, 2018.

    • Bump minimum CMake version to 3.3.2.
    • CMake fixes for Ninja generator by Marc Espie (#1550, #1537, #1523).
    • More efficient linear regression implementation (#1500).
    • Serialization fixes for neural networks (#1508, #1535).
    • Mean shift now allows single-point clusters (#1536).
    Source code(tar.gz)
    Source code(zip)
  • mlpack-3.0.3(Jul 29, 2018)

    Released July 27th, 2018.

    • Fix Visual Studio compilation issue (#1443).
    • Allow running local_coordinate_coding binding with no initial_dictionary parameter when input_model is not specified (#1457).
    • Make use of OpenMP optional via the CMake USE_OPENMP configuration variable (#1474).
    • Accelerate FNN training by 20-30% by avoiding redundant calculations (#1467).
    • Fix math::RandomSeed() usage in tests (#1462, #1440).
    • Generate better Python with documentation (#1460).
    Source code(tar.gz)
    Source code(zip)
  • mlpack-3.0.2(Jun 9, 2018)

    Released June 8th, 2018.

    • Documentation generation fixes for Python bindings (#1421).
    • Fix build error for man pages if command-line bindings are not being built (#1424).
    • Add shuffle parameter and Shuffle() method to KFoldCV (#1412). This will shuffle the data when the object is constructed, or when Shuffle() is called.
    • Added neural network layers: AtrousConvolution (#1390), Embedding (#1401), and LayerNorm (layer normalization) (#1389).
    • Add Pendulum environment for reinforcement learning (#1388) and update Mountain Car environment (#1394).
    Source code(tar.gz)
    Source code(zip)
  • mlpack-3.0.1(May 11, 2018)

    Released May 10th, 2018.

    • Fix intermittently failing tests (#1387).
    • Add Big-Batch SGD (BBSGD) optimizer in src/mlpack/core/optimizers/bigbatch_sgd (#1131).
    • Fix simple compiler warnings (#1380, #1373).
    • Simplify NeighborSearch constructor and Train() overloads (#1378).
    • Add warning for OpenMP setting differences (#1358/#1382). When mlpack is compiled with OpenMP but another application linking against mlpack is not (or vice versa), a compilation warning will now be issued.
    • Restructured loss functions in src/mlpack/methods/ann/ (#1365).
    • Add environments for reinforcement learning tests (#1368, #1370, #1329).
    • Allow single outputs for multiple timestep inputs for recurrent neural networks (#1348).
    • Neural networks: add He and LeCun normal initializations (#1342), add FReLU and SELU activation functions (#1346, #1341), add alpha-dropout (#1349).
    Source code(tar.gz)
    Source code(zip)
  • mlpack-3.0.0(Mar 31, 2018)

    Released March 30th, 2018.

    • Speed and memory improvements for DBSCAN. --single_mode can now be used for situations where previously RAM usage was too high.
    • Bump minimum required version of Armadillo to 6.500.0.
    • Add automatically generated Python bindings. These have the same interface as the command-line programs.
    • Add deep learning infrastructure in src/mlpack/methods/ann/.
    • Add reinforcement learning infrastructure in src/mlpack/methods/reinforcement_learning/.
    • Add optimizers: AdaGrad, CMAES, CNE, FrankeWolfe, GradientDescent, GridSearch, IQN, Katyusha, LineSearch, ParallelSGD, SARAH, SCD, SGDR, SMORMS3, SPALeRA, SVRG.
    • Add hyperparameter tuning infrastructure and cross-validation infrastructure in src/mlpack/core/cv/ and src/mlpack/core/hpt/.
    • Fix bug in mean shift.
    • Add random forests (see src/mlpack/methods/random_forest).
    • Numerous other bugfixes and testing improvements.
    • Add randomized Krylov SVD and Block Krylov SVD.
    Source code(tar.gz)
    Source code(zip)
  • mlpack-2.2.5(Aug 26, 2017)

  • mlpack-2.2.4(Jul 19, 2017)

    Released July 18th, 2017.

    • Speed and memory improvements for DBSCAN. --single_mode can now be used for situations where previously RAM usage was too high.
    • Fix bug in CF causing incorrect recommendations.
    Source code(tar.gz)
    Source code(zip)
  • mlpack-2.2.3(May 24, 2017)

  • mlpack-2.2.2(May 5, 2017)

    Released May 4th, 2017.

    • Install backwards-compatibility mlpack_allknn and mlpack_allkfn programs; note they are deprecated and will be removed in mlpack 3.0.0 (#992).
    • Fix RStarTree bug that surfaced on OS X only (#964).
    • Small fixes for MiniBatchSGD and SGD and tests.
    Source code(tar.gz)
    Source code(zip)
  • mlpack-2.2.1(Apr 13, 2017)

  • mlpack-2.2.0(Mar 21, 2017)

    Released Mar. 21st, 2017.

    • Bugfix for mlpack_knn program (#816).
    • Add decision tree implementation in methods/decision_tree/. This is very similar to a C4.5 tree learner.
    • Add DBSCAN implementation in methods/dbscan/.
    • Add support for multidimensional discrete distributions (#810, #830).
    • Better output for Log::Debug/Log::Info/Log::Warn/Log::Fatal for Armadillo objects (#895, #928).
    • Refactor categorical CSV loading with boost::spirit for faster loading (#681).
    Source code(tar.gz)
    Source code(zip)
  • mlpack-2.1.1(Dec 22, 2016)

    Released Dec. 22nd, 2016.

    • HMMs now use random initialization; this should fix some convergence issues (#828).
    • HMMs now initialize emissions according to the distribution of observations (#833).
    • Minor fix for formatted output (#814).
    • Fix DecisionStump to properly work with any input type.
    Source code(tar.gz)
    Source code(zip)
  • mlpack-2.1.0(Oct 31, 2016)

    Released Oct. 31st, 2016.

    • Fixed CoverTree to properly handle single-point datasets.
    • Fixed a bug in CosineTree (and thus QUIC-SVD) that caused split failures for some datasets (#717).
    • Added mlpack_preprocess_describe program, which can be used to print statistics on a given dataset (#742).
    • Fix prioritized recursion for k-furthest-neighbor search (mlpack_kfn and the KFN class), leading to orders-of-magnitude speedups in some cases.
    • Bump minimum required version of Armadillo to 4.200.0.
    • Added simple Gradient Descent optimizer, found in src/mlpack/core/optimizers/gradient_descent/ (#792).
    • Added approximate furthest neighbor search algorithms QDAFN and DrusillaSelect in src/mlpack/methods/approx_kfn/, with command-line program mlpack_approx_kfn.
    Source code(tar.gz)
    Source code(zip)
  • mlpack-2.0.3(Jul 21, 2016)

    Released July 21st, 2016.

    • Standardize some parameter names for programs (old names are kept for reverse compatibility, but warnings will now be issued).
    • RectangleTree optimizations (#721).
    • Fix memory leak in NeighborSearch (#731).
    • Documentation fix for k-means tutorial (#730).
    • Fix TreeTraits for BallTree (#727).
    • Fix incorrect parameter checks for some command-line programs.
    • Fix error in HMM training with probabilities for each point (#636).
    Source code(tar.gz)
    Source code(zip)
  • mlpack-2.0.2(Jun 20, 2016)

    Released June 20th, 2016.

    • Added the function LSHSearch::Projections(), which returns an arma::cube with each projection table in a slice (#663). Instead of Projection(i), you should now use Projections().slice(i).
    • A new constructor has been added to LSHSearch that creates objects using projection tables provided in an arma::cube (#663).
    • LSHSearch projection tables refactored for speed (#675).
    • Handle zero-variance dimensions in DET (#515).
    • Add MiniBatchSGD optimizer (src/mlpack/core/optimizers/minibatch_sgd/) and allow its use in mlpack_logistic_regression and mlpack_nca programs.
    • Add better backtrace support from Grzegorz Krajewski for Log::Fatal messages when compiled with debugging and profiling symbols. This requires libbfd and libdl to be present during compilation.
    • CosineTree test fix from Mikhail Lozhnikov (#358).
    • Fixed HMM initial state estimation (#600).
    • Changed versioning macros __MLPACK_VERSION_MAJOR, __MLPACK_VERSION_MINOR, and __MLPACK_VERSION_PATCH to MLPACK_VERSION_MAJOR, MLPACK_VERSION_MINOR, and MLPACK_VERSION_PATCH. The old names will remain in place until mlpack 3.0.0.
    • Renamed mlpack_allknn, mlpack_allkfn, and mlpack_allkrann to mlpack_knn, mlpack_kfn, and mlpack_krann. The mlpack_allknn, mlpack_allkfn, and mlpack_allkrann programs will remain as copies until mlpack 3.0.0.
    • Add --random_initialization option to mlpack_hmm_train, for use when no labels are provided.
    • Add --kill_empty_clusters option to mlpack_kmeans and KillEmptyClusters policy for the KMeans class (#595, #596).
    Source code(tar.gz)
    Source code(zip)
  • mlpack-2.0.1(Mar 3, 2016)

    Released Feb. 4th, 2016.

    • Fix CMake to properly detect when MKL is being used with Armadillo.
    • Minor parameter handling fixes to mlpack_logistic_regression (#504, #505).
    • Properly install arma_config.hpp.
    • Memory handling fixes for Hoeffding tree code.
    • Add functions that allow changing training-time parameters to HoeffdingTree class.
    • Fix infinite loop in sparse coding test.
    • Documentation spelling fixes (#501).
    • Properly handle covariances for Gaussians with large condition number (#496), preventing GMMs from filling with NaNs during training (and also HMMs that use GMMs).
    • CMake fixes for finding LAPACK and BLAS as Armadillo dependencies when ATLAS is used.
    • CMake fix for projects using mlpack's CMake configuration from elsewhere (#512).
    Source code(tar.gz)
    Source code(zip)
  • mlpack-2.0.0(Dec 24, 2015)

    Released Dec. 23rd, 2015.

    • Removed overclustering support from k-means because it is not well-tested, may be buggy, and is (I think) unused. If this was support you were using, open a bug or get in touch with us; it would not be hard for us to reimplement it.
    • Refactored KMeans to allow different types of Lloyd iterations.
    • Added implementations of k-means: Elkan's algorithm, Hamerly's algorithm, Pelleg-Moore's algorithm, and the DTNN (dual-tree nearest neighbor) algorithm.
    • Significant acceleration of LRSDP via the use of accu(a % b) instead of trace(a * b).
    • Added MatrixCompletion class (matrix_completion), which performs nuclear norm minimization to fill unknown values of an input matrix.
    • No more dependence on Boost.Random; now we use C++11 STL random support.
    • Add softmax regression, contributed by Siddharth Agrawal and QiaoAn Chen.
    • Changed NeighborSearch, RangeSearch, FastMKS, LSH, and RASearch API; these classes now take the query sets in the Search() method, instead of in the constructor.
    • Use OpenMP, if available. For now OpenMP support is only available in the DET training code.
    • Add support for predicting new test point values to LARS and the command-line 'lars' program.
    • Add serialization support for Perceptron and LogisticRegression.
    • Refactor SoftmaxRegression to predict into an arma::Row<size_t> object, and add a softmax_regression program.
    • Refactor LSH to allow loading and saving of models.
    • ToString() is removed entirely (#487).
    • Add --input_model_file and --output_model_file options to appropriate machine learning algorithms.
    • Rename all executables to start with an "mlpack" prefix (#229).
    Source code(tar.gz)
    Source code(zip)
  • mlpack-1.0.12(Jan 7, 2015)

  • mlpack-1.0.9(Dec 22, 2014)

    Released July 28th, 2014.

    • GMM initialization is now safer and provides a working GMM when constructed with only the dimensionality and number of Gaussians (#314).
    • Check for division by 0 in Forward-Backward algorithm in HMMs (#314).
    • Fixed implementation of Viterbi algorithm in HMM::Predict() (#316)
    • Significant speedups for dual-tree algorithms using the cover tree (#243, #329) including a faster implementation of FastMKS.
    • CF (collaborative filtering) now expects users and items to be zero-indexed, not one-indexed (#324).
    • CF::GetRecommendations() API change: now requires the number of recommendations as the first parameter. The number of users in the local neighborhood should be specified with CF::NumUsersForSimilarity().
    • Removed incorrect PeriodicHRectBound (#30).
    • Refactor LRSDP into LRSDP class and standalone function to be optimized (#318).
    • Fix for centering in kernel PCA (#355).
    • Added simulated annealing (SA) optimizer, contributed by Zhihao Lou.
    • HMMs now support initial state probabilities; these can be set in the constructor, trained, or set manually with HMM::Initial() (#315).
    • Added Nyström method for kernel matrix approximation by Marcus Edel.
    • Kernel PCA now supports using the Nyström method for approximation.
    • Ball trees now work with dual-tree algorithms, via the BallBound<> bound structure (#320); fixed by Yash Vadalia.
    • The NMF class is now AMF<>, and supports far more types of factorizations, by Sumedh Ghaisas.
    • A QUIC-SVD implementation has returned, written by Siddharth Agrawal and based on older code from Mudit Gupta.
    • Added perceptron and decision stump by Udit Saxena (these are weak learners for an eventual AdaBoost class).
    • Sparse autoencoder added by Siddharth Agrawal.
    Source code(tar.gz)
    Source code(zip)
  • mlpack-1.0.8(Dec 22, 2014)

    Released January 6th, 2014.

    • Memory leak in NeighborSearch index-mapping code fixed.
    • GMMs can be trained using the existing model as a starting point by specifying an additional boolean parameter to GMM::Estimate().
    • Logistic regression implementation added in methods/logistic_regression.
    • Version information is now obtainable via mlpack::util::GetVersion() or the __MLPACK_VERSION_MAJOR, __MLPACK_VERSION_MINOR, and __MLPACK_VERSION_PATCH macros.
    • Fix typos in allkfn and allkrann output.
    Source code(tar.gz)
    Source code(zip)
