libsequence: a C++ class library for evolutionary genetic analysis

Related tags

Biology libsequence

libsequence - A C++ class library for evolutionary genetic analysis

Copyright (C) 2002 Kevin Thornton

libsequence2 is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

Comments are welcome.

- Kevin Thornton 

User's group

Please post to the libsequence user group for help.

Build status

  • master branch: Build Status . master branch on Circle: CircleCI
  • dev branch: Build Status . dev branch on Circle: CircleCI


If you use the library for your research, please cite:

@article{libsequence, author = {Thornton, Kevin}, title = {{Libsequence: a C++ class library for evolutionary genetic analysis.}}, journal = {Bioinformatics (Oxford, England)}, year = {2003}, volume = {19}, number = {17}, pages = {2325--2327}, month = nov }

The manuscript is available online at

Revision history.

The revision history of the library is here. The document describes what changed for a given release.

Obtaining the source code

Obtaining the master branch

You have a few options:

Obtaining a specific release

Again, a few options:

  • Click on "Releases" at, then download the one you want
  • Clone the repo (see previous section)
  • Get a list of releases by saying "git tag -l"
  • Checkout the release you want. For example "git checkout 1.8.0"



  1. A C++11-compliant compiler (see next section)


I support the following compilers:

I'd appreciate success/failure reports on Intel's icc compiler. As it is no longer free for academic use, I'm not longer able to test it.

Simplest installation instructions

sudo make install

The build conditions can be adjusted via the usual environment variables. To compile an optimized "release" build:

./configure CXXFLAGS="-O3 -DNDEBUG"

To compile a debugger-friendly build:

./configure CXXFLAGS="-O0 -g"

To change the compiler, set the C and C++ compiler variables:

./configure CC=gcc CXX=g++

Compiling unit tests and examples

To compile unit testing suite and example programs

make check


cd test
make check

Note that the library must be built prior to "make check", but you do not have to install the library prior ot "make check". The examples and unit tests are statically-linked to the version of the library that will be found in src/.libs after a "make" command. I do this so that one can perform unit tests without having to install the library. I use static linking here to avoid any possible confusion with an existing libsequence installation.

Running the unit tests

cd test && sh

More complex installation scenarios

Some users may not have the dependent libraries installed in the standard locations on their systems. Note that "standard" means wherever the compiler system looks for header files during compilation and libraries during linking. This scenario is common on OS X systems where users have used some sort of "system" to install various libraries rather than installing from source directly. In order to accomodate such situations, the user must provide the correct path to the include and lib directories. For example, assume that the dependend libraries are in /opt on your system. You would install libsequence as follows:

CPPFLAGS=-I/opt/include LDFLAGS="$LDFLAGS -l/opt/lib" ./configure


sudo make install

Note that the modification of LDFLAGS prepends the current value of LDFLAGS if it exists. This allows for scenarios where the system's search path for libraries may have been modified by the user or sysadmin via a modification of that shell variable. (One could also do the same with CPPFLAGS, FYI.)

Installing libsequence locally

If you do not have permission to "sudo make install", you can install the library in your $HOME:

./configure --prefix=$HOME

Then, when compiling any program using libsequence, you need to add


to any compilation commands and

-L$HOME/lib -Wl,-rpath,$HOME/lib

to any linking commands.

When running programs linking to any of the above run-time libraries, and depending on your system, you may also need to adjust variables like LD_LIBRARY_PATH to prepend $HOME/lib to them, etc., but you'll need to figure that out on case-by-case basis, as different systems can behave quite differently.

Installation via Bioconda

Libsequence is available for installation via bioconda:

conda install -c bioconda libsequence

The above command will give you the most recent stable release on OS X or Linux.

Using libsequence to compile other programs

If libsequence is not installed in a standard path, then you must provide the appropriate include (-I) and link path (-L) commands to your compiler. This may be done in various ways, e.g., via a configure script or your own Makefile.

A program that depends on libsequence must provide at least the following libraries to the linker:

-lsequence -lz 
  • Use lambda instead of bind of unresolved overload

    Use lambda instead of bind of unresolved overload

    The standard library bind(f,...) requires a sufficiently resolved callable type, the overloads of notDifferent with default parameters prevents selection of the intended overload.

    opened by zao 19
  • Build fails on macOS

    Build fails on macOS

    I tried submitting PR brewsci/homebrew-bio/pull/535 for upgrading to 1.9.6, but CI fails on macOS. I guess changes in path string (like summstats vs SummStats) is preventing proper tracking of diffs on case-insensitive file systems. On macOS, unstaged changes show up right after git clone.

    git status:

    On branch master
    Your branch is up to date with 'origin/master'.
    Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git checkout -- <file>..." to discard changes in working directory)
            modified:   Sequence/SummStats.hpp
            modified:   Sequence/SummStats/Garud.hpp
            modified:   Sequence/SummStats/
            modified:   Sequence/SummStats/
            modified:   Sequence/SummStats/lHaf.hpp
            modified:   Sequence/SummStats/nSL.hpp
            modified:   src/SummStats/
            modified:   src/SummStats/
            modified:   src/SummStats/

    git diff Sequence/SummStats.hpp:

    diff --git a/Sequence/SummStats.hpp b/Sequence/SummStats.hpp
    index a0bda1a1..e9810ac8 100644
    --- a/Sequence/SummStats.hpp
    +++ b/Sequence/SummStats.hpp
    @@ -1,13 +1,23 @@
    +/// @file Sequence/summstats.hpp
    +/// \brief Include all summary statistic functions and types
    -/*! \file SummStats.hpp
    -  Header file for summary statistic of variation data.
    + *  \defgroup popgenanalysis Analysis of molecular population genetic data
    + *  \brief Summary statistics and other analysis of Sequence::VariantMatrix
    + *  \ingroup popgen
    + *
    + *  See @ref md_md_tutorial.
    + *
    -#include <Sequence/SummStats/nSL.hpp>
    -#include <Sequence/SummStats/Garud.hpp>
    -#include <Sequence/SummStats/lHaf.hpp>
    -#include <Sequence/SummStats/Snn.hpp>
    +#include "summstats/generic.hpp"
    +#include "summstats/classics.hpp"
    +#include "summstats/nsl.hpp"
    +#include "summstats/nslx.hpp"
    +#include "summstats/ld.hpp"
    +#include "summstats/lhaf.hpp"
    +#include "summstats/garud.hpp"
    opened by heavywatal 10
  • Avoid putting std::isfinite in global namespace

    Avoid putting std::isfinite in global namespace

    Some compilers/stdlibs (Intel 15 / libstdc++ 4.6) have different isfinite functions in the global namespace and in namespace std.

    Using using statements to put std::isfinite in the global namespace or an global unnamed namespace makes unqualified isfinite(x) calls ambiguous.

    Dealing with cmath vs. math.h and where they put their definitions is a proper mess, putting the alias in the closest namespace yields the right result on both GCC 4.9.1 and Intel 15.

    Also, stdlib.h needed for malloc/free.

    opened by zao 9
  • Installation improved, ignore files created in build process

    Installation improved, ignore files created in build process

    Hi Kevin,

    Blimey, great work, way beyond 'it just works'! I have had some minor trouble installing this, so it could already be a small contribution.

    Keep up the good work, Richel Bilderbeek

    opened by richelbilderbeek 7
  • make error

    make error


    I need to compile libsequence because it's a dependency for ms-stat.

    I'm on CentOS6 and I use devtoolset-1.1 to have access to gcc 4.7.2 But I got this message during the make step:

    make[2]: Entering directory `/usr/local/install/src/genome2/libsequence-1.8.9/src'
    depbase=`echo Seq/Fasta.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`;\
        /bin/sh ../libtool  --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H -I. -I..    -Wall -W -Woverloaded-virtual -Wnon-virtual-dtor -Wcast-qual -Wconversion -Wsign-conversion -Wsign-promo -Wsynth -ffor-scope   -DNDEBUG -g -O2 -std=c++11 -MT Seq/Fasta.lo -MD -MP -MF $depbase.Tpo -c -o Seq/Fasta.lo Seq/ &&\
        mv -f $depbase.Tpo $depbase.Plo
    libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I.. -Wall -W -Woverloaded-virtual -Wnon-virtual-dtor -Wcast-qual -Wconversion -Wsign-conversion -Wsign-promo -Wsynth -ffor-scope -DNDEBUG -g -O2 -std=c++11 -MT Seq/Fasta.lo -MD -MP -MF Seq/.deps/Fasta.Tpo -c Seq/  -fPIC -DPIC -o Seq/.libs/Fasta.o
    In file included from Seq/
    ../Sequence/Fasta.hpp:53:18: error: ‘Sequence::Seq::Seq’ names constructor
    In file included from ../Sequence/Fasta.hpp:45:0,
                     from Seq/
    ../Sequence/Seq.hpp:58:7: warning: unused parameter ‘seq’ [-Wunused-parameter]
    Seq/ In constructor ‘Sequence::Fasta::Fasta(const Sequence::Seq&)’:
    Seq/ note: synthesized method ‘Sequence::Seq::Seq(const Sequence::Seq&)’ first required here
    In file included from ../Sequence/Fasta.hpp:45:0,
                     from Seq/
    ../Sequence/Seq.hpp: At global scope:
    ../Sequence/Seq.hpp:59:7: warning: unused parameter ‘seq’ [-Wunused-parameter]
    Seq/ In constructor ‘Sequence::Fasta::Fasta(Sequence::Seq&&)’:
    Seq/ note: synthesized method ‘Sequence::Seq::Seq(Sequence::Seq&&)’ first required here
    make[2]: *** [Seq/Fasta.lo] Error 1
    make[2]: Leaving directory `/usr/local/install/src/genome2/libsequence-1.8.9/src'
    make[1]: *** [all-recursive] Error 1
    make[1]: Leaving directory `/usr/local/install/src/genome2/libsequence-1.8.9'
    make: *** [all] Error 2

    Do you have an idea?


    opened by lecorguille 4
  • reserved identifier violation

    reserved identifier violation

    opened by elfring 3
  • Prevent use of bamrecord if HTS not present

    Prevent use of bamrecord if HTS not present

    Several functions consuming bamrecords were not guarded by the feature detection macro HAVE_HTSLIB that Sequence::bamrecord's definition was guarded with.

    opened by zao 2
  • Deprecate use of htslib

    Deprecate use of htslib

    I'm not willing to track htslib development, and prefer to refocus libsequence on pop gen.

    • [ ] add a --no-htslib option to configure
    • [ ] mark anything using htslib as deprecated.
    opened by molpopgen 1
  • Conan package

    Conan package

    Hello, Do you know about Conan? Conan is modern dependency manager for C++. And will be great if your library will be available via package manager for other developers.

    Here you can find example, how you can create package for the library.

    If you have any questions, just ask :-)

    opened by zamazan4ik 1
  • FASTQ input with spaces in read name

    FASTQ input with spaces in read name

    FASTQ file are incorrectly read in if there is whitespace in the sequence name, which apparently NCBI/SRA adds, apparently because they don't feel good unless they've modified what people upload...

    opened by molpopgen 1
  • bug in Sequence::label_haplotypes

    bug in Sequence::label_haplotypes

    New unit tests have revealed a bug in this function.

    The previous unit tests were too simple.

    See #43 for additional background.

    This function is the basis for many haplotype statistics, which will all be a bit off as a result.

    opened by molpopgen 0
  • Summary statistics needed for 2.0

    Summary statistics needed for 2.0

    • [x] nSL/iHS
    • [x] H1, H12 et al.
    • [ ] Hudson's Snn
    • [ ] Fst
    • [x] lHaf
    • [ ] Wall's B, Q, etc.
    • [ ] r^2/D/D'

    This will leave us with pretty much everything except the Fu/Li statistics? There are also some from Rozas that msstats did but never made it into libsequence.

    opened by molpopgen 0
  • 1.9.8(Jun 18, 2019)

    • Refactor VariantMatrix to manage memory via Sequence::GenotypeCapsule and Sequence::PositionCapsule
    • Windows of VariantMatrix objects now do not require copies, and instead use Sequence::NonOwningGenotypeCapsule and Sequence::NonOwningPositionCapsule.
    • A bug in haplotype labelling is fixed. Issue 59. Statistics like number of haplotypes, haplotype diversity, etc., were affected by this issue, but the errors were small for larger data sets.
    Source code(tar.gz)
    Source code(zip)
  • 1.9.7(Apr 1, 2019)

    A bug in l-Haf from #53 was fixed. Thanks to Alex Nater for spotting that.

    The library compiles on OS X again, which required that the path names for deprecated summary stats be changed.

    Source code(tar.gz)
    Source code(zip)
  • 1.9.6(Nov 6, 2018)

    • Fix GitHub issue PR50 via PR51.
    • Added very efficient overload of Sequence::nsl. PR51 and PR52
    • PR52 added a first implementation of Sequence::nslx, a series of back-end changes to some of the summary statistic code as well as some more testing.
    • Added Sequence::lhaf. PR53
    • Include Sequence/variant_matrix/msformat.hpp when installing the library.
    • Fix GitHub issue 54
    Source code(tar.gz)
    Source code(zip)
  • 1.9.5(Sep 13, 2018)

  • 1.9.4(Aug 2, 2018)

  • 1.9.3(Jul 12, 2018)

    • Refactor unit tests to be much faster to compile
    • Remove dependency on htslib.
    • The coalescent simulation machinery is no longer compiled or installed.
    • Mark a lot of code as deprecated
    • Travis CI is now Linux-only
    • Add Sequence::VariantMatrix and Sequence::StateCounts

    This release includes the following GitHub PRs: #11, #12, #13, #14, #15, #25, #26, #30, #31, and #32.

    Source code(tar.gz)
    Source code(zip)
  • 1.9.2(Oct 16, 2017)

    This release simplifies these calculations. There is no longer a function to standardize/bin results. Rather, a vector if (nSL/iHS/derived mutation count) tuples are returned.

    Source code(tar.gz)
    Source code(zip)
  • 1.9.1(Apr 19, 2017)

    • Sequence::SeqException was removed. Exceptions from namespace std are preferred, and are easier to wrap in other languages.
    • Sequence::PolySNP::ThetaL throws exception if outrgroup not present
    Source code(tar.gz)
    Source code(zip)
  • 1.9.0(Oct 24, 2016)

    • Fixed issues with Sequence::Comeron95 that made it impossible to allocate on the stack.
    • Updated threaded implementation of the l-Haf statistic to use TBB.
    • Weights on stop codons used in Grantham distance calculations is now configurable, and defaults to the max value of a double. Previous library versions arbitrarily used 999.0.
    • PolySIM::ThetaL now correctly will not include fixed differences in the calculation.
    • nSL/iHS, H1, H12, H2H1, and haplotype homozygosity statistics are now calculated in parallel.
    • Sequence::Disequilibrium parallelized.
    • Intel's TBB is now a dependency.
    Source code(tar.gz)
    Source code(zip)
  • 1.8.9(Feb 15, 2016)

    • Issue #8 fixed
    • Sequence::PolyTableSlice will throw std::runtime_error if input range is not properly sorted
    • War on "mutable". The use of this keyword has been removed from the library to the best extent possible.
    • The API for calculations involving codons has been modernized. This includes Sequence::Comeron95, Sequence::RedundancyCom95, Sequence::WeightingScheme2 (and derived types), Sequence::WeightingScheme3 (and derived types), Sequence::TwoSubs, Sequence::ThreeSubs, functions in Sequence/PathwayHelper.hpp
    • Sequence::PolyTable (and derived types) have been refactored. The fundamental idea is the same, but the API is modernized. IMO, it is still imperfect, and can be further changed to reflect more idiomatic C++11, but that'll have to wait.
    • Private data members for classes have been hidden using the PIMPL idiom. This goes a long way to future-proofing the ABI compatibility of these types against further implementation changes such as bug fixes.
    • Sequence/SummStats/classic.hpp provides a sneak previous of how summary statistics will work in the future, once the deprecated Sequence::PolySNP and Sequence::PolySIM can finally be removed
    Source code(tar.gz)
    Source code(zip)
  • 1.8,8(Nov 9, 2015)

    • l-HAF statistic added (Sequence/SummStats/lHaf.hpp)
    • Garud et al.'s H1, H2H1, etc. added (Sequence/SummStats/Garud.hpp)
    • nSL added (Sequence/SummStats/nSL.hpp)
    • fixed implementation of Sequence::invalidPolyChar, which was checking the wrong alphabet
    • Various documentation fixes
    • Sequence::FST functions shared,Private, and fixed now throw an exception if deme indexes are out of range. Previously, empty return values were sent, which could be confused with there being no sites in a category.
    • Various code cleanups, esp. removal of commented-out code blocks
    • The 8-bit encoding stuff has been removed. This was never used in real-world programs, and suffered from some design issues.
    • Sequence::PolyTableSlice has several updates. First, a bug in "fixed-S" windows was identified through unit testing and fixed. The previous version would drop the last window in some cases. This probably didn't affect many people, but the bug was there for years. (In practice, most 'windows' are fixed distance, not fixed no. variable sites, hence my belief that most previous analyses are ok.) A new constructor supports 'chunking' a PolyTable into equal-sized windows (based on number of variable sites). The class no longer contains a data member of type T, which was never necessary anways.
    • auto_ptr replaced with unique_ptr in src/
    • binning of nSL/iHS statistics is improved, and better handling of non-finite values implemented.
    Source code(tar.gz)
    Source code(zip)
  • 1.8.7(Sep 23, 2015)

    Improve behavior of sliding window calculator. Programs using that class must now input more info and do not have to do as much manual post-processing.

    Source code(tar.gz)
    Source code(zip)
  • 1.8.6(Aug 5, 2015)

Kevin R. Thornton
Faculty member in Ecology & Evolutionary Biology at UC Irvine. @ThorntonLab is our group's GitHub account.
Kevin R. Thornton
Probabilistic Risk Analysis Tool (fault tree analysis, event tree analysis, etc.)

SCRAM SCRAM is a Command-line Risk Analysis Multi-tool. This project aims to build a command line tool for probabilistic risk analysis. SCRAM is capab

Olzhas Rakhimov 115 Dec 30, 2022
Simple C++ Genetic Algorithm library

crsGA: Simple C++ Genetic Algorithm library crsGA is a simple C++ template library for developing genetic algorithms, plus some other utilities (Logge

Rafael Gaitán 6 Apr 24, 2022
A Binary Genetic Traits Lexer

BinLex a Genetic Binary Trait Lexer Library and Utility The purpose of BinLex is to extract basic blocks and functions as traits from binaries. Most p

c3rb3ru5 310 Dec 26, 2022
This program uses genetic algorithm to find the best route possible given the conditions.

Genetic Algorithm Table Of Contents Table Of Contents Installation About Terms The Algorithm Default values for the conditions Example result Installa

Tony Trinh 1 Jan 23, 2022
Terrain Analysis Using Digital Elevation Models (TauDEM) software for hydrologic terrain analysis and channel network extraction.

TauDEM (Terrain Analysis Using Digital Elevation Models) is a suite of Digital Elevation Model (DEM) tools for the extraction and analysis of hydrolog

David Tarboton 191 Dec 28, 2022
Library that simplify to find header for class from STL library.

Library that simplify to find header for class from STL library. Instead of searching header for some class you can just include header with the class name.

null 6 Jun 7, 2022
free C++ class library of cryptographic schemes

Crypto++: free C++ Class Library of Cryptographic Schemes Version 8.4 - TBD Crypto++ Library is a free C++ class library of cryptographic schemes. Cu

null 3.7k Jan 2, 2023
cavi is an open-source library that aims to provide performant utilities for closed hierarchies (i.e. all class types of the hierarchy are known at compile time).

cavi cavi is an open-source library that aims to provide performant utilities for closed hierarchies (i.e. all class types of the hierarchy are known

Baber Nawaz 5 Mar 9, 2022
C++ library thats implemets class color. Available models: RGB, HSL, HSV, CMY, CMYK, YIQ, YUV and growing.

Yet another c++ library that implements color. Description Yet another c++ library that implements color conversion and manipulation. Key features: No

Dejan 142 Dec 19, 2022
Visual odometry package based on hardware-accelerated NVIDIA Elbrus library with world class quality and performance.

Isaac ROS Visual Odometry This repository provides a ROS2 package that estimates stereo visual inertial odometry using the Isaac Elbrus GPU-accelerate

NVIDIA Isaac ROS 339 Dec 28, 2022
A single-class C++ library for reading animated GIF files

EasyGifReader EasyGifReader is a single-class C++ library that aims to simplify reading an animated GIF file. It is built on top of and depends on gif

Viktor Chlumský 9 Nov 17, 2022
rax/RAX is a C++ extension library designed to provide new, fast, and reliable cross-platform class types.

rax rax/RAX is a C++ extension library designed to provide cross-platform new, fast, and reliable class types for different fields such as work with I

MaxHwoy 5 May 2, 2022
A library to serialize custom classes to and from XML by adding a very minimal amount of code to a class.

ai-xml submodule This repository is a git submodule providing a C++ framework for serializing classes to and from XML with a minimal amount of code pe

Carlo Wood 2 Oct 1, 2022
A C++ Class and Template Library for Performance Critical Applications

Spirick Tuning A C++ Class and Template Library for Performance Critical Applications Optimized for Performance The Spirick Tuning library provides a

Dietmar Deimling 3 Dec 6, 2021
Kraken is an open-source modern math library that comes with a fast-fixed matrix class and math-related functions.

Kraken ?? Table of Contents Introduction Requirement Contents Installation Introduction Kraken is a modern math library written in a way that gives ac

yahya mohammed 24 Nov 30, 2022
Blitz++ is a C++ template class library which provides array objects for scientific computing

Blitz++ is a C++ template class library which provides array objects for scientific computing

Peter Kümmel 17 Nov 22, 2020
Tntdb is a c++-class-library for easy access to databases

Tntdb is a c++-class-library for easy access to databases

Tommi Mäkitalo 31 Aug 1, 2022
a generic C++ library for image analysis

VIGRA Computer Vision Library Copyright 1998-2013 by Ullrich Koethe This file is part of the VIGRA computer vision library. You may use,

Ullrich Koethe 378 Dec 30, 2022
a library for audio and music analysis

aubio aubio is a library to label music and sounds. It listens to audio signals and attempts to detect events. For instance, when a drum is hit, at wh

aubio 2.9k Jan 1, 2023