Main gperftools repository

(originally Google Performance Tools)

The fastest malloc we’ve seen; works particularly well with threads
and STL. Also: thread-friendly heap-checker, heap-profiler, and


gperftools is a collection of a high-performance multi-threaded
malloc() implementation, plus some pretty nifty performance analysis

gperftools is distributed under the terms of the BSD License. Join our
mailing list at [email protected] for updates:!forum/gperftools

gperftools was original home for pprof program. But do note that
original pprof (which is still included with gperftools) is now
deprecated in favor of Go version at

Just link in -ltcmalloc or -ltcmalloc_minimal to get the advantages of
tcmalloc -- a replacement for malloc and new.  See below for some
environment variables you can use with tcmalloc, as well.

tcmalloc functionality is available on all systems we've tested; see
INSTALL for more details.  See README_windows.txt for instructions on
using tcmalloc on Windows.

when compiling.  gcc makes some optimizations assuming it is using its
own, built-in malloc; that assumption obviously isn't true with
tcmalloc.  In practice, we haven't seen any problems with this, but
the expected risk is highest for users who register their own malloc
hooks with tcmalloc (using gperftools/malloc_hook.h).  The risk is
lowest for folks who use tcmalloc_minimal (or, of course, who pass in
the above flags :-) ).

See docs/heapprofile.html for information about how to use tcmalloc's
heap profiler and analyze its output.

As a quick-start, do the following after installing this package:

1) Link your executable with -ltcmalloc
2) Run your executable with the HEAPPROFILE environment var set:
     $ HEAPPROFILE=/tmp/heapprof <path/to/binary> [binary args]
3) Run pprof to analyze the heap usage
     $ pprof <path/to/binary> /tmp/heapprof.0045.heap  # run 'ls' to see options
     $ pprof --gv <path/to/binary> /tmp/heapprof.0045.heap

You can also use LD_PRELOAD to heap-profile an executable that you
didn't compile.

There are other environment variables, besides HEAPPROFILE, you can
set to adjust the heap-profiler behavior; c.f. "ENVIRONMENT VARIABLES"

The heap profiler is available on all unix-based systems we've tested;
see INSTALL for more details.  It is not currently available on Windows.

See docs/heap_checker.html for information about how to use tcmalloc's
heap checker.

In order to catch all heap leaks, tcmalloc must be linked *last* into
your executable.  The heap checker may mischaracterize some memory
accesses in libraries listed after it on the link line.  For instance,
it may report these libraries as leaking memory when they're not.
(See the source code for more details.)

Here's a quick-start for how to use:

As a quick-start, do the following after installing this package:

1) Link your executable with -ltcmalloc
2) Run your executable with the HEAPCHECK environment var set:
     $ HEAPCHECK=1 <path/to/binary> [binary args]

Other values for HEAPCHECK: normal (equivalent to "1"), strict, draconian

You can also use LD_PRELOAD to heap-check an executable that you
didn't compile.

The heap checker is only available on Linux at this time; see INSTALL
for more details.

See docs/cpuprofile.html for information about how to use the CPU
profiler and analyze its output.

As a quick-start, do the following after installing this package:

1) Link your executable with -lprofiler
2) Run your executable with the CPUPROFILE environment var set:
     $ CPUPROFILE=/tmp/prof.out <path/to/binary> [binary args]
3) Run pprof to analyze the CPU usage
     $ pprof <path/to/binary> /tmp/prof.out      # -pg-like text output
     $ pprof --gv <path/to/binary> /tmp/prof.out # really cool graphical output

There are other environment variables, besides CPUPROFILE, you can set
to adjust the cpu-profiler behavior; cf "ENVIRONMENT VARIABLES" below.

The CPU profiler is available on all unix-based systems we've tested;
see INSTALL for more details.  It is not currently available on Windows.

NOTE: CPU profiling doesn't work after fork (unless you immediately
      do an exec()-like call afterwards).  Furthermore, if you do
      fork, and the child calls exit(), it may corrupt the profile
      data.  You can use _exit() to work around this.  We hope to have
      a fix for both problems in the next release of perftools
      (hopefully perftools 1.2).

If you want the CPU profiler, heap profiler, and heap leak-checker to
all be available for your application, you can do:
   gcc -o myapp ... -lprofiler -ltcmalloc

However, if you have a reason to use the static versions of the
library, this two-library linking won't work:
   gcc -o myapp ... /usr/lib/libprofiler.a /usr/lib/libtcmalloc.a  # errors!

Instead, use the special libtcmalloc_and_profiler library, which we
make for just this purpose:
   gcc -o myapp ... /usr/lib/libtcmalloc_and_profiler.a

For advanced users, there are several flags you can pass to
'./configure' that tweak tcmalloc performance.  (These are in addition
to the environment variables you can set at runtime to affect
tcmalloc, described below.)  See the INSTALL file for details.

The cpu profiler, heap checker, and heap profiler will lie dormant,
using no memory or CPU, until you turn them on.  (Thus, there's no
harm in linking -lprofiler into every application, and also -ltcmalloc
assuming you're ok using the non-libc malloc library.)

The easiest way to turn them on is by setting the appropriate
environment variables.  We have several variables that let you
enable/disable features as well as tweak parameters.

Here are some of the most important variables:

HEAPPROFILE=<pre> -- turns on heap profiling and dumps data using this prefix
HEAPCHECK=<type>  -- turns on heap checking with strictness 'type'
CPUPROFILE=<file> -- turns on cpu profiling and dumps data to this file.
PROFILESELECTED=1 -- if set, cpu-profiler will only profile regions of code
                     surrounded with ProfilerEnable()/ProfilerDisable().
CPUPROFILE_FREQUENCY=x-- how many interrupts/second the cpu-profiler samples.

PERFTOOLS_VERBOSE=<level> -- the higher level, the more messages malloc emits
MALLOCSTATS=<level>    -- prints memory-use stats at program-exit

For a full list of variables, see the documentation pages:


Perftools was developed and tested on x86 Linux systems, and it works
in its full generality only on those systems.  However, we've
successfully ported much of the tcmalloc library to FreeBSD, Solaris
x86, and Darwin (Mac OS X) x86 and ppc; and we've ported the basic
functionality in tcmalloc_minimal to Windows.  See INSTALL for details.
See README_windows.txt for details on the Windows port.


If you're interested in some third-party comparisons of tcmalloc to
other malloc libraries, here are a few web pages that have been
brought to our attention.  The first discusses the effect of using
various malloc libraries on OpenLDAP.  The second compares tcmalloc to
win32's malloc.

It's possible to build tcmalloc in a way that trades off faster
performance (particularly for deletes) at the cost of more memory
fragmentation (that is, more unusable memory on your system).  See the
INSTALL file for details.


When compiling perftools on some old systems, like RedHat 8, you may
get an error like this:
    ___tls_get_addr: symbol not found

This means that you have a system where some parts are updated enough
to support Thread Local Storage, but others are not.  The perftools
configure script can't always detect this kind of case, leading to
that error.  To fix it, just comment out (or delete) the line
   #define HAVE_TLS 1
in your config.h file before building.


There are two issues that can cause program hangs or crashes on x86_64
64-bit systems, which use the libunwind library to get stack-traces.
Neither issue should affect the core tcmalloc library; they both
affect the perftools tools such as cpu-profiler, heap-checker, and

1) Some libc's -- at least glibc 2.4 on x86_64 -- have a bug where the
libc function dl_iterate_phdr() acquires its locks in the wrong
order.  This bug should not affect tcmalloc, but may cause occasional
deadlock with the cpu-profiler, heap-profiler, and heap-checker.
Its likeliness increases the more dlopen() commands an executable has.
Most executables don't have any, though several library routines like
getgrgid() call dlopen() behind the scenes.

2) On x86-64 64-bit systems, while tcmalloc itself works fine, the
cpu-profiler tool is unreliable: it will sometimes work, but sometimes
cause a segfault.  I'll explain the problem first, and then some

Note that this only affects the cpu-profiler, which is a
gperftools feature you must turn on manually by setting the
CPUPROFILE environment variable.  If you do not turn on cpu-profiling,
you shouldn't see any crashes due to perftools.

The gory details: The underlying problem is in the backtrace()
function, which is a built-in function in libc.
Backtracing is fairly straightforward in the normal case, but can run
into problems when having to backtrace across a signal frame.
Unfortunately, the cpu-profiler uses signals in order to register a
profiling event, so every backtrace that the profiler does crosses a
signal frame.

In our experience, the only time there is trouble is when the signal
fires in the middle of pthread_mutex_lock.  pthread_mutex_lock is
called quite a bit from system libraries, particularly at program
startup and when creating a new thread.

The solution: The dwarf debugging format has support for 'cfi
annotations', which make it easy to recognize a signal frame.  Some OS
distributions, such as Fedora and gentoo 2007.0, already have added
cfi annotations to their libc.  A future version of libunwind should
recognize these annotations; these systems should not see any

Workarounds: If you see problems with crashes when running the
cpu-profiler, consider inserting ProfilerStart()/ProfilerStop() into
your code, rather than setting CPUPROFILE.  This will profile only
those sections of the codebase.  Though we haven't done much testing,
in theory this should reduce the chance of crashes by limiting the
signal generation to only a small part of the codebase.  Ideally, you
would not use ProfilerStart()/ProfilerStop() around code that spawns
new threads, or is otherwise likely to cause a call to

17 May 2011
  • CreateToolhelp32Snapshot fails spuriously in an MT application

    CreateToolhelp32Snapshot fails spuriously in an MT application

    Originally reported on Google Code with ID 200

    What steps will reproduce the problem?
    I have a multithreaded applicaion that calls LoadLibrary from multiple 
    threads. Sometimes the library fails to load with the following error on 
    the console:
    Check failed: sidestep::SIDESTEP_SUCCESS == 
    PreamblePatcher::Patch(windows_fn_[i], perftools_fn_[i], &origstub_fn_[i])
    What is the expected output? What do you see instead?
    The expected behavior is consistent loading of the libraries, with the same 
    result as if without tcmalloc.
    What version of the product are you using? On what operating system?
    Windows XP, MSVC 7.1, STLPort 5.1.4, perftools 1.4 + patch from issue 199 
    and minor modifications for STLPort.
    Please provide any additional information below.
    The problem is related to issue 199: origstub_fn_[i] is dirty when calling 
    PreamblePatcher::RawPatch from LibcInfoWithPatchFunctions<T>::Patch. 
    However, the proposed patch from that issue does not help in my case 
    because after cleaning origstub_fn_[i] and patching the functions, the 
    resulting functions become double-patched. This leads to an infinite loop 
    when calling, e.g. operator delete - it calls 
    LibcInfoWithPatchFunctions<0>::Perftools_delete, which calls 
    LibcInfoWithPatchFunctions<1>::Perftools_delete, which calls 
    LibcInfoWithPatchFunctions<0>::Perftools_delete and so on.
    The root of the problem is in the PatchAllModules function, which is called 
    from LoadLibrary. It appears that CreateToolhelp32Snapshot can fail 
    occasionally, which leads to assuming the process has no modules at all and 
    marking all module_libcs as invalid. The next call to LoadLibrary, for 
    which CreateToolhelp32Snapshot succeeds, will double-patch the modules and 
    lead to the described behavior.
    I'm not sure what causes CreateToolhelp32Snapshot to fail (in my case 
    GetLastError returns 8, Not enough storage is available to process this 
    command, but I clearly have enough memory). MSDN mentions that it may 
    spuriously return ERROR_BAD_LENGTH (24). Probably, when multiple threads 
    call to LoadLibrary, CreateToolhelp32Snapshot fails to synchronize with the 
    library manager properly, even though the call is for the current process.
    One other thing I noticed. The patch_all_modules_lock in PatchAllModules is 
    locked after traversing the modules snapshot. But module_libcs is used 
    during the traverse, which may lead to a race condition. However, moving 
    patch_all_modules_lock before it does not help my situation.

    Reported by andrey.semashev on 2009-12-24 07:16:39

    Type-Defect Priority-Medium Status-Fixed 
    opened by alk 96
  • [tcmalloc] O(n) address-ordered best-fit over PageHeap::large_ becomes major scalability bottleneck on fragmented heap

    [tcmalloc] O(n) address-ordered best-fit over PageHeap::large_ becomes major scalability bottleneck on fragmented heap

    Originally reported on Google Code with ID 532

    We run tcmalloc on our very high throughput MySQL master. Join buffers > kMaxPages are
    allocated by a large percentage of client threads. Over time, as fragmentation increases,
    the large_.returned list grows to contain thousands of Spans, which makes PageHeap::AllocLarge
    extremely slow and the pageheap_lock becomes a major source of contention, eventually
    slowing down MySQL considerably.
    I have written a patch which maintains a doubly linked skiplist to calculate address-ordered
    best-fit in amortized O(log(n)) instead of O(n) when the combined size of the large_
    normal and returned lists exceed a configurable threshold. We're running it in production
    now and according to perf top, the percentage of time spent in PageHeap::AllocLarge
    has gone from ~8% overall to ~0.4%.
    However, while the time complexity of the skiplist is favorable, its space complexity
    (each node is 168 bytes) is obviously substantially less favorable than something like
    a red/black tree. I haven't had a chance to do my own measurements yet, but it seems
    from everything I've read that allocator metadata size has been shown to have a substantial
    impact on CPU cache efficiency, so while this patch may lessen a scalability bottleneck,
    it may have negative performance effects. I'm up for implementing a different data
    structure for this if that has a better chance of being accepted.
    I've also written a small test program that attempts to create a sizeable large_ list,
    and then times 100000 allocations and frees from the large list. It's far from perfect,
    but it's something.
    Test program:

    Reported by jamesgolick on 2013-05-21 23:38:00

    Type-Defect Priority-Medium Status-Accepted 
    opened by alk 70
  • patch for heap profiling on windows

    patch for heap profiling on windows

    Originally reported on Google Code with ID 83

    This is a patch file that takes steps towards finishing the 
    windows port of the heap profiler.  It doesn't address CPU profiling 
    or heap checking.  It has been tested against some fairly large profilees 
    on windows.  Some of the changes in the patch are fixes to issues that 
    would affect tcmalloc even in the absence of profiling.
    It's possible I may have lost some changes when creating the patch (we 
    have some other modifications I am not sharing at this time). 
    Please ask questions.
    - Use the HANDLE API to dump the profile instead of file descriptors. 
    - Fix a couple bugs where there were dependences on the global 
    constructor order, which is undefined 
    - Make sure the pad member of the Central_Freelist is not of length 0 
    - Do not unpatch the windows functions, ever.  This is dangerous since 
    you would be depending on global destructor order. 
    - Do not report large allocations using printf (or at all for now) 
    since printf isn't so safe to call in some processes 
    - If we are asked to free something allocated by the CRT heap, then do 
    a nop.  This is necessary because a bunch of stuff gets allocated 
    before our hooks are in place (global c++ constructors, parts of 
    environ, other things?) 
    - Hook _msize and _expand.  Otherwise, very subtle bugs can occur. 
    Even in programs that don't use these, since the linker/compiler 
    introduces calls to _msize for global c++ constructors occurring in 
    - Make realloc work if it is given something that was originally 
    allocated by the CRT heap. 
    - Make pprof forgiving of carriage returns.  Editing pprof is the sum 
    total of my life experience using perl, so I've probably done 
    something stupid. 
    - Added stack walker using RtlCaptureStackBackTrace, which doesn't 
    work so well with FPO (I turned this walker on by default) 
    - Wrote ad-hoc stack walker, which sometimes works with FPO, but 
    sometimes gets a snapshot from an earlier time due to left over stack junk.  
    It is possible that this one is faster than the previous one if you turn 
    off the heuristics -- I didn't test that. 
    - Leak all spinlocks since you cannot count on the global destructor 
    - Hook all instances of the CRT in the process, not just an arbitrary 
    - Hook LoadLibraryExW in order to detect dynamic loads of the CRT 
    - WART: Using FreeLibrary to free the CRT is dangerous if you later 
    LoadLibrary it again.  So we aren't perfect yet. 
    - Fix some compile errors with VC8 
    - Wrote nm and addr2line programs that use PDB symbols.  Did not add 
    these to any build scripts.  You must download Debugging Tools For 
    Windows and copy dbghelp.dll and symsrv.dll into the directory of 
    nm.exe and addr2line.exe for them to function.  There exist old 
    versions of these dlls that do not function. 
    - Modified pprof to fall back on these pdb versions of nm and 
    addr2line if they are siblings of pprof.  Processes linking in a 
    mixture of gcc compiled and cl compiled dlls seem to get the union of 
    the symbols. 

    Reported by dvitek on 2008-10-16 23:27:13

    - _Attachment: [google-perftools-0.99.2.msvc8_i386_hprof.patch]( Priority-Medium Status-Fixed Type-Patch OpSys-Windows 
    opened by alk 66
  • test suite failures on 1.5 and trunk, Mac OS X 10.6.3

    test suite failures on 1.5 and trunk, Mac OS X 10.6.3

    Originally reported on Google Code with ID 243

    What steps will reproduce the problem?
    1. configure
    2. make
    3. make check
    What is the expected output? What do you see instead?
    0 tests should fail. Instead, 3 out of 38 tests fail.
    What version of the product are you using? On what operating system?
    Tried both 1.5 and trunk on Mac OS X 10.6.3.
    Please provide any additional information below.
    Running the 'heap-profiler_unittest' binary manually gives this output:
    Starting tracking the heap
    Had other new/delete MallocHook-s set. Are you using the heap leak checker? Use --
    heap_check="" to avoid this conflict.
    Abort trap

    Reported by neunon on 2010-05-20 23:15:05

    - _Attachment: [make-check-1.5.txt.gz]( - _Attachment: [make-check-trunk.txt.gz]( Type-Defect Priority-Medium Status-Fixed 
    opened by alk 46
  • Heap checking currently does not support FreeBSD6 (requesting feedback on porting)

    Heap checking currently does not support FreeBSD6 (requesting feedback on porting)

    Originally reported on Google Code with ID 311

    What steps will reproduce the problem?
    1. build the tools with --enable-heap-check
    2. write a small program. Compile and link it with libtcmalloc.
    #include <stdio.h>
    #include <stdlib.h>
    int main(int argc, char** argv)
        printf("********** start **********\n");
        int* i1 = new int(1000);
        int* i2 = new int(1000);
        int* i3 = new int(1000);
        int* i4 = new int(1000);
        int* i5 = new int(1000);
        int* i6 = new int(1000);
        printf("*********** end ***********\n");
        return 0;
    $ g++ -L/usr/local/lib main.cpp -o main -ltcmalloc
    3. Run the heap checker as follows: 
    $ export PERFTOOLS_VERBOSE=10
    $ HEAPCHECK=normal ./main
    Unable to open /proc/self/environ, falling back on getenv("HEAPCHECK"), which may not
    MemoryRegionMap Init
    MemoryRegionMap Init done
    Starting tracking the heap
    Found hooked allocator at 3: 0x680a2a69 <- 0x682a5b8f
    Found hooked allocator at 2: 0x680a2a69 <- 0x682a3114
    Found hooked allocator at 2: 0x680a2a69 <- 0x6808713a
    Found hooked allocator at 2: 0x680a2a69 <- 0x682a1b51
    Found hooked allocator at 2: 0x680a2a69 <- 0x68087154
    Found hooked allocator at 2: 0x680a0e03 <- 0x68087172
    Found hooked allocator at 2: 0x680a2a69 <- 0x682a3114
    Found hooked allocator at 2: 0x680a2a69 <- 0x682a7a04
    Found hooked allocator at 2: 0x680a0e03 <- 0x68097b4a
    Going to ignore live object at 0x80cd000 of 4 bytes
    Found hooked allocator at 2: 0x680a2a69 <- 0x682a698a
    Found hooked allocator at 2: 0x680a0e03 <- 0x681330ff
    Found hooked allocator at 2: 0x680a0e03 <- 0x68095fae
    Found hooked allocator at 2: 0x680a0e03 <- 0x681330ff
    Found hooked allocator at 2: 0x680a0e03 <- 0x681330ff
    Found hooked allocator at 2: 0x680a0e03 <- 0x681330ff
    No shared libs detected. Will likely report false leak positives for statically linked
    Turning perftools heap leak checking off
    MemoryRegionMap Shutdown
    MemoryRegionMap Shutdown done
    ********** start **********
    *********** end ***********
    What is the expected output? What do you see instead?
    As you can see, we hit a condition in the heap check startup/init that causes heap
    checking to be disabled as indicated by the message:
    No shared libs detected. Will likely report false leak positives for statically linked
    Turning perftools heap leak checking off
    What version of the product are you using? On what operating system?
    Please provide any additional information below.
    Here is the code snippet where the heap check is bailing out:
    HeapLeakChecker::ProcMapsResult HeapLeakChecker::UseProcMapsLocked(
                                      ProcMapsTask proc_maps_task) {
      RAW_DCHECK(heap_checker_lock.IsHeld(), "");
      // Need to provide own scratch memory to ProcMapsIterator:
      ProcMapsIterator::Buffer buffer; 
      ProcMapsIterator it(0, &buffer);
      if (!it.Valid()) {
        int errsv = errno;
        RAW_LOG(ERROR, "Could not open /proc/self/maps: errno=%d. "
                       "Libraries will not be handled correctly.", errsv); 
        return CANT_OPEN_PROC_MAPS;
      uint64 start_address, end_address, file_offset;
      int64 inode;
      char *permissions, *filename;
      bool saw_shared_lib = false;
      while (it.Next(&start_address, &end_address, &permissions,
                     &file_offset, &inode, &filename)) {
        if (start_address >= end_address) {
          // Warn if a line we can be interested in is ill-formed:
          if (inode != 0) { 
            RAW_LOG(ERROR, "Errors reading /proc/self/maps. "
                           "Some global memory regions will not "
                           "be handled correctly.");
          // Silently skip other ill-formed lines: some are possible
          // probably due to the interplay of how /proc/self/maps is updated
          // while we read it in chunks in ProcMapsIterator and
          // do things in this loop.
        // Determine if any shared libraries are present.
        if (inode != 0 && strstr(filename, "lib") && strstr(filename, ".so")) {
          saw_shared_lib = true; 
        switch (proc_maps_task) {
            // All lines starting like
            // "401dc000-4030f000 r??p 00132000 03:01 13991972  lib/bin"
            // identify a data and code sections of a shared library or our binary
            if (inode != 0 && strncmp(permissions, "r-xp", 4) == 0) { 
              DisableLibraryAllocsLocked(filename, start_address, end_address);
          case RECORD_GLOBAL_DATA:
            RecordGlobalDataLocked(start_address, end_address,
                                   permissions, filename);
            RAW_CHECK(0, "");
    ******************************* HERE *********************************  
      if (!saw_shared_lib) {
        RAW_LOG(ERROR, "No shared libs detected. Will likely report false leak "
                       "positives for statically linked executables.");
      return PROC_MAPS_USED;
    The documentation seems to suggest that the heap check is only currently supported
    on Linux. I suspect that whatever linux specific code could be made platform independent
    or easily ported to work with FreeBSD. I would like to add support for FreeBSD but
    need a bit of a background overview on the Linux specific bits that would need to change.
    Is someone willing to get me up to speed here to get this work underway? 

    Reported by chappedm on 2011-02-11 22:01:19

    Priority-Medium Status-Fixed Type-Enhancement 
    opened by alk 40
  • RHEL4. Random crashes in run time on symbol resolution.

    RHEL4. Random crashes in run time on symbol resolution.

    Originally reported on Google Code with ID 246

    What steps will reproduce the problem?
    The problem is not stable, I have no concrete guide to reproduce it. It 
    does seem, however, happen when an application attempts to start a thread 
    via pthread_create. I've seen other crashes with no threading involved (at 
    least, on the application level).
    What is the expected output? What do you see instead?
    The application should not crash
    What version of the product are you using? On what operating system?
    PerfTools 1.4, patched according to ticket #201. RHEL4.
    uname -a
    Linux ~~~~~~ 2.6.9-78.ELsmp #1 SMP Wed Jul 9 15:39:47 EDT 2008 i686 i686 
    i386 GNU/Linux
    rpm -qa | grep glibc
    GNU C Library stable release version 2.3.4, by Roland McGrath et al.
    Copyright (C) 2005 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.
    There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
    Compiled by GNU CC version 3.4.6 20060404 (Red Hat 3.4.6-9).
    Compiled on a Linux 2.4.20 system on 2008-04-15.
    Available extensions:
            GNU libio by Per Bothner
            crypt add-on version 2.1 by Michael Glad and others
            linuxthreads-0.10 by Xavier Leroy
            The C stubs add-on version 2.1.2.
            NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
            Glibc-2.0 compatibility add-on by Cristian Gafton 
            GNU Libidn by Simon Josefsson
            libthread_db work sponsored by Alpha Processor Inc
    Thread-local storage support included.
    For bug reporting instructions, please see:
    Please provide any additional information below.
    Our tests on RHEL4 are randomly crashing. The tests involve different code, 
    some are multithreaded, some are not. All modules are linked with
    I managed to recover a stack of one crash (which is not very easy, since 
    it's on a remote server):
    27/05/10 05:28:45 Using host libthread_db library 
    27/05/10 05:28:45 Core was generated by `/home/nb/KERN-
    27/05/10 05:28:45 Program terminated with signal 11, Segmentation fault.
    27/05/10 05:28:47 #0  0x002c199e in do_lookup_x () from /lib/
    27/05/10 05:28:47 
    27/05/10 05:28:47 Thread 1 (process 23039):
    27/05/10 05:28:47 #0  0x002c199e in do_lookup_x () from /lib/
    27/05/10 05:28:47 #1  0x002c1e22 in _dl_lookup_symbol_x () from /lib/ld-
    27/05/10 05:28:47 #2  0x002c51d6 in fixup () from /lib/
    27/05/10 05:28:47 #3  0x002c5110 in _dl_runtime_resolve () from /lib/ld-
    27/05/10 05:28:47 #4  0x08066419 in CrashHandler (sig=11) at 
    27/05/10 05:28:47 #5  <signal handler called>
    27/05/10 05:28:47 #6  0x002c199e in do_lookup_x () from /lib/
    27/05/10 05:28:47 #7  0x002c1e22 in _dl_lookup_symbol_x () from /lib/ld-
    27/05/10 05:28:47 #8  0x002c51d6 in fixup () from /lib/
    27/05/10 05:28:47 #9  0x002c5110 in _dl_runtime_resolve () from /lib/ld-
    27/05/10 05:28:47 #10 0x003d4fb9 in tcmalloc::CentralFreeList::InsertRange 
    27/05/10 05:28:47    from /home/nb/KERN-
    27/05/10 05:28:47 #11 0x003d8683 in 
    tcmalloc::ThreadCache::ReleaseToCentralCache ()
    27/05/10 05:28:47    from /home/nb/KERN-
    27/05/10 05:28:47 #12 0x003d879a in tcmalloc::ThreadCache::Scavenge ()
    27/05/10 05:28:47    from /home/nb/KERN-
    27/05/10 05:28:47 #13 0x003cfb67 in (anonymous 
    namespace)::do_free_with_callback ()
    27/05/10 05:28:47    from /home/nb/KERN-
    27/05/10 05:28:47 #14 0x003d1f6f in MallocBlock::ProcessFreeQueue ()
    27/05/10 05:28:47    from /home/nb/KERN-
    27/05/10 05:28:47 #15 0x003ce407 in DebugDeallocate ()
    27/05/10 05:28:47    from /home/nb/KERN-
    27/05/10 05:28:47 #16 0x003de860 in realloc ()
    27/05/10 05:28:47    from /home/nb/KERN-
    27/05/10 05:28:47 #17 0x00525c44 in pthread_create@@GLIBC_2.1 () from 
    27/05/10 05:28:47 #18 0x0806895f in Impl (this=0x942f990) at 
    27/05/10 05:28:47 #19 0x0806751e in SignalHandler (this=0x942f9b0)
    27/05/10 05:28:47     at ./src/SignalHandlerPosix.cpp:344
    27/05/10 05:28:47 #20 0x080598ca in StartUp (lg=@0xbffaff30, 
    27/05/10 05:28:47     fileName=0x946b3e0 "localsettings.xml") at 
    27/05/10 05:28:47 #21 0x080559a6 in (anonymous namespace)::COMMain 
    27/05/10 05:28:47     at ./src/CBOSSinMain.cpp:81
    27/05/10 05:28:47 #22 0x08055c83 in main (argc=1, argv=0xbffb0074) at 
    Note that is actually a renamed I did so in attempt to debug our 
    applications and this particular problem.
    Here the application is at its very startup, it attempts to create a thread 
    to wait for signals. Apparently, the crash appears when the dynamic linker 
    attempts to resolve a symbol in run time. The crash handler installed by 
    the application fails to do anything useful since it also triggers symbol 
    resolution. In other crashes the application was also in its very early 
    stage, but I don't have stacks from these.
    The problem is very hard to reproduce, out of ~1800 tests we run each 
    night, about 5-10 of them fail with these sympthoms, almost every time 
    different ones.
    I'm suspecting that this is a bug in glibc, but I'm not sure. I'm not very 
    knowledgeable in the interworkings of the dynamic linker to go digging into 
    I have a suggestion of a possible workaround in perftools, though. I tried 
    to compile it with an additional linker flag "-Wl,-z -Wl,now", which would 
    force the linker to resolve all symbols in the tcmalloc library immediately 
    at its load time, instead of lazily, which is the default. I did one tests 
    turnaround and it did not show any tests with the described crash. I'll 
    keep monitoring, though. However, I suggest adding the mentioned flag to 
    the library makefiles.

    Reported by andrey.semashev on 2010-05-28 05:22:47

    Type-Defect Priority-Medium Status-Invalid 
    opened by alk 40
  • [patch] Add preliminary MIPS32/MIPS64 atomicops support

    [patch] Add preliminary MIPS32/MIPS64 atomicops support

    Originally reported on Google Code with ID 361

    What steps will reproduce the problem?
    1. cd google-perftools-1.8.2/ && patch -p1 < ../google-perftools-1.8.2-mips.patch
    2. -- configure --  *ensure that you read the README_mips.txt
    3. make
    4. Copy the shared/static libraries and link
    What is the expected output? What do you see instead?
    Existing MIPS support is not available. No atomicops are present, and some of the system
    call infrastructure needs tweaking. This patch addresses these issues. This patch DOES
    NOT add full featured perftools support.
    What version of the product are you using? On what operating system?
    This patch is generated against 1.8.2 of the google perf tools (which is the latest
    at the time of this writing). 
    Please provide any additional information below.
    This patch has only been thoroughly tested on the octeon hardware platform, with mips32
    application running under linux. Mips64 has been tested only insofar as the unit tests
    seem to pass. Your mileage may vary.

    Reported by [email protected] on 2011-08-23 16:19:37

    Priority-Medium Type-Patch Status-Obsolete 
    opened by alk 37
  • malloc_usable_size/tc_malloc_size hangs under debug malloc when custom operator new implementation calls malloc_usable_size

    malloc_usable_size/tc_malloc_size hangs under debug malloc when custom operator new implementation calls malloc_usable_size

    attempting to use malloc_usable_size() with tcmalloc_debug simply hangs here: #0 pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:94 #1 0x00007f10b280bb27 in MallocExtension::instance () at src/ #2 0x00007f10b28159e9 in tc_malloc_size (ptr=0x1010020) at src/

    (gdb) f 1 #1 0x00007f10b280bb27 in MallocExtension::instance () at src/

    207 perftools_pthread_once(&module_init, InitModule); (gdb) p module_init $1 = 1 (gdb) p InitModule $2 = {void (void)} 0x7f10b280b580 <InitModule()> (gdb) f 0 #0 pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:94

    94 jmp 6b (gdb) l 89 orl %fs:PRIVATE_FUTEX, %esi 90 # endif 91 #endif 92 movl $SYS_futex, %eax 93 syscall 94 jmp 6b 95 96 /* Preserve the pointer to the control variable. */ 97 3: pushq %rdi 98 cfi_adjust_cfa_offset(8)

    any ideas on what could be going wrong here..

    opened by gaurabpaul 36
  • Bus error on ARM

    Bus error on ARM


    Thanks for great tool.

    I cross-compile it for powerPC with GNU compilers 4.1.1 and it works fine.

    Problem is that I also need it for ARM. I configure it with options --host=arm-926ejs-linux-gnueabi --with-pic=no --enable-static=no --disable-cpu-profiler --disable-heap-checker --disable-debugalloc Once I use it on ARM machine with LD_PRELOAD, linux reports Bus error.

    Do you have any idea what can be wrong?

    best regards Stevo

    opened by sbellus 35
  • google-perftools-2.5 compile error on gentoo for mips loongson2f CPU

    google-perftools-2.5 compile error on gentoo for mips loongson2f CPU

    The error message : In file included from /var/tmp/portage/dev-util/google-perftools-2.5/work/gperftools-2.5/src/ /var/tmp/portage/dev-util/google-perftools-2.5/work/gperftools-2.5/src/malloc_hook_mmap_linux.h: In function 'void* do_mmap64(void*, size_t, int, int, int, __off64_t)' :

    /var/tmp/portage/dev-util/google-perftools-2.5/work/gperftools-2.5/src/malloc_hook_mmap_linux.h:86:30: error: 'SYS_mmap2' was not declared in this scope
     result = (void *)syscall(SYS_mmap2,

    Makefile:4734: recipe for target 'src/libtcmalloc_minimal_internal_la-malloc_hook.lo' failed make: *** [src/libtcmalloc_minimal_internal_la-malloc_hook.lo] Error 1

    • ERROR: dev-util/google-perftools-2.5::gentoo failed (compile phase):
    • emake failed
    • If you need support, post the output of emerge --info '=dev-util/google-perftools-2.5::gentoo',
    • the complete build log and the output of emerge -pqv '=dev-util/google-perftools-2.5::gentoo'.
    • The complete build log is located at '/var/tmp/portage/dev-util/google-perftools-2.5/temp/build.log'.
    • The ebuild environment file is located at '/var/tmp/portage/dev-util/google-perftools-2.5/temp/environment'.
    • Working directory: '/var/tmp/portage/dev-util/google-perftools-2.5/work/gperftools-2.5-abi_mips_n32.n32'
    • S: '/var/tmp/portage/dev-util/google-perftools-2.5/work/gperftools-2.5'
    opened by emtone 34
  • Compiling fails on snow leopard

    Compiling fails on snow leopard

    Originally reported on Google Code with ID 168

    What steps will reproduce the problem?
    1. Compiling on Snow Leopard
    What is the expected output? What do you see instead?
    A working version
    What version of the product are you using? On what operating system?
    Snow Leopard, Perftools 1.4
    Please provide any additional information below.
    Compiling fails with the following error:
    (cd .libs && rm -f && ln -s ../
    if /bin/sh ./libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I. -I.
    -I./src  -I./src   -Wall -Wwrite-strings -Woverloaded-virtual
    -Wno-sign-compare  -DNO_FRAME_POINTER -g -O2 -MT profiler.lo -MD -MP -MF
    ".deps/profiler.Tpo" -c -o profiler.lo `test -f 'src/' || echo
    './'`src/; \
        then mv -f ".deps/profiler.Tpo" ".deps/profiler.Plo"; else rm -f
    ".deps/profiler.Tpo"; exit 1; fi
     g++ -DHAVE_CONFIG_H -I. -I. -I./src -I./src -Wall -Wwrite-strings
    -Woverloaded-virtual -Wno-sign-compare -DNO_FRAME_POINTER -g -O2 -MT
    profiler.lo -MD -MP -MF .deps/profiler.Tpo -c src/  -fno-common
    -DPIC -o .libs/profiler.o
    In file included from src/
    src/getpc.h:178: error: expected ‘,’ or ‘...’ before ‘&’ token
    src/getpc.h:178: error: ISO C++ forbids declaration of ‘ucontext_t’ with no
    src/getpc.h: In function ‘void* GetPC(int)’:
    src/getpc.h:179: error: ‘signal_ucontext’ was not declared in this scope
    src/ At global scope:
    src/ error: conflicting declaration ‘typedef int ucontext_t’
    /usr/include/sys/_structs.h:227: error: ‘ucontext_t’ has a previous
    declaration as ‘typedef struct __darwin_ucontext ucontext_t’
    In file included from src/base/commandlineflags.h:55,
                     from src/
    ./src/base/basictypes.h: In constructor
    ‘AssignAttributeStartEnd::AssignAttributeStartEnd(const char*, char**,
    ./src/base/basictypes.h:251: warning: ‘_dyld_present’ is deprecated
    (declared at /usr/include/mach-o/dyld.h:237)
    ./src/base/basictypes.h:251: warning: ‘_dyld_present’ is deprecated
    (declared at /usr/include/mach-o/dyld.h:237)
    src/ In static member function ‘static void
    CpuProfiler::prof_handler(int, siginfo_t*, void*, void*)’:
    src/ error: cannot convert ‘__darwin_ucontext’ to ‘int’ for
    argument ‘1’ to ‘void* GetPC(int)’
    make: *** [profiler.lo] Error 1

    Reported by sebastian.hillig on 2009-09-16 09:07:56

    Type-Defect Priority-Medium Status-Fixed 
    opened by alk 33
  • failed to build due to std::random_shuffle removed from new C++ standard

    failed to build due to std::random_shuffle removed from new C++ standard

    build error when using -std=c++17 or -std=c++20: benchmark/ error: no member named 'random_shuffle' in namespace 'std'

    reason: std::random_shuffle is removed in the new C++ standard.

    solution: as the example here:

    opened by akofer 0
  • pprof: request: exclude allocations that have since been freed

    pprof: request: exclude allocations that have since been freed

    So, in pprof there's a --base argument to exclude allocations that were not made until after a certain point.

    I'd like to request a similar argument that allows us to exclude allocations that were not live after a given point.

    For example, if I have 3 heap files: 0001.heap, 0002.heap and 0003.heap, I only want to show the memory that was allocated when 0002.heap was dumped, and specifically exclude those allocations that were freed before 0003.heap was allocated.

    opened by Spongman 0
  • How to release all freelists back to system?

    How to release all freelists back to system?

    Invoking ReleaseFreeMemory seems to only release the "page heap freelist". How can I release the other freelists?

    MALLOC:         784216 (    0.7 MiB) Bytes in use by application
    MALLOC: +        49152 (    0.0 MiB) Bytes in page heap freelist
    MALLOC: +       465456 (    0.4 MiB) Bytes in central cache freelist
    MALLOC: +       144640 (    0.1 MiB) Bytes in transfer cache freelist
    MALLOC: +      1169784 (    1.1 MiB) Bytes in thread cache freelists
    MALLOC: +       409600 (    0.4 MiB) Bytes in malloc metadata
    MALLOC:   ------------
    MALLOC: =      3022848 (    2.9 MiB) Actual memory used (physical + swap)
    MALLOC: +       532480 (    0.5 MiB) Bytes released to OS (aka unmapped)
    MALLOC:   ------------
    MALLOC: =      3555328 (    3.4 MiB) Virtual address space used
    MALLOC:            391              Spans in use
    MALLOC:              6              Thread heaps in use
    MALLOC:           4096              Tcmalloc page size
    opened by infn-ke 0
  • Page size 4 kB - how to set?

    Page size 4 kB - how to set?

    How do I set a page size of 4 kB? I can see there are at least two different build switches that I believe refers to the same thing?

    There is a parameter called TCMALLOC_PAGE_SIZE_SHIFT that seems to allow me to change to 4 kB (shift of 12)

    The top cmake recipe has a parameter called gperftools_tcmalloc_pagesize but that only allows setting a page size down to 8 kB.

    What is the correct way to set 4 kB page size?

    opened by infn-ke 1
  • Any idea how to use with php?

    Any idea how to use with php?

    Anyone used this software for php? If so how? I compile release version 2.10 using:

    make -j9
    make install

    I put this in php.ini:


    And then I run php and get:

    Unable to load dynamic library ''

    And all around google it seems that no one ever discussed this error on any forums! What does the compile script do? What does it install? How to use this software?

    opened by aario-k24 0
  • gperftools-2.10(May 31, 2022)

    30 May 2022

    gperftools 2.10 is out!

    Here are notable changes:

    • Matt T. Proud contributed documentation fix to call Go programming language by it's true name instead of golang.
    • Robert Scott contributed debugallocator feature to use readable (PROT_READ) fence pages. This is activated by TCMALLOC_PAGE_FENCE_READABLE environment veriable.
    • User stdpain contributed fix for cmake detection of libunwind.
    • Natale Patriciello contributed fix for OSX Monterey support.
    • Volodymyr Nikolaichuk contributed support for returning memory back to OS by using mmap with MAP_FIXED and PROT_NONE. It is off by default and enabled by preprocessor define: FREE_MMAP_PROT_NONE. This should help OSes that don't support Linux-style madvise MADV_DONTNEED or BSD-style MADV_FREE.
    • Jingyun Hua has contributed basic support for LoongArch.
    • Github issue #1338 of failing to build on some recent musl versions has been fixed.
    • Github issue #1321 of failing to ship cmake bits with .tar.gz archive has been fixed.
    Source code(tar.gz)
    Source code(zip)
    gperftools-2.10.tar.gz(1.54 MB) MB)
  • gperftools-2.9.1(Mar 3, 2021)

  • gperftools-2.9(Feb 21, 2021)

    Few more changes landed compared to rc:

    • Venkatesh Srinivas has contributed thread-safety annotations support.
    • couple more unit test bugs that caused tcmalloc_unittest to fail on recent clang has been fixed.
    • usage of unsupportable linux_syscall_support.h has been removed from few places. Building with --disable-heap-checker now completely avoids it. Expect complete death of this header in next major release.
    Source code(tar.gz)
    Source code(zip)
    gperftools-2.9.0.tar.gz(1.50 MB) MB)
  • gperftools-2.8.90(Feb 15, 2021)

    gperftools 2.9rc is out!

    Here are notable changes:

    • Jarno Rajahalme has contributed fix crashing bug in syscalls support for aarch64.
    • User SSE4 has contributed basic support for Elbrus 2000 architecture (!)
    • Venkatesh Srinivas has contributed cleanup to atomic ops.
    • Đoàn Trần Công Danh has fixed cpu profiler compilation on musl.
    • there is now better backtracing support for aarch64 and riscv. x86-64 with frame pointers now also defaults to this new "generic" frame pointer backtracer.
    • emergency malloc is now enabled by default. Fixes hang on musl when libgcc backtracer is enabled.
    • bunch of legacy config tests has been removed
    Source code(tar.gz)
    Source code(zip)
    gperftools-2.8.90.tar.gz(1.50 MB) MB)
  • gperftools-2.8.1(Dec 21, 2020)

    gperftools-2.8.1 is out!

    Here are notable changes:

    • previous release contained change to release memory without page heap lock, but this change had at least one bug that caused to crashes and corruption when running under aggressive decommit mode (this is not default). While we check for other bugs, this feature was reverted. See github issue #1204 and issue #1227.

    • stack traces depth captured by gperftools is now up to 254 levels deep. Thanks to Kerrick Staley for this small but useful tweak.

    • Levon Ter-Grigoryan has contributed small fix for compiler warning.

    • Grant Henke has contributed updated detection of program counter register for OS X on arm64.

    • Tim Gates has contributed small typo fix.

    • Steve Langasek has contributed basic build fixes for riscv64.

    • Isaac Hier and okhowang have contributed premiliminary port of build infrastructure to cmake. This works, but it is very premiliminary. Autotools-based build is the only officially supported build for now.

    Source code(tar.gz)
    Source code(zip)
    gperftools-2.8.1.tar.gz(1.52 MB) MB)
  • gperftools-2.8(Jul 6, 2020)

    gperftools 2.8 is out!

    Here are notable changes:

    • ProfilerGetStackTrace is now officially supported API for libprofiler. Contributed by Kirill Müller.

    • Build failures on mingw were fixed. This fixed issue #1108.

    • Build failure of page_heap_test on MSVC was fixed.

    • Ryan Macnak contributed fix for compiling linux syscall support on i386 and recent GCCs. This fixed issue #1076.

    • test failures caused by new gcc 10 optimizations were fixed. Same change also fixed tests on clang.

    Source code(tar.gz)
    Source code(zip)
    gperftools-2.8.tar.gz(1.52 MB) MB)
  • gperftools-2.7.90(Mar 9, 2020)

    gperftools 2.8rc is out!

    Here are notable changes:

    • building code now requires c++11 or later. Bundled MSVC project was converted to Visual Studio 2015.
    • User obones contributed fix for windows x64 TLS callbacks. This fixed leak of thread caches on thread exists in 64-bit windows.
    • releasing memory back to kernel is now made with page heap lock dropped.
    • HoluWu contributed fix for correct malloc patching on debug builds on windows. This configuration previously crashed.
    • Romain Geissler contributed fix for tls access during early tls initialization on dlopen.
    • large allocation reports are now silenced by default. Since not all programs want their stderr polluted by those messages. Contributed by Junhao Li.
    • HolyWu contributed improvements to MSVC project files. Notably, there is now project for "overriding" version of tcmalloc.
    • MS-specific _recalloc is now correctly zeroing only malloced part. This fix was contributed by HolyWu.
    • Brian Silverman contributed correctness fix to sampler_test.
    • Gabriel Marin ported few fixes from chromium's fork. As part of those fixes, we reduced number of static initializers (forbidden in chromium). Also we now syscalls via syscall function instead of reimplementing direct way to make syscalls on each platform.
    • Brian Silverman fixed flakiness in page heap test.
    • There is now configure flag to skip installing perl pprof, since external golang pprof is much superior. --disable-deprecated-pprof is the flag.
    • Fabric Fontaine contributed fixes to drop use of nonstandard __off64_t type.
    • Fabrice Fontaine contributed build fix to check for presence of nonstandard __sbrk functions. It is only used by mmap hooks code and (rightfully) not available on musl.
    • Fabrice Fontaine contributed build fix around mmap64 macro and function conflict in same cases.
    • there is now configure time option to enable aggressive decommit by default. Contributed by Laurent Stacul. --enable-aggressive-decommit-by-default is the flag.
    • Tulio Magno Quites Machado Filho contributed build fixes for ppc around ucontext access.
    • User pkubaj contributed couple build fixes for FreeBSD/ppc.
    • configure now always assumes we have mmap. This fixes configure failures on some linux guests inside virtualbox. This fixed issue #1008.
    • User shipujin contributed syscall support fixes for mips64 (big and little endian).
    • Henrik Edin contributed configurable support for wide range of malloc page sizes. 4K, 8K, 16K, 32K, 64K, 128K and 256K are now supported via existing --with-tcmalloc-pagesize flag to configure.
    • Jon Kohler added overheads fields to per-size-class textual stats. Stats that are available via MallocExtension::instance()->GetStats().
    • tcmalloc can now avoid fallback from memfs to default sys allocator. TCMALLOC_MEMFS_DISABLE_FALLBACK switches this on. This was contributed by Jon Kohler.
    • Ilya Leoshkevich fixed mmap syscall support on s390.
    • Todd Lipcon contributed small build warning fix.
    • User prehistoricpenguin contributed misc source file mode fixes (we still had few few c++ files marked executable).
    • User invalid_ms_user contributed fix for typo.
    • Jakub Wilk contributed typos fixes.
    Source code(tar.gz)
    Source code(zip)
    gperftools-2.7.90.tar.gz(1.51 MB) MB)
  • gperftools-2.7(Apr 30, 2018)

    gperftools 2.7 is out!

    Few people contributed minor, but important fixes since rc.


    • bug in span stats printing introduced by new scalable page heap change was fixed.
    • Christoph Müllner has contributed couple warnings fixes and initial support for aarch64_ilp32 architecture.
    • Ben Dang contributed documentation fix for heap checker.
    • Fabrice Fontaine contributed fixed for linking benchmarks with --disable-static.
    • Holy Wu has added sized deallocation unit tests.
    • Holy Wu has enabled support of sized deallocation (c++14) on recent MSVC.
    • Holy Wu has fixed MSVC build in WIN32_OVERRIDE_ALLOCATORS mode. This closed issue #716.
    • Holy Wu has contributed cleanup of config.h used on windows.
    • Mao Huang has contributed couple simple tcmalloc changes from chromium code base. Making our tcmalloc forks a tiny bit closer.
    • issue #946 that caused compilation failures on some Linux clang installations has been fixed. Much thanks to github user htuch for helping to diagnose issue and proposing a fix.
    • Tulio Magno Quites Machado Filho has contributed build-time fix for PPC (for problem introduced in one of commits since RC).
    Source code(tar.gz)
    Source code(zip)
    gperftools-2.7.tar.gz(1.45 MB) MB)
  • gperftools-2.6.90(Mar 18, 2018)

    gperftools 2.7rc is out!


    • Most notable change in this release is that very large allocations (>1MiB) are now handled be O(log n) implementation. This is contributed by Todd Lipcon based on earlier work by Aliaksei Kandratsenka and James Golick. Special thanks to Alexey Serbin for contributing OSX fix for that commit.

    • detection of sized deallocation support is improved. Which should fix another set of issues building on OSX. Much thanks to Alexey Serbin for reporting the issue, suggesting a fix and verifying it.

    • Todd Lipcon made a change to extend page heaps freelists to 1 MiB (up from 1MiB - 8KiB). This may help a little for some workloads.

    • Ishan Arora contributed typo fix to docs

    Source code(tar.gz)
    Source code(zip)
    gperftools-2.6.90.tar.gz(1.44 MB) MB)
  • gperftools-2.6.3(Dec 9, 2017)

  • gperftools-2.6.2(Nov 30, 2017)

    gperftools 2.6.2 is out!

    Most notable change is recently added support for C++17 over-aligned allocation operators contributed by Andrey Semashev. I've extended his implementation to have roughly same performance as malloc/new. This release also has native support for C11 aligned_alloc.

    Rest is mostly bug fixes:

    • Jianbo Yang has contributed a fix for potentially severe data raceintroduced by malloc fast-path work in gperftools 2.6. This race could cause occasional violation of total thread cache size constraint. See issue #929 for more details.

    • Correct behavior in out-of-memory condition in fast-path cases was restored. This was another bug introduced by fast-path optimization in gperftools 2.6 which caused operator new to silently return NULL instead of doing correct C++ OOM handling (calling new_handler and throwing bad_alloc).

    • Khem Raj has contributed couple build fixes for newer glibcs (ucontext_t vs struct ucontext and loff_t definition)

    • Piotr Sikora has contributed build fix for OSX (not building unwind benchmark). This was issue #910 (thanks to Yuriy Solovyov for reporting it).

    • Dorin Lazăr has contributed fix for compiler warning

    • issue #912 (occasional deadlocking calling getenv too early on windows) was fixed. Thanks to github user shangcangriluo for reporting it.

    • Couple earlier lsan-related commits still causing occasional issues linking on OSX has been reverted. See issue #901.

    • Volodimir Krylov has contributed GetProgramInvocationName for FreeBSD

    • changsu lee has contributed couple minor correctness fixes (missing va_end() and missing free() call in rarely executed Symbolize path)

    • Andrew C. Morrow has contributed some more page heap stats. See issue #935.

    • some cases of built-time warnings from various gcc/clang versions about throw() declarations have been fixes.

    Source code(tar.gz)
    Source code(zip)
    gperftools-2.6.2.tar.gz(1.43 MB) MB)
  • gperftools-2.6.1(Jul 9, 2017)

    gperftools 2.6.1 is out! This is mostly bug-fixes release.

    • issue #901: build issue on OSX introduced in last-time commit in 2.6 was fixed (contributed by Francis Ricci).

    • tcmalloc_minimal now works on 32-bit ABI of mips64. This is issue #845. Much thanks to Adhemerval Zanella and github user mtone.

    • Romain Geissler contributed build fix for -std=c++17. This is pull request #897.

    • As part of fixing issue #904, tcmalloc atfork handler is now installed early. This should fix slight chance of hitting deadlocks at fork in some cases.

    Source code(tar.gz)
    Source code(zip)
    gperftools-2.6.1.tar.gz(1.43 MB) MB)
  • gperftools-2.6(Jul 5, 2017)

    gperftools 2.6 is out! See NEWS entries of pre-releases for major new features.

    • Kim Gräsman contributed documentation update for HEAPPROFILESIGNAL environment variable

    • KernelMaker contributed fix for population of min_object_size field returned by MallocExtension::GetFreeListSizes

    • commit 8c3dc52fcfe0 "issue-654: [pprof] handle split text segments" was reverted. Some OSX users reported issues with this commit. Given our pprof implementation is strongly deprecated, it is best to drop recently introduced features rather than breaking it badly.

    • Francis Ricci contributed improvement for interaction with leak sanitizer

    Source code(tar.gz)
    Source code(zip)
    gperftools-2.6.tar.gz(1.42 MB) MB)
  • gperftools-2.5.93(May 23, 2017)

  • gperftools-2.5.92(May 22, 2017)

  • gperftools-2.5.91(May 15, 2017)

  • gperftools-2.5.90(May 15, 2017)

    gperftools 2.6rc is out!

    Highlights of this release are performance work on malloc fast-path and support for more modern visual studio runtimes, and deprecation of bundled pprof. Another significant performance-affecting changes are reverting central free list transfer batch size back to 32 and disabling of aggressive decommit mode by default.

    Note, while we still ship perl implementation of pprof, everyone is strongly advised to use golang reimplementation of pprof from

    Here are notable changes in more details (and see ChangeLog for full details):

    • a bunch of performance tweaks to tcmalloc fast-path were merged. This speeds up critical path of tcmalloc by few tens of %. Well tuned and allocation-heavy programs should see substantial performance boost (should apply to all modern elf platforms). This is based on Google-internal tcmalloc changes for fast-path (with obvious exception of lacking per-cpu mode, of course). Original changes were made by Aliaksei Kandratsenka. And Andrew Hunter, Dmitry Vyukov and Sanjay Ghemawat contributed with reviews and discussions.

    • Architectures with 48 bits address space (x86-64 and aarch64) now use faster 2 level page map. This was ported from Google-internal change by Sanjay Ghemawat.

    • Default value of TCMALLOC_TRANSFER_NUM_OBJ was returned back to

      1. Larger values have been found to hurt certain programs (but help some other benchmarks). Value can still be tweaked at run time via environment variable.
    • tcmalloc aggressive decommit mode is now disabled by default again. It was found to degrade performance of certain tensorflow benchmarks. Users who prefer smaller heap over small performance win can still set environment variable TCMALLOC_AGGRESSIVE_DECOMMIT=t.

    • runtime switchable sized delete support has be fixed and re-enabled (on GNU/Linux). Programs that use C++ 14 or later that use sized delete can again be sped up by setting environment variable TCMALLOC_ENABLE_SIZED_DELETE=t. Support for enabling sized deallication support at compile-time is still present, of course.

    • tcmalloc now explicitly avoids use of MADV_FREE on Linux, unless TCMALLOC_USE_MADV_FREE is defined at compile time. This is because performance impact of MADV_FREE is not well known. Original issue #780 raised by Mathias Stearn.

    • issue #786 with occasional deadlocks in stack trace capturing via libunwind was fixed. It was originally reported as Ceph issue:

    • ChangeLog is now automatically generated from git log. Old ChangeLog is now ChangeLog.old.

    • tcmalloc now provides implementation of nallocx. Function was originally introduced by jemalloc and can be used to return real allocation size given allocation request size. This is ported from Google-internal tcmalloc change contributed by Dmitry Vyukov.

    • issue #843 which made tcmalloc crash when used with erlang runtime was fixed.

    • issue #839 which caused tcmalloc's aggressive decommit mode to degrade performance in some corner cases was fixed.

    • Bryan Chan contributed support for 31-bit s390.

    • Brian Silverman contributed compilation fix for 32-bit ARMs

    • Issue #817 that was causing tcmalloc to fail on windows 10 and later, as well as on recent msvc was fixed. We now patch _free_base as well.

    • a bunch of minor documentaion/typos fixes by: Mike Gaffney [email protected], iivlev [email protected], savefromgoogle [email protected], John McDole [email protected], zmertens [email protected], Kirill Müller [email protected], Eugene [email protected], Ola Olsson [email protected], Mostyn Bramley-Moore [email protected]

    • Tulio Magno Quites Machado Filho has contributed removal of deprecated glibc malloc hooks.

    • Issue #827 that caused intercepting malloc on osx 10.12 to fail was fixed, by copying fix made by Mike Hommey to jemalloc. Much thanks to Koichi Shiraishi and David Ribeiro Alves for reporting it and testing fix.

    • Aman Gupta and Kenton Varda contributed minor fixes to pprof (but note again that pprof is deprecated)

    • Ryan Macnak contributed compilation fix for aarch64

    • Francis Ricci has fixed unaligned memory access in debug allocator

    • TCMALLOC_PAGE_FENCE_NEVER_RECLAIM now actually works thanks to contribution by Andrew Morrow.

    Source code(tar.gz)
    Source code(zip)
    gperftools-2.5.90.tar.gz(1.42 MB) MB)
  • gperftools-2.5(Mar 12, 2016)

  • gperftools-2.4.91(Mar 6, 2016)

  • gperftools-2.4.90(Feb 22, 2016)

    gperftools 2.5rc is out!

    Here are major changes since 2.4:

    • we've moved to github!
    • Bryan Chan has contributed s390x support
    • stacktrace capturing via libgcc's _Unwind_Backtrace was implemented (for architectures with missing or broken libunwind).
    • "emergency malloc" was implemented. Which unbreaks recursive calls to malloc/free from stacktrace capturing functions (such as glib'c backtrace() or libunwind on arm). It is enabled by --enable-emergency-malloc configure flag or by default on arm when --enable-stacktrace-via-backtrace is given. It is another fix for a number common issues people had on platforms with missing or broken libunwind.
    • C++14 sized-deallocation is now supported (on gcc 5 and recent clangs). It is off by default and can be enabled at configure time via --enable-sized-delete. On GNU/Linux it can also be enabled at run-time by either TCMALLOC_ENABLE_SIZED_DELETE environment variable or by defining tcmalloc_sized_delete_enabled function which should return 1 to enable it.
    • we've lowered default value of transfer batch size to 512. Previous value (bumped up in 2.1) was too high and caused performance regression for some users. 512 should still give us performance boost for workloads that need higher transfer batch size while not penalizing other workloads too much.
    • Brian Silverman's patch finally stopped arming profiling timer unless profiling is started.
    • Andrew Morrow has contributed support for obtaining cache size of the current thread and softer idling (for use in MongoDB).
    • we've implemented few minor performance improvements, particularly on malloc fast-path.

    A number of smaller fixes were made. Many of them were contributed:

    • issue that caused spurious failures was fixed.
    • Jonathan Lambrechts contributed improved callgrind format support to pprof.
    • Matt Cross contributed better support for debug symbols in separate files to pprof.
    • Matt Cross contributed support for printing collapsed stack frame from pprof aimed at producing flame graphs.
    • Angus Gratton has contributed documentation fix mentioning that on windows only tcmalloc_minimal is supported.
    • Anton Samokhvalov has made tcmalloc use mi_force_{un,}lock on OSX instead of pthread_atfork. Which apparently fixes forking issues tcmalloc had on OSX.
    • Milton Chiang has contributed support for building 32-bit gperftools on arm8.
    • Patrick LoPresti has contributed support for specifying alternative profiling signal via CPUPROFILE_TIMER_SIGNAL environment variable.
    • Paolo Bonzini has contributed support configuring filename for sending malloc tracing output via TCMALLOC_TRACE_FILE environment variable.
    • user spotrh has enabled use of futex on arm.
    • user mitchblank has contributed better declaration for arg-less profiler functions.
    • Tom Conerly contributed proper freeing of memory allocated in HeapProfileTable::FillOrderedProfile on error paths.
    • user fdeweerdt has contributed curl arguments handling fix in pprof
    • Frederik Mellbin fixed tcmalloc's idea of mangled new and delete symbols on windows x64
    • Dair Grant has contributed cacheline alignment for ThreadCache objects
    • Fredrik Mellbin has contributed updated windows/config.h for Visual Studio 2015 and other windows fixes.
    • we're not linking libpthread to libtcmalloc_minimal anymore. Instead libtcmalloc_minimal links to pthread symbols weakly. As a result single-threaded programs remain single-threaded when linking to or preloading
    • Boris Sazonov has contributed mips compilation fix and printf misue in pprof.
    • Adhemerval Zanella has contributed alignment fixes for statically allocated variables.
    • Jens Rosenboom has contributed fixes for
    • gshirishfree has contributed better description for GetStats method.
    • cyshi has contributed spinlock pause fix.
    • Chris Mayo has contributed --docdir argument support for configure.
    • Duncan Sands has contributed fix for function aliases.
    • Simon Que contributed better include for malloc_hook_c.h
    • user wmamrak contributed struct timespec fix for Visual Studio 2015.
    • user ssubotin contributed typo in PrintAvailability code.
    Source code(tar.gz)
    Source code(zip)
    gperftools-2.4.90.tar.gz(1.32 MB) MB)
  • gperftools-2.4(Aug 15, 2015)

  • gperftools-2.3.90(Aug 15, 2015)

    gperftools 2.4rc is out!

    Here are changes since 2.3:

    • enabled aggressive decommit option by default. It was found to significantly improve memory fragmentation with negligible impact on performance. (Thanks to investigation work performed by Adhemerval Zanella)
    • added ./configure flags for tcmalloc pagesize and tcmalloc allocation alignment. Larger page sizes have been reported to improve performance occasionally. (Patch by Raphael Moreira Zinsly)
    • sped-up hot-path of malloc/free. By about 5% on static library and about 10% on shared library. Mainly due to more efficient checking of malloc hooks.
    • improved accuracy of stacktrace capturing in cpu profiler (due to issue found by Arun Sharma). As part of that issue pprof's handling of cpu profiles was also improved.
    Source code(tar.gz)
    Source code(zip)
    gperftools-2.3.90.tar.gz(1.28 MB) MB)
  • gperftools-2.3(Aug 15, 2015)

    gperftools 2.3 is out!

    Here are changes since 2.3rc:

    • ( issue 658 ) correctly close socketpair fds on failure (patch by glider)
    • libunwind integration can be disabled at configure time (patch by Raphael Moreira Zinsly)
    • libunwind integration is disabled by default for ppc64 (patch by Raphael Moreira Zinsly)
    • libunwind integration is force-disabled for OSX. It was not used by default anyways. Fixes compilation issue I saw.
    Source code(tar.gz)
    Source code(zip)
    gperftools-2.3.tar.gz(1.28 MB) MB)
  • gperftools-2.2.90(Aug 15, 2015)

    gperftools 2.3rc is out!

    Most small improvements in this release were made to pprof tool.

    New experimental Linux-only (for now) cpu profiling mode is a notable big improvement.

    Here are notable changes since 2.2.1:

    • ( issue 631 ) fixed debugallocation miscompilation on mmap-less platforms (courtesy of user iamxujian)
    • ( issue 630 ) reference to wrong PROFILE (vs. correct CPUPROFILE) environment variable was fixed (courtesy of WenSheng He)
    • pprof now has option to display stack traces in output for heap checker (courtesy of Michael Pasieka)
    • ( issue 636 ) pprof web command now works on mingw
    • ( issue 635 ) pprof now handles library paths that contain spaces (courtesy of user [email protected])
    • ( issue 637 ) pprof now has an option to not strip template arguments (patch by jiakai)
    • ( issue 644 ) possible out-of-bounds access in GetenvBeforeMain was fixed (thanks to user abyss.7)
    • ( issue 641 ) pprof now has an option --show_addresses (thanks to user yurivict). New option prints instruction address in addition to function name in stack traces
    • ( issue 646 ) pprof now works around some issues of addr2line reportedly when DWARF v4 format is used (patch by Adam McNeeney)
    • ( issue 645 ) heap profiler exit message now includes remaining memory allocated info (patch by user yurivict) pprof code that finds location of /proc/pid/maps in cpu profile files is now fixed (patch by Ricardo M. Correia)
    • (issue 654) pprof now handles "split text segments" feature of Chromium for Android (patch by simonb)
    • ( issue 655 ) potential deadlock on windows caused by early call to getenv in malloc initialization code was fixed (bug reported and fix proposed by user zndmitry)
    • incorrect detection of arm 6zk instruction set support (-mcpu=arm1176jzf-s) was fixed. (Reported by pedronavf on old issue-493)
    • new cpu profiling mode on Linux is now implemented. It sets up separate profiling timers for separate threads. Which improves accuracy of profiling on Linux a lot. It is off by default. And is enabled if both librt.f is loaded and CPUPROFILE_PER_THREAD_TIMERS environment variable is set. But note that all threads need to be registered via ProfilerRegisterThread.
    Source code(tar.gz)
    Source code(zip)
    gperftools-2.2.90.tar.gz(1.28 MB) MB)
  • gperftools-2.2.1(Aug 15, 2015)

  • gperftools-2.2(Aug 15, 2015)

  • gperftools-2.1.90(Aug 15, 2015)

    gperftools 2.2rc is out!

    Here are notable changes since 2.1:

    • a number of fixes for a number compilers and platforms. Notably Visual Studio 2013, recent mingw with c++ threads and some OSX fixes.
    • we now have mips and mips64 support! (courtesy of Jovan Zelincevic, Jean Lee, user xiaoyur347 and others)
    • we now have aarch64 (aka arm64) support! (contributed by Riku Voipio)
    • there's now support for ppc64-le (by Raphael Moreira Zinsly and Adhemerval Zanella)
    • there's now some support of uclibc (contributed by user xiaoyur347)
    • google/ headers will now give you deprecation warning. They are deprecated since 2.0
    • there's now new api: tc_malloc_skip_new_handler (ported from chromium fork)
    • issue 557 : added support for dumping heap profile via signal (by Jean Lee)
    • issue 567 : Petr Hosek contributed SysAllocator support for windows
    • Joonsoo Kim contributed several speedups for central freelist code
    • TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES environment variable now works
    • configure scripts are now using AM_MAINTAINER_MODE. It'll only affect folks who modify source from .tar.gz and want automake to automatically rebuild Makefile-s. See automake documentation for that.
    • issue 586 : detect main executable even if PIE is active (based on patch by user themastermind1). Notably, it fixes profiler use with ruby.
    • there is now support for switching backtrace capturing method at runtime (via TCMALLOC_STACKTRACE_METHOD and TCMALLOC_STACKTRACE_METHOD_VERBOSE environment variables)
    • there is new backtrace capturing method using -finstrument-functions prologues contributed by user xiaoyur347
    • few cases of crashes/deadlocks in profiler were addressed. See (famous) issue-66, issue 547 and issue 579 .
    • issue 464 (memory corruption in debugalloc's realloc after memallign) is now fixed
    • tcmalloc is now able to release memory back to OS on windows ( issue 489 ). The code was ported from chromium fork (by a number of authors).
    • Together with issue 489 we ported chromium's "aggressive decommit" mode. In this mode (settable via malloc extension and via environment variable TCMALLOC_AGGRESSIVE_DECOMMIT), free pages are returned back to OS immediately.
    • MallocExtension::instance() is now faster (based on patch by Adhemerval Zanella)
    • issue 610 (hangs on windows in multibyte locales) is now fixed

    The following people helped with ideas or patches (based on git log, some contributions purely in bugtracker might be missing): Andrew C. Morrow, yurivict, Wang YanQing, Thomas Klausner, [email protected], Dai MIKURUBE, Joon-Sung Um, Jovan Zelincevic, Jean Lee, Petr Hosek, Ben Avison, drussel, Joonsoo Kim, Hannes Weisbach, xiaoyur347, Riku Voipio, Adhemerval Zanella, Raphael Moreira Zinsly

    Source code(tar.gz)
    Source code(zip)
    gperftools-2.1.90.tar.gz(1.30 MB) MB)
  • gperftools-2.1(Aug 15, 2015)

Chaste - Cancer Heart And Soft Tissue Environment - main public repository

Chaste - Cancer Heart And Soft Tissue Environment - main public repository

Chaste - Cancer Heart and Soft Tissue Environment 98 Dec 14, 2022
Main libjpeg-turbo repository

Background libjpeg-turbo is a JPEG image codec that uses SIMD instructions to accelerate baseline JPEG compression and decompression on x86, x86-64, A

libjpeg-turbo 3.1k Dec 31, 2022
Open MPI main development repository

Open MPI The Open MPI Project is an open source Message Passing Interface (MPI) implementation that is developed and maintained by a consortium of aca

Open MPI 1.6k Jan 5, 2023
The main repository for the Darkflame Universe Server Emulator project.

Darkflame Universe Introduction Darkflame Universe (DLU) is a server emulator for LEGO® Universe. Development started in 2013 and has gone through mul

null 492 Jan 7, 2023
A conda-smithy repository for qt-main.

About qt-main Home: Package license: LGPL-3.0-only Feedstock license: BSD-3-Clause Summary: Qt is a cross-platform application and UI fra

conda-forge 4 Dec 15, 2022
Tesseract Open Source OCR Engine (main repository)

Tesseract OCR Table of Contents Tesseract OCR About Brief history Installing Tesseract Running Tesseract For developers Support License Dependencies L

null 48.2k Jan 2, 2023
Mitsuba renderer main repository

Mitsuba — Physically Based Renderer About Mitsuba is a research-oriented rendering system in the style of PBRT, from whic

Mitsuba Physically Based Renderer 924 Dec 27, 2022
BRL-CAD's main source code

BRL-CAD Release 7.32.4 BRL-CAD is a powerful cross-platform open source combinatorial solid modeling system that incl

BRL-CAD 220 Dec 30, 2022
The InitWare Suite of Middleware allows you to manage services and system resources as logical entities called units. Its main component is a service management ("init") system.

InitWare isn't ready to use yet!! Unless you are doing so for fun, to experiment, or to contribute, you most likely do not want to try to install Init

null 164 Dec 21, 2022
EarlyBird process hollowing technique (BOF) - Spawns a process in a suspended state, inject shellcode, hijack main thread with APC, and execute shellcode

HOLLOW - Cobalt Strike BOF Authors: Bobby Cooke (@0xBoku) Justin Hamilton (@JTHam0) Octavio Paguaga (@OakTree__) Matt Kingstone (@n00bRage) Beacon Obj

Bobby Cooke 203 Dec 20, 2022
This project contains the main ROS 2 packages of Xiaomi CyberDog®.

Xiaomi CyberDog ROS 2 文档包含简体中文和English 简介 - Introduction 本项目包含小米铁蛋®的ROS 2主要功能包. This project contains the main ROS 2 packages of Xiaomi CyberDog®. 基本信

null 383 Dec 31, 2022
An operating system. Its main goal? Readable code, developer experience and documentation.

OS Dependencies Required for development. sudo apt install build-essential nasm grub-pc-bin grub-common xorriso Required for building cross-compiler.

Stijn Rogiest 1 Nov 15, 2022
A commented version of my libft, with details about how my algorithms work and simple main functions to compile them.


Abdessamad Laamimi 1 Nov 11, 2021
Half-Life : Extended main branch for developing purposes

Half Life : Extended SDK Source Code of Half Life : Extended as a open source modbase for everyone publicly, make your own mod with alot of features e

Bacontsu 15 Dec 12, 2022
PRINT++ is a simple, open source print library for C++, the main usage of PRINT++ is printing out "log" messages

note that for now, print++ is using std::cout. In future it will be using own print function. Windows version can be unstable That library is in alpha

Ksawery 3 Jan 23, 2022
Shamir’s Secret Sharing Algorithm: Shamir’s Secret Sharing is an algorithm in cryptography created by Adi Shamir. The main aim of this algorithm is to divide secret that needs to be encrypted into various unique parts.

Shamir-s-Secret-Sharing-Algorithm-Cryptography Shamir’s Secret Sharing Algorithm: Shamir’s Secret Sharing is an algorithm in cryptography created by A

Pavan Ananth Sharma 5 Jul 5, 2022
OpenBK7231T project - main application

Building Clone the SDK repo to a folder, e.g. bk7231sdk/ Clone the app repo into bk7231sdk/apps/ - e.g. bk7231sdk\apps\openbk7231app On Windows, start

null 428 Jan 9, 2023