Lightweight profiler library for c++

Overview

easy_profiler 2.1.0 2.x.x

Build Status Build Status

License: MIT License

  1. About
  2. Key features
  3. Usage
  4. Build
  5. Notes about major release (1.0 -> 2.0)
  6. License

About

Lightweight cross-platform profiler library for c++

You can profile any function in you code. Furthermore this library provide measuring time of any block of code. For example, information for 12 millions of blocks is using less than 300Mb of memory. Working profiler slows your application execution for only 1-2%.

Block time Average overhead per block is about 15ns/block (tested on Intel Core i7-5930K 3.5GHz, Win7)

Disabled profiler will not affect your application execution in any way. You can leave it in your Release build and enable it at run-time at any moment during application launch to see what is happening at the moment.

Also the library can capture system's context switch events between threads. Context switch information includes duration, target thread id, thread owner process id, thread owner process name.

You can see the results of measuring in simple GUI application which provides full statistics and renders beautiful time-line.

GUI screenshot Profiling CryEngine SDK example

New UI Style New UI style in version 2.0

Key features

  • Extremely low overhead
  • Low additional memory usage
  • Cross-platform
  • Profiling over network
  • Capture thread context-switch events
  • Store user variables (both single values and arrays)
  • GUI could be connected to an application which is already profiling (so you can profile initialization of your application)
  • Monitor main thread fps at real-time in GUI even if profiling is disabled or draw your own HUD/fps-plot directly in your application using data provided by profiler
  • Save a snapshot (selected area) of profiled data from file
  • Add bookmarks at any place on the timeline
  • Configurable timer type with CMakeLists or preprocessor macros

Usage

Integration

General

First of all you can specify path to include directory which contains include/profiler directory and define macro BUILD_WITH_EASY_PROFILER. For linking with easy_profiler you can specify path to library.

If using CMake

If you are using cmake set CMAKE_PREFIX_PATH to lib/cmake/easy_profiler directory (from release package) and use function find_package(easy_profiler) with target_link_libraries(... easy_profiler).

Example:

project(my_application)

set(SOURCES
    main.cpp
)

# CMAKE_PREFIX_PATH should be set to <easy_profiler-release_dir>/lib/cmake/easy_profiler
find_package(easy_profiler REQUIRED)  # STEP 1 #########################

add_executable(my_application ${SOURCES})

target_link_libraries(my_application easy_profiler)  # STEP 2 ##########

Inserting blocks

Example of usage.

#include <easy/profiler.h>

void foo() {
    EASY_FUNCTION(profiler::colors::Magenta); // Magenta block with name "foo"

    EASY_BLOCK("Calculating sum"); // Begin block with default color == Amber100
    int sum = 0;
    for (int i = 0; i < 10; ++i) {
        EASY_BLOCK("Addition", profiler::colors::Red); // Scoped red block (no EASY_END_BLOCK needed)
        sum += i;
    }
    EASY_END_BLOCK; // End of "Calculating sum" block

    EASY_BLOCK("Calculating multiplication", profiler::colors::Blue500); // Blue block
    int mul = 1;
    for (int i = 1; i < 11; ++i)
        mul *= i;
    //EASY_END_BLOCK; // This is not needed because all blocks are ended on destructor when closing braces met
}

void bar() {
    EASY_FUNCTION(0xfff080aa); // Function block with custom ARGB color
}

void baz() {
    EASY_FUNCTION(); // Function block with default color == Amber100
}

EasyProfiler is using Google Material-Design colors palette, but you can use custom colors in ARGB format (like shown in example above).
The default color is Amber100 (it is used when you do not specify color explicitly).

Storing variables

Example of storing variables:

#include <easy/profiler.h>
#include <easy/arbitrary_value.h> // EASY_VALUE, EASY_ARRAY are defined here

class Object {
    Vector3 m_position; // Let's suppose Vector3 is a struct { float x, y, z; };
    unsigned int  m_id;
public:
    void act() {
        EASY_FUNCTION(profiler::colors::Cyan);

        // Dump variables values
        constexpr auto Size = sizeof(Vector3) / sizeof(float);
        EASY_VALUE("id", m_id);
        EASY_ARRAY("position", &m_position.x, Size, profiler::color::Red);

        // Do something ...
    }

    void loop(uint32_t N) {
        EASY_FUNCTION();
        EASY_VALUE("N", N, EASY_VIN("N")); /* EASY_VIN is used here to ensure
                                            that this value id will always be
                                            the same, because the address of N
                                            can change */
        for (uint32_t i = 0; i < N; ++i) {
            // Do something
        }
    }
};

Collect profiling data

There are two ways to collect profiling data: streaming over network and dumping data to file.

Streaming over network

This is the most preferred and convenient method in many cases.

  1. (In profiled app) Invoke profiler::startListen(). This will start new thread to listen 28077 port for the start-capture-signal from profiler_gui.
  2. (In UI) Connect profiler_gui to your application using hostname or IP-address.
  3. (In UI) Press Start capture button in profiler_gui.
  4. (In UI) Press Stop capture button in profiler_gui to stop capturing and wait until profiled data will be passed over network.
  5. (Optional step)(In profiled app) Invoke profiler::stopListen() to stop listening.

Example:

void main() {
    profiler::startListen();
    /* do work */
}

Dump to file

  1. (Profiled application) Start capturing by putting EASY_PROFILER_ENABLE macro somewhere into the code.
  2. (Profiled application) Dump profiled data to file in any place you want by profiler::dumpBlocksToFile("test_profile.prof") function.

Example:

void main() {
    EASY_PROFILER_ENABLE;
    /* do work */
    profiler::dumpBlocksToFile("test_profile.prof");
}

Note about thread context-switch events

To capture a thread context-switch events you need:

  • On Windows: launch your application "as Administrator"
  • On Linux: you can launch special systemtap script with root privileges as follow (example on Fedora):
#stap -o /tmp/cs_profiling_info.log scripts/context_switch_logger.stp name APPLICATION_NAME

APPLICATION_NAME - name of your application

There are some known issues on a linux based systems (for more information see wiki)

Profiling application startup

To profile your application startup (when using network method) add EASY_PROFILER_ENABLE macro into the code together with profiler::startListen().

Example:

void main() {
    EASY_PROFILER_ENABLE;
    profiler::startListen();
    /* do work */
}

This will allow you to collect profiling data before profiler_gui connection. profiler_gui will automatically display capturing dialog window after successful connection to the profiled application.

Build

Prerequisites

  • CMake 3.0 or higher
  • Compiler with c++11 support
    • for Unix systems: compiler with thread_local support is highly recommended: GCC >=4.8, Clang >=3.3

Additional requirements for GUI:

  • Qt 5.3.0 or higher

Linux

$ mkdir build
$ cd build
$ cmake -DCMAKE_BUILD_TYPE="Release" ..
$ make

MacOS

$ mkdir build
$ cd build
$ cmake -DCMAKE_CXX_COMPILER=g++-5 -DCMAKE_C_COMPILER=gcc-5 -DCMAKE_BUILD_TYPE="Release" ..
$ make

Windows

If you are using QtCreator IDE you can just open CMakeLists.txt file in root directory. If you are using Visual Studio you can generate solution by cmake generator command. Examples shows how to generate Win64 solution for Visual Studio 2013. To generate for another version use proper cmake generator (-G "name of generator").

Way 1

Specify path to cmake scripts in Qt5 dir (usually in lib/cmake subdir) and execute cmake generator command, for example:

$ mkdir build
$ cd build
$ cmake -DCMAKE_PREFIX_PATH="C:\Qt\5.3\msvc2013_64\lib\cmake" .. -G "Visual Studio 12 2013 Win64"

Way 2

Create system variable "Qt5Widgets_DIR" and set it's value to "[path-to-Qt5-binaries]\lib\cmake\Qt5Widgets". For example, "C:\Qt\5.3\msvc2013_64\lib\cmake\Qt5Widgets". And then run cmake generator as follows:

$ mkdir build
$ cd build
$ cmake .. -G "Visual Studio 12 2013 Win64"

QNX

$ souce $QNX_ENVIRONMENT
$ mkdir build
$ cd build
$ cmake -DCMAKE_TOOLCHAIN_FILE=/path/to/QNXToolchain.cmake ..

For more information and example for QNXToolchain.cmake see this PR

Android

You can build native library for android by using NDK and standalone toolchain. See comment for this PR to get a more detailed instruction.

Status

Branch develop contains all v2.0.0 features and new UI style.
Please, note that .prof file header has changed in v2.0.0:

struct EasyFileHeader {
    uint32_t signature = 0;
    uint32_t version = 0;
    profiler::processid_t pid = 0;
    int64_t cpu_frequency = 0;
    profiler::timestamp_t begin_time = 0;
    profiler::timestamp_t end_time = 0;
    
    // Changed order of memory_size and blocks_number relative to v1.3.0
    uint64_t memory_size = 0;
    uint64_t descriptors_memory_size = 0;
    uint32_t total_blocks_number = 0;
    uint32_t total_descriptors_number = 0;
};

License

Licensed under either of

at your option.

Issues
  • Profiling arbitrary values

    Profiling arbitrary values

    Would it be possible to keep track of arbitrary float/integer values? That would be useful for engine to track object counts. It could be represented simply as FPS graph. API could be something like profiler::sample(valueType, value).

    Unrelated: we would love to track object lifetimes as well, however i have no idea how that could be presented in UI. Just something to think about.

    feature implemented core ui 
    opened by rokups 20
  • Document Stream Format

    Document Stream Format

    I'm having an issue with a serialized block size having a zero in it, despite there actually being a block there. I'm having a really hard time debugging it without fully understanding the contents of a stream.

    question wiki 
    opened by rationalcoder 16
  • Compile Problems

    Compile Problems

    Hello,

    so I am trying to test-profile an application of mine on a Windows 7 x64 machine.

    I use MinGW compiler version 4.9.1 .

    So here is the code:

    #include <iostream>
    #include <easy/profiler.h>
    
    ...
    
    int main(){
    	EASY_PROFILER_ENABLE;
    	printf("%d",fibonacci(30));
    	profiler::dumpBlocksToFile("test_profile.prof");
    }
    

    Since I am using eclipse together with MinGW , i get to chose some arguments for the compilation...

    When I compile as follows: g++ -std=c++0x -DBUILD_WITH_EASY_PROFILER "-IC:\\Users\\EMAKMEL\\Desktop\\gradle-test\\54profiling\\easy_profiler-v1.2.0-msvc12-win64\\bin" "-IC:\\Users\\EMAKMEL\\Desktop\\gradle-test\\54profiling\\easy_profiler-v1.2.0-msvc12-win64\\include" -O0 -g3 -Wall -c -fmessage-length=0 -o file.o "..\\file.cpp"

    Everything is okay so far... However when eclipse tries to execute:

    g++ -DBUILD_WITH_EASY_PROFILER "-LC:\\Users\\EMAKMEL\\Desktop\\gradle-test\\54profiling\\easy_profiler-v1.2.0-msvc12-win64\\bin" "-LC:\\Users\\EMAKMEL\\Desktop\\gradle-test\\54profiling\\easy_profiler-v1.2.0-msvc12-win64\\include" -o profiledproject.exe file.o

    I get : undefined reference to '__imp_setEnabled' and undefined reference to '__imp_dumpBlocksToFile'

    If I take out the -DBUILD_WITH_EASY_PROFILER then it all compiles nicely and executes....

    Thank you in advance for your help, Regards, Maksim Melnik

    question delayed 
    opened by ghost 13
  • GUI output after profiling

    GUI output after profiling

    Hello, I am compiling my application using Cmake and setting the prefix path to easyprofile directory, after doing a make I get an object file which just gives me the terminal output, can someone please help me on how can I get GUI output.

    question 
    opened by abhi1212 12
  • Unscoped thread crashes when dumping blocks

    Unscoped thread crashes when dumping blocks

    Version:

    Public v1.2.0, built locally with source and integrated into my project that way (rather than with the provided static libraries).

    Callstack:

    easy_profiler.dll!operator` delete(void * block) Line 21 C++ easy_profiler.dll!std::_Deallocate(void * _Ptr, unsigned __int64 _Count, unsigned __int64 _Sz) Line 133 C++ easy_profiler.dll!std::allocator<std::reference_wrapper<profiler::Block> >::deallocate(std::reference_wrapper<profiler::Block> * _Ptr, unsigned __int64 _Count) Line 721 C++ easy_profiler.dll!std::_Wrap_alloc<std::allocator<std::reference_wrapper<profiler::Block> > >::deallocate(std::reference_wrapper<profiler::Block> * _Ptr, unsigned __int64 _Count) Line 988 C++ easy_profiler.dll!std::vector<std::reference_wrapper<profiler::Block>,std::allocator<std::reference_wrapper<profiler::Block> > >::_Reallocate(unsigned __int64 _Count) Line 1619 C++ easy_profiler.dll!std::vector<std::reference_wrapper<profiler::Block>,std::allocator<std::reference_wrapper<profiler::Block> > >::_Reserve(unsigned __int64 _Count) Line 1633 C++ easy_profiler.dll!std::vector<std::reference_wrapper<profiler::Block>,std::allocator<std::reference_wrapper<profiler::Block> > >::emplace_back<profiler::Block & __ptr64>(profiler::Block & <_Val_0>) Line 928 C++ easy_profiler.dll!ProfileManager::beginBlock(profiler::Block & _block) Line 968 C++ easy_profiler.dll!beginBlock(profiler::Block & _block) Line 304 C++ TomatoGame.exe!Katgine::App::Win32AppWindow::Run(Katgine::App::IApp * app) Line 81 C++ ...

    Steps to reproduce:

    1. Have two threads, Win32 and Game. Win32 is declared with EASY_THREAD("Win32"), and the game thread is declared with EASY_THREAD("Main"). Note that this bug only occurs when the Win32 thread is not guarded with a scoped thread.
    2. The Win32 thread is blocking in the Win32 message loop, e.g. sending no data, but it has sent data since starting, so the THIS_THREAD is correct.
    3. The main thread is sending data regularly, e.g. once every 16 ms but the interval really doesn't matter.
    4. Connect over the network and gather data.
    5. Observe that no data is captured on the Win32 thread because the thread is still blocking.
    6. End network capture of the data.
    7. Notice how ProfileManager::dumpBlocksToStream is called, and the Win32 thread is removed from m_threads at line 1429.
    8. Mouse over the window, which causes the Win32 message loop to leave its blocked state, and data is once again captured on that thread.
    9. Observe crash because THIS_THREAD is now pointing to freed memory, because the thread has been freed, but not reallocated because THIS_THREAD was never set to null.

    Workaround:

    1. Use scoped thread macro instead, which marks the thread as guarded so it isn't freed when blocks are dumped by the network profiler.

    Possible solutions:

    1. m_threads now stores pointers to pointers, which would allow the code which removes threads from the collection to also nullify the thread-local storage pointer. This may be an issue though because I bet thread-local storage is freed when a thread dies, so you can't nullify the pointer without a segfault.
    2. Do not free threads for unscoped threads when dumping blocks.
    3. Removed unguarded thread support. (Extreme!)
    bug resolved core 
    opened by Liareth 11
  • Create, begin, end, store blocks manually

    Create, begin, end, store blocks manually

    Hi,

    thanks for your awesome tool. I love it so much that I am planning to (ab)using it in a slightly different way. I would like to measure the execution of tasks and not just functions/methods.

    Ideally I need to be able to manually do the following operations:

    1. Create a profile::Block
    2. Start() it manually.
    3. Call manually finish()

    In other words, I don't want the ProfilerManager to take care of this for me, but, on the other hand, ProfilerManager is the one that provides the socket interface and dump to file.

    I am positive that what I described can be done, but it would be nice if you can give any hint

    Regards

    Davide

    feature implemented 
    opened by facontidavide 9
  • Undefined Behavior

    Undefined Behavior

    In profile_manager.h:chunk_allocator::allocate() / emplace_back() and elsewhere, there are lines like *(uint16_t*)(data + n) = 0; and *(uint16_t*)last->data = 0; that violate strict-aliasing rules:

    /easy_profiler/easy_profiler_core/profile_manager.h:175:36: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] *(uint16_t*)last->data = 0;

    I have confirmed that easy_profiler does not work correctly when cross compiled (with gcc-4.9.4) for a single core ARMv5 (32 bit) Linux, with pthreads enabled. It works, however, on Linux x86_64 with the same compiler version on the same machine used for cross-compiling. This could be direct result of the UB, as I have made similar mistakes that caused code to work on x86 but not the ARM platform I am using.

    Even if my problems aren't caused by the UB, it still needs to be fixed anywhere it is found. I would suggest the use of std::memcpy / std::memset for such things.

    EDIT the line *(uint16_t*)(data + n) = 0 doesn't violate strict-aliasing, only the second one.

    bug resolved core 
    opened by rationalcoder 8
  • Design problem with StackBuffer

    Design problem with StackBuffer

    I am in the progress of integrating easy_profiler to AtomicGameEngine. In the process we bumped into several nasty issues which i think should be addressed. Profiler does not build on MacOS platform or on windows with MSVC compiler. We worked out some patches and will submit PR once code is in shape.

    Now on to a real problem:

    MacOS is having a hard time (meaning crash) due to manual string destructor invocation. This right there is a sign of very bad design and needs to be addressed.

    I started digging deeper into the problem and discovered that this destructor call is a result of custom StackBuffer class. As i understand this class was created due to performance reasons, but i think it is completely not necessary. Did you try using std::vector? We can minimize memory reallocations by reserving space in a vector. I usually reserve double of vector size when capacity is reached. It wastes some memory, but progressively reduces memory reallocations and we can use RAII and not need ugly manual calls to destructors. To avoid copying on insertion we can just .push_back(std::move(NonscopedBlock(...))). This should perform much the same as StackBuffer except so much more safer and cleaner.

    I started changing StackBuffer to std::vector and noticed that move constructor of Block modifies state of object it is stealing state from. Why? Move constructor is supposed to steal state from the other object and that leaves this object in undetermined state, basically destined for a trashcan. Here something else is happening which seems very wrong. I do not think i can fix this problem without understanding why it was written this way.

    warning resolved 
    opened by rokups 8
  • Make histogram max value depend on visible region

    Make histogram max value depend on visible region

    Consider case where some very slow event happens: hist1 Now if we scroll the view to the right and no longer see slow region we get this: hist2 Visible data is still hanging at very bottom of the chart making it not really useful. I think histogram range in zoom mode should adapt to visible data instead of entire dataset.

    feature implemented 
    opened by rokups 8
  • can't build UI in ubuntu 14.04

    can't build UI in ubuntu 14.04

    I use cmake -DCMAKE_BUILD_TYPE="Release" .. and it's ok. then I use make command, and it run failed.

    [ 30%] Building CXX object profiler_gui/CMakeFiles/profiler_gui.dir/arbitrary_value_inspector.cpp.o In file included from easy_profiler/profiler_gui/dialog.h:55:0, fromeasy_profiler/profiler_gui/arbitrary_value_inspector.cpp:79: easy_profiler/profiler_gui/window_header.h:87:19: error: ISO C++ forbids declaration of 'Q_FLAG' with no type [-fpermissive] Q_FLAG(Buttons) ^ easy_profiler/profiler_gui/window_header.h:87:19: error: expected ';' at end of member declaration make[2]: *** [profiler_gui/CMakeFiles/profiler_gui.dir/arbitrary_value_inspector.cpp.o] Error 1 make[1]: *** [profiler_gui/CMakeFiles/profiler_gui.dir/all] Error 2 make: *** [all] Error 2

    I used to think it's gcc/qt version's problem, but after I upgrade gcc/qt, it still don't work.

    gcc version 5.5.0 20171010 (Ubuntu 5.5.0-12ubuntu1~14.04) Using Qt version 5.9.1 in /opt/Qt5.9.1/5.9.1/gcc_64/lib

    I don't know what I can do now to make it work.

    opened by zezhou 7
  • x86 Binaries

    x86 Binaries

    Hi,

    Would it be possible to include x86/Win32 Windows binaries into the next release? Previously I compiled the source myself but now the code triggers an internal compiler error so I cannot build the library...

    Silveryard

    bug resolved core 
    opened by Silveryard 7
  • Fails to build on i386: static_assert failed due to requirement 'get_aligned_size<65526>::Size == 65536 - EASY_ALIGN_SIZE'

    Fails to build on i386: static_assert failed due to requirement 'get_aligned_size<65526>::Size == 65536 - EASY_ALIGN_SIZE' "wrong get_aligned_size"

    In file included from /wrkdirs/usr/ports/devel/easy-profiler/work/easy_profiler-2.1.0-41-g3104dd4/easy_profiler_core/block.cpp:52:
    In file included from /wrkdirs/usr/ports/devel/easy-profiler/work/easy_profiler-2.1.0-41-g3104dd4/easy_profiler_core/profile_manager.h:56:
    In file included from /wrkdirs/usr/ports/devel/easy-profiler/work/easy_profiler-2.1.0-41-g3104dd4/easy_profiler_core/thread_storage.h:55:
    /wrkdirs/usr/ports/devel/easy-profiler/work/easy_profiler-2.1.0-41-g3104dd4/easy_profiler_core/chunk_allocator.h:408:1: error: static_assert failed due to requirement 'get_aligned_size<65526>::Size == 65536 - EASY_ALIGN_SIZE' "wrong get_aligned_size"
    static_assert(get_aligned_size<65526>::Size == 65536 - EASY_ALIGN_SIZE, "wrong get_aligned_size");
    ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1 warning and 1 error generated.
    

    clang-13 FreeBSD 13.1

    build 
    opened by yurivict 0
  • Doesn't tracking in infinity loops

    Doesn't tracking in infinity loops

    Hello, is there a way to track code blocks inside the infinity loops?

    auto job = JobSystem::Scheduler::CreateJob("Test", []() { while (true) { EASY_BLOCK("Test"); Sleep(100); } }); JobSystem::Scheduler::GetInstance()->Run(job);

    The code above is runned in a separated thread and EASY_BLOCK is not tracking Sleep(100). If you change loop to 'one shot' everything is fine.

    opened by aantropov 0
  • CMake: fix install target for bundles (iOS/tvOS/watchOS)

    CMake: fix install target for bundles (iOS/tvOS/watchOS)

    executables in install target need a BUNDLE DESTINATION if cross-build to iOS/tvOS/watchOS

    By removing RUNTIME, the same DESTINATION is set for all types, including BUNDLE (and since there are only executables in those install commands, we don't care of ARCHIVE and LIBRARY destination).

    see https://cmake.org/cmake/help/latest/policy/CMP0006.html (and MACOSX_BUNDLE is ON by default for iOS/tvOS/watchOS: https://cmake.org/cmake/help/latest/variable/CMAKE_MACOSX_BUNDLE.html#variable:CMAKE_MACOSX_BUNDLE)

    opened by SpaceIm 0
  • Run slower

    Run slower

    import clang.cindex
    from clang.cindex import Index
    from clang.cindex import Config
    from clang.cindex import CursorKind
    from clang.cindex import TypeKind
    from glob2 import glob
    
    libclangPath = r'C:\Program Files\LLVM\bin\libclang.dll'
    if Config.loaded == True:
        print("Config.loaded == True:")
    else:
        Config.set_library_file(libclangPath)
        print("install path")
    
    stmt = False
    
    
    def preorder_travers_AST(cursor, stmt_list):
        for cur in cursor.get_children():
            # do something
            global stmt
            # print(cur.spelling, cur.location, cur.kind)
            if (cur.kind == CursorKind.FUNCTION_DECL or cur.kind == CursorKind.CXX_METHOD) and (
                    cur.location.file.name[-2:] == ".c" or cur.location.file.name[-4:] == ".cpp"):
                print(cur.spelling, cur.kind, cur.type.kind, cur.location)
                stmt = True
            if stmt == True and cur.kind == CursorKind.COMPOUND_STMT:
                print(cur.spelling, cur.kind, cur.type.kind, cur.location)
                stmt_list.append((cur.location.line, cur.location.column))
                stmt = False
            preorder_travers_AST(cur, stmt_list)
    
    
    def add_profiler_for_file(path, stmt_list):
        index = Index.create()
        tu = index.parse(path)
        tu = index.parse(path, ['c++', '-std=c++11', '-DCALL_METHOD=', '-D__linux__'])
        AST_root_node = tu.cursor
        preorder_travers_AST(AST_root_node, stmt_list)
    
    
    def add_profiler(path, stmt_list):
        profiler_head = "#include <easy/profiler.h>\n"
        profiler = "EASY_FUNCTION();"
        try:
            data = open(path, "r").readlines()
        except:
            data = open(path, "r", encoding="utf-8").readlines()
        if data[0] == profiler_head:
            return
    
        for i, j in stmt_list:
            print(data[i - 1][j - 1])
            data[i - 1] = data[i - 1].replace("{", "{" + profiler)
        data.insert(0, profiler_head)
    
        f = open(path, "w+", encoding='utf-8')
        f.writelines(data)
        f.close()
    
    
    if __name__ == "__main__":
        src_dir = "H:\xxxxxx"
        cpp_list = glob(src_dir + "\**\*.cpp")
    
        print(len(cpp_list))
        for i, file_path in enumerate(cpp_list):
            print(file_path)
            stmt_list = []
            add_profiler_for_file(file_path, stmt_list)
            print(stmt_list)
            add_profiler(file_path, stmt_list)
    

    I use python script to add 'EASY_FUNCTION();' to each function.However, it took three times longer to run. It's confusing to me.

    opened by Tianxiaomo 0
Releases(v2.1.0)
Owner
Sergey Yagovtsev
Sergey Yagovtsev
A lightweight C library for artificial neural networks

Getting Started # acquire source code and compile git clone https://github.com/attractivechaos/kann cd kann; make # learn unsigned addition (30000 sam

Attractive Chaos 606 Jun 10, 2022
A lightweight C++ machine learning library for embedded electronics and robotics.

Fido Fido is an lightweight, highly modular C++ machine learning library for embedded electronics and robotics. Fido is especially suited for robotic

The Fido Project 412 Jun 25, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20k Jul 3, 2022
A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support.

Libonnx A lightweight, portable pure C99 onnx inference engine for embedded devices with hardware acceleration support. Getting Started The library's

xboot.org 400 Jun 27, 2022
TensorVox is an application designed to enable user-friendly and lightweight neural speech synthesis in the desktop

TensorVox is an application designed to enable user-friendly and lightweight neural speech synthesis in the desktop, aimed at increasing accessibility to such technology.

null 129 Jun 24, 2022
A lightweight 2D Pose model can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.

A lightweight 2D Pose model can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.

JinquanPan 45 Jun 29, 2022
A lightweight version of OrcVIO that uses monocular images, inertial data, as well as bounding box measurements

OrcVIO-Lite About Object residual constrained Visual-Inertial Odometry (OrcVIO) is a visual-inertial odometry pipeline, which is tightly coupled with

Sean 21 May 2, 2022
Ncnn version demo of [CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search (ncnn) The official implementation by pytorch: ht

null 18 Jun 13, 2022
Caffe2 is a lightweight, modular, and scalable deep learning framework.

Source code now lives in the PyTorch repository. Caffe2 Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the origin

Meta Archive 8.4k Jun 22, 2022
PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

PocketSphinx 5prealpha This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech r

null 3k Jun 27, 2022
The dgSPARSE Library (Deep Graph Sparse Library) is a high performance library for sparse kernel acceleration on GPUs based on CUDA.

dgSPARSE Library Introdution The dgSPARSE Library (Deep Graph Sparse Library) is a high performance library for sparse kernel acceleration on GPUs bas

dgSPARSE 49 Jun 17, 2022
C-based/Cached/Core Computer Vision Library, A Modern Computer Vision Library

Build Status Travis CI VM: Linux x64: Raspberry Pi 3: Jetson TX2: Backstory I set to build ccv with a minimalism inspiration. That was back in 2010, o

Liu Liu 6.9k Jun 23, 2022
Edge ML Library - High-performance Compute Library for On-device Machine Learning Inference

Edge ML Library (EMLL) offers optimized basic routines like general matrix multiplications (GEMM) and quantizations, to speed up machine learning (ML) inference on ARM-based devices. EMLL supports fp32, fp16 and int8 data types. EMLL accelerates on-device NMT, ASR and OCR engines of Youdao, Inc.

NetEase Youdao 176 Jun 17, 2022
The Robotics Library (RL) is a self-contained C++ library for rigid body kinematics and dynamics, motion planning, and control.

Robotics Library The Robotics Library (RL) is a self-contained C++ library for rigid body kinematics and dynamics, motion planning, and control. It co

Robotics Library 580 Jun 25, 2022
A GPU (CUDA) based Artificial Neural Network library

Updates - 05/10/2017: Added a new example The program "image_generator" is located in the "/src/examples" subdirectory and was submitted by Ben Bogart

Daniel Frenzel 91 Jun 13, 2022
Header-only library for using Keras models in C++.

frugally-deep Use Keras models in C++ with ease Table of contents Introduction Usage Performance Requirements and Installation FAQ Introduction Would

Tobias Hermann 872 Jun 22, 2022
simple neural network library in ANSI C

Genann Genann is a minimal, well-tested library for training and using feedforward artificial neural networks (ANN) in C. Its primary focus is on bein

Lewis Van Winkle 1.3k Jul 4, 2022
oneAPI Deep Neural Network Library (oneDNN)

oneAPI Deep Neural Network Library (oneDNN) This software was previously known as Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-

oneAPI-SRC 2.9k Jun 28, 2022
LibDEEP BSD-3-ClauseLibDEEP - Deep learning library. BSD-3-Clause

LibDEEP LibDEEP is a deep learning library developed in C language for the development of artificial intelligence-based techniques. Please visit our W

Joao Paulo Papa 18 Mar 15, 2022