Single C file, Realtime CPU/GPU Profiler with Remote Web Viewer

Overview

Remotery

Build Status Build status

A realtime CPU/GPU profiler hosted in a single C file with a viewer that runs in a web browser.

screenshot

Supported Platforms:

  • Windows
  • Windows UWP (Hololens)
  • Linux
  • OSX
  • iOS
  • Android
  • XBox One
  • FreeBSD

Supported GPU Profiling APIS:

  • D3D 11
  • OpenGL
  • CUDA
  • Metal

Features:

  • Lightweight instrumentation of multiple threads running on the CPU.
  • Web viewer that runs in Chrome, Firefox and Safari. Custom WebSockets server transmits sample data to the browser on a latent thread.
  • Profiles itself and shows how it's performing in the viewer.
  • Console output for logging text.
  • Console input for sending commands to your game.

Compiling

  • Windows (MSVC) - add lib/Remotery.c and lib/Remotery.h to your program. Set include directories to add Remotery/lib path. The required library ws2_32.lib should be picked up through the use of the #pragma comment(lib, "ws2_32.lib") directive in Remotery.c.

  • Mac OS X (XCode) - simply add lib/Remotery.c, lib/Remotery.h and lib/Remotery.mm to your program.

  • Linux (GCC) - add the source in lib folder. Compilation of the code requires -pthreads for library linkage. For example to compile the same run: cc lib/Remotery.c sample/sample.c -I lib -pthread -lm

  • FreeBSD - the easiest way is to take a look at the official port (devel/remotery) and modify the port's Makefile if needed. There is also a package available via pkg install remotery.

You can define some extra macros to modify what features are compiled into Remotery:

Macro               Default     Description

RMT_ENABLED         1           Disable this to not include any bits of Remotery in your build
RMT_USE_TINYCRT     0           Used by the Celtoys TinyCRT library (not released yet)
RMT_USE_CUDA        0           Assuming CUDA headers/libs are setup, allow CUDA profiling
RMT_USE_D3D11       0           Assuming Direct3D 11 headers/libs are setup, allow D3D11 GPU profiling
RMT_USE_OPENGL      0           Allow OpenGL GPU profiling (dynamically links OpenGL libraries on available platforms)
RMT_USE_METAL       0           Allow Metal profiling of command buffers

Basic Use

See the sample directory for further examples. A quick example:

int main()
{
    // Create the main instance of Remotery.
    // You need only do this once per program.
    Remotery* rmt;
    rmt_CreateGlobalInstance(&rmt);

    // Explicit begin/end for C
    {
        rmt_BeginCPUSample(LogText, 0);
        rmt_LogText("Time me, please!");
        rmt_EndCPUSample();
    }

    // Scoped begin/end for C++
    {
        rmt_ScopedCPUSample(LogText, 0);
        rmt_LogText("Time me, too!");
    }

    // Destroy the main instance of Remotery.
    rmt_DestroyGlobalInstance(rmt);
}

Running the Viewer

Double-click or launch vis/index.html from the browser.

Sampling CUDA GPU activity

Remotery allows for profiling multiple threads of CUDA execution using different asynchronous streams that must all share the same context. After initialising both Remotery and CUDA you need to bind the two together using the call:

rmtCUDABind bind;
bind.context = m_Context;
bind.CtxSetCurrent = &cuCtxSetCurrent;
bind.CtxGetCurrent = &cuCtxGetCurrent;
bind.EventCreate = &cuEventCreate;
bind.EventDestroy = &cuEventDestroy;
bind.EventRecord = &cuEventRecord;
bind.EventQuery = &cuEventQuery;
bind.EventElapsedTime = &cuEventElapsedTime;
rmt_BindCUDA(&bind);

Explicitly pointing to the CUDA interface allows Remotery to be included anywhere in your project without need for you to link with the required CUDA libraries. After the bind completes you can safely sample any CUDA activity:

CUstream stream;

// Explicit begin/end for C
{
    rmt_BeginCUDASample(UnscopedSample, stream);
    // ... CUDA code ...
    rmt_EndCUDASample(stream);
}

// Scoped begin/end for C++
{
    rmt_ScopedCUDASample(ScopedSample, stream);
    // ... CUDA code ...
}

Remotery supports only one context for all threads and will use cuCtxGetCurrent and cuCtxSetCurrent to ensure the current thread has the context you specify in rmtCUDABind.context.

Sampling Direct3D 11 GPU activity

Remotery allows sampling of D3D11 GPU activity on multiple devices on multiple threads. After initialising Remotery, you need to bind it to D3D11 with a single call from the thread that owns the device context:

// Parameters are ID3D11Device* and ID3D11DeviceContext*
rmt_BindD3D11(d3d11_device, d3d11_context);

Sampling is then a simple case of:

// Explicit begin/end for C
{
    rmt_BeginD3D11Sample(UnscopedSample);
    // ... D3D code ...
    rmt_EndD3D11Sample();
}

// Scoped begin/end for C++
{
    rmt_ScopedD3D11Sample(ScopedSample);
    // ... D3D code ...
}

Subsequent sampling calls from the same thread will use that device/context combination. When you shutdown your D3D11 device and context, ensure you notify Remotery before shutting down Remotery itself:

rmt_UnbindD3D11();

Sampling OpenGL GPU activity

Remotery allows sampling of GPU activity on your main OpenGL context. After initialising Remotery, you need to bind it to OpenGL with the single call:

rmt_BindOpenGL();

Sampling is then a simple case of:

// Explicit begin/end for C
{
    rmt_BeginOpenGLSample(UnscopedSample);
    // ... OpenGL code ...
    rmt_EndOpenGLSample();
}

// Scoped begin/end for C++
{
    rmt_ScopedOpenGLSample(ScopedSample);
    // ... OpenGL code ...
}

Support for multiple contexts can be added pretty easily if there is demand for the feature. When you shutdown your OpenGL device and context, ensure you notify Remotery before shutting down Remotery itself:

rmt_UnbindOpenGL();

Sampling Metal GPU activity

Remotery can sample Metal command buffers issued to the GPU from multiple threads. As the Metal API does not support finer grained profiling, samples will return only the timing of the bound command buffer, irrespective of how many you issue. As such, make sure you bind and sample the command buffer for each call site:

rmt_BindMetal(mtl_command_buffer);
rmt_ScopedMetalSample(command_buffer_name);

The C API supports begin/end also:

rmt_BindMetal(mtl_command_buffer);
rmt_BeginMetalSample(command_buffer_name);
...
rmt_EndMetalSample();

Applying Configuration Settings

Before creating your Remotery instance, you can configure its behaviour by retrieving its settings object:

rmtSettings* settings = rmt_Settings();

Some important settings are:

// Redirect any Remotery allocations to your own malloc/free, with an additional context pointer
// that gets passed to your callbacks.
settings->malloc;
settings->free;
settings->mm_context;

// Specify an input handler that receives text input from the Remotery console, with an additional
// context pointer that gets passed to your callback.
// The handler will be called from the Remotery thread so synchronization with a mutex or atomics
// might be needed to avoid race conditions with your threads.
settings->input_handler;
settings->input_handler_context;
Comments
  • rmt_UnbindOpenGL blocks indefinitely

    rmt_UnbindOpenGL blocks indefinitely

    after creating Remotery and OpenGL context on the same thread, and issuing rmt_BindOpenGL on the same thread. Placing a call to rmt_UnbindOpenGL before shutting down Remotery blocks indefnintely in Remotery_BlockingDeleteSampleTree because of the following code:

                // Wait around until the Remotery server thread has sent all sample trees
                // of this type to the client
                while (sample_tree->allocator->nb_inuse > 1)
                    msSleep(1);
    

    The only way to get the application to close is to load up the Remotery web console in a webbrowser and wait for all the events to be sent to it. Is there a way to "jump the gun" and terminate the application even if there is samples pending?

    opened by graphitemaster 53
  • Access violation calling Thread32Next when lots of threads present

    Access violation calling Thread32Next when lots of threads present

    Remotery version: https://github.com/Celtoys/Remotery/commit/bf7cffb72122dc1a34ea8070c278b0af72ea232c Platform: Windows 10.0.19041 Hardware: Intel® Core™ i9-7980XE (18 Cores 36 Threads) Example project: https://github.com/dougbinks/enkiTSExamples/tree/remotery_issue

    Reproduction steps:

    1. Clone the remotery_issue branch of https://github.com/dougbinks/enkiTSExamples
    2. Build as per https://github.com/dougbinks/enkiTSExamples/blob/master/README.md#building
    3. Run enkiTSRemoteryExample.
    4. On my PC an exception occurs after a few runs of the main loop, on what appears to be the second run of the GatherThreads function.

    The example has 32 enkiTS task threads created, I found anything >=21 threads caused an issue on my PC.

    I note that you don't call CloseHandle( handle );, however although I believe this is required as per the documented example it does not fix the issue for me.

    opened by dougbinks 40
  • Connection problems on Win32

    Connection problems on Win32

    When Remotery is launched about 25% of the times, starts fine but after a few seconds remotery connection log starts showing connection errors every 2 seconds.

    [11:08:11] Connecting to ws://127.0.0.1:17815/rmt
    [11:08:12] Connection Error 
    [11:08:12] Disconnected
    [11:08:13] Connecting to ws://127.0.0.1:17815/rmt
    [11:08:14] Connection Error 
    [11:08:14] Disconnected
    [11:08:15] Connecting to ws://127.0.0.1:17815/rmt
    [11:08:16] Connection Error 
    [11:08:16] Disconnected
    [11:08:17] Connecting to ws://127.0.0.1:17815/rmt
    [11:08:18] Connection Error 
    [11:08:18] Disconnected
    [11:08:19] Connecting to ws://127.0.0.1:17815/rmt
    [11:08:20] Connection Error 
    [11:08:20] Disconnected
    

    Looking at the Chome console, I can see this:

    WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Could not decode a text frame as UTF-8.
    WebSocketConnection.js:54 WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Invalid frame header
    WebSocketConnection.js:54 WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Could not decode a text frame as UTF-8.
    WebSocketConnection.js:54 WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Invalid frame header
    WebSocketConnection.js:54 WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Could not decode a text frame as UTF-8.
    WebSocketConnection.js:54 WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Invalid frame header
    WebSocketConnection.js:54 WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Could not decode a text frame as UTF-8.
    WebSocketConnection.js:89 WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Invalid frame headerWebSocketConnection.js:89 OnOpen
    WebSocketConnection.js:89 WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Could not decode a text frame as UTF-8.
    9WebSocketConnection.js:54 WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Could not decode a text frame as UTF-8.
    

    On the WebSocketConnection.js file line 54: this.Socket = new WebSocket(address);

    This error are reported by chrome:

    WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Could not decode a text frame as UTF-8.
    WebSocket connection to 'ws://127.0.0.1:17815/rmt' failed: Invalid frame header
    

    Any think I could test to track this issue ?

    I wonder if I have too many marks, as I have changed my current profiller marks to call remotery...

    opened by ulanda 26
  • Add first proof of concept for rmt_StatI32

    Add first proof of concept for rmt_StatI32

    The iterator example now outputs stats as well:

    ~/code/Remotery $ ./test
    // ********************   DUMP TREE: Thread0   ************************
    SAMPLE: delay 1  time: 12  self: 12 type: 0  color: 0xddf9f8
      SAMPLE: recursive 6  time: 0  self: 0 type: 0  color: 0x6e745f
      SAMPLE: aggregate 3  time: 0  self: 0 type: 0  color: 0xcfd6ff
        STAT: MyCounter type: RMT_StatType_I32 value: 3  desc: test
    

    Also, here's a first screenshot of a statistic in action: Screenshot 2022-04-14 at 16 06 15

    opened by JCash 19
  • Custom port configuration

    Custom port configuration

    Allow the user to specify the server port when creating a Remotery global instance. Allows a user to run multiple instances of a program on the same computer and view it in Remotery or deal with port conflicts.

    opened by nickkorn 14
  • Adding manual sample entries

    Adding manual sample entries

    In our use case, we have projects that have large Lua code bases, and sometimes they need to figure out where the time is spent in their scripts.

    Lua has it's own way of adding debug callbacks, which allows the client to gather the data they need. I think it would be beneficial to be able to add these "externally" collected samples into the profiler data.

    On the C side, it might be good to mimic the current C code flow with the Begin/End pairs:

    rmt_BeginManualSample(name, flags)
    
        rmt_ManualSampleSetTime(time)
        rmt_ManualSampleSetStart(start)
    
    rmt_EndManualSample()
    

    As a first step, the question is what do you think about the idea, and if it would fit Remotery?

    opened by JCash 13
  • thread sanitizer warnings

    thread sanitizer warnings

    Hello,

    with -fsanitize=thread, I get the following warning from remotery code. Looking at this discussion, it seems that volatileis not enough to guarantee a safe read in static rmtU32 LoadAcquire(rmtU32* volatile address). A mutex may be required... I'm not sure about which one to use so I cannot provide a quick fix, if ever it would be a fix.

    Best,

    Stéphane

    WARNING: ThreadSanitizer: data race (pid=14734)
      Read of size 4 at 0x000108c05354 by thread T6 (mutexes: write M397):
        #0 LoadAcquire Remotery.c:585 (libdjnn-core.dylib:arm64+0x7ec6c)
        #1 rmtMessageQueue_AllocMessage Remotery.c:3999 (libdjnn-core.dylib:arm64+0x83dc0)
        #2 QueueThreadName Remotery.c:4894 (libdjnn-core.dylib:arm64+0x87cc4)
        #3 ThreadProfiler_Constructor Remotery.c:4937 (libdjnn-core.dylib:arm64+0x87bc8)
        #4 ThreadProfilers_GetThreadProfiler Remotery.c:5184 (libdjnn-core.dylib:arm64+0x87984)
        #5 ThreadProfilers_GetCurrentThreadProfiler Remotery.c:5210 (libdjnn-core.dylib:arm64+0x7f9e0)
        #6 _rmt_SetCurrentThreadName Remotery.c:6797 (libdjnn-core.dylib:arm64+0x7f928)
        #7 GatherThreadsLoop Remotery.c:5300 (libdjnn-core.dylib:arm64+0x83598)
        #8 StartFunc Remotery.c:1972 (libdjnn-core.dylib:arm64+0x842b4)
    
      Previous atomic write of size 4 at 0x000108c05354 by thread T5:
        #0 __tsan_atomic32_compare_exchange_val <null>:31683908 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x577a4)
        #1 AtomicCompareAndSwap Remotery.c:521 (libdjnn-core.dylib:arm64+0x83f84)
        #2 rmtMessageQueue_AllocMessage Remotery.c:4008 (libdjnn-core.dylib:arm64+0x83e10)
        #3 QueueSampleTree Remotery.c:4796 (libdjnn-core.dylib:arm64+0x834a8)
        #4 ThreadProfiler_Pop Remotery.c:5000 (libdjnn-core.dylib:arm64+0x80404)
        #5 _rmt_EndCPUSample Remotery.c:6945 (libdjnn-core.dylib:arm64+0x80260)
        #6 Remotery_ThreadMain Remotery.c:6360 (libdjnn-core.dylib:arm64+0x81830)
        #7 StartFunc Remotery.c:1972 (libdjnn-core.dylib:arm64+0x842b4)
    
      Location is heap block of size 24 at 0x000108c05340 allocated by main thread:
        #0 malloc <null>:31683908 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x592cc)
        #1 CRTMalloc Remotery.c:6603 (libdjnn-core.dylib:arm64+0x7edc0)
        #2 rmtMalloc Remotery.c:182 (libdjnn-core.dylib:arm64+0x7ef44)
        #3 Remotery_Constructor Remotery.c:6490 (libdjnn-core.dylib:arm64+0x7f0a4)
        #4 _rmt_CreateGlobalInstance Remotery.c:6657 (libdjnn-core.dylib:arm64+0x7ee8c)
        #5 djnn::init_core() core.cpp:74 (libdjnn-core.dylib:arm64+0x2cc4)
        #6 main <null>:31683908 (volta:arm64+0x100033ff4)
    
      Mutex M397 (0x000111c0a830) created at:
        #0 pthread_mutex_init <null>:31683908 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x2d1f4)
        #1 mtxInit Remotery.c:474 (libdjnn-core.dylib:arm64+0x831c8)
        #2 ThreadProfilers_Constructor Remotery.c:5097 (libdjnn-core.dylib:arm64+0x81450)
        #3 Remotery_Constructor Remotery.c:6553 (libdjnn-core.dylib:arm64+0x7f3f8)
        #4 _rmt_CreateGlobalInstance Remotery.c:6657 (libdjnn-core.dylib:arm64+0x7ee8c)
        #5 djnn::init_core() core.cpp:74 (libdjnn-core.dylib:arm64+0x2cc4)
        #6 main <null>:31683908 (volta:arm64+0x100033ff4)
    
      Thread T6 (tid=53203, running) created by thread T4 at:
        #0 pthread_create <null>:31683908 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x2bbe8)
        #1 rmtThread_Constructor Remotery.c:2013 (libdjnn-core.dylib:arm64+0x8169c)
        #2 InitThreadSampling Remotery.c:5576 (libdjnn-core.dylib:arm64+0x83324)
        #3 SampleThreadsLoop Remotery.c:5607 (libdjnn-core.dylib:arm64+0x8329c)
        #4 StartFunc Remotery.c:1972 (libdjnn-core.dylib:arm64+0x842b4)
    
      Thread T5 (tid=53202, running) created by main thread at:
        #0 pthread_create <null>:31683908 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x2bbe8)
        #1 rmtThread_Constructor Remotery.c:2013 (libdjnn-core.dylib:arm64+0x8169c)
        #2 Remotery_Constructor Remotery.c:6565 (libdjnn-core.dylib:arm64+0x7f4b4)
        #3 _rmt_CreateGlobalInstance Remotery.c:6657 (libdjnn-core.dylib:arm64+0x7ee8c)
        #4 djnn::init_core() core.cpp:74 (libdjnn-core.dylib:arm64+0x2cc4)
        #5 main <null>:31683908 (volta:arm64+0x100033ff4)
    
    opened by conversy 11
  • Api for iterating the profiler info

    Api for iterating the profiler info

    In our current profile library, we have support for iterating the samples and counters. This allows us to display the info quickly at runtime. E.g.:

    image

    Is this a feature you've considered before, and if so, what were/are your current thoughts around such an api in remotery?

    Regards, Mathias

    opened by JCash 11
  • Adding support for counters

    Adding support for counters

    As I'm currently testing out the api, one thing that our current api has support for is counters (vertex buffer sizes, number of draw calls, number of sprites etc).

    Looking at the remotery code, it shouldn't be impossible, but I'd like to see if you already have had thoughts/ideas around this topic.

    Api

    Our current api looks like this, which is quite similar to the current remotery api as well: #define DM_COUNTER(name, amount)

    E.g something like this should probably work well:

    #define rmt_AddCount(name, amount)
    

    Not sure about the name. I chose "Add" since it's is a verb (looking at the names "LogText", and "BeginCpuSample")

    UI

    Ideally, these counters should be possible to be presented in the web UI

    • just like samples are displayed today: name + count
    • if selected with a check box: display the counter as a graph over time (to more easily see spikes) (Stretch goal)

    For the UI, I guess the simplest way is to put it in the same thread window, at the top or bottom.

    Currently, I have to scroll the sample window to see all samples of our main thread. So it might be good to present the counters in a separate window

    Regards, Mathias

    opened by JCash 11
  • GL_TIMESTAMP not supported on OSX

    GL_TIMESTAMP not supported on OSX

    Hi,

    I can't figure out how to get usable traces from my OpenGL application, even if I follow the description in readme.md. Incidentally, I use glad as a crossplatform OpenGL functions loader, it might be the reason why I see nothing useful reported by remotery.

    So, could it be my use of glad? If so, I could give a try at understanding how it's supposed to work and provide a PR for glad support. Or should I use OpenGL Query objects explicitly in my app? In this case, would it be possible to add a sample, or link to an existing example somewhere to understand how it is supposed to work?

    Best.

    opened by conversy 11
  • OSX: Missing NSGLGetProcAddress

    OSX: Missing NSGLGetProcAddress

    I had to make workaround missing NSGLGetProcAddress function in newer OSX. https://github.com/bkaradzic/bgfx/commit/c3dd88767a8642e030916527937624a27cf3b3eb

    opened by bkaradzic 11
  • Sample window font in Firefox (macOS) is blurry

    Sample window font in Firefox (macOS) is blurry

    Screenshot 2022-07-20 at 09 46 07 Screenshot 2022-07-20 at 10 29 56

    The W and S are most visibly cut off.

    I tried to verify the texture with the Spector.js extension, but it wouldn't run for me. Although I expect it to be the shader code to be slightly off here.

    opened by JCash 9
  • Q: Generic long running events?

    Q: Generic long running events?

    I wish to be able to log events that span multiple frames. e.g. the loading of a resource in the game.

    Example: https://i.stack.imgur.com/odD1S.png

    By the looks of it in the Remotery example image (int the Readme.md), the "Processor timeline" is something related to that, but I'm not sure?

    Is this something that is currently supported? If not, what would be the recommended way to do it (assuming it fits within the scope/vison of Remotery)

    /Mathias

    opened by JCash 17
  • Q: Visualizer configurability?

    Q: Visualizer configurability?

    Regarding the visualizer, I'm thinking about ideas on how to specify the initial view. Each time I connect the engine+visualizer, I get the threads list populated. The Remotery thread shows up first, then threads in "random" order. I'd like to be able to sort the list (putting my main thread first, and Remotery last), giving a better default view for our users.

    Such settings could be done using the properties url (like the addr property). E.g. url?sort=main,sound,Remotery

    Thinking more about it, another idea is to be able to save settings from the visualizer into a cookie or something. This could be presented via a dropdown list. It would further minimize the effort in continuing where the last debug session left off. (e.g. expanding the flame graph of the threads to the desired depth)

    Have you had any thoughts around this?

    opened by JCash 2
  • Q: Create internal names for properties?

    Q: Create internal names for properties?

    I wonder if creating "internal" variable names is preferable, in order to avoid name clashes. When I started porting to this api, I immediately got a clash where I had to change my prop name to "EngineProps", since there already was a struct named Engine. The drawback is that I'd rather print "Engine" when presenting the properties.

        rmt_PropertyDefine_Group(Engine, "Engine properties");
        rmt_PropertyDefine_U32(FrameCount, 0, FrameReset, "# frames", &Engine);
    

    An alternative would be just passing the name (not a pointer to a variable), allowing the macro to create internal names for the variable:

        rmt_PropertyDefine_U32(FrameCount, 0, FrameReset, "# frames", Engine);
    

    E.g. the variables could be prefixed and end up like _rmt_prop_Engine and _rmt_prop_FrameCount.

    What are your thoughts on this?

    opened by JCash 8
  • Thread sampling hanging with debug CRT?

    Thread sampling hanging with debug CRT?

    Hi,

    I know it's silly to profile a program linked with the debug CRT, but I do sometimes use Remotery simply to see what's going on in our app, and that means sometimes in Debug. I noticed when upgrading to the latest Remotery that with the thread sampler on, I end up with deadlocks.

    When this happens I'm in CheckForStallingSamples and which allocates (via rmtMalloc) ; we usually end up with a NtWaitForAlertByThreadId (from ucrtbased.dll!__acrt_lock, called from ucrtbased.dll!malloc).

    Once this happens to our main thread, we basically hang.

    For now I'm going to disable the thread sampler by default when building our debug builds, however I feel this should at least be documented in Remotery's readme. Or an even better solution.

    opened by uucidl 3
Releases(v1.2.1)
Owner
Celtoys
Celtoys
Palanteer is a set of high performance visual profiler, debugger, tests enabler for C++ and Python

Palanteer is a set of lean and efficient tools to improve the general software quality, for C++ and Python programs.

Damien Feneyrou 1.9k Dec 29, 2022
Templight 2.0 - Template Instantiation Profiler and Debugger

Templight is a Clang-based tool to profile the time and memory consumption of template instantiations and to perform interactive debugging sessions to gain introspection into the template instantiation process.

Sven Mikael Persson 611 Dec 30, 2022
A Garry's Mod module that creates a Remote DeBugger server

gm_rdb A Garry's Mod module that creates a Remote DeBugger server. Provides Lua debugging (using LRDB) and access to the Source engine console. Compil

Daniel 14 Jul 7, 2022
Windows-only Remote Access Tool (RAT) with anti-debugging and anti-sandbox checks

RATwurst Windows-only Remote Access Tool (RAT) with anti-debugging and anti-sandbox checks. For educational purposes only. The reason behind this proj

AccidentalRebel 35 Dec 5, 2022
The fastest feature-rich C++11/14/17/20 single-header testing framework

master branch Windows All dev branch Windows All doctest is a new C++ testing framework but is by far the fastest both in compile times (by orders of

Viktor Kirilov 4.5k Jan 5, 2023
🧪 single header unit testing framework for C and C++

?? utest.h A simple one header solution to unit testing for C/C++. Usage Just #include "utest.h" in your code! The current supported platforms are Lin

Neil Henning 560 Jan 1, 2023
🍦IceCream-Cpp is a little (single header) library to help with the print debugging on C++11 and forward.

??IceCream-Cpp is a little (single header) library to help with the print debugging on C++11 and forward.

Renato Garcia 422 Dec 28, 2022
Watch for file changes and auto restart an application using fork checkpoints to continue the process (for quick live development)

Forkmon Watch for file changes and auto restart an application using fork checkpoints to continue. Intended for quick live development. This works onl

Eduardo Bart 12 Aug 27, 2022
Distributed (Deep) Machine Learning Community 682 Dec 28, 2022
FaceSwap, Realtime using cpu, 3D, c++

faceswap_cxx 3D FaceSwap, Using cpu realtime realtime face swap using cpu with 3D model Introduction c++版使用cpu实时换脸,参考git: https://github.com/MarekKowa

null 5 Nov 23, 2022
The open-source database for the realtime web.

RethinkDB What is RethinkDB? Open-source database for building realtime web applications NoSQL database that stores schemaless JSON documents Distribu

RethinkDB 25.9k Jan 9, 2023
Linux Terminal Service Manager (LTSM) is a set of service programs that allows remote computers to connect to a Linux operating system computer using a remote terminal session (over VNC or RDP)

Linux Terminal Service Manager (LTSM) is a set of service programs that allows remote computers to connect to a Linux operating system computer using a remote terminal session (over VNC)

null 34 Dec 16, 2022
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Dec 31, 2022
XMRig is a high performance, open source, cross platform RandomX, KawPow, CryptoNight and AstroBWT unified CPU/GPU miner

XMRig is a high performance, open source, cross platform RandomX, KawPow, CryptoNight and AstroBWT unified CPU/GPU miner and RandomX benchmark. Official binaries are available for Windows, Linux, macOS and FreeBSD.

null 7.3k Jan 9, 2023
CTranslate2 is a fast inference engine for OpenNMT-py and OpenNMT-tf models supporting both CPU and GPU executio

CTranslate2 is a fast inference engine for OpenNMT-py and OpenNMT-tf models supporting both CPU and GPU execution. The goal is to provide comprehensive inference features and be the most efficient and cost-effective solution to deploy standard neural machine translation systems such as Transformer models.

OpenNMT 395 Jan 2, 2023
Software ray tracer written from scratch in C that can run on CPU or GPU with emphasis on ease of use and trivial setup

A minimalist and platform-agnostic interactive/real-time raytracer. Strong emphasis on simplicity, ease of use and almost no setup to get started with

Arnon Marcus 48 Dec 28, 2022
Performance Evaluation of a Parallel Image Enhancement Technique for Dark Images on Multithreaded CPU and GPU Architectures

Performance Evaluation of a Parallel Image Enhancement Technique for Dark Images on Multithreaded CPU and GPU Architectures Image processing is a rese

Batuhan Hangün 5 Nov 4, 2021
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

Tencent 1.2k Dec 29, 2022
A lightweight 2D Pose model can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.

A lightweight 2D Pose model can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.

JinquanPan 58 Jan 3, 2023
Legion Low Level Rendering Interface provides a graphics API agnostic rendering interface with minimal CPU overhead and low level access to verbose GPU operations.

Legion-LLRI Legion-LLRI, or “Legion Low Level Rendering Interface” is a rendering API that aims to provide a graphics API agnostic approach to graphic

Rythe Interactive 25 Dec 6, 2022