Toy LLVM obfuscator pass

Overview

ToyObfuscator

Some simple obfuscator ;) (base on llvm-10)

Compile

Build out-tree pass

git clone https://github.com/veritas501/ToyObfuscator.git
cd ToyObfuscator
mkdir build && cd build
cmake .. -DLLVM_DIR=/usr/lib/llvm-10/lib/cmake/llvm/
make -j`nproc`

Compiled pass at "ToyObfuscator/build/src/libLLVMToyObfuscator.so"

Build in-tree pass

# clone llvm-10.0.1
git clone https://github.com/llvm/llvm-project.git --depth 1 -b llvmorg-10.0.1
# apply custom patch
./build_clang.sh <DIR_TO_llvm-project>
# build clang and llvm as normal
cd <DIR_TO_llvm-project>
mkdir build && cd build
cmake -DLLVM_ENABLE_PROJECTS=clang -DCMAKE_BUILD_TYPE=Release -G "Unix Makefiles" ../llvm
make -j`nproc` # or 'make clang -j`nproc`' for just compile clang

Pass flags

  • -fla_plus: control flow graph flatten plus version
    • -dont_fla_invoke: used with -fla_plus, flattening each function except which contains InvokeInst (default=false)
    • -fla_cnt=X: used with -fla_plus, do flatten X times (default=1, max=3)
  • -bcf: bogus control flow
    • -bcf_rate: the probability that each basic block will be obfuscated. (default=30, max=100)

Quickstart

  • demo.c
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

void foo1() {
    puts("argc == 1");
}

void foo2(int argc) {
    for (int i = 0; i < argc; i++) {
        puts("argc != 1");
    }
}

int main(int argc, char **argv) {
    if (argc == 1) {
        foo1();
    } else {
        foo2(argc);
    }
    return argc;
}

Use out-tree pass to do obfuscation.

clang -emit-llvm -c demo.c -o demo.bc
opt -load ./libLLVMToyObfuscator.so -fla_plus demo.bc -o demo_obf.bc
clang demo_obf.bc -o demo_obf

Pass design

fla_plus

先来说说传统ollvm中的flat吧。

在这种flat思路中,switch块中用来判断jump地址的信息为label,而每个块其实都对应一个label,而这个label值在其块结束时被设置。 因此恢复者可以先通过switch块收集label和basic block间的对应关系,之后在每个basic block的结尾获取这个block设置的新label从而推出下一个block是谁(如果是条件跳转就获取后两个block以及对应的条件)。 或者说,恢复者可以先通过特征找到所有的useful block,然后借助angr等符号执行工具,找到他的下一个或下两个block。 换言之,在这种思路中,假设A->B,那么已知A,通过switch信息以及A末尾的label便可求出后继B。

为了对抗这种恢复思路,我打算在A->B的过程中引入状态变量。即A跳转到B依赖于进入A的状态。

这里我引入了x, y, label三个变量(目前设计为三个uint32_t)。粗略一看可以发现,所有的useful block后面设置的label都为label1,而label1指向trans-1。在switch中不仅存在useful block,还存在translate block。translate block的作用是进行一个f的运算,因为label和x存在如下关系:label=f(x)。所以,在我的这个方案中,A->B并不是依靠写在block后面的label值,而是x值。那有人就要问了,那我拿到A中的x,不就能计算出对应的label,得到A->B的关系了吗?并没这么容易。我们发现,x的获取并不是简单的赋值,而是使用的xor,x=y^imm32_const1; y=x^imm32_const2。因此,想要得到A执行完x的值,还必须知道进入A时的y值,而y值并不存在于label的计算也不存在于switch的分发中。因此,想要知道A的后继,必须知道进入A时的y值,而这个y值单纯将A抽出来分析是无法得到的,因为它和程序的运行态相关,得到的方法只有将这个flat函数完整从prologue开始模拟到A块的开头,正如程序正常执行时那样。因此,这个方法也不是万能的,依然有破解的方法,只是不能像之前那样将一个个block拉出来逐个击破。

此外flat_plus还支持了InvokeInst,在使用try...catch的c++函数中出现(貌似还有其他情况也会使用InvokeInst),ollvm是直接将invoke屏蔽了,只要这个函数中存在就不对这个函数做flat。

思路如下。invoke的unwind分支的开头为LandingPadInst,我们找到所有包含landingpad的block以及这些block的后继,将其在后续flat过程中排除,其余的block继续做flatten。其中,invoke类似branch,作为terminator,其后面不能再添加指令,因此只能修改默认的跳转分支并创建trampoline,将flat的逻辑写在trampoline中。此外,fixStack的逻辑需要一些改动,这里不再赘述。

bcf

虚假控制流一般就是通过不透明谓词(opaque predicates)来实现。

例如ollvm中所用的是(y < 10 || x * (x + 1) % 2 == 0),其中x, y是全局变量且初值为0,因此这个等式恒成立。 由于从binary角度分析,x, y被分配在.bss上,IDA并不会将其视为常量0,故不会对这个等式进行化简。

其实关于不透明谓词我们明显有更好的对象,例如构造满足条件的sqrt(b^2-4ac)让其恒小于0,这种式子IDA还没有能力分析出结果。

或者如我借鉴的这篇文章所述, 构造两个不同的素数p1, p2,再取两个不同的正整数常数a1, a2,再从程序中随机挑选两个int类型的变量v1, v2,则下述不等式恒成立: p1*((v1|a1)^2) != p2*((v2|a2)^2)

此外,我对ollvm中的虚假控制流的做了一些调整。 传统的ollvm中的bcf pattern如下:

分析这种bcf有几种方案:

  1. 由于x, y为全局变量很容易识别,只要在IDA中通过修改段属性为ro即可让IDA自动分析这个不透明谓词。例如这篇文章
  2. 此外我们也可以注意到,这个方案中Junk块在运行过程中永远不会被执行到,因此还有一种思路就是将永远不被执行到的块去除。例如这篇文章
  3. 通过查找这种固定的pattern来直接去除不透明谓词。例如这篇文章

可能还有其他思路,这里不再列举。

为了对抗思路1,我没有引入全局变量,我使用的不透明谓词中只使用了常量和程序中正常使用的变量。

为了对抗思路2,我在不透明谓词中引入的虚假跳转分支都为函数中本就真实存在的block,这样,虽然不透明谓词中的虚假跳转不会发生,但目标块是在程序运行中是被真实执行过的。

至于思路3暂时还没有做对抗,毕竟目前的不透明谓词也只有一种,只要识别到了特征都很方便去除。

You might also like...
Convert LLVM coverage information into HTML reports

llvm-coverage-to-html converter The clang compiler supports source based coverage tracking, but the default reporting options are very basic. This too

A simple Jasper interpreter made with Flex, Bison and the LLVM IR
A simple Jasper interpreter made with Flex, Bison and the LLVM IR

JasperCompiler A simple Jasper interpreter (for now) made with Flex and Bison. Jasper? Jasper is "a scripting language inspired by Haskell, Javascript

Tobsterlang is a simple imperative programming language, written in C++ with LLVM.

tobsterlang Tobsterlang is a simple imperative programming language, written in C++ with LLVM. One of its distinct features is the fact it uses XML in

Writing a basic compiler frontend following  LLVM's tutorial, with complete added supports Hindi and UTF-8 in general
Writing a basic compiler frontend following LLVM's tutorial, with complete added supports Hindi and UTF-8 in general

सारस | SARAS Started with following LLVM's tutorial In development, a hobby project only JIT is broken right now, 'jit' branch par code hai uska Compi

repo to house various LLVM based SIHFT passes for RISCV 32/64 soft error resilience

compas-ft-riscv COMPAS: Compiler-assisted Software-implemented Hardware Fault Tolerance implemented in LLVM passes for the RISC-V backend Repo to hous

TypeScriptCompiler - TypeScript Compiler (by LLVM)
TypeScriptCompiler - TypeScript Compiler (by LLVM)

TypeScript Native Compiler Powered by Build Demo Chat Room Want to chat with other members of the TypeScriptCompiler community? Example abstract class

pluggable tool to convert an unrolled TritonAST to LLVM-IR, optimize it and get back to TritonAST

it is fork from https://github.com/fvrmatteo/TritonASTLLVMIRTranslator *WARNINGS: tested only linux(ubuntu 20.04) and only llvm and clang version 10*

Per function, Lua JIT using LLVM C++ toolchain

Lua Low Level Brief This is an alternative Lua (5.3.2) implementation that aims to archive better performance by generating native code with the help

LLVM IR and optimizer for shaders, including front-end adapters for GLSL and SPIR-V and back-end adapter for GLSL

Licensing LunarGLASS is available via a three clause BSD-style open source license. Goals The primary goals of the LunarGLASS project are: Reduce the

Comments
  • 对InvokeInst做flat有概率编译失败

    对InvokeInst做flat有概率编译失败

    在测试对LIEF(C++库)做flatten测试时发现,如果对InvokeInst做flat,则在后续Greedy Register Allocator时会发生空指针引用从而导致clang崩溃。

    $ make
    [  2%] Built target lief_libjson
    [  4%] Built target lief_frozen
    [  6%] Built target lief_mbed_tls
    [  8%] Built target lief_leaf
    [ 10%] Built target lief_utfcpp
    [ 12%] Built target lief_spdlog_project
    [ 12%] Building CXX object CMakeFiles/LIB_LIEF.dir/src/ELF/Builder.cpp.o
    Stack dump:
    0.      Program arguments: /home/veritas/src/llvm-project/build/bin/clang++ -DLIEF_STATIC -DMBEDTLS_MD2_C -DMBEDTLS_MD4_C -DMBEDTLS_PEM_PARSE_C -DMBEDTLS_PEM_WRITE_C -DMBEDTLS_PKCS1_V15 -DMBEDTLS_PKCS1_V21 -DMBEDTLS_X509_ALLOW_UNSUPPORTED_CRITICAL_EXTENSION -DMBEDTLS_X509_CRT_PARSE_C -DSPDLOG_DISABLE_DEFAULT_LOGGER -DSPDLOG_FUNCTION= -D_GLIBCXX_USE_CXX11_ABI=1 -I/home/veritas/src/LIEF/include -I/home/veritas/src/LIEF/api/c/include -I/home/veritas/src/LIEF/build/include -I/home/veritas/src/LIEF/build/lief_frozen-prefix/src/lief_frozen/include -I/home/veritas/src/LIEF/src -I/home/veritas/src/LIEF/build -I/home/veritas/src/LIEF/include/LIEF -isystem /home/veritas/src/LIEF/build/mbed_tls/src/lief_mbed_tls/include -isystem /home/veritas/src/LIEF/build/lief_spdlog_project-prefix/src/lief_spdlog_project/include -mllvm -fla_plus -O3 -DNDEBUG -fPIC -fvisibility=hidden -Wall -Wextra -Wpedantic -fno-stack-protector -fomit-frame-pointer -fno-strict-aliasing -fexceptions -fvisibility=hidden -Wno-expansion-to-defined -fdiagnostics-color=always -fcolor-diagnostics -std=gnu++14 -o CMakeFiles/LIB_LIEF.dir/src/ELF/Builder.cpp.o -c /home/veritas/src/LIEF/src/ELF/Builder.cpp
    1.      <eof> parser at end of file
    2.      Code generation
    3.      Running pass 'Function Pass Manager' on module '/home/veritas/src/LIEF/src/ELF/Builder.cpp'.
    4.      Running pass 'Greedy Register Allocator' on function '@_ZN4LIEF3ELF7Builder20build_symbol_versionINS0_5ELF32EEEvv'
     #0 0x000055711d3f311e llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/home/veritas/src/llvm-project/build/bin/clang+++0x296711e)
     #1 0x000055711d3f0e64 llvm::sys::RunSignalHandlers() (/home/veritas/src/llvm-project/build/bin/clang+++0x2964e64)
     #2 0x000055711d3f10e1 llvm::sys::CleanupOnSignal(unsigned long) (/home/veritas/src/llvm-project/build/bin/clang+++0x29650e1)
     #3 0x000055711d36c908 CrashRecoverySignalHandler(int) (/home/veritas/src/llvm-project/build/bin/clang+++0x28e0908)
     #4 0x00007fb221eaf3c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x153c0)
     #5 0x000055711cb8171f (anonymous namespace)::HoistSpillHelper::getVisitOrders(llvm::MachineBasicBlock*, llvm::SmallPtrSet<llvm::MachineInstr*, 16u>&, llvm::SmallVectorImpl<llvm::DomTreeNodeBase<llvm::MachineBasicBlock>*>&, llvm::SmallVectorImpl<llvm::MachineInstr*>&, llvm::DenseMap<llvm::DomTreeNodeBase<llvm::MachineBasicBlock>*, unsigned int, llvm::DenseMapInfo<llvm::DomTreeNodeBase<llvm::MachineBasicBlock>*>, llvm::detail::DenseMapPair<llvm::DomTreeNodeBase<llvm::MachineBasicBlock>*, unsigned int> >&, llvm::DenseMap<llvm::DomTreeNodeBase<llvm::MachineBasicBlock>*, llvm::MachineInstr*, llvm::DenseMapInfo<llvm::DomTreeNodeBase<llvm::MachineBasicBlock>*>, llvm::detail::DenseMapPair<llvm::DomTreeNodeBase<llvm::MachineBasicBlock>*, llvm::MachineInstr*> >&) (.isra.0) (/home/veritas/src/llvm-project/build/bin/clang+++0x20f571f)
     #6 0x000055711cb82db5 (anonymous namespace)::HoistSpillHelper::runHoistSpills(llvm::LiveInterval&, llvm::VNInfo&, llvm::SmallPtrSet<llvm::MachineInstr*, 16u>&, llvm::SmallVectorImpl<llvm::MachineInstr*>&, llvm::DenseMap<llvm::MachineBasicBlock*, unsigned int, llvm::DenseMapInfo<llvm::MachineBasicBlock*>, llvm::detail::DenseMapPair<llvm::MachineBasicBlock*, unsigned int> >&) (.isra.0) (/home/veritas/src/llvm-project/build/bin/clang+++0x20f6db5)
     #7 0x000055711cb8b498 (anonymous namespace)::HoistSpillHelper::hoistAllSpills() (/home/veritas/src/llvm-project/build/bin/clang+++0x20ff498)
     #8 0x000055711cbff382 llvm::RegAllocBase::postOptimization() (/home/veritas/src/llvm-project/build/bin/clang+++0x2173382)
     #9 0x000055711cbcdfbd (anonymous namespace)::RAGreedy::runOnMachineFunction(llvm::MachineFunction&) (/home/veritas/src/llvm-project/build/bin/clang+++0x2141fbd)
    #10 0x000055711c95acec llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/veritas/src/llvm-project/build/bin/clang+++0x1ececec)
    #11 0x000055711cce73d8 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/veritas/src/llvm-project/build/bin/clang+++0x225b3d8)
    #12 0x000055711cce8999 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/veritas/src/llvm-project/build/bin/clang+++0x225c999)
    #13 0x000055711cce8d60 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/veritas/src/llvm-project/build/bin/clang+++0x225cd60)
    #14 0x000055711d658b3c clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/home/veritas/src/llvm-project/build/bin/clang+++0x2bccb3c)
    #15 0x000055711e26ed79 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/veritas/src/llvm-project/build/bin/clang+++0x37e2d79)
    #16 0x000055711edecf61 clang::ParseAST(clang::Sema&, bool, bool) (/home/veritas/src/llvm-project/build/bin/clang+++0x4360f61)
    #17 0x000055711dc2fbf9 clang::FrontendAction::Execute() (/home/veritas/src/llvm-project/build/bin/clang+++0x31a3bf9)
    #18 0x000055711dbe75eb clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/veritas/src/llvm-project/build/bin/clang+++0x315b5eb)
    #19 0x000055711dd064ab clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/veritas/src/llvm-project/build/bin/clang+++0x327a4ab)
    #20 0x000055711b6973a1 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/veritas/src/llvm-project/build/bin/clang+++0xc0b3a1)
    #21 0x000055711b694fea ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) (/home/veritas/src/llvm-project/build/bin/clang+++0xc08fea)
    #22 0x000055711dab1c89 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const::'lambda'()>(long) (/home/veritas/src/llvm-project/build/bin/clang+++0x3025c89)
    #23 0x000055711d36ca07 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/veritas/src/llvm-project/build/bin/clang+++0x28e0a07)
    #24 0x000055711dab289e clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const (.part.0) (/home/veritas/src/llvm-project/build/bin/clang+++0x302689e)
    #25 0x000055711da892bc clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const (/home/veritas/src/llvm-project/build/bin/clang+++0x2ffd2bc)
    #26 0x000055711da89bb9 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) const (/home/veritas/src/llvm-project/build/bin/clang+++0x2ffdbb9)
    #27 0x000055711da9193f clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) (/home/veritas/src/llvm-project/build/bin/clang+++0x300593f)
    #28 0x000055711b60cd1e main (/home/veritas/src/llvm-project/build/bin/clang+++0xb80d1e)
    #29 0x00007fb2203900b3 __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:342:3
    #30 0x000055711b694bce _start (/home/veritas/src/llvm-project/build/bin/clang+++0xc08bce)
    clang-10: error: clang frontend command failed due to signal (use -v to see invocation)
    clang version 10.0.1 (https://github.com/llvm/llvm-project.git ef32c611aa214dea855364efd7ba451ec5ec3f74)
    Target: x86_64-unknown-linux-gnu
    Thread model: posix
    InstalledDir: /home/veritas/src/llvm-project/build/bin
    clang-10: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
    clang-10: note: diagnostic msg:
    ********************
    
    PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
    Preprocessed source(s) and associated run script(s) are located at:
    clang-10: note: diagnostic msg: /tmp/Builder-975698.cpp
    clang-10: note: diagnostic msg: /tmp/Builder-975698.sh
    clang-10: note: diagnostic msg:
    
    ********************
    make[2]: *** [CMakeFiles/LIB_LIEF.dir/build.make:1804: CMakeFiles/LIB_LIEF.dir/src/ELF/Builder.cpp.o] Error 254
    make[1]: *** [CMakeFiles/Makefile2:373: CMakeFiles/LIB_LIEF.dir/all] Error 2
    make: *** [Makefile:152: all] Error 2
    
    bug 
    opened by veritas501 0
Owner
veritas501
veritas501
Automatic Disassembly Desynchronization Obfuscator

desync-cc --- Automatic Disassembly Desynchronization Obfuscator desync-cc is designed as a drop-in replacement for gcc, which applies disassembly des

Ulf Kargén 8 Dec 30, 2022
Trident provides an easy way to pass the output of one command to any number of targets.

Trident: The multiple-pipe system Trident provides an easy way to pipe the output of one command to not just one but many targets. These targets can b

Matthias Gessinger 36 Nov 23, 2021
A gazebo actor plugin that utilizes the map of the environment and graph search methods to generate random actor trajectories that don't pass through walls, furniture, etc.

Gazebo-Map-Actor-Plugin A gazebo actor plugin that utilizes the map of the environment and graph search methods to generate random actor trajectories

Yasin Sonmez 11 Dec 23, 2022
This repository contains toy ImPlot applications that demonstrate some of the library's functionality

ImPlot Demos This repository contains toy ImPlot applications that demonstrate some of the library's functionality.

Evan Pezent 83 Dec 28, 2022
Toy 8 bit CPU with a real assembler

neko8 neko8 is a 8 bit CPU emulator designed to be easy to learn written in C. It uses its own simple architecture and can be programmed in its own fo

rem 4 Jan 4, 2022
nn - a toy operating system, designed for fun

nn is a toy operating system, designed for fun (and from a position of general naïveté). i'm not sure how far it'll go, but one thing's for sure: it'll probably implement nearly nothing.

Lux L. 2 Jan 28, 2022
Eve programming Language. Toy project.

Eve Programming Language How to use Eve Install & Run $ sudo make install $ eve <filename>.eve Version check $ eve -v Clean $ sudo make clean Hell

tsharp0x11 66 Jun 28, 2022
Collection of C++ containers extracted from LLVM

lvc lvc is a set of C++ containers extracted form LLVM for an easier integration in external projects. To avoid any potential conflit, the llvm namesp

Benjamin Navarro 26 Apr 22, 2022
LLVM bindings for Node.js/JavaScript/TypeScript

llvm-bindings LLVM bindings for Node.js/JavaScript/TypeScript Supported OS macOS Ubuntu Windows Supported LLVM methods listed in the TypeScript defini

ApsarasX 250 Dec 18, 2022
VM devirtualization PoC based on AsmJit and llvm

vm_jit PoC vm devirtualization based on AsmJit. The binary was taken from YauzaCTF 2021 competition. You are welcome to try to solve it yourself, the

Pavel 58 Aug 8, 2022