Programming language that compiles into a x86 ELF executable.

Overview

ocean

Programming language that compiles into a x86 ELF executable.

The main goal at the moment is to create a C compiler, which can atleast compile itself.

TODO

In no particular order and may be planned in the near or far future. Not all of them are definitive aswell and may be scrapped if other better structurally sound suggestions / ideas may arise.

It does not list every however small it be functionality of C, like statements/expressions or what may be implemented already or isn't done yet. I may compile a list of those in the future to keep track of what to work on once most of functionality you expect from C has been implemented.

Short term

  • basic features of a programming language, conditionals/loops/variables/types/order of precedence e.g (1 + 2) * 3 - 4 / 5 + (6 * 8)
  • standard library
  • other compile targets e.g https://docs.microsoft.com/en-us/windows/win32/debug/pe-format
  • self-hosting
  • a working C compiler (excluding any use of GCC/clang specific features like attribute or such)

Long(er) term

  • Add new or borrow from other language(s) features/ideas ontop of C, deviate away from just C

    • defer
    • operator overloading
    • constexpr/consteval from c++ or some #run directive (https://github.com/BSVino/JaiPrimer/blob/master/JaiPrimer.md)
    • learn from other similar projects (e.g https://news.ycombinator.com/item?id=27890888)
    • array of struct to struct of arrays conversion and vice versa https://en.wikipedia.org/wiki/AoS_and_SoA
    • something like lisp ' to pass around expressions without evaluating them
    • builtin string type that keeps track of it's capacity/size/hash and when comparing first just compares the hash then if true the actual data or index based strings
    • struct members default values
    • coroutines by either saving entire stack for that function to heap or incorporating some message system notify/waittill (https://github.com/riicchhaarrd/gsc) sort of like sleeping functions that will wake up once you call notify on some key
    • more math operators / integration
      • cross/dot operator (e.g a cross b dot c)

      • backtick or other operator e.g math to evaluate math expressions as they are in math (e.g ` or math(a . c x b) either with x . or × and ⋅ like sizeof(int))

      • parse LaTeX (e.g [ \vec{a}\cdot\vec{b} = |\vec{a}||\vec{b}|\cos\theta ] $W = \vec{F}\cdot\vec{d} = F\cos\theta d$)

      • integrate some form of a symbolic language like wolfram or error on results that can't be solved/indeterminate/don't make sense in code

  • target other ISA (instruction set architectures) e.g ARM/x86_64

  • better error handling / memory cleanup

  • LLVM or other IR backend (atm just targeting x86 and it's a great learning exercise)

Example

Example of code (may be subject to change)

int strlen(const char *s)
{
    int i = 0;
    while(s[i])
    {
        i+=1;
    }    
    return i;
}

void print(const char *s)
{
    write(1, s, strlen(s) + 1);
}

int main()
{
    print("hello, world!\n");
    
    char msg[32];
    msg[0] = 'h';
    msg[1] = 'e';
    msg[2] = 'l';
    msg[3] = 'l';
    msg[4] = 'o';
    msg[5] = ',';
    msg[6] = ' ';
    msg[7] = 'w';
    msg[8] = 'o';
    msg[9] = 'r';
    msg[10] = 'l';
    msg[11] = 'd';
    msg[12] = '!';
    msg[13] = 10;
    msg[14] = 0;

    int i;
    
    for(i = 0; i < 10; i += 1)
    {
        if(i % 2 ==0)
        {
            print(msg);
        }
    }
}

Which compiles into (no relocations, when compiling to ELF strings will be relocated and 0xcccccccc would be replaced by the actual location)

mov eax, 0
call eax
xor ebx, ebx
xor eax, eax
inc eax
int 0x80
push ebp
mov ebp, esp
sub esp, 4
mov eax, 0
push eax
lea ebx, [ebp - 4]
pop eax
mov dword [ebx], eax
mov ebx, dword [ebp + 8]
mov eax, dword [ebp - 4]
add ebx, eax
movzx eax, byte [ebx]
test eax, eax
je 0x47
mov eax, 1
push eax
lea ebx, [ebp - 4]
pop eax
add dword [ebx], eax
jmp 0x23
mov eax, dword [ebp - 4]
mov esp, ebp
pop ebp
ret
mov esp, ebp
pop ebp
ret
push ebp
mov ebp, esp
sub esp, 0
mov eax, 1
push eax
mov eax, dword [ebp + 8]
push eax
mov eax, dword [ebp + 8]
push eax
call 0xe
add esp, 4
push eax
mov eax, 1
mov ecx, eax
pop eax
xor edx, edx
add eax, ecx
mov edx, eax
pop ecx
pop ebx
mov eax, 4
int 0x80
mov esp, ebp
pop ebp
ret
push ebp
mov ebp, esp
sub esp, 0x24
mov eax, 0xcccccccc
push eax
call 0x52
add esp, 4
mov eax, 0x68
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 0
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x65
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 1
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x6c
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 2
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x6c
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 3
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x6f
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 4
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x2c
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 5
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x20
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 6
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x77
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 7
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x6f
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 8
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x72
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 9
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x6c
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 0xa
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x64
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 0xb
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0x21
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 0xc
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0xa
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 0xd
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0
push eax
push eax
lea ebx, [ebp - 0x20]
mov eax, 0xe
add ebx, eax
pop eax
pop eax
mov byte [ebx], al
mov eax, 0
push eax
lea ebx, [ebp - 0x24]
pop eax
mov dword [ebx], eax
mov eax, dword [ebp - 0x24]
push eax
mov eax, 0xa
mov ecx, eax
pop eax
xor edx, edx
cmp eax, ecx
jge 0x202
xor eax, eax
inc eax
jmp 0x204
xor eax, eax
test eax, eax
je 0x256
mov eax, dword [ebp - 0x24]
push eax
mov eax, 2
mov ecx, eax
pop eax
xor edx, edx
idiv ecx
mov eax, edx
push eax
mov eax, 0
mov ecx, eax
pop eax
xor edx, edx
cmp eax, ecx
jne 0x232
xor eax, eax
inc eax
jmp 0x234
xor eax, eax
cmp eax, 0
je 0x245
lea eax, [ebp - 0x20]
push eax
call 0x52
add esp, 4
mov eax, 1
push eax
lea ebx, [ebp - 0x24]
pop eax
add dword [ebx], eax
jmp 0x1eb
mov esp, ebp
pop ebp
ret
You might also like...
Volatile ELF payloads generator with Metasploit integrations for testing GNU/Linux ecosystems against low-level threats
Volatile ELF payloads generator with Metasploit integrations for testing GNU/Linux ecosystems against low-level threats

Revenant Intro This tool combines SCC runtime, rofi, Msfvenom, Ngrok and a dynamic template processor, offering an easy to use interface for compiling

Easy Dump ELF libil2cpp.so from Android Process Memory

PAD (Process Android Dumper) This dumper is made for il2cpp game but you can use it in any app you want How To Use Run the process Open PADumper Put p

Tool to convert ELF (S)hared (O)bject to Nintendo (R)elocatable (S)hared (O)bject

elf2rso Tool to convert ELF (S)hared (O)bject to Nintendo (R)elocatable (S)hared (O)bject Command Line Options -i or --input - It's the ELF File to be

A simple PoC to demonstrate that is possible to write Non writable memory and execute Non executable memory on Windows

WindowsPermsPoC A simple PoC to demonstrate that is possible to write Non writable memory and execute Non executable memory on Windows You can build i

Cobalt Strike beacon object file implementation for trusted path UAC bypass. The target executable will be called without involving
Cobalt Strike beacon object file implementation for trusted path UAC bypass. The target executable will be called without involving

Beacon object file implementation for trusted path UAC bypass. The target executable will be called without involving "cmd.exe" by using DCOM object.

The C source code was RESTORED by disassembling the original executable file OPTIM.COM from the Hi-Tech v3.09 compiler.

The C source code was RESTORED by disassembling the original executable file OPTIM.COM from the Hi-Tech v3.09 compiler. This file is compiled by Hi-Te

Resolve DOS MZ executable symbols at runtime
Resolve DOS MZ executable symbols at runtime

NtSymbol Resolve DOS MZ executable symbols at runtime Example You no longer have not have to use memory pattern scan inside your sneaky rootkit. Pass

Simple one file header for hijacking windows version.dll for desired executable to do 3rd party modifying without dll injection.

Version-Hijack Simple one file header for hijacking windows version.dll for desired executable to do 3rd party modifying without dll injection. Usage

Manual mapper that uses PTE manipulation, Virtual Address Descriptor (VAD) manipulation, and forceful memory allocation to hide executable pages. (VAD hide / NX bit swapping)
Manual mapper that uses PTE manipulation, Virtual Address Descriptor (VAD) manipulation, and forceful memory allocation to hide executable pages. (VAD hide / NX bit swapping)

Stealthy Kernel-mode Injector Manual mapper that uses PTE manipulation, Virtual Address Descriptor (VAD) manipulation, and forceful memory allocation

Comments
  • Error-based recommendation

    Error-based recommendation

    Add some system where only functions marked as such can error. For example, opening a file might error, but if the error is handled, the parent function could be the equivalent of noexcept. It would definitely be safer to assume noexcept by default though. Unbounded array access should be marked as except, so that you have to implement some sort of bounds check or error handler to be noexcept. If the main() function isn't marked as except then this should guarantee that the program wont crash

    opened by ramonGonzEdu 3
  • Rust-like features

    Rust-like features

    Another idea is adding rusty features like enums with data and better switch (maybe with alternate keyword for support). This would replace something like std::variant and std::visit of c++.

    opened by ramonGonzEdu 2
  • Ability to define where to put a strict member

    Ability to define where to put a strict member

    Add a way to tell a strict member to always be at a constant offset from the strict start and fit everything else in the empty space. This would allow structs to define a common offset of N where there is a string Id so that there would only have to be 1 function for all the structs without templating.

    opened by ramonGonzEdu 0
  • fixed bug with eax, needed to push and restore, added while test,

    fixed bug with eax, needed to push and restore, added while test,

    ternary operator ast, type_qualifiers preamble, fixed infinite for loop, !=, == operator, fixed empty strings which pointer to constant memory location now, fixed array subscript stuff and some reorganizing

    opened by riicchhaarrd 0
Releases(2021-11-23)
Owner
Richard
intrigued
Richard
A port of the Linux x86 IOLI crackme challenges to x86-64

This is a port of the original Linux x86 IOLI crackme binaries to x86-64. The original set of IOLI crackmes can be found here: https://github.com/Maij

Julian Daeumer 4 Mar 19, 2022
Bungie's Oni modified so it compiles with Microsoft Visual Studio 2019.

OniFoxed What's this? This is a modified variant of the recently leaked Oni source code so that it compiles under Microsoft Visual Studio 2019 with so

Mark Sowden 59 Dec 2, 2022
Compiles c files automatically on any change

Compiles c files automatically on any change

notaweeb 9 Dec 5, 2022
PLP Project Programming Language | Programming for projects and computer science and research on computer and programming.

PLPv2b PLP Project Programming Language Programming Language for projects and computer science and research on computer and programming. What is PLP L

PLP Language 5 Aug 20, 2022
StarkScript - or the Stark programming language - is a compiled C-based programming language that aims to offer the same usability as that of JavaScript's and TypeScript's

StarkScript StarkScript - or the Stark programming language - is a compiled C-based programming language that aims to offer the same usability as that

EnderCommunity 5 May 10, 2022
Jaws is an invisible programming language! Inject invisible code into other languages and files! Created for security research -- see blog post

Jaws is an invisible interpreted programming language that was created for antivirus research. Since Jaws code is composed entirely of whitespace char

C.J. May 208 Dec 9, 2022
Remap ELF LOAD segments to huge pages

Quick start Not recommended as a production solution, but it's a very fast way to benchmark if your application benefits from remapping your text and

null 19 Dec 21, 2022
Elven relativism -- relocation and execution of aarch64 ELF relocatable objects (REL)

elvenrel Elven Relativism -- relocation and execution of aarch64 ELF relocatable objects (REL) on Linux and macOS. Program loads a multitude of ELF RE

Martin Krastev 15 Oct 15, 2022
A repository for experimenting with elf loading and in-place patching of android native libraries on non-android operating systems.

droidports: A repository for experimenting with elf loading and in-place patching of android native libraries on non-android operating systems. Discla

João Henrique 26 Dec 15, 2022
A utility to run ELF files in memory.

execelf - A utility to execute ELF files in memory. execelf is small utility for running ELF files in memory, without touching the disk! Installation

null 11 Nov 21, 2022