Simple Virtual Machine with its own Bytecode and Assembly language.

Overview

BM

birch

Simple Virtual Machine with its own Bytecode and Assembly language.

Build

We are using nobuild build system which requires a bootstrapping step with any relatively standard complaint C compiler.

On Linux/MacOS/FreeBSD/literally any OS on the planet Earth except Windows with MSVC:

$ cc -o nobuild nobuild.c
$ ./nobuild help

If you still want to use MSVC on Windows run vcvarsall.bat and from within the development environment of MSVC:

> cl.exe nobuild.c
> nobuild.exe help

Building the libbm Library

$ ./nobuild lib

The static library will be put into ./build/library/

Building the Toolchain

$ ./nobuild tools

The binaries of the toolchain will be placed in ./build/toolchain/.

Building and Running Examples

$ ./nobuild examples

The examples will be placed in ./build/examples/.

To run the examples use basm executable from the toolchain:

$ ./build/toolchain/bme -i ./build/examples/hello.bm
$ ./build/toolchain/bme -i ./build/examples/fib.bm
$ ./build/toolchain/bme -i ./build/examples/e.bm
$ ./build/toolchain/bme -i ./build/examples/pi.bm

Adding More Examples

nobuild examples automatically builds all the ./examples/*.basm files. So if you want to add a new example to the build just add *.basm file to ./examples/.

Running and Recoding Tests

TBD

Toolchain

basm

Assembly language for the Virtual Machine. For examples see ./examples/ folder.

bme

BM emulator. Used to run programs generated by basm.

bdb

BM debuger. Used to step debug programs generated by basm.

debasm

Disassembler for the binary files generated by basm

bmr

BM recorder. Used to record the output of binary files generated by basm and comparing those output to the expected ones. We use this tool for Integration Testing.

basm2nasm

An experimental tool that translates BM files generated by basm to an assembly files in NASM dialect for x86_64 Linux.

expr2dot

Accepts BASM TTE as a command line argument and dumps its AST in dot format that you can render later with graphviz later.

$ ./build/toolchain/expr2dot "f(a) + g(b) + 69 > 420" | dot -Tsvg > ast.svg
$ iexplore.exe ast.svg

Editor Support

Emacs

Emacs mode available in ./tools/basm-mode.el. Until the language stabilized and we upload the mode on MELPA you need to install this mode manually.

Add the following lines to your .emacs file:

(add-to-list 'load-path "/path/to/basm-mode/")
(require 'basm-mode)

Vim

Copy ./tools/basm.vim in .vim/syntax/basm.vim. Add the following line to your .vimrc file:

autocmd BufRead,BufNewFile *.basm set filetype=basm
Comments
  • [bang] SIGSEGV upon nullptr dereference on FreeBSD

    [bang] SIGSEGV upon nullptr dereference on FreeBSD

    511daade172717fac238b9b2c8673df77d38ea67 breaks

    See:

    [[email protected] ~/src/bm]$ gg build/toolchain/bang -t nasm-freebsd-x86-64 examples/while.bang 
    Reading symbols from build/toolchain/bang...
    (gdb) r
    Starting program: /usr/home/nico/src/bm/build/toolchain/bang -t nasm-freebsd-x86-64 examples/while.bang
    
    Program received signal SIGSEGV, Segmentation fault.
    0x00000000002161a8 in precompute_label_locations (basm=0x22c610 <main.basm>) at src/library/nasm_sysv_x86_64.c:44
    44          for (size_t i = 0; i < basm->global_scope->bindings_size; i++) {
    (gdb) bt full
    #0  0x00000000002161a8 in precompute_label_locations (basm=0x22c610 <main.basm>) at src/library/nasm_sysv_x86_64.c:44
            i = 0
            label_locations = 0x800a09de8
    #1  0x0000000000213560 in basm_save_to_file_as_nasm_sysv_x86_64 (basm=0x22c610 <main.basm>, os_target=OS_TARGET_FREEBSD, 
        output_file_path=0x800a09668 "./while.S") at src/library/nasm_sysv_x86_64.c:108
            output = 0x800499d00
            label_locations = 0x22c610 <main.basm>
            jmp_count = 3
    #2  0x000000000020ef3f in basm_save_to_file_as_target (basm=0x22c610 <main.basm>, output_file_path=0x800a09668 "./while.S", 
        target=TARGET_NASM_FREEBSD_X86_64) at src/library/basm.c:255
    No locals.
    #3  0x000000000020bb9d in main (argc=0, argv=0x7fffffffe5b0) at src/toolchain/bang.c:112
            basm = <error reading variable basm (value of type `Basm' requires 1103064 bytes, which is more than max-value-size)>
            program = 0x7fffffffe8d8 "/usr/home/nico/src/bm/build/toolchain/bang"
            input_file_path = 0x7fffffffe91a "examples/while.bang"
            output_file_path = 0x800a09668 "./while.S"
            output_target = TARGET_NASM_FREEBSD_X86_64
            content = {count = 133, 
              data = 0x800a09690 "var i: i64;\n\nproc main() {\n  write(\"Begin\\n\");\n  i = 0;\n  while i < 10 {\n    write(\"Hey!\\n\");\n    i = i + 1;\n  }\n  write(\"End\\n\");\n}\n"}
            lexer = {content = {count = 0, data = 0x800a09715 ""}, line_start = 0x800a09713 "}\n", line = {count = 0, 
                data = 0x800a09714 "\n"}, file_path = 0x7fffffffe91a "examples/while.bang", row = 11, peek_buffer = {tokens = {{
                    kind = BANG_TOKEN_KIND_SEMICOLON, text = {count = 1, data = 0x800a09711 ";\n}\n"}, loc = {row = 10, col = 17, 
                      file_path = 0x7fffffffe91a "examples/while.bang"}}, {kind = BANG_TOKEN_KIND_CLOSE_CURLY, text = {count = 1, 
                      data = 0x800a09713 "}\n"}, loc = {row = 11, col = 1, file_path = 0x7fffffffe91a "examples/while.bang"}}}, 
                begin = 0, count = 0}}
            module = {tops_begin = 0x800a09718, tops_end = 0x800a09758}
            bang = <error reading variable bang (value of type `Bang' requires 106544 bytes, which is more than max-value-size)>
    (gdb) p basm->global_scope 
    $1 = (Scope *) 0x0
    (gdb) 
    

    I have no idea how this is even supposed to work. Obviously it must segfault because basm->global_scope is never initialized.

    #21 should definitely be on a priority list now.

    opened by herrhotzenplotz 5
  • (#373) Fix pi = 4 in amd64 assembly-gen

    (#373) Fix pi = 4 in amd64 assembly-gen

    • Clean out rax before doing the comparison to sweep out the upper bits as only the lower 16 bits get set (Thanks to abridgewater)
    • Fix cleaning xmm0 (Thanks to kolumbetko)
    opened by herrhotzenplotz 4
  • (#295) Add SDL binary dependency to the repo

    (#295) Add SDL binary dependency to the repo

    Close #295 Just added files from https://www.libsdl.org/release/SDL2-devel-2.0.12-VC.zip from include and lib folders Lib files are 64 bit versions. Didn't included SDL2test.lib and SDL2.dll. But now on Windows sdl example works if I execute this commands:

    cl.exe nobuild.c
    nobuild examples
    build\toolchain\bme build\examples\sdl.bm -n build\wrappers\libbm_sdl.dll
    
    opened by kolumb 4
  • Segmentation Fault in amd64 executables

    Segmentation Fault in amd64 executables

    In the Linuxulator I see:

    [[email protected] ~/src/bm]$ uname -ap
    FreeBSD triton.herrhotzenplotz.geek 13.0-ALPHA1 FreeBSD 13.0-ALPHA1 #2 main-c111-g6eebda3bb: Thu Jan 14 21:17:40 CET 2021     [email protected]:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64 amd64
    [[email protected] ~/src/bm]$ 
    [[email protected] ~/src/bm]$ ./build-x86_64.sh  
    ...
    [[email protected] ~/src/bm]$ brandelf -t Linux ./build/examples/fib.exe
    [[email protected] ~/src/bm]$ ./build/examples/fib.exe
    [5] + Segmentation fault - core dumped ./build/examples/fib.exe
    [[email protected] ~/src/bm]$ file ./build/examples/fib.exe
    ./build/examples/fib.exe: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, with debug_info, not stripped
    [[email protected] ~/src/bm]$ gdb ./build/examples/fib.exe fib.exe.core 
    Reading symbols from ./build/examples/fib.exe...
    
    warning: core file may not match specified executable file.
    [New LWP 100639]
    Core was generated by `./build/examples/fib.exe'.
    Program terminated with signal SIGSEGV, Segmentation fault.
    #0  0x000000000020177c in inst_49 ()
    (gdb) where
    #0  0x000000000020177c in inst_49 ()
    #1  0x0000000000000001 in ?? ()
    #2  0x00007fffffffd7d0 in ?? ()
    #3  0x0000000000000000 in ?? ()
    (gdb) x/i $pc
    => 0x20177c <inst_49+8>:    movq   $0xa,(%rsi)
    (gdb) p/x $rsi
    $1 = 0x2a2000
    (gdb) 
    

    Same applies to 123i.exe. No idea whether that is expected or not. If you need the corefile or the gdb trace of 123i.exe for debugging purposes, please let me know :-)

    opened by herrhotzenplotz 4
  • Build on FreeBSD 12.2-RELEASE-p3 is broken

    Build on FreeBSD 12.2-RELEASE-p3 is broken

    Compile log: http://paste.debian.net/plainh/5d93a9c7

    Other details:

    [[email protected] ~/src/bm]$ uname -apKU
    FreeBSD hades.herrhotzenplotz.geek 12.2-RELEASE-p3 FreeBSD 12.2-RELEASE-p3 GENERIC  amd64 amd64 1202000 1202000
    [[email protected] ~/src/bm]$ cc --version
    FreeBSD clang version 10.0.1 ([email protected]:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2)
    Target: x86_64-unknown-freebsd12.2
    Thread model: posix
    InstalledDir: /usr/bin
    [[email protected] ~/src/bm]$ 
    

    I have no idea what is going on with the static assert, but the stat errors seem to be related to preprocessor magic before the includes. See: https://www.freebsd.org/cgi/man.cgi?query=stat&apropos=0&sektion=2&manpath=FreeBSD+12.2-RELEASE+and+Ports&arch=default&format=html

    EDIT: It does compile, if I comment out on the top of basm.c:

    //#    define _DEFAULT_SOURCE
    //#    define _POSIX_C_SOURCE 200112L
    
    opened by herrhotzenplotz 3
  • Compilation warnings on clang

    Compilation warnings on clang

    When compiling nobuild with clang version 11.0.1 it spits out a lot of warnings about using assignment as a condition.

    ❯ clang nobuild.c -o nobuild
    In file included from nobuild.c:2:
    ././nobuild.h:544:1: warning: non-void function does not return a value [-Wreturn-type]
    }
    ^
    ././nobuild.h:634:23: warning: using the result of an assignment as a condition without parentheses [-Wparentheses]
                if (errno = ENOENT) {
                    ~~~~~~^~~~~~~~
    ././nobuild.h:634:23: note: place parentheses around the assignment to silence this warning
                if (errno = ENOENT) {
                          ^
                    (             )
    ././nobuild.h:634:23: note: use '==' to turn this assignment into an equality comparison
                if (errno = ENOENT) {
                          ^
                          ==
    ././nobuild.h:643:23: warning: using the result of an assignment as a condition without parentheses [-Wparentheses]
                if (errno = ENOENT) {
                    ~~~~~~^~~~~~~~
    ././nobuild.h:643:23: note: place parentheses around the assignment to silence this warning
                if (errno = ENOENT) {
                          ^
                    (             )
    ././nobuild.h:643:23: note: use '==' to turn this assignment into an equality comparison
                if (errno = ENOENT) {
                          ^
                          ==
    nobuild.c:170:40: warning: field width should have type 'int', but argument has type 'unsigned long' [-Wformat]
            fprintf(stream, "./nobuild %s%*s - %s\n",
                                         ~~^
    4 warnings generated.
    
    opened by aodhneine 3
  • file(1) maintainer deems magic marker

    file(1) maintainer deems magic marker "really weak" for 2021

    The maintainer of file(1) command has accepted a patch to support the Birtual Machine.

    However, he states

    I added it, but the magic is really weak. It is 2021 and we are making 2 byte magic entries? Please let the developers know that they should have at least a 4 byte magic. If there are conflicts/complaints in the future it will be removed.

    Perhaps the magic marker should be "bomb", for Born Of Mighty Birtualization.

    I checked, "bomb" is not taken yet.

    low 
    opened by catull 3
  • Magic marker

    Magic marker "BM" is already taken

    In trying to encode the magic(5) definitions for the file(1) utility, I discovered that the magic marker 'BM' is already used here PC Bitmap. It is documented Bitmap file header.

    Perhaps the Birtual Machine magic marker could be redefined as БМ or as 'bm'.

    opened by catull 3
  • Differing definitions of `int64_t` on macos

    Differing definitions of `int64_t` on macos

    On macos (no idea about "real" BSD), int64_t is defined as long long, while on linux it's defined as long. This causes issues with the printf formatting generating warnings (it won't actually affect the runtime since long is still also 8 bytes)

    There's inttypes.h and PRId64 nonsense, but that might get ugly...

    opened by zhiayang 3
  • Add FreeBSD target for Native Codegen via NASM

    Add FreeBSD target for Native Codegen via NASM

    Adapts syscall-IDs for FreeBSD so we don't have to use `brandelf'.

    I renamed basm_save_to_file_as_nasm_linux_x86_64' tobasm_save_to_file_as_nasm_sysv_x86_64' since effectively we're using the SystemV AMD64 calling convention here. The only thing that differs are the syscall ids, which is the additional argument, that the function now takes.

    Also, I'm not quite sure about the type name `Syscall_Target' so if anyone has got a better name: go ahead!

    opened by herrhotzenplotz 2
  • Make basm2nasm a bit better

    Make basm2nasm a bit better

    Call basm_translate_root_source_file instead of basm_translate_source_file so it doesn't crash because of scopes

    All FIXME's are now straight up asserts

    Also now it can handle include paths, instructions for comparing integers (gei/geu, gti/gtu, lei/leu, lti/ltu, nei/neu and equ) and multu

    HACKERMANS

    opened by bit9tream 2
  • Introduce the memory_base field in the BM file metadata

    Introduce the memory_base field in the BM file metadata

    This makes bm bytecode compiled from bang significantly smaller because the stack is no longer included in the file. Furthermore, debasm looks way cleaner because it doesn't print the stack anymore.

    The behavior of native targets remains the same.

    opened by redoste 1
Owner
Tsoding
Recreational Programming
Tsoding
use ptrace hook Hotspot JavaVM, instrument java bytecode

taycan 通过native层修改java层(JVM),使用JVMTI及JNI API可以修改java任意类、执行任意代码,完成hook、插入内存马、反射等功能。 适用环境 LINUX KERNEL version > 3.2 GLIBC > 2.15 openJDK/OracleJDK 1.8

null 26 Jul 12, 2022
x64 Assembly HalosGate direct System Caller to evade EDR UserLand hooks

ASM HalosGate Direct System Caller Assembly HalosGate implementation that directly calls Windows System Calls, evades EDR User Land hooks, and display

Bobby Cooke 128 Dec 2, 2022
Panda - is a set of utilities used to research how PsExec encrypts its traffic.

Panda Panda - is a set of utilities used to research how PsExec encrypts its traffic. Shared library used to inject into lsass.exe process to log NTLM

Pavel 11 Jul 17, 2022
A small utility to set the clock on a Hayes Stack Chronograph over its serial port.

chronosync A small utility to set the clock on a Hayes Stack Chronograph over its serial port. Synopsis chronosync [-d] [-s serial speed] <serial devi

joshua stein 1 Oct 1, 2021
Group project: writing our own printf function

0x11. C - printf By Julien Barbier, co-founder & CEO Concepts For this project, students are expected to look at these concepts: Group Projects Pair P

Pericles ADJOVI 5 Oct 24, 2022
Writing our own printf function, this is a project done under ALX Low Level Programming.

0x11. C - printf Writing our own printf function, this is a project done under ALX Low Level Programming. Resource secrets of printf Implementing prin

Ephantus Mwangi 4 Oct 26, 2022
Edf is an event-driven framework for embedded system (e.g. FreeRTOS) with state machine and subscriber-publisher pattern.

Edf means event-driven framework. Event-driven programming is a common pattern in embedded systems. However, if you develop software directly on top o

Arrow89 7 Oct 16, 2022
GPU Task Spooler - A SLURM alternative/job scheduler for a single simulation machine

GPU Task Spooler - A SLURM alternative/job scheduler for a single simulation machine

Duc Nguyen 93 Nov 21, 2022
Infocom Z-machine build environment for 25 retro computer systems, preconfigured for PunyInform

Puddle BuildTools (for PunyInform and other libraries and compilers targeting the Infocom Z-machine) If you're into classic 8-bit and 16-bit home comp

Stefan Vogt 44 Nov 25, 2022
Lightweight state machine implemented in C++

Intro This is my second take on C++ state machine implementation. My first attempt can be found here. The main goals of the implementation are: No dyn

Łukasz Gemborowski 21 Nov 17, 2022
WAFer is a C language-based software platform for scalable server-side and networking applications. Think node.js for C programmers.

WAFer WAFer is a C language-based ultra-light scalable server-side web applications framework. Think node.js for C programmers. Because it's written i

Riolet Corporation 692 Nov 19, 2022
A tiny programming language that transpiles to C, C++, Java, TypeScript, Python, C#, Swift, Lua and WebAssembly 🚀

A tiny programming language that transpiles to C, C++, Java, TypeScript, Python, C#, Swift, Lua and WebAssembly ??

Lingdong Huang 576 Nov 23, 2022
My collection of raylib code examples - For learning the C language with 2D and 3D games.

Raylib-Examples My collection of raylib examples. ( https://www.raylib.com/index.html ) For Raylib Version of 4 april 2020 ( Notepad++ windows install

Rudy Boudewijn van Etten 48 Sep 8, 2022
C-code generator for docopt language.

C-code generator for docopt language Note, at this point the code generator handles only options (positional arguments, commands and pattern matching

null 311 Nov 27, 2022
Open Data Description Language

Open Data Description Language This is the reference parser for the Open Data Description Language (OpenDDL), version 3.0. The official language speci

Eric Lengyel 40 Nov 27, 2022
C language utility library

versatile and easy to use C language utility library with functions and macros commonly used in various applications

Tilen Majerle 41 Nov 8, 2022
Statically typed programming language.

Summary Luxury is a statically programming langage which is targeting embedded programming. I will eventually stop using C and fully use this language

null 39 Oct 5, 2022
A interpreter that runs the script which is programmed in the language of FF0 script (or you can call it as Warfarin)

ff0-script A interpreter that runs the script which is programmed in the language of FF0 script (or you can call it as Warfarin) You can do it, unders

null 24 Apr 27, 2022
An implementation of yacc for the janet programming language.

janet-yacc An implementation of yacc for the janet programming language. The implementation is based heavily on https://c9x.me/yacc/. Example from ./e

null 11 Nov 22, 2021