A C compiler written in Zig.

Related tags

Compilers c compiler zig
Overview

Aro

A C compiler with the goal of providing fast compilation and low memory usage with good diagnostics.

Currently it can preprocess, parse and semantically analyze ~85% of standard C17 with work still being needed to support all of the usual extensions.

Basic code generation is supported for x86-64 linux and can produce a valid hello world:

$ cat hello.c
extern int printf(const char *restrict fmt, ...);
int main(void) {
    printf("Hello, world!\n");
    return 0;
}
$ zig build run -- hello.c -c
$ zig run hello.o -lc
Hello, world!
$

Future plans for the project include making it the C backend of Zig's translate-c feature and making it an optional C frontend for the self-hosted Zig compiler.

#define MAIN ma##in

#ifndef FOO
int *something[5];
#endif

#if defined MAIN
int MAIN(int argc, const char *argv[]) {
    return (argc * (char)4)[argv];
}
#endif
var: '[5]*int'
 name: something

fn_def: 'fn (argc: int, argv: **const char) int'
 name: main
 body:
  compound_stmt_two: 'void'
    return_stmt: 'void'
     expr:
      array_access_expr: '*const char' lvalue
       lhs:
        lval_to_rval: '**const char'
          decl_ref_expr: '**const char' lvalue
           name: argv
       index:
        paren_expr: 'int'
         operand:
          mul_expr: 'int'
           lhs:
            lval_to_rval: 'int'
              decl_ref_expr: 'int' lvalue
               name: argc
           rhs:
            int_cast: 'int' (value: 4)
              cast_expr: 'char'
               operand:
                int_literal: 'int'
                 value: 4

types are printed in Zig style as C types are more confusing than necessary, actual error messages contain proper C like types

Comments
  • WIP: String interning for types

    WIP: String interning for types

    Initial pass at #315

    If this approach looks good I can apply it to the other []const u8 fields in Type.zig.

    One slightly annoying thing is juggling the string_interner in the test runner, to make sure it doesn't get de-inited. I could make the string interner not be owned by the Compilation, but I'm not sure that feels right either.

    opened by ehaas 16
  • Record layout

    Record layout

    Here is my first cut at record layout ported from this Rust code https://github.com/mahkoh/repr-c.

    I'm looking for feedback and/or a punch list before merging.

    It's a bit delicate. If you look at it sideways one of the tests will likely break. I'm skipping a good % of the test cases for various reasons (#300, #301), as well as others I haven't dug into yet. There are also some tests that clang-13 fails on, so there is some discrepancy in Rust tests and clang. I need to look into those.

    The tests are auto-generated from the Rust code. Because of that test/test_records.zig is doing the filtering for which tests to skip (or parts of tests to skip).

    I shoved a bunch of stuff into Type.Record and Type.Record.Field. I think I can get rid of some of that in the future, but I'd like to leave it now as it makes debugging easier.

    I haven't done any work towards getting other targets to work. I'm currently just trying to mimic clang on x86 linux.

    There are some asserts as well as panics, and I haven't done any errors or warnings. yet.

    I'm happy for any feedback, even nitpicky "that's not very zig-like" feedback. It's a new-ish language to me. Or major "this is all wrong" is fine as well.

    opened by TwoClocks 11
  • Basic validation for `aligned` attribute

    Basic validation for `aligned` attribute

    Based on doing this, I think we should probably do #40 first to make token locations better and make it easier to get the values we'll be working with.

    opened by ehaas 11
  • Calculate record size and alignment

    Calculate record size and alignment

    Currently both just default to 1.

    Note: the size calculation should happen after the attributes after the type are parsed since they may contain a packed attribute.

    enhancement 
    opened by Vexu 10
  • Parser: improve typeof support

    Parser: improve typeof support

    Add two new type specifiers: typeof_type and typeof_expr, which are the types returned by typeof (depending on whether it's called with a type or an expression)

    This allows us to track the underlying type or expression that was used.

    opened by ehaas 10
  • Compiling a simple hello world program

    Compiling a simple hello world program

    ❯ arocc test.c -o test
    thread 12150 panic: integer cast truncated bits
    /home/varlad/arocc/src/Parser.zig:226:19: 0x2b86bc in Parser.addList (arocc)
        const start = @intCast(u32, p.data.items.len);
                      ^
    /home/varlad/arocc/src/Parser.zig:3275:71: 0x2a317c in Parser.condExpr (arocc)
            .data = .{ .if3 = .{ .cond = cond.node, .body = (try p.addList(&.{ then_expr.node, else_expr.node })).start } },
                                                                          ^
    /home/varlad/arocc/src/Parser.zig:2491:27: 0x28a5c0 in Parser.macroExpr (arocc)
        const res = p.condExpr() catch |e| switch (e) {
                              ^
    /home/varlad/arocc/src/Preprocessor.zig:415:28: 0x26ef1f in Preprocessor.expr (arocc)
        return parser.macroExpr();
                               ^
    /home/varlad/arocc/src/Preprocessor.zig:121:40: 0x267f80 in Preprocessor.preprocess (arocc)
                            if (try pp.expr(&tokenizer)) {
                                           ^
    /home/varlad/arocc/src/Preprocessor.zig:1098:22: 0x26fecb in Preprocessor.include (arocc)
        try pp.preprocess(new_source);
                         ^
    /home/varlad/arocc/src/Preprocessor.zig:205:55: 0x268a8a in Preprocessor.preprocess (arocc)
                        .keyword_include => try pp.include(&tokenizer),
                                                          ^
    /home/varlad/arocc/src/Preprocessor.zig:1098:22: 0x26fecb in Preprocessor.include (arocc)
        try pp.preprocess(new_source);
                         ^
    /home/varlad/arocc/src/Preprocessor.zig:205:55: 0x268a8a in Preprocessor.preprocess (arocc)
                        .keyword_include => try pp.include(&tokenizer),
                                                          ^
    /home/varlad/arocc/src/Preprocessor.zig:1098:22: 0x26fecb in Preprocessor.include (arocc)
        try pp.preprocess(new_source);
                         ^
    /home/varlad/arocc/src/Preprocessor.zig:205:55: 0x268a8a in Preprocessor.preprocess (arocc)
                        .keyword_include => try pp.include(&tokenizer),
                                                          ^
    /home/varlad/arocc/src/main.zig:209:22: 0x25f5ac in processSource (arocc)
        try pp.preprocess(source);
                         ^
    /home/varlad/arocc/src/main.zig:196:22: 0x25884e in handleArgs (arocc)
            processSource(comp, source, builtin, user_macros) catch |e| switch (e) {
                         ^
    /home/varlad/arocc/src/main.zig:33:15: 0x24f8c2 in main (arocc)
        handleArgs(&comp, args) catch |err| switch (err) {
                  ^
    /home/varlad/zig-master/lib/std/start.zig:507:29: 0x2481cc in std.start.callMain (arocc)
                return root.main();
                                ^
    /home/varlad/zig-master/lib/std/start.zig:452:12: 0x22c1ce in std.start.callMainWithArgs (arocc)
        return @call(.{ .modifier = .always_inline }, callMain, .{});
               ^
    /home/varlad/zig-master/lib/std/start.zig:366:17: 0x22b1f6 in std.start.posixCallMainAndExit (arocc)
        std.os.exit(@call(.{ .modifier = .always_inline }, callMainWithArgs, .{ argc, argv, envp }));
                    ^
    /home/varlad/zig-master/lib/std/start.zig:279:5: 0x22b002 in std.start._start (arocc)
        @call(.{ .modifier = .never_inline }, posixCallMainAndExit, .{});
        ^
    fish: Job 1, 'arocc test.c -o test' terminated by signal SIGABRT (Abort)
    

    where test.c reads

    #include "stdio.h"
    
    int main()
    {
    	printf("Well, heck");
    	return 0;
    }
    

    Built with Zig v0.9.0-dev.861+311797f68 via zig build

    enhancement 
    opened by VarLad 9
  • Parser: Compute sizeof / alignof struct and union types

    Parser: Compute sizeof / alignof struct and union types

    Compute sizeof/alignof records.

    This isn't a perfect solution - I'm unsure if it works identically to other compilers in all cases & it definitely doesn't factor in packed attributes. But it's good enough - I wanted to load netinet/in.h, which fails with the following error otherwise:

    Compiler error:
    /usr/include/netinet/in.h:244:53: warning: overflow in expression; result is '18446744073709551615' [-Winteger-overflow]
        unsigned char sin_zero[sizeof (struct sockaddr) -
                                                        ^
    /usr/include/netinet/in.h:244:27: error: array is too large
        unsigned char sin_zero[sizeof (struct sockaddr) -
    
    
    // Because of this expression:
    unsigned char sin_zero[sizeof (struct sockaddr) -
                            __SOCKADDR_COMMON_SIZE -
                            sizeof (in_port_t) -
                            sizeof (struct in_addr)];
    

    Two questions:

    1. How can i report a compiler error? see line 1711. I poked around Compilation.zig but nothing jumped out.
    2. The test is obviously flawed, I just matched whatever GCC produced on my system. I can either remove the test, or maybe there's a way to skip it on non-x64-linux systems?
    opened by tomc1998 8
  • Tree: store cast kind as part of implicit/explicit cast nodes

    Tree: store cast kind as part of implicit/explicit cast nodes

    Briefly discussed here: https://github.com/Vexu/arocc/pull/232#issuecomment-1030530859

    If you like this approach I can fill out the fromExplicitCast function.

    opened by ehaas 8
  • Parser: Rudimentary attribute parsing

    Parser: Rudimentary attribute parsing

    Putting this up as a draft to see what you think; right now it just consumes attributes (in some but not all places where they are allowed) but doesn't do anything with them.

    Remaining work includes but is not limited to:

    • [x] Implement __has_attribute preprocessor function
    • [ ] merge consecutive attribute specifiers?
    • [ ] Verify arg count and types for each attribute
    • [ ] Strategy for attributed statements, var decls, function decls, types, etc
    • [ ] Actually do something with the attributes
    opened by ehaas 8
  • record layout test suite

    record layout test suite

    Adds a test suite for record layout.

    • Runs permutations of targets and tests
    • Has expected failures for some tests due to various zig/arocc issues, including parser failures.
    • will error if a test passes when it currently expects to fail
    • Can run a specific platform/test for local debugging
    • readme for more info
    opened by TwoClocks 6
  • Should not warn if control flow reaches closing `}` of main

    Should not warn if control flow reaches closing `}` of main

    Since C99, if the return type of main is compatible with int and control flow reaches the closing }, then main implicitly returns 0. GCC and clang issue no warnings.

    Skipping the warning is easy enough, but how should we handle the AST - it looks like .implicit_return nodes don't have any associated data. Should we have a separate node type for implicit_main_return?

    enhancement 
    opened by ehaas 6
  • MSVC takes required alignment of child types into account

    MSVC takes required alignment of child types into account

    MSVC seems to walk down the type stack.

    __declspec(align(8)) typedef int I1;
    __declspec(align(1)) typedef I1 I2;
    
    _Static_assert(sizeof(I1) == 4, "");
    _Static_assert(_Alignof(I1) == 8, "");
    _Static_assert(sizeof(I2) == 4, "");
    _Static_assert(_Alignof(I2) == 8, "");
    

    where GCC/Clang don't

    typedef int I1 __attribute__((aligned(8)));
    typedef I1 I2 __attribute__((aligned(1)));
    
    _Static_assert(sizeof(I1) == 4, "");
    _Static_assert(_Alignof(I1) == 8, "");
    _Static_assert(sizeof(I2) == 4, "");
    _Static_assert(_Alignof(I2) == 1, "");
    
    
    enhancement 
    opened by TwoClocks 0
  • Attribute: handle typedef decreasing record alignment

    Attribute: handle typedef decreasing record alignment

    This addresses the GCC/clang side of #367. It's also a pretty ugly hack so feel free to close it if you'd prefer to handle it some other way. Short of reworking the entire Type system, another way I think it could be handled would be adding a ?u29 alignment field to Type.Attributed to store typedef alignments

    The fundamental problem is that after parse time there's no way to distinguish "struct with attributes" from "typedef which adds attributes to a struct" . It's relevant because attributes directly on a struct have a different effect on alignment than attributes on a typedef.

    struct A {
        int x;
    };
    struct __attribute__((aligned(2))) B {
        int x;
    };
    typedef __attribute__((aligned(2))) struct A Aligned_A;
    
    _Static_assert(_Alignof(struct A) == 4, "");
    _Static_assert(_Alignof(struct B) == 4, "");
    _Static_assert(_Alignof(Aligned_A) == 2, "");
    

    Both struct B and Aligned_A are represented as "attributed record", where the record has a single int field, but they have different alignment because the alignment on a typedef is allowed to be lower than the natural alignment.

    opened by ehaas 6
  • C23 feature support checklist

    C23 feature support checklist

    Based on https://en.cppreference.com/w/c/23

    • [x] _Static_asert with no message
    • [x] [[nodiscard]]
    • [x] [[maybe_unused]]
    • [x] [[deprecated]]
    • [x] Attributes
    • [ ] IEEE 754 decimal floating-point types
    • [x] [[fallthrough]]
    • [ ] u8 character constants
    • [ ] Removal of function definitions without prototype
    • [x] [[nodiscard]] with message
    • [x] Unnamed parameters in function definitions
    • [ ] Labels before declarations and end of blocks
    • [x] Binary integer constants
    • [ ] __has_c_attribute in preprocessor conditionals
    • [x] Allow duplicate attributes
    • [ ] IEEE 754 interchange and extended types
    • [x] Digit separators
    • [ ] #elifdef and #elifndef
    • [ ] Type change of u8 string literals
    • [ ] [[maybe_unused]] for labels
    • [x] #warning
    • [x] Bit-precise integer types (_BitInt)
    • [x] [[noreturn]]
    • [ ] Suffixes for bit-precise integer constants
    • [x] __has_include in preprocessor conditionals
    • [ ] Removal of function declarations without prototype
    • [x] Empty initializers
    • [x] typeof ...
    • [ ] ... and typeof_unqual
    • [x] New spelling of keywords
    • [x] Predefined true and false
    • [ ] [[unsequenced]]
    • [ ] [[reproducible]]
    • [ ] Relax requirements for variadic parameter list
    • [ ] Type inference in object definitions (auto)
    • [x] constexpr objects
    • [ ] nullptr
    • [ ] #embed
    enhancement 
    opened by Vexu 1
  • Improve error for assigning to const

    Improve error for assigning to const

    int foo(void) {
        const int a;
        a = 1;
        return a;
    }
    

    Arocc output:

    ./a.c:40:7: error: expression is not assignable
        a = 1;
          ^
    1 error generated.
    

    Clang output:

    a.c:40:7: error: cannot assign to variable 'a' with const-qualified type 'const int'
        a = 1;
        ~ ^
    a.c:39:15: note: variable 'a' declared const here
        const int a;
        ~~~~~~~~~~^
    1 error generated.
    
    enhancement 
    opened by Vexu 0
  • Implicit int diagnostics

    Implicit int diagnostics

    Starting with C23, implicit-int type specifiers are no longer allowed:

    // test.c
    static x = 5;
    
    ➜  ~/local/llvm15-release/bin/clang -std=c2x test.c
    test.c:1:8: error: a type specifier is required for all declarations
    static x = 5;
    ~~~~~~ ^
    1 error generated.
    

    This will need a new error diagnostic in Diagnostics.zig and an update to Type.zig where we issue .missing_type_specifier

    enhancement good first issue 
    opened by ehaas 0
Owner
Veikka Tuominen
Doing Zig stuff.
Veikka Tuominen
Compiler Design Project: Simulation of front-end phase of C Compiler involving switch-case construct.

CSPC41 Compiler Design Project Assignment Compiler Design Project: Simulation of front-end phase of C Compiler involving switch-case construct. Using

Adeep Hande 1 Dec 15, 2021
This is a compiler written from scratch in C

C Compiler This is a compiler written from scratch in C, with fully supporting C18 as a goal. It can currently compile itself, and most simple program

null 27 Oct 8, 2022
A compiler written in C++ to convert .hoi code to hoi4 code

What is HC4? HC4 is a compiler that converts .hoic filenames to Hearts of Iron IV's .txt. Usage Use hc4 in the terminal (./hc4 if on Unix) and it will

SaCode 1 Jul 31, 2022
A Small C Compiler

8cc C Compiler Note: 8cc is no longer an active project. The successor is chibicc. 8cc is a compiler for the C programming language. It's intended to

Rui Ueyama 5.8k Nov 25, 2022
A C compiler for LLVM. Supports C++11/14/1z C11

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies

LLVM 16.9k Nov 26, 2022
GNU Prolog is a native Prolog compiler

GNU Prolog is a native Prolog compiler with constraint solving over finite domains (FD)

Daniel Diaz 65 Nov 21, 2022
Take your first step in writing a compiler.

first-step Take your first step in writing a compiler. Building from Source Before building first-step, please make sure you have installed the follow

PKU Compiler Course 28 Aug 20, 2022
NCC is an ANSI/ISO-compliant optimizing C compiler.

The compiler is retargetable by design, but, at present, it only produces binaries for Linux/x86_64. As the compiler ABI differs somewhat from the System V ABI used by Linux, its code cannot be linked against Linux system libraries. It does, however, provide its own (incomplete) standard ANSI/Posix C library.

Charles E. Youse 0 Apr 1, 2022
nanoc is a tiny subset of C and a tiny compiler that targets 32-bit x86 machines.

nanoc is a tiny subset of C and a tiny compiler that targets 32-bit x86 machines. Tiny? The only types are: int (32-bit signed integer) char (8-

Ajay Tatachar 19 Nov 28, 2022
Smaller C is a simple and small single-pass C compiler

Smaller C is a simple and small single-pass C compiler, currently supporting most of the C language common between C89/ANSI C and C99 (minus some C89 and plus some C99 features).

Alexey Frunze 1.2k Nov 22, 2022
Microvm is a virtual machine and compiler

The aim of this project is to create a stack based language and virtual machine for microcontrollers. A mix of approaches is used. Separate memory is used for program and variable space (Harvard architecture). An interpreter, virtual machine and compiler are available. A demostration of the interpreter in action is presented below.

null 10 Aug 14, 2022
yadcc - Yet Another Distributed C++ Compiler

Yet Another Distributed C++ Compiler. yadcc是一套腾讯广告自研的分布式编译系统,用于支撑腾讯广告的日常开发及流水线。相对于已有的同类解决方案,我们针对实际的工业生产环境做了性能、可靠性、易用性等方面优化。

Tencent 271 Nov 23, 2022
Aheui JIT compiler for PC and web

아희짓 개요 아희짓은 아희 언어를 위한 JIT (Just in Time) 컴파일러입니다. 어셈블러와 유틸 라이브러리외에 외부 라이브러리에 전혀 의존하지 않고 JIT을 바닥부터 구현합니다. 지원 환경 64비트 windows, mac, linux (x86 아키텍쳐) 웹어셈

Sunho Kim 28 Sep 23, 2022
C implementation of the Tiny BASIC compiler found in an article by Dr. Austin Henley

Teeny Tiny Basic A C implementation of the Tiny BASIC compiler found in this article and this github repo by Dr. Austin Henley. I did pretty well in A

Gavin Morris 7 Oct 4, 2022
bcc is an interactive compiler of a language called b.

bcc is an interactive compiler of a language called b.

kparc 18 Nov 7, 2022
mrcceppc is a reimplementation project for the Metrowerks mwcceppc compiler.

Compiler | mrcceppc mrcceppc is a reimplementation project for the Metrowerks mwcceppc compiler. Compiling Run generate_{version}.bat for which versio

null 9 Nov 21, 2022
A C header that allow users to compile brainfuck programs within a C compiler.

brainfuck.h A C header that allow users to compile brainfuck programs within a C compiler. You can insert the header into the top of your brainfuck so

null 2 Apr 18, 2022
Gilbraltar is a version of the OCaml compiler to be able to build a MirageOS for RaspberryPi 4.

Gilbraltar is a version of the OCaml compiler to be able to build a MirageOS for RaspberryPi 4. It's a work in progress repository to provide a dune's toolchain (as ocaml-freestanding) specialized for Raspberry Pi 4.

Calascibetta Romain 49 Oct 30, 2022
NaiveCC: a compiler frontend for a subset of C

NaiveCC: a compiler frontend for a subset of C This is a toy compiler frontend for a subset of the C programming language based on the LR(1) parsing t

Yuxiang Wei 1 Nov 15, 2021