Multi-format archive and compression library

Overview

Welcome to libarchive!

The libarchive project develops a portable, efficient C library that can read and write streaming archives in a variety of formats. It also includes implementations of the common tar, cpio, and zcat command-line tools that use the libarchive library.

Questions? Issues?

Contents of the Distribution

This distribution bundle includes the following major components:

  • libarchive: a library for reading and writing streaming archives
  • tar: the 'bsdtar' program is a full-featured 'tar' implementation built on libarchive
  • cpio: the 'bsdcpio' program is a different interface to essentially the same functionality
  • cat: the 'bsdcat' program is a simple replacement tool for zcat, bzcat, xzcat, and such
  • examples: Some small example programs that you may find useful.
  • examples/minitar: a compact sample demonstrating use of libarchive.
  • contrib: Various items sent to me by third parties; please contact the authors with any questions.

The top-level directory contains the following information files:

  • NEWS - highlights of recent changes
  • COPYING - what you can do with this
  • INSTALL - installation instructions
  • README - this file
  • CMakeLists.txt - input for "cmake" build tool, see INSTALL
  • configure - configuration script, see INSTALL for details. If your copy of the source lacks a configure script, you can try to construct it by running the script in build/autogen.sh (or use cmake).

The following files in the top-level directory are used by the 'configure' script:

  • Makefile.am, aclocal.m4, configure.ac - used to build this distribution, only needed by maintainers
  • Makefile.in, config.h.in - templates used by configure script

Documentation

In addition to the informational articles and documentation in the online libarchive Wiki, the distribution also includes a number of manual pages:

  • bsdtar.1 explains the use of the bsdtar program
  • bsdcpio.1 explains the use of the bsdcpio program
  • bsdcat.1 explains the use of the bsdcat program
  • libarchive.3 gives an overview of the library as a whole
  • archive_read.3, archive_write.3, archive_write_disk.3, and archive_read_disk.3 provide detailed calling sequences for the read and write APIs
  • archive_entry.3 details the "struct archive_entry" utility class
  • archive_internals.3 provides some insight into libarchive's internal structure and operation.
  • libarchive-formats.5 documents the file formats supported by the library
  • cpio.5, mtree.5, and tar.5 provide detailed information about these popular archive formats, including hard-to-find details about modern cpio and tar variants.

The manual pages above are provided in the 'doc' directory in a number of different formats.

You should also read the copious comments in archive.h and the source code for the sample programs for more details. Please let us know about any errors or omissions you find.

Supported Formats

Currently, the library automatically detects and reads the following formats:

  • Old V7 tar archives
  • POSIX ustar
  • GNU tar format (including GNU long filenames, long link names, and sparse files)
  • Solaris 9 extended tar format (including ACLs)
  • POSIX pax interchange format
  • POSIX octet-oriented cpio
  • SVR4 ASCII cpio
  • Binary cpio (big-endian or little-endian)
  • ISO9660 CD-ROM images (with optional Rockridge or Joliet extensions)
  • ZIP archives (with uncompressed or "deflate" compressed entries, including support for encrypted Zip archives)
  • ZIPX archives (with support for bzip2, ppmd8, lzma and xz compressed entries)
  • GNU and BSD 'ar' archives
  • 'mtree' format
  • 7-Zip archives
  • Microsoft CAB format
  • LHA and LZH archives
  • RAR and RAR 5.0 archives (with some limitations due to RAR's proprietary status)
  • XAR archives

The library also detects and handles any of the following before evaluating the archive:

  • uuencoded files
  • files with RPM wrapper
  • gzip compression
  • bzip2 compression
  • compress/LZW compression
  • lzma, lzip, and xz compression
  • lz4 compression
  • lzop compression
  • zstandard compression

The library can create archives in any of the following formats:

  • POSIX ustar
  • POSIX pax interchange format
  • "restricted" pax format, which will create ustar archives except for entries that require pax extensions (for long filenames, ACLs, etc).
  • Old GNU tar format
  • Old V7 tar format
  • POSIX octet-oriented cpio
  • SVR4 "newc" cpio
  • shar archives
  • ZIP archives (with uncompressed or "deflate" compressed entries)
  • GNU and BSD 'ar' archives
  • 'mtree' format
  • ISO9660 format
  • 7-Zip archives
  • XAR archives

When creating archives, the result can be filtered with any of the following:

  • uuencode
  • gzip compression
  • bzip2 compression
  • compress/LZW compression
  • lzma, lzip, and xz compression
  • lz4 compression
  • lzop compression
  • zstandard compression

Notes about the Library Design

The following notes address many of the most common questions we are asked about libarchive:

  • This is a heavily stream-oriented system. That means that it is optimized to read or write the archive in a single pass from beginning to end. For example, this allows libarchive to process archives too large to store on disk by processing them on-the-fly as they are read from or written to a network or tape drive. This also makes libarchive useful for tools that need to produce archives on-the-fly (such as webservers that provide archived contents of a users account).

  • In-place modification and random access to the contents of an archive are not directly supported. For some formats, this is not an issue: For example, tar.gz archives are not designed for random access. In some other cases, libarchive can re-open an archive and scan it from the beginning quickly enough to provide the needed abilities even without true random access. Of course, some applications do require true random access; those applications should consider alternatives to libarchive.

  • The library is designed to be extended with new compression and archive formats. The only requirement is that the format be readable or writable as a stream and that each archive entry be independent. There are articles on the libarchive Wiki explaining how to extend libarchive.

  • On read, compression and format are always detected automatically.

  • The same API is used for all formats; it should be very easy for software using libarchive to transparently handle any of libarchive's archiving formats.

  • Libarchive's automatic support for decompression can be used without archiving by explicitly selecting the "raw" and "empty" formats.

  • I've attempted to minimize static link pollution. If you don't explicitly invoke a particular feature (such as support for a particular compression or format), it won't get pulled in to statically-linked programs. In particular, if you don't explicitly enable a particular compression or decompression support, you won't need to link against the corresponding compression or decompression libraries. This also reduces the size of statically-linked binaries in environments where that matters.

  • The library is generally thread safe depending on the platform: it does not define any global variables of its own. However, some platforms do not provide fully thread-safe versions of key C library functions. On those platforms, libarchive will use the non-thread-safe functions. Patches to improve this are of great interest to us.

  • In particular, libarchive's modules to read or write a directory tree do use chdir() to optimize the directory traversals. This can cause problems for programs that expect to do disk access from multiple threads. Of course, those modules are completely optional and you can use the rest of libarchive without them.

  • The library is not thread aware, however. It does no locking or thread management of any kind. If you create a libarchive object and need to access it from multiple threads, you will need to provide your own locking.

  • On read, the library accepts whatever blocks you hand it. Your read callback is free to pass the library a byte at a time or mmap the entire archive and give it to the library at once. On write, the library always produces correctly-blocked output.

  • The object-style approach allows you to have multiple archive streams open at once. bsdtar uses this in its "@archive" extension.

  • The archive itself is read/written using callback functions. You can read an archive directly from an in-memory buffer or write it to a socket, if you wish. There are some utility functions to provide easy-to-use "open file," etc, capabilities.

  • The read/write APIs are designed to allow individual entries to be read or written to any data source: You can create a block of data in memory and add it to a tar archive without first writing a temporary file. You can also read an entry from an archive and write the data directly to a socket. If you want to read/write entries to disk, there are convenience functions to make this especially easy.

  • Note: The "pax interchange format" is a POSIX standard extended tar format that should be used when the older ustar format is not appropriate. It has many advantages over other tar formats (including the legacy GNU tar format) and is widely supported by current tar implementations.

Issues
  • The libarchive lib exist a READ memory access Vulnerability

    The libarchive lib exist a READ memory access Vulnerability

    hello,when i use libfuzzer to write code to call archive_read_data function,i find a READ memory access Vulnerability.see the picture! The lzma_decode function crashed when decode my testcase. 图片

    opened by icycityone 47
  • Support for RAR

    Support for RAR

    Original issue 40 created by Google Code user ondra.pelech on 2009-10-05T16:05:49.000Z:

    Hi,
    
    it would be great if libarchive supported the RAR format; even if it would
    be passworded archive.
    
    This is just a wish/enhancement, not a bug; and I know it's probably not
    easy to implement and may take a long time. And thanks for this great
    project, I use it through GNOME's gvfs-mount.
    
    Type-Enhancement OpSys-All Milestone-Later Component-libarchive Priority-None 
    opened by kwrobot 45
  • libarchive's CMakeLists.txt finds major() when it shouldn't

    libarchive's CMakeLists.txt finds major() when it shouldn't

    Original issue 125 created by Google Code user audiofanatic on 2011-01-04T08:44:44.000Z:

    When building the cmlibarchive project with LSB compilers (LSB = Linux Standards Base), the archive_entry.c file generates compiler errors because it relies on the following functions which are not provided by the LSB:
    
    major
    minor
    makedev
    
    On linux, these are generally implemented as macros which forward to functions like gnu_dev_makedev, etc., but they actually have very simple inlineable implementations. In fact, with certain GCC flags, these macros/functions *are* fully inlined. It would seem that this has been noted by the cmlibarchive developers too, since they have recently added the following to archive_entry.c
    
    #if !defined(HAVE_MAJOR) && !defined(major)
    /* Replacement for major/minor/makedev. */
    #define major(x) ((int)(0x00ff & ((x) >> 8)))
    #define minor(x) ((int)(0xffff00ff & (x)))
    #define makedev(maj,min) ((0xff00 & ((maj)<<8)) | (0xffff00ff & (min)))
    #endif
    
    The HAVE_MAJOR switch is the problem for LSB compilers. It is set earlier in archive_entry.c and it merely depends on one of MAJOR_IN_MKDEV or MAJOR_IN_SYSMACROS being defined. Unfortunately, the top level CMakeLists.txt file does this detection without considering LSB compilers, since the LSB does not provide either mkdev.h nor sysmacros.h, but the CMakeLists.txt file doesn't account for this. The result is that system versions of these headers can be found, which is incorrect/dangerous when using LSB compilers. This is easy to fix with the attached patch to the top level CMakeLists.txt file.
    
    Note that this bug was originally reported to KitWare since it affects CMake itself. They have requested that this issue be fixed in cmlibarchive itself since they import cmlibarchive sources. For reference, see here:
    
    http://public.kitware.com/Bug/view.php?id=11648
    
    
    
    
    

    See attachment: CMakeLists.txt.patch

    Type-Defect Priority-Medium OpSys-All 
    opened by kwrobot 34
  • Add support for extracting SCHILY.xattr extended attributes

    Add support for extracting SCHILY.xattr extended attributes

    This patch adds support for extracting SCHIL.xattr extended attributes found in the PAX extended header. Since some of the attributes found there can be binary data, we extend the parser for support of binary data.

    One example for an attribute with binary data is SCHILY.xattr.security.ima, which contains a digital signature.

    Signed-off-by: Stefan Berger [email protected]

    Type-Feature 
    opened by stefanberger 26
  • Unicode filenames inside RAR not working

    Unicode filenames inside RAR not working

    Original issue 247 created by Google Code user [email protected] on 2012-03-06T03:13:12.000Z:

    <b>What steps will reproduce the problem?</b>
    Attached RAR file contains one file called &quot;テスト3.xlsx&quot;.
    Read filenames in attached file with archive_read_next_header and archive_entry_pathname_w.
    
    <b>What is the expected output? What do you see instead?</b>
    Expected filename == &quot;テスト3.xlsx&quot;, but get &quot;テスト3&quot;.
    
    <b>What version are you using?</b>
    3.0.3
    
    <b>On what operating system?</b>
    Win7-64
    
    <b>How did you build?  (cmake, configure, or pre-packaged binary)</b>
    Cmake
    
    <b>What compiler or development environment (please include version)?</b>
    VS2010
    
    <b>Please provide any additional information below.</b>
    64-bit build.
    

    See attachment: unicode-subfile.rar

    Type-Defect Priority-Medium OpSys-All 
    opened by kwrobot 24
  • [meta] Reporting potential security problems

    [meta] Reporting potential security problems

    (copy of https://groups.google.com/d/topic/libarchive-discuss/zFtqsPhNcQ0/discussion)

    Our fuzzing effort (read more at our home page: https://github.com/google/oss-fuzz) has detected several crashes (2 buffer overrun and one null deref) in libarchive trunk using the fuzz target that we developed:

    https://github.com/google/oss-fuzz/blob/master/targets/libarchive/libarchive_fuzzer.cc

    These crashes are now filed in a security-protected monorail tracker (https://bugs.chromium.org/p/oss-fuzz/issues/list) and we'd like to find libarchive engineers to take a look at them.

    We'd like to CC developers on libarchive issues to give them access to stack traces and reproducer data. For that we'd only need an e-mail with associated gmail account. We can set up the process to auto-CC these e-mails when we find more issues.

    opened by mikea 23
  • build fails on SCO 5

    build fails on SCO 5

    Original issue 129 created by Google Code user brianchina60221 on 2011-01-18T18:34:53.000Z:

    libarchive/archive.h defines some types appropriate to the platform, but those types aren't used elsewhere in the code. There are many direct uses of, e.g., uint32_t.
    
    The attached patch against trunk is big, but it just rearranges some stuff in archive.h, archive_entry.h, and archive_platform.h, and then seds the C99 types in libarchive/* to use the internal names.
    
    Thank you.
    

    See attachment: types.patch

    Type-Defect Priority-Medium OpSys-Other 
    opened by kwrobot 22
  • libarchive fails to process zip files with garbage padding at end

    libarchive fails to process zip files with garbage padding at end

    Original issue 257 created by Google Code user alexkozlov0 on 2012-04-11T00:29:38.000Z:

    <b>What steps will reproduce the problem?</b>
    1. wget ftp://ftp.adobe.com/pub/adobe/magic/acrobatviewer/unix/1.x/viewer.bin
    2. bsdtar tvf viewer.bin
    
    What is the expected output?
    The zip file listing.
    
    What do you see instead?
    bsdtar: Invalid central directory signature
    bsdtar: Error exit delayed from previous errors.
    
    <b>What version are you using?</b>
    3.0.4, git
    
    <b>On what operating system?</b>
    FreeBSD 9.x
    
    <b>How did you build?  (cmake, configure, or pre-packaged binary)</b>
    configure
    
    <b>What compiler or development environment (please include version)?</b>
    gcc 4.2
    
    <b>Please provide any additional information below.</b>
    Now that libarchive have seekable zip reader, it can fallback to Central directory at the end of the zip file instead of terminating with error.
    
    Type-Defect Priority-Medium OpSys-All 
    opened by kwrobot 18
  • Support reading from multiple data objects (multivolume reading)

    Support reading from multiple data objects (multivolume reading)

    Original issue 166 created by Google Code user mcitadel on 2011-08-13T18:39:46.000Z:

    RAR archives can be split into multiple files (to provide multivolume support). Each file contains the RAR signature header, a main archive header, and the optional EOF header. The data blocks are split arbitrarily between each file in a multivolume set of files. Currently, libarchive doesn't handle reading from multiple files.
    
    This patch would introduce reading from multiple files by way of reading from multiple client objects. What would happen is that there is a chain of client objects, each with the callbacks and data necessary to open, read, skip, and close each object it's reading from (such as different files). Data is read from each of these clients as one large stream. I plan on implementing multivolume reading support of RAR files once general reading from multiple streams is accepted and committed to trunk.
    
    I introduced a new callback (switch callback) that can be used to switch from reading of one client to the next or previous client. I needed some way to determine whether a file should be closed because it's going to open the next file, or if it's being closed because libarchive is done reading from the file set. The latter would mean that I also need to free all memory allocated for all data objects of each client.
    
    I've introduced some test cases already for reading from these multiple clients. The test files are simply some reused test rar files that have been splitted using the 'split' program. There's also a test case for supplying custom callbacks and multiple client objects. This custom callbacks test case is essentially the way I see of using libarchive to read from multiple files with custom callbacks. I plan on using libarchive in this way in another application (XBMC).
    
    This patch also updates test_fuzz so it can read from multiple files. Currently, the multiple files used in test_fuzz would have the same result as &quot;test_read_format_rar.rar&quot; would. Once I have RARv3 multivolume reading support implemented, this would provide a better test for test_fuzz.
    
    
    Priority-Medium Type-Review 
    opened by kwrobot 18
  • Wrong locale defaults for windows

    Wrong locale defaults for windows

    Original issue 132 created by Google Code user repalov on 2011-01-31T18:08:58.000Z:

    <b>What version are you using?</b>
    trunk / revision 2953
    
    <b>On what operating system?</b>
    Windows 7
    
    <b>How did you build?  (cmake, configure, or pre-packaged binary)</b>
    cmake
    
    <b>What compiler or development environment (please include version)?</b>
    Visual Studio 2010
    
    I have two comments.
    
    1)  In line 464 of archive_string.c (
    http://code.google.com/p/libarchive/source/browse/trunk/libarchive/archive_string.c#464 ) used ACP code page, but ZIP and TAR (and may be other) archives created in Windows with using CP_OEMCP (OEM) character set for filenames (at least for russian it is true - ACP defines codepage 1251, but in archive names are in 866 codepage).
    
    2) It is incorrect idea to use _system_ default locale to convert mbstring&lt;-&gt;wcstring. Because if I have archive with russian filenames from FreeBSD it filenames is in koi8-r, and if I can't define charset for archive - I can't get proper names. 
    The other problem - if i build libarchive as dll, then dll have it's own locale and I can't change it from program at all.
    
    So I think it is need mechanism to change locale for string conversion for library (as minimum) or for archive (optimal).
    
    At this time no one archiver that I tested (7zip, WinRar, bsdtar from libarchive) not extracted russian names correct from tar.bz2 archive created on FreeBSD 8.1.
    
    Type-Defect OpSys-All Priority-Critical Milestone-3.0 
    opened by kwrobot 18
  • Hide private symbols in libarchive.so

    Hide private symbols in libarchive.so

    Libarchive.so presently exports 281 symbols (over 50%, full list attached) which are not present in libarchive's headers and thus are not supposed to be used by clients.

    Removing these symbols would allow compiler to optimize code more aggressively (.text reduced by 1%), speed up dynamic linker on Linux and prevent clients from inadvertently using internal APIs.

    I attached a simple patch that hides private symbols. It passes make check (I can do additional testing if needed). Would something like this be interesting for the project?

    0001-Hide-private-symbols.patch.txt private_syms.txt

    The issue was found using ShlibVisibilityChecker.

    opened by yugr 17
  • Missing archive_read_support_filter_by_code API and tests

    Missing archive_read_support_filter_by_code API and tests

    As spotted in https://github.com/libarchive/libarchive/pull/1751 the cmake build (as does the Android one) do not include the libarchive/archive_read_support_filter_by_code.c file into the build. As such the API is not exposed.

    To make it even worse, there is no tests or in-tree users for this API (unlike the read_format_by_code).

    We should fix that up by:

    • adding the cmake and android bits
    • adding new test
    • optional: check through (and if needed fix) the rest of the API/files
    • optional: add some misc API tests - say like the ones in mesa that I've wrote :-P

    Alternatively, we could remove the API

    • audit for potential users
      • some/all(?) Linux distros are using the cmake build
      • what about BSDs?
      • check around for users - distros, github, general search?
    • check when the code was added - see with author for actual users
    • if seemingly safe and maintainers are happy - remove it
    opened by evelikov 1
  • how can i get file attribute when reading archive file content

    how can i get file attribute when reading archive file content

    Dear all,

    I am trying to read archive file contents with code below, is there any way to get file attribute (like hidden, ... ) ? for example there is one file in archive with hidden attribute, and i want to know this.

    static void listContent()
    {
        struct archive *a;
        struct archive_entry *entry;
        int r;
    
        a = archive_read_new();
        archive_read_support_filter_all(a);
        archive_read_support_format_all(a);
        r = archive_read_open_filename(a, "1.zip", 10240); // Note 1
        if (r != ARCHIVE_OK)
            exit(1);
        while (archive_read_next_header(a, &entry) == ARCHIVE_OK) {
            printf("%s\n", archive_entry_pathname(entry));
            archive_read_data_skip(a); // Note 2
        }
        r = archive_read_free(a); // Note 3
        if (r != ARCHIVE_OK)
            exit(1);
    }
    
    opened by mohsenomidi 2
  • Incorrect passphrase for Zip archive

    Incorrect passphrase for Zip archive

    Hello! I try open zip archive with password in Chinese, but i get an error when reading entry.

    Password: 夜亱线强爲為为 File: clean_one_layer_password_in_utf8.zip OS: linux libarchive: 3.6.1 LANG: en_US.utf8

    Simple code to reproduce:

    #include <archive.h>
    #include <archive_entry.h>
    #include <stdio.h>
    
    #define CHECK_AND_EXIT(result, msg)                                            \
      {                                                                            \
        if (result) {                                                              \
          printf(msg);                                                             \
          return 1;                                                                \
        }                                                                          \
      }
    
    int main(int argc, char **argv) {
      int r;
      char buff[8192];
      ssize_t len;
      FILE *out;
      struct archive *ina;
      struct archive_entry *entry;
      char *input, *passphrase;
    
      input = argv[1];
      passphrase = argv[2];
    
      CHECK_AND_EXIT((ina = archive_read_new()) == NULL,
                     "Cannot create archive reader");
      CHECK_AND_EXIT(archive_read_support_filter_all(ina) != ARCHIVE_OK,
                     "Cannot enable decompression");
      CHECK_AND_EXIT(archive_read_support_format_all(ina) != ARCHIVE_OK,
                     "Cannot enable read formats")
      CHECK_AND_EXIT(archive_read_add_passphrase(ina, passphrase) != ARCHIVE_OK,
                     "Cannot add passphrase");
      CHECK_AND_EXIT(archive_read_open_filename(ina, input, 10240) != ARCHIVE_OK,
                     "Cannot open archive");
      CHECK_AND_EXIT((out = fopen(argv[3], "wb")) == NULL,
                     "Cannot open output file");
    
      while ((r = archive_read_next_header(ina, &entry)) == ARCHIVE_OK) {
        printf("%s: ", archive_entry_pathname(entry));
        /* Skip anything that isn't a regular file. */
        if (!S_ISREG(archive_entry_mode(entry))) {
          printf("skipped\n");
          continue;
        }
        if (archive_entry_size(entry) > 0) {
          do {
    
            len = archive_read_data(ina, buff, sizeof(buff));
            if (len == 0) {
              printf("copied\n");
              break;
            }
            if (len < 0) {
              printf("Error reading input archive: retcode=%zi string=%s\n", len,
                     archive_error_string(ina));
              break;
            }
            fprintf(out, buff, len);
          } while (1);
        }
      }
    
      CHECK_AND_EXIT(r != ARCHIVE_EOF, "Error reading archive");
      /* Close the archives.  */
      CHECK_AND_EXIT(archive_read_free(ina) != ARCHIVE_OK,
                     "Error closing input archive");
      return 0;
    }
    
    opened by ikrivosheev 0
  • RAR5: unexpected EOF not reported as failure

    RAR5: unexpected EOF not reported as failure

    Since truncated files are invalid, archive_read_data_block() should return ARCHIVE_FATAL instead of ARCHIVE_EOF in such cases. This is what happens when truncated .7z, .zip, .tar or .tar.gz archives are given as input to the code below. But truncated RAR5 archives cause ARCHIVE_EOF to be returned, except if the compression method is store.

    Example program:

    #include <archive.h>
    #include <archive_entry.h>
    #include <stdio.h>
    
    int main(int argc, char **argv)
    {
        struct archive *a;
        struct archive_entry *e;
        size_t read_total;
    
        a = archive_read_new();
        archive_read_support_format_all(a);
        archive_read_support_filter_all(a);
        if (archive_read_open_filename(a, argv[1], BUFSIZ) != ARCHIVE_OK)
            return 1;
        if (archive_read_next_header(a, &e) != ARCHIVE_OK)
            return 1;
    
        read_total = 0;
        for (;;) {
            const void *buf;
            size_t len;
            off_t offset;
            int ret = archive_read_data_block(a, &buf, &len, &offset);
            if (ret == ARCHIVE_OK) {
                read_total = offset + len;
            } else if (ret == ARCHIVE_EOF) {
                break;
            } else {
                fprintf(stderr, "error reported (%d): %s\n", ret, archive_error_string(a));
                return 1;
            }
        }
    
        if (archive_entry_size_is_set(e)) {
            long s = archive_entry_size(e);
            if (s != read_total) {
                fprintf(stderr, "no error reported, but size is wrong: want %ld, have %lu\n", s, read_total);
                return 1;
            }
        }
    
        fprintf(stderr, "no error reported\n");
        return 0;
    }
    

    To reproduce (I used libarchive 1385cd9c5126d9b681b7396ad2f353779ad143ba plus #1745, so the offset is always initialized):

    $ cat lorem
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum
    
    $ rar a -m0 store.rar lorem     #from https://www.rarlab.com/download.htm
    $ rar a -m5 comp.rar lorem
    $ wc -c lorem *.rar
     445 lorem
     367 comp.rar
     516 store.rar
    1328 total
    
    $ dd if=store.rar bs=250 count=1 of=trunc-store.rar
    $ dd if=comp.rar bs=250 count=1 of=trunc-comp.rar
    
    $ ./a.out store.rar
    no error reported
    $ ./a.out comp.rar
    no error reported
    $ ./a.out trunc-store.rar
    error reported (-30): I/O error when unstoring file
    $ ./a.out trunc-comp.rar
    no error reported, but size is wrong: want 445, have 0     # `read_total` > 0 when the input is bigger
    
    $ bsdtar -x -f trunc-comp.rar
    # no error is diplayed, the output size is correct but it contains only zeroes
    

    base64 store.rar:

    UmFyIRoHAQAzkrXlCgEFBgAFAQGAgACNZOsbIwIDC70DBL0DpIICbCGxpIAAAQVsb3JlbQoDE0xX
    wGLYp1MiTG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQsIGNvbnNlY3RldHVyIGFkaXBpc2Npbmcg
    ZWxpdCwgc2VkIGRvIGVpdXNtb2QgdGVtcG9yIGluY2lkaWR1bnQgdXQgbGFib3JlIGV0IGRvbG9y
    ZSBtYWduYSBhbGlxdWEuIFV0IGVuaW0gYWQgbWluaW0gdmVuaWFtLCBxdWlzIG5vc3RydWQgZXhl
    cmNpdGF0aW9uIHVsbGFtY28gbGFib3JpcyBuaXNpIHV0IGFsaXF1aXAgZXggZWEgY29tbW9kbyBj
    b25zZXF1YXQuIER1aXMgYXV0ZSBpcnVyZSBkb2xvciBpbiByZXByZWhlbmRlcml0IGluIHZvbHVw
    dGF0ZSB2ZWxpdCBlc3NlIGNpbGx1bSBkb2xvcmUgZXUgZnVnaWF0IG51bGxhIHBhcmlhdHVyLiBF
    eGNlcHRldXIgc2ludCBvY2NhZWNhdCBjdXBpZGF0YXQgbm9uIHByb2lkZW50LCBzdW50IGluIGN1
    bHBhIHF1aSBvZmZpY2lhIGRlc2VydW50IG1vbGxpdCBhbmltIGlkIGVzdCBsYWJvcnVtCh13VlED
    BQQA
    

    base64 comp.rar:

    UmFyIRoHAQAzkrXlCgEFBgAFAQGAgACPcI+2IwIDC6gCBL0DpIICbCGxpIAFAQVsb3JlbQoDE0xX
    wGLYp1MizLMkAUZjMjQ/dU97wo4BcvCrfO+t4AUCSvMgEsA3N7/Nz9GzeOkyMiTSTc5Gm02MTcjP
    vuP/deJTc29sTdGux1dZrufOV/d0K0tKFyrB9XxCWtBqaulsyEAYqwdGxadSoMz3V/A4XCo4oIcH
    Nn/qXQm/S2UzfrR4mCD/w/4d+X5UaaXABs0dl2HsXAUa3Ura/GAKXeXAsXfdiG8thTMjXSrJg8bL
    LyrkmR+8BgETUTXrAJruNnzlXeFECwsKC6yweXFe5z+H7EaVn1fDsA43TbJQ1ZIT1hCImH761MsR
    h2mNSVglMxDt38LoLwbn13gJ4uzWCNGg1hB8jhRAV2ie82ywkdqnUd1slakngH2EjOdFvpow35WD
    j84yKYIZ1+XWoFyob1trl7Add1ZRAwUEAA==
    

    gdb showed that do_unpack() returns ARCHIVE_FATAL at https://github.com/libarchive/libarchive/blob/1385cd9c5126d9b681b7396ad2f353779ad143ba/libarchive/archive_read_support_format_rar5.c#L3903 if the compression method is store, otherwise returns ARCHIVE_EOF at https://github.com/libarchive/libarchive/blob/1385cd9c5126d9b681b7396ad2f353779ad143ba/libarchive/archive_read_support_format_rar5.c#L3914. So it looks like do_unstore_file() correctly handles unexpected EOF, but uncompress_file() does not.

    I don't know what happens if the file is not truncated but one of the entries is, or if it is truncated but no entry is, because I don't know how to generate such files.

    @antekone

    opened by r-ricci 1
  • Crash

    Crash

    In this line, in archive_write_set_format_zip.c

    if (!is_all_ascii(archive_entry_pathname(zip->entry))) {

    due to a null pointer being passed to is_all_ascii(). is_all_ascii() should check for a null pointer.

    Still not sure why this is happening. Only happens when adding a file with unicode characters in its filename to a zip file.

    opened by Snuff-Daddy 0
Releases(v3.6.1)
Brotli compression format

SECURITY NOTE Please consider updating brotli to version 1.0.9 (latest). Version 1.0.9 contains a fix to "integer overflow" problem. This happens when

Google 11.3k Aug 6, 2022
Compression abstraction library and utilities

Squash - Compresion Abstraction Library

null 364 Jul 31, 2022
Superfast compression library

DENSITY Superfast compression library DENSITY is a free C99, open-source, BSD licensed compression library. It is focused on high-speed compression, a

Centaurean 979 Jul 27, 2022
data compression library for embedded/real-time systems

heatshrink A data compression/decompression library for embedded/real-time systems. Key Features: Low memory usage (as low as 50 bytes) It is useful f

Atomic Object 1.1k Jul 30, 2022
Small strings compression library

SMAZ - compression for very small strings ----------------------------------------- Smaz is a simple compression library suitable for compressing ver

Salvatore Sanfilippo 999 Aug 8, 2022
Heavily optimized zlib compression algorithm

Optimized version of longest_match for zlib Summary Fast zlib longest_match function. Produces slightly smaller compressed files for significantly fas

Konstantin Nosov 117 May 10, 2022
Fastest Integer Compression

TurboPFor: Fastest Integer Compression TurboPFor: The new synonym for "integer compression" ?? (2019.11) ALL functions now available for 64 bits ARMv8

powturbo 623 Jul 31, 2022
is a c++20 compile and runtime Struct Reflections header only library.

is a c++20 compile and runtime Struct Reflections header only library. It allows you to iterate over aggregate type's member variables.

RedSkittleFox 4 Apr 18, 2022
A simple C library for compressing lists of integers using binary packing

The SIMDComp library A simple C library for compressing lists of integers using binary packing and SIMD instructions. The assumption is either that yo

Daniel Lemire 391 Aug 7, 2022
A portable, simple zip library written in C

A portable (OSX/Linux/Windows), simple zip library written in C This is done by hacking awesome miniz library and layering functions on top of the min

Kuba Podgórski 943 Jul 31, 2022
Compile and execute C "scripts" in one go!

c "There isn't much that's special about C. That's one of the reasons why it's fast." I love C for its raw speed (although it does have its drawbacks)

Ryan Jacobs 2k Jul 29, 2022
distributed builds for C, C++ and Objective C

distcc -- a free distributed C/C++ compiler system by Martin Pool Current Documents: https://distcc.github.io/ Formally http://distcc.org/ "pump" func

distcc 1.7k Aug 4, 2022
Roaring bitmaps in C (and C++)

CRoaring Portable Roaring bitmaps in C (and C++) with full support for your favorite compiler (GNU GCC, LLVM's clang, Visual Studio). Included in the

Roaring bitmaps: A better compressed bitset 1k Aug 5, 2022
New generation entropy codecs : Finite State Entropy and Huff0

New Generation Entropy coders This library proposes two high speed entropy coders : Huff0, a Huffman codec designed for modern CPU, featuring OoO (Out

Yann Collet 1.1k Jul 24, 2022
Easing the task of comparing code generated by cc65, vbcc, and 6502-gcc

6502 C compilers benchmark Easing the way to compare code generated by cc65, 6502-gcc, vbcc, and KickC. This repository contains scripts to: Compile t

Sylvain Gadrat 16 Dec 15, 2021
Secure ECC-based DID intersection in Go, Java and C.

SecureUnionID Secure ECC-based DID intersection. ABSTRACT This project is used to protect device ID using Elliptic Curve Cryptography algorithm. The d

Volcengine 19 Jul 22, 2022
nanoc is a tiny subset of C and a tiny compiler that targets 32-bit x86 machines.

nanoc is a tiny subset of C and a tiny compiler that targets 32-bit x86 machines. Tiny? The only types are: int (32-bit signed integer) char (8-

Ajay Tatachar 16 Feb 13, 2022
Smaller C is a simple and small single-pass C compiler

Smaller C is a simple and small single-pass C compiler, currently supporting most of the C language common between C89/ANSI C and C99 (minus some C89 and plus some C99 features).

Alexey Frunze 1.1k Aug 9, 2022
Microvm is a virtual machine and compiler

The aim of this project is to create a stack based language and virtual machine for microcontrollers. A mix of approaches is used. Separate memory is used for program and variable space (Harvard architecture). An interpreter, virtual machine and compiler are available. A demostration of the interpreter in action is presented below.

null 11 Jun 21, 2022