Using PLT trampolines to provide a BLAS and LAPACK demuxing library.

Overview

GitHub Actions CI Drone Build Status Travis Build Status

libblastrampoline

All problems in computer science can be solved by another level of indirection

Using PLT trampolines to provide a BLAS and LAPACK demuxing library.

Basic usage

Build libblastrampoline.so, then link your BLAS-using library against it instead of libblas.so. When libblastrampoline is loaded, it will inspect the LBT_DEFAULT_LIBS environment variable and attempt to forward BLAS calls made to it on to that library (this can be a list of semicolon-separated libraries if your backing implementation is split across multiple libraries, such as in the case of separate BLAS and LAPACK libraries). At any time, you may call lbt_forward(libname, clear, verbose) to redirect forwarding to a new BLAS library. If you set clear to 1 it will clear out all previous mappings before setting new mappings, while if it is set to 0 it will leave symbols that do not exist within the given libname alone. This is used to implement layering of libraries, such as between a split BLAS and LAPACK library:

lbt_forward("libblas.so", 1, 0);
lbt_forward("liblapack.so", 0, 0);

ABI standard

libblastrampoline exports a consistent ABI for applications to link against. In particular, we export both a 32-bit (LP64) and 64-bit (ILP64) interface, allowing applications that use one or the other (or both!) to link against the library. Applications that wish to use the 64-bit interface must append _64 to their function calls, e.g. instead of calling dgemm() they must call dgemm_64(). The BLAS/LAPACK symbol list we re-export comes from the gensymbol script contained within OpenBLAS. See ext/gensymbol for more. We note that we have an experimental Clang.jl-based symbol extractor that extracts only those symbols that are defined within the headers shipped with OpenBLAS, however as there are hundreds of symbols that gensymbol knows about (and are indeed exported from the shared library libopenblas.so) that are not included in the public C headers, we take the conservative approach and export the gensymbol-sourced symbols.

Because we export both the 32-bit (LP64) and 64-bit (ILP64) interfaces, if clients need header files defining the various BLAS/LAPACK functions, they must include headers defining the appropriate ABI. We provide headers broken down by interface (LP64 vs. ILP64) as well as target (e.g. x86_64-linux-gnu), so to properly compile your code with headers provided by libblastrampoline you must add the appropriate -I${prefix}/include/${interface}/${target} flags.

When libblastrampoline loads a BLAS/LAPACK library, it will inspect it to determine whether it is a 32-bit (LP64) or 64-bit (ILP64) library, and depending on the result, it will forward from its own 32-bit/64-bit names to the names declared in the library its forwarding to. This allows automatic usage of multiple libraries with different interfaces but the same symbol names.

libblastrampoline is also cognizant of the f2c calling convention incompatibilities introduced by some libraries such as Apple's Accelerate. It will automatically probe the library to determine its calling convention and employ a return-value conversion routine to fix the float/double return value differences. This support is only available on the x86_64 and i686 architectures, however these are the only systems on which the incompatibilty exists to our knowledge.

libblastrampoline-specific API

libblastrampoline exports a simple configuration API including lbt_forward(), lbt_get_config(), lbt_{set,get}_num_threads(), and more. See the public header file for the most up-to-date documentation on the libblastrampoline API.

Note: all lbt_* functions should be considered thread-unsafe. Do not attempt to load two BLAS libraries on two different threads at the same time.

Limitations

This library has the ability to work with a mixture of LP64 and ILP64 BLAS libraries, but is slightly hampered on certain platforms that do not have the capability to perform RTLD_DEEPBIND-style linking. As of the time of this writing, this includes FreeBSD and musl Linux. The impact of this is that you are unable to load an ILP64 BLAS that exports the typical LP64 names (e.g. dgemm_) at the same time as an actual LP64 BLAS (with any naming scheme). This is because without RTLD_DEEPBIND-style linking semantics, when the ILP64 BLAS tries to call one of its own functions, it will call the function exported by libblastrampoline itself, which will result in incorrect values and segfaults. To address this, libblastrampoline will detect if you attempt to do this and refuse to load a library that would cause this kind of confusion. You can always tell if your system is limited in this fashion by calling lbt_get_config() and checking the build_flags member for the LBT_BUILDFLAGS_DEEPBINDLESS flag.

Version History

v3.0.2 - Fix MKL threading interface to use properly-capitalized names to get the C ABI.

v3.0.1 - Don't dlclose() libraries; this can cause crashes due to not knowing when resources are truly freed.

v3.0.0 - Added active_forwards field to lbt_libinfo_t and exported_symbols to lbt_config_t.

v2.2.0 - Removed useless exit(1) in src/dl_utils.c.

v2.1.0 - Added threading getter/setter API, direct setting API and default function API.

v2.0.0 - Added f2c autodetection for Accelerate, changed public API to lbt_forward() from load_blas_funcs().

v1.0.0 - Feburary 2021: Initial release with basic autodetection, LP64/ILP64 mixing and trampoline support.

Comments
  • Missing dot(ComplexF32, ComplexF32) for LBT4 + MKL

    Missing dot(ComplexF32, ComplexF32) for LBT4 + MKL

    Using the vs/lp64 branch of MKL.jl

    julia> using LinearAlgebra
    
    julia> x = rand(ComplexF32, 10); dot(x,x)
    6.305651f0 + 0.0f0im
    
    julia> using MKL
    
    julia> x = rand(ComplexF32, 10); dot(x,x)
    Error: no BLAS/LAPACK library loaded!
    1.0f-45 + 4.5916f-41im
    
    julia> BLAS.dotc(x,x)
    Error: no BLAS/LAPACK library loaded!
    2.5705f-41 + 0.0f0im
    
    opened by ViralBShah 12
  • Force setting of some `CFLAGS` and `LDFLAGS`

    Force setting of some `CFLAGS` and `LDFLAGS`

    Fix #44. @nalimilan could you please test this out? I think this is now doing what you want:

    % LANG=C make -C src CFLAGS="" LDFLAGS="" VERBOSE=1
    make: Entering directory '/home/mose/repo/libblastrampoline/src'
    cc -o build/libblastrampoline.o -fPIC -DF2C_AUTODETECTION -c libblastrampoline.c
    cc -o build/dl_utils.o -fPIC -DF2C_AUTODETECTION -c dl_utils.c
    cc -o build/config.o -fPIC -DF2C_AUTODETECTION -c config.c
    cc -o build/autodetection.o -fPIC -DF2C_AUTODETECTION -c autodetection.c
    cc -o build/threading.o -fPIC -DF2C_AUTODETECTION -c threading.c
    cc -o build/deepbindless_surrogates.o -fPIC -DF2C_AUTODETECTION -c deepbindless_surrogates.c
    cc -o build/trampolines/trampolines_x86_64.o -fPIC -DF2C_AUTODETECTION -c trampolines/trampolines_x86_64.S
    cc -o build/f2c_adapters.o -fPIC -DF2C_AUTODETECTION -c f2c_adapters.c
    cc -o build/libblastrampoline.so -fPIC -DF2C_AUTODETECTION build/libblastrampoline.o build/dl_utils.o build/config.o build/autodetection.o build/threading.o build/deepbindless_surrogates.o build/trampolines/trampolines_x86_64.o build/f2c_adapters.o -shared -ldl
    make: Leaving directory '/home/mose/repo/libblastrampoline/src'
    
    opened by giordano 10
  • Support Apple Accelerate

    Support Apple Accelerate

    MacOS provides Accelerate.framework, which we can use to provide BLAS on MacOS and especially on apple silicon. Unfortunately there seems to be some ABI differences specifically related to return codes on these libraries. Since we want to provide the gfortran-compatible interface to all client code, we'll need to add some return code-altering shims and forward to those first, providing some small argument/return code massaging.

    We should be able to auto-detect this similarly to how we auto-detect bitwidth; we can call an effected function (such as sdot) and look at the return code to see how the return codes are being passed back.

    X-ref: https://github.com/tenomoto/dotwrp

    opened by staticfloat 9
  • Add gemmt symbols

    Add gemmt symbols

    https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-fortran/top/blas-and-sparse-blas-routines/blas-like-extensions/gemmt.html

    opened by amontoison 8
  • Stop building with `-Werror`

    Stop building with `-Werror`

    -Werror should not be used in default build flags as it makes the compilation fail every time somebody tries to build with a new version of the compiler which triggers a new warning. It should only be set when running CI.

    I saw this when trying to build the RPM for Julia 1.8.0-rc3 on Fedora rawhide (with GCC 12.1.1):

    threading.c: In function 'lbt_register_thread_interface':
    threading.c:44:5: error: 'strcpy' writing one too many bytes into a region of a size that depends on 'strlen' [-Werror=stringop-overflow=]
       44 |     strcpy(getter_names[idx], getter);
          |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    threading.c:43:34: note: destination object of size [0, 9223372036854775805] allocated by 'malloc'
       43 |     getter_names[idx] = (char *) malloc(strlen(getter));
          |                                  ^~~~~~~~~~~~~~~~~~~~~~
    threading.c:46:5: error: 'strcpy' writing one too many bytes into a region of a size that depends on 'strlen' [-Werror=stringop-overflow=]
       46 |     strcpy(setter_names[idx], setter);
          |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    threading.c:45:34: note: destination object of size [0, 9223372036854775805] allocated by 'malloc'
       45 |     setter_names[idx] = (char *) malloc(strlen(setter));
          |                                  ^~~~~~~~~~~~~~~~~~~~~~
    cc1: all warnings being treated as errors
    make[1]: *** [Makefile:37: build/threading.o] Error 1
    make[1]: *** Waiting for unfinished jobs....
    make[1]: Leaving directory '/builddir/build/BUILD/julia-1.8.0-beta3/deps/scratch/blastrampoline-d32042273719672c6669f6442a0be5605d434b70/src'
    make: *** [/builddir/build/BUILD/julia-1.8.0-beta3/deps/blastrampoline.mk:14: scratch/blastrampoline-d32042273719672c6669f6442a0be5605d434b70/build-compiled] Error 2
    make: Leaving directory '/builddir/build/BUILD/julia-1.8.0-beta3/deps'
    

    The warning probably deserves some attention but the problem isn't new I guess so there's no reason to worry the particular user who happened to build with a new compiler version.

    opened by nalimilan 8
  • Use the properly-capitalized C-ABI MKL threading interface

    Use the properly-capitalized C-ABI MKL threading interface

    It appears that the all-lowercase threading symbols are actually all FORTRAN ABI, as can be seen through dlsym():

    julia> dlsym(libmkl_rt, "mkl_set_num_threads")
    Ptr{Nothing} @0x00007fa64f24e3b0
    
    julia> dlsym(libmkl_rt, "mkl_set_num_threads_")
    Ptr{Nothing} @0x00007fa64f24e3b0
    

    The C-ABI versions are not the non-underscore-suffixed ones, but the capitalized ones, e.g. MKL_Set_Num_Threads. The more you know.

    opened by staticfloat 6
  • Actually set the soname

    Actually set the soname

    Ref: https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/59#discussion_r795090192

    Also, use major version instead of full version number, ref: https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/59#discussion_r795040249

    opened by giordano 5
  • Use `_64` ILP64 symbols for MKL

    Use `_64` ILP64 symbols for MKL

    MKL 2022.0.1 now has ILP64 symbols with a _64 suffix. I've verified this on Linux, and presumably this has been implemented on Mac and Windows too.

    $ nm libmkl_intel_ilp64.so | grep -i _64 | grep -i dpotrf
    000000000036f7c0 T dpotrf2_64
    000000000036f7c0 T DPOTRF2_64
    000000000035f900 T dpotrf_64
    000000000035f900 T DPOTRF_64
    
    opened by ViralBShah 5
  • Issues with blis_jll

    Issues with blis_jll

    blis_jll now has ILP64 the way we want it. But the library does not have enough symbols. I always believed it had a complete set of BLAS symbols. I haven't yet investigated if the issue is in BLIS, the way it is built, or LBT.

    https://github.com/JuliaLinearAlgebra/BLIS.jl/issues/3 https://github.com/JuliaPackaging/Yggdrasil/pull/2666

    julia> using LinearAlgebra, blis_jll
    julia> BLAS.lbt_forward(blis_jll.blis, clear=true)
    33
    
    opened by ViralBShah 5
  • FlexiBLAS

    FlexiBLAS

    I tried to use FlexiBLAS as a BLAS/LAPACK backend for LBT. While lbt_forward seems to work (What does 2191 mean?) using it doesn't:

    julia> libflexiblas = "/opt/software/pc2/EB-SW/software/FlexiBLAS/3.0.4-GCC-11.2.0/lib/libflexiblas.so"
    "/opt/software/pc2/EB-SW/software/FlexiBLAS/3.0.4-GCC-11.2.0/lib/libflexiblas.so"
    
    julia> using LinearAlgebra
    
    julia> BLAS.lbt_forward(libflexiblas; clear=true)
    2191
    
    julia> BLAS.get_config()
    LinearAlgebra.BLAS.LBTConfig
    Libraries:
    └ [ LP64] libflexiblas.so
    
    julia> A = rand(1000,1000);
    
    julia> A*A;
    Error: no BLAS/LAPACK library loaded!
    

    Why doesn't this work given that FlexiBLAS claims to have 100% compatibility with BLAS/LAPACK ABI/API?

    opened by carstenbauer 4
  • Add `LBT_USE_RTLD_DEEPBIND` capability

    Add `LBT_USE_RTLD_DEEPBIND` capability

    Certain tools such as sanitizers don't like us loading libraries with RTLD_DEEPBIND, so since we already have the workarounds in place for systems that don't have RTLD_DEEPBIND at all, let's change these from using compile-time constants to instead use a runtime switch that can be overridden through setting the environment variable LBT_USE_RTLD_DEEPBIND=0 before running our program.

    opened by staticfloat 4
  • On Windows only build library with major soversion

    On Windows only build library with major soversion

    Keeping multiple copies of LBT on Windows is simply useless, wasteful, and error prone.

    Refs

    • https://github.com/JuliaLang/julia/issues/46872#issue-1383915227
    • https://github.com/JuliaLang/julia/pull/47676#issuecomment-1367481852
    • https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/88#issuecomment-1367485884

    I haven't tested whether it actually builds, I'm hoping CI will tell me.

    opened by giordano 1
  • Resolving 32/64 and 64/64 libraries without function suffix

    Resolving 32/64 and 64/64 libraries without function suffix

    I am attempting to port our BLAS/LAPACK shared libraries to work with the lbt_forward command. Our library is split into 4 shared libraries where we built them with (serial or SMP) and (32/64 or 64/64) options. However, our 64/64 libraries do not have any additional naming convention; both the 32/64 and 64/64 libraries use the same function names ( ie dgemm() ).

    From my understanding, any pre-packaged Julia distribution forces users to use 64/64 BLAS. Since our 64/64 libraries do not have a suffix in any function, the lbt_forward command seems to call the 32/64 variant w/ 64-bit integers and causes a segfault at setup. In the ReadMe ABI section, it sounds like LBT require BLAS libraries to use the 64/64 OpenBLAS naming convention via the suffix. While possible on our end, is there any other way to resolve this?

    Current version: 1.7.0-rc2 (last version prebuilt for powerpcle). Possible to move to 1.7.2 or higher if needed.

    opened by BrandonGroth 0
  • Error: no BLAS/LAPACK library loaded!

    Error: no BLAS/LAPACK library loaded!

    julia> if Base.USE_BLAS64
               # Load ILP64 forwards
               BLAS.lbt_forward(libkmlblas; clear=true, suffix_hint="64")
               # Load LP64 forward
               BLAS.lbt_forward(libkmlblas; suffix_hint="")
           else
               BLAS.lbt_forward(libkmlblas; clear=true, suffix_hint="")
           end
    325
    
    julia> BLAS.get_config()
    LinearAlgebra.BLAS.LBTConfig
    Libraries: 
    └ [ LP64] libkblas.so
    
    julia> Base.USE_BLAS64
    true
    
    julia> 
    
    julia> 
    
    julia> 
    
    julia> aa * bb
    Error: no BLAS/LAPACK library loaded!
    

    Sorry to bother you, I'm trying to use BLAS.lbtforward to implement my own BLAS switching. I wrote a BLAS library and installed Julia on an arm-based machine. I didn't use BinaryBuilder to build my binary files. Instead, build the link library directly on this machine as shown in the code below, and successfully use the BLAS.lbtforward function to link to my link library, but when I call multiplication, the following error occurs, my library is based on MKL-BLAS The specification was written, and the switching method has also reached an agreement with MKL.jl. Please tell me where the problem lies.

    opened by liushang0322 4
  • Null pointers in LP64 on Windows

    Null pointers in LP64 on Windows

    Building Ipopt and MUMPS_seq with LBT using LP64 results in null pointer issues only on Windows (when using OpenBLAS32), both win64 and win32. Works fine on the other platforms.

    Discussed in https://github.com/jump-dev/Ipopt.jl/pull/327

    opened by ViralBShah 1
  • Some problems with library on Windows

    Some problems with library on Windows

    There are currently two orthogonal issues with the library on Windows (spotted in https://github.com/jump-dev/Ipopt.jl/pull/327#issuecomment-1233458190):

    • the soversion of the library, fixed in src/Make.inc is stalling: https://github.com/JuliaLinearAlgebra/libblastrampoline/blob/b829b0e1583da58bc80acbcc0cfb83b482530b92/src/Make.inc#L26-L28 but the latest release at the moment is v5.1.1
    • https://github.com/JuliaLinearAlgebra/libblastrampoline/blob/b829b0e1583da58bc80acbcc0cfb83b482530b92/src/Makefile#L45-L46 generates the import library for $@, which is LIB_FULL_VERSION, which is a bad idea because it means the ABI is broken in every single version (well, every single version where the soversion is actually bumped, see point above)

    An overall solution is probably to stop generating the library with the full name on Windows, or at least generating the import library for LIB_MAJOR_VERSION instead of LIB_FULL_VERSION.

    It remains the problem of actually remembering to update the soversion in src/Make.inc when doing a new release.

    CC: @ViralBShah

    bug 
    opened by giordano 5
  • 32-bit Windows CI is broken

    32-bit Windows CI is broken

    All GitHub Actions builds for Windows x86 since July 2022 have failed, and the first one to fail was associated with the merge commit for #80, which literally only touched the README. All other platforms seem just fine.

    opened by ararslan 1
Releases(v5.3.0)
  • v5.2.0(Oct 15, 2022)

    What's Changed

    Other Changes

    • Remove version history from the README by @ViralBShah in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/80
    • Use Cirrus for FreeBSD CI rather than Travis by @ararslan in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/85
    • Mention Julia packages based on LBT in README.md by @carstenbauer in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/83
    • Generate Windows import library for with major version only by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/88
    • Fix composition of f2c and complex return style in Apple Accelerate by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/82
    • Added quotes around linked libs for ld, removed extra whitespace. by @apaz-cli in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/90
    • Print exit code when command fails by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/91
    • Change to buildkite CI by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/92

    New Contributors

    • @ararslan made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/85
    • @carstenbauer made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/83
    • @apaz-cli made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/90

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v5.1.1...v5.2.0

    Source code(tar.gz)
    Source code(zip)
  • v5.1.1(Jun 16, 2022)

    What's Changed

    • Add riscv64 support by @alexfanqi in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/68
    • Use strdup to copy strings by @yuyichao in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/71
    • Remove const qualifier on return type by @yuyichao in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/72
    • Add support for DESTDIR by @inkydragon in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/76
    • Fix incompatible pointer conversion on gcc 12.1.0 by @metab0t in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/77
    • Correctly identify OS when cross compile with MinGW on Linux by @metab0t in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/79

    New Contributors

    • @alexfanqi made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/68
    • @yuyichao made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/71
    • @inkydragon made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/76
    • @metab0t made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/77

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v5.1.0...v5.1.1

    Source code(tar.gz)
    Source code(zip)
  • v5.1.0(Mar 12, 2022)

    What's Changed

    • Allow complex return style autodetection on Windows as well by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/67

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v5.0.2...v5.1.0

    Source code(tar.gz)
    Source code(zip)
  • v5.0.2(Mar 8, 2022)

    What's Changed

    • Fix build for OpenBSD by @cmburn in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/64
    • Add cblas_sdot and cblas_ddot to our CBLAS workaround list by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/66

    New Contributors

    • @cmburn made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/64

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v5.0.1...v5.0.2

    Source code(tar.gz)
    Source code(zip)
  • v5.0.1(Feb 2, 2022)

    What's Changed

    • Release v5.0.1, fixing complex retstyle infinite loop bug by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/62

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v5.0.0...v5.0.1

    Source code(tar.gz)
    Source code(zip)
  • v5.0.0(Feb 2, 2022)

    What's Changed

    • Add complex return style detection and cblas_*_sub workaround by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/61

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v4.1.0...v5.0.0

    Source code(tar.gz)
    Source code(zip)
  • v4.1.0(Feb 2, 2022)

    What's Changed

    • Add CI job to ensure that our symbol list is up to date by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/57
    • Add LBT_STRICT environment variable by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/58
    • Add SONAME/SOVERSION by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/59
    • Actually set the soname by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/60

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v4.0.0...v4.1.0

    Source code(tar.gz)
    Source code(zip)
  • v4.0.0(Jan 20, 2022)

    What's Changed

    • Force setting of some CFLAGS and LDFLAGS by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/45
    • Mark stack as non executable by @nalimilan in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/51
    • Revert mistaken removal of LBT_LDFLAGS line by @nalimilan in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/53
    • Add test for how to directly load new MKL ILP64 by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/54

    New Contributors

    • @nalimilan made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/51

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v3.1.0...v4.0.0

    Source code(tar.gz)
    Source code(zip)
  • v3.1.0(Oct 10, 2021)

    What's Changed

    • Run CI for armv7l on Drone by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/40
    • Remove vendor-specific APIs from include/ by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/42
    • Build on Haiku by @vazub in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/43
    • Add -fvisibility=protected on Linux by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/46
    • Add LBT_USE_RTLD_DEEPBIND capability by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/48
    • Use Alpine 3.13 and remove -fvisibility=protected -fuse-ld=gold flags by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/49

    New Contributors

    • @vazub made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/43

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v3.0.4...v3.1.0

    Source code(tar.gz)
    Source code(zip)
  • v3.0.4(Apr 7, 2021)

    What's Changed

    • Add missing underscore by @pabloferz in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/37
    • Fix armv7l BLAS interface autodetection by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/38

    New Contributors

    • @pabloferz made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/37

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v3.0.3...v3.0.4

    Source code(tar.gz)
    Source code(zip)
  • v3.0.3(Apr 7, 2021)

    What's Changed

    • Test package on Alpine Linux by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/33
    • Fix armv7l trampoline by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/34

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v3.0.2...v3.0.3

    Source code(tar.gz)
    Source code(zip)
  • v3.0.2(Mar 2, 2021)

    What's Changed

    • Use the properly-capitalized C-ABI MKL threading interface by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/32

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v3.0.1...v3.0.2

    Source code(tar.gz)
    Source code(zip)
  • v3.0.1(Feb 27, 2021)

    Apparently, mixing MKL and OpenBLAS causes crashes if we dlclose() things.

    What's Changed

    • Minor improvements to README.md files by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/30

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v3.0.0...v3.0.1

    Source code(tar.gz)
    Source code(zip)
  • v3.0.0(Feb 24, 2021)

    What's Changed

    • Add active_forwards in config info by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/31

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v2.2.0...v3.0.0

    Source code(tar.gz)
    Source code(zip)
  • v2.2.0(Feb 20, 2021)

  • v2.1.0(Feb 20, 2021)

    What's Changed

    • Install libblas64 and liblapack64 on Ubuntu by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/25
    • Add f2c autodetection, test against Accelerate on macOS by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/24
    • Only use surrogate lsame on RTLD_DEEPBIND-less systems by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/26
    • Add lbt_get_config() and get/set threads APIs and some tests showing how to use it from Julia by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/27
    • APIs, APIs, APIs by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/29

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v2.0.0...v2.1.0

    Source code(tar.gz)
    Source code(zip)
  • v2.0.0(Feb 16, 2021)

    What's Changed

    • Test MKL ILP64 through environment variable by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/18
    • [WIP] Add MKL exports by @ViralBShah in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/19
    • Rename load_blas_funcs to lbt_forward by @ViralBShah in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/21

    New Contributors

    • @ViralBShah made their first contribution in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/19

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/compare/v1.0.0...v2.0.0

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Feb 13, 2021)

    What's Changed

    • Add CI with GitHub Actions by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/3
    • Fix CI by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/5
    • Enable RTLD_DEEPBIND by default, add workaround for musl and FreeBSD by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/6
    • Exporting doesn't use underscores by @staticfloat in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/8
    • Add LICENSE file by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/10
    • Add CI for aarch64 and FreeBSD by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/9
    • Fix name of branch for CI by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/11
    • Install compiler for 32-bit Windows by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/12
    • Install build tools in Drone by @giordano in https://github.com/JuliaLinearAlgebra/libblastrampoline/pull/13

    Full Changelog: https://github.com/JuliaLinearAlgebra/libblastrampoline/commits/v1.0.0

    Source code(tar.gz)
    Source code(zip)
Owner
Elliot Saba
Senior Research Engineer
Elliot Saba
A Cross platform implement of Wenet ASR. It's based on ONNXRuntime and Wenet. We provide a set of easier APIs to call wenet models.

RapidASR: a new member of RapidAI family. Our visio is to offer an out-of-box engineering implementation for ASR. A cpp implementation of recognize-on

RapidAI-NG 97 Nov 17, 2022
Provide sample code of efficient operator implementation based on the Cambrian Machine Learning Unit (MLU) .

Cambricon CNNL-Example CNNL-Example 提供基于寒武纪机器学习单元(Machine Learning Unit,MLU)开发高性能算子、C 接口封装的示例代码。 依赖条件 操作系统: 目前只支持 Ubuntu 16.04 x86_64 寒武纪 MLU SDK: 编译和

Cambricon Technologies 1 Mar 7, 2022
ORB-SLAM3 is the first real-time SLAM library able to perform Visual, Visual-Inertial and Multi-Map SLAM with monocular, stereo and RGB-D cameras, using pin-hole and fisheye lens models.

Just to test for my research, and I add coordinate transformation to evaluate the ORB_SLAM3. Only applied in research, and respect the authors' all work.

B.X.W 5 Jul 11, 2022
Radeon Rays is ray intersection acceleration library for hardware and software multiplatforms using CPU and GPU

RadeonRays 4.1 Summary RadeonRays is a ray intersection acceleration library. AMD developed RadeonRays to help developers make the most of GPU and to

GPUOpen Libraries & SDKs 980 Dec 29, 2022
Header-only library for using Keras models in C++.

frugally-deep Use Keras models in C++ with ease Table of contents Introduction Usage Performance Requirements and Installation FAQ Introduction Would

Tobias Hermann 926 Dec 30, 2022
A C library for product recommendations/suggestions using collaborative filtering (CF)

Recommender A C library for product recommendations/suggestions using collaborative filtering (CF). Recommender analyzes the feedback of some users (i

Ghassen Hamrouni 254 Dec 29, 2022
An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

Siavash Eliasi 33 May 31, 2022
HackySAC is a C++ header only library for model estimation using RANSAC.

HackySAC HackySAC is a C++ header only library for model estimation using RANSAC. Available under the MIT license. Examples Minimal working example fo

Jonathan Broere 1 Oct 10, 2021
AGE is a simple 2D console game engine runs in UNIX using third library Ncurses.

AGE-Game-Engine AGE is a simple 2D console game engine runs in UNIX using third library Ncurses. How-To-Run You need to install ncurses using the foll

SIHAN LI 1 Dec 16, 2021
We implemented our own sequential version of GA, PSO, SA and ACA using C++ and the parallelized version with CUDA support

We implemented our own sequential version of GA, PSO, SA and ACA using C++ (some using Eigen3 as matrix operation backend) and the parallelized version with CUDA support. All of them are much faster than the popular lib scikit-opt.

Aron751 4 May 7, 2022
The dgSPARSE Library (Deep Graph Sparse Library) is a high performance library for sparse kernel acceleration on GPUs based on CUDA.

dgSPARSE Library Introdution The dgSPARSE Library (Deep Graph Sparse Library) is a high performance library for sparse kernel acceleration on GPUs bas

dgSPARSE 59 Dec 5, 2022
The Robotics Library (RL) is a self-contained C++ library for rigid body kinematics and dynamics, motion planning, and control.

Robotics Library The Robotics Library (RL) is a self-contained C++ library for rigid body kinematics and dynamics, motion planning, and control. It co

Robotics Library 656 Jan 1, 2023
Spying on Microcontrollers using Current Sensing and embedded TinyML models

Welcome to CurrentSense-TinyML CurrentSense-TinyML is all about detecting microcontroller behaviour with current sensing and TinyML. Basically we are

Santander Security Research 71 Sep 17, 2022
ResNet Implementation, Training, and Inference Using LibTorch C++ API

LibTorch C++ ResNet CIFAR Example Introduction ResNet implementation, training, and inference using LibTorch C++ API. Because there is no native imple

Lei Mao 23 Oct 29, 2022
Raspberry Pi guitar pedal using neural networks to emulate real amps and pedals.

NeuralPi NeuralPi is a guitar pedal using neural networks to emulate real amps and pedals on a Raspberry Pi 4. The NeuralPi software is a VST3 plugin

Keith Bloemer 865 Jan 5, 2023
License plate parsing using Darknet and YOLO

DarkPlate Note that DarkPlate by itself is not a complete software project. The intended purpose was to create a simple project showing how to use Dar

Stéphane Charette 35 Dec 9, 2022
Training and Evaluating Facial Classification Keras Models using the Tensorflow C API Implemented into a C++ Codebase.

CFace Training and Evaluating Facial Classification Keras Models using the Tensorflow C API Implemented into a C++ Codebase. Dependancies Tensorflow 2

null 7 Oct 18, 2022