Beringei is a high performance, in-memory storage engine for time series data.

Overview

** THIS REPO HAS BEEN ARCHIVED AND IS NO LONGER BEING ACTIVELY MAINTAINED **

Beringei CircleCI

A high performance, in memory time series storage engine

In the fall of 2015, we published the paper “Gorilla: A Fast, Scalable, In-Memory Time Series Database” at VLDB 2015. Beringei is the open source representation of the ideas presented in this paper.

Beringei is a high performance time series storage engine. Time series are commonly used as a representation of statistics, gauges, and counters for monitoring performance and health of a system.

Features

Beringei has the following features:

  • Support for very fast, in-memory storage, backed by disk for persistence. Queries to the storage engine are always served out of memory for extremely fast query performance, but backed to disk so the process can be restarted or migrated with very little down time and no data loss.
  • Extremely efficient streaming compression algorithm. Our streaming compression algorithm is able to compress real world time series data by over 90%. The delta of delta compression algorithm used by Beringei is also fast - we see that a single machine is able to compress more than 1.5 million datapoints/second.
  • Reference sharded service implementation, including a client implementation.
  • Reference http service implementation that enables direct Grafana integration.

How can I use Beringei?

Beringei can be used in one of two ways.

  1. We have created a simple, sharded service, and reference client implementation, that can store and serve time series query requests.
  2. You can use Beringei as an embedded library to handle the low-level details of efficiently storing time series data. Using Beringei in this way is similar to RocksDB - the Beringei library can be the high performance storage system underlying your performance monitoring solution.

Requirements

Beringei is tested and working on:

  • Ubuntu 16.10

We also depend on these open source projects:

Building Beringei

Our instructions are for Ubuntu 16.10 - but you will probably be able to modify the install scripts and directions to work with other linux distros.

  • Run sudo ./setup_ubuntu.sh.

  • Build beringei.

mkdir build && cd build && cmake .. && make
  • Generate a beringei configuration file.
./beringei/tools/beringei_configuration_generator --host_names $(hostname) --file_path /tmp/beringei.json
  • Start beringei.
./beringei/service/beringei_main \
    -beringei_configuration_path /tmp/beringei.json \
    -create_directories \
    -sleep_between_bucket_finalization_secs 60 \
    -allowed_timestamp_behind 300 \
    -bucket_size 600 \
    -buckets $((86400/600)) \
    -logtostderr \
    -v=2
  • Send data.
while [[ 1 ]]; do
    ./beringei/tools/beringei_put \
        -beringei_configuration_path /tmp/beringei.json \
        testkey ${RANDOM} \
        -logtostderr -v 3
    sleep 30
done
  • Read the data back.
./beringei/tools/beringei_get \
    -beringei_configuration_path /tmp/beringei.json \
    testkey \
    -logtostderr -v 3

License

Beringei is BSD-licensed. We also provide an additional patent grant.

Issues
  • plain text server

    plain text server

    Add a fetch endpoint curl "127.0.0.1:9990/fetch?start=2016-10-31T06:33:44.866Z&end=2018-10-31T06:33:44.866Z&key=testkey"

    Add an update endpoint curl "127.0.0.1:9990/update" --data-binary @metrics Metrics file

    testkey 1 1484620850
    testkey 3 1484620883
    
    CLA Signed 
    opened by d33d33 9
  • [beringei] Fix the build

    [beringei] Fix the build

    Summary: 7138b7d exposed the fact that our dependencies are all very old.

    We still need to come up with a beter solution than manually updating when things break, but for now, this gets things working again by bumping the version numbers.

    Updating the dependencies also broke two things that required fixing:

    • fbthrift now generates extra files that need compiling
    • folly::Singleton doesn't like being used before folly::init()

    Test Plan:

    • make test passes
    • beringei_main runs without crashing
    CLA Signed 
    opened by scottfranklin 5
  • [beringei] Improve CMake dependency tree

    [beringei] Improve CMake dependency tree

    Summary:

    • Require gtest headers to be downloaded before building unit tests
    • Turn thrift generation into a proper build target

    Test Plan:

    • Clean clean build with -j10 now succeeds instead of failing
    • Repeated make now does not recompile everything
    • Touching any .thrift file or build_thrift.sh still causes a rebuild
    • make clean now deletes thrift-generated files
    • Build now works with ninja: mkdir build && cd build && cmake -GNinja .. && ninja
    CLA Signed 
    opened by scottfranklin 4
  • [beringei] Automate the build/generation of thrift files

    [beringei] Automate the build/generation of thrift files

    Summary: Currently we expect the user/developer to build/generate thrift files before running 'cmake ..'. This also means that users/developers need to remember to build thrift files on every change. This diff automates the process.

    execute_process() is run on 'cmake ..' add_custom_command() is run on 'make'

    The one disadvantage of this is that 'make' will now always rebuild every target that depends on these thrift files.

    Test Plan:

    1. Test 'cmake ..' with no build errors OUTPUT: -- FOLLY: /usr/local/facebook/include -- WANGLE: /usr/local/facebook/include -- PROXYGEN: /usr/local/facebook/include Building Required Thrift Files Building file: beringei_data.thrift Building file: beringei_grafana.thrift Building file: beringei.thrift -- Configuring done -- Generating done

    2. Test 'cmake.. ' with build error

    OUTPUT: -- FOLLY: /usr/local/facebook/include -- WANGLE: /usr/local/facebook/include -- PROXYGEN: /usr/local/facebook/include Building Required Thrift Files Building file: beringei_data.thrift [ERROR:beringei_data.thrift:13] (last token was 'sruct') syntax error [FAILURE:beringei_data.thrift:13] Parser error during include pass. CMake Error at CMakeLists.txt:56 (message): Could not build thrfft file.

    1. Test 'make' with no build errors -> Success

    2. Test 'make' with build errors

    OUTPUT: [ 15%] Built target beringei_test_util [ 16%] Building CXX object beringei/if/CMakeFiles/beringei_thrift.dir/gen-cpp2/BeringeiService_processmap_compact.cpp.o [ 17%] Building CXX object beringei/if/CMakeFiles/beringei_thrift.dir/gen-cpp2/beringei_constants.cpp.o [ 18%] Building CXX object beringei/if/CMakeFiles/beringei_thrift.dir/gen-cpp2/beringei_data_types.cpp.o [ 19%] Building CXX object beringei/if/CMakeFiles/beringei_thrift.dir/gen-cpp2/beringei_data_constants.cpp.o [ 20%] Building CXX object beringei/if/CMakeFiles/beringei_thrift.dir/gen-cpp2/beringei_types.cpp.o [ 21%] Building CXX object beringei/if/CMakeFiles/beringei_thrift.dir/gen-cpp2/beringei_grafana_constants.cpp.o [ 22%] Building CXX object beringei/if/CMakeFiles/beringei_thrift.dir/gen-cpp2/beringei_grafana_types.cpp.o [ 23%] Linking CXX static library libberingei_thrift.a Building file: beringei_data.thrift Building file: beringei_grafana.thrift Building file: beringei.thrift [ERROR:beringei.thrift:15] (last token was 'ervice') syntax error [FAILURE:beringei.thrift:15] Parser error during include pass. beringei/if/CMakeFiles/beringei_thrift.dir/build.make:328: recipe for target 'beringei/if/libberingei_thrift.a' failed make[2]: *** [beringei/if/libberingei_thrift.a] Error 1 CMakeFiles/Makefile2:173: recipe for target 'beringei/if/CMakeFiles/beringei_thrift.dir/all' failed make[1]: *** [beringei/if/CMakeFiles/beringei_thrift.dir/all] Error 2 Makefile:138: recipe for target 'all' failed make: *** [all] Error 2

    opened by kiranisaac 4
  • [beringei] Bump FB_VERSION to 2017.05.22.00

    [beringei] Bump FB_VERSION to 2017.05.22.00

    Summary: Update dependencies to be a month newer.

    This also forces .thrift files to be recompiled when the thrift libraries are updated, avoiding mismatches between old generated thrift files and new thrift common libraries after pulling in new dependencies with setup_ubuntu.sh.

    Test Plan: CircleCI

    CLA Signed 
    opened by scottfranklin 3
  • [beringei] Continuous build using CircleCI and Docker

    [beringei] Continuous build using CircleCI and Docker

    Summary:

    Continuous Build Integration for Beringei using CIrcleCI and Docker. CircleCI only supports Ubuntu 14. Hence we setup a docker Ubuntu 16.10 container and run the build/test inside the docker container.

    I should have stacked my diffs/pull requests. This Pull request is built on top of https://github.com/facebookincubator/beringei/pull/4 (automate build for thrift files).

    Test Plan: CircleCI.

    opened by kiranisaac 3
  • Fix avg mcs too low

    Fix avg mcs too low

    Found in SJC: MCS average was around 8.1 even though MCS never drops below 9. Likely reason is that I was not considering missing data. This diff considers the missing data when calculating average SNR, MCS and txPower.

    opened by csmodlin 2
  • Add Vagrantfile

    Add Vagrantfile

    This PR add Vagrantfile for using vagrant. It's more friendly for development than docker, you just need vagrant up and it will mount the folder into /vagrant in the guest vm. I've test it on some OS, maybe someone with a Mac can help test if works for Mac OS.

    • when run vagrant up for the first time, it will run setup_ubuntu.sh
    • tested host
      • Fedora 25
      • Ubuntu 16.04 LTS
      • Windows 10

    And I don't know if it's possible to have a hack folder like k8s did, to put all the environment setup, test script (in the readme), format check scripts together, though there is one setup_ubuntu.sh but there could be more in the future.

    beringei
      build
      beringei
      hack 
       setup_ubuntu.sh
       gen_config.sh
       start_beringei.sh
       insert_random.sh
       read.sh
       check_format.sh
    ````
    CLA Signed 
    opened by at15 2
  • Add type-ahead query handler to beringei query service T22220874

    Add type-ahead query handler to beringei query service T22220874

    Adapted from queryHelper.js

    -accept a topology json struct which includes nodes, links, sites, etc. -fetch all key ids for the given topology name from the db -allow a formatter to take the key names in the DB and transform them into their short-names -return a list of the key names

    opened by mahyali 1
  • Add .dockerignore file

    Add .dockerignore file

    This is a very simple PR. It just adds a .dockerignore file. This helps with docker caching, since we add the whole directory while building the docker image, ignoring those files does not trigger new steps. As a result cached steps will not be used only when the code itself changes.

    CLA Signed 
    opened by commixon 1
  • Add cleaner thread to beringei

    Add cleaner thread to beringei

    Looks like we missed porting the cleaner thread to beringei (which periodically goes through and compacts the keylist and deletes old block files). This should make the service more stable/efficient over time.

    Tested by running the service for > 24 hours while constantly pushing data into it:

      ./beringei/service/beringei_main \
        -beringei_configuration_path /tmp/beringei.json \
        -create_directories \
        -sleep_between_bucket_finalization_secs 60 \
        -allowed_timestamp_behind 300 \
        -bucket_size 600 \
        -buckets $((86400/600)) \
        -logtostderr \
        -v=2
    

    In another terminal:

      while [[ 1 ]]; do
        ./beringei/tools/beringei_put \
            -beringei_configuration_path /tmp/beringei.json \
            testkey ${RANDOM} \
            -logtostderr -v 3
        sleep 30
    done
    
    CLA Signed 
    opened by jteller 1
  • set_ubuntu.sh works on 17.10, Dockerfile builds

    set_ubuntu.sh works on 17.10, Dockerfile builds

    This change allows Beringei with a known good version of the upstream files. setup_ubuntu.sh works and the dockerfile builds cleanly. I've taken the setup TG is using and wrapped it in #ifdef's to peg the versions.

    CLA Signed 
    opened by michaelnugent 0
  • Update dependency versions

    Update dependency versions

    Summary: Pull in the latest changes from various dependencies.

    • Update FB dependencies to a newer date
    • Use a commit hash for wangle, since its version tagging seems to have failed this week
    • Update to the latest zstd version

    Test Plan:

    • It builds
    • Tests pass
    CLA Signed 
    opened by scottfranklin 4
Owner
Meta Archive
These projects have been archived and are generally unsupported, but are still available to view and use
Meta Archive
Experimental and Comparative Performance Measurements of High Performance Computing Based on OpenMP and MPI

High-Performance-Computing-Experiments Experimental and Comparative Performance Measurements of High Performance Computing Based on OpenMP and MPI 实验结

Jiang Lu 1 Nov 27, 2021
DeepRTS is a high-performance Real-TIme strategy game for Reinforcement Learning research written in C++

DeepRTS is a high-performance Real-TIme strategy game for Reinforcement Learning research. It is written in C++ for performance, but provides an python interface to better interface with machine-learning toolkits. Deep RTS can process the game with over 6 000 000 steps per second and 2 000 000 steps when rendering graphics. In comparison to other solutions, such as StarCraft, this is over 15 000% faster simulation time running on Intel i7-8700k with Nvidia RTX 2080 TI.

Centre for Artificial Intelligence Research (CAIR) 146 Jul 29, 2022
PPLNN is a high-performance deep-learning inference engine for efficient AI inferencing.

PPLNN, which is short for "PPLNN is a Primitive Library for Neural Network", is a high-performance deep-learning inference engine for efficient AI inferencing.

null 847 Aug 6, 2022
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Project DeepSpeech DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Spee

Mozilla 20k Aug 8, 2022
Simplified distributed block storage with strong consistency, like in Ceph (repository mirror)

Vitastor Читать на русском The Idea Make Software-Defined Block Storage Great Again. Vitastor is a small, simple and fast clustered block storage (sto

Vitaliy Filippov 53 Aug 2, 2022
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

Frog - A Tagger-Lemmatizer-Morphological-Analyzer-Dependency-Parser for Dutch Copyright 2006-2020 Ko van der Sloot, Maarten van Gompel, Antal van den

Language Machines 69 Jun 20, 2022
Pipy is a tiny, high performance, highly stable, programmable proxy.

Pipy Pipy is a tiny, high performance, highly stable, programmable proxy. Written in C++, built on top of Asio asynchronous I/O library, Pipy is extre

null 460 Aug 3, 2022
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.7k Aug 12, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.1k Aug 10, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Aug 1, 2022
Forward - A library for high performance deep learning inference on NVIDIA GPUs

a library for high performance deep learning inference on NVIDIA GPUs.

Tencent 123 Mar 17, 2021
A library for high performance deep learning inference on NVIDIA GPUs.

Forward - A library for high performance deep learning inference on NVIDIA GPUs Forward - A library for high performance deep learning inference on NV

Tencent 502 Jul 31, 2022
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

ONNX Runtime is a cross-platform inference and training machine-learning accelerator compatible with deep learning frameworks, PyTorch and TensorFlow/Keras, as well as classical machine learning libraries such as scikit-learn, and more.

Microsoft 7.3k Aug 6, 2022
Edge ML Library - High-performance Compute Library for On-device Machine Learning Inference

Edge ML Library (EMLL) offers optimized basic routines like general matrix multiplications (GEMM) and quantizations, to speed up machine learning (ML) inference on ARM-based devices. EMLL supports fp32, fp16 and int8 data types. EMLL accelerates on-device NMT, ASR and OCR engines of Youdao, Inc.

NetEase Youdao 176 Jul 21, 2022
LightSeq: A High Performance Library for Sequence Processing and Generation

LightSeq is a high performance training and inference library for sequence processing and generation implemented in CUDA. It enables highly efficient computation of modern NLP models such as BERT, GPT, Transformer, etc. It is therefore best useful for Machine Translation, Text Generation, Dialog, Language Modelling, Sentiment Analysis, and other related tasks with sequence data.

Bytedance Inc. 2.2k Aug 3, 2022
A flexible, high-performance serving system for machine learning models

XGBoost Serving This is a fork of TensorFlow Serving, extended with the support for XGBoost, alphaFM and alphaFM_softmax frameworks. For more informat

iQIYI 120 Aug 1, 2022
An Out-of-the-Box TensorRT-based Framework for High Performance Inference with C++/Python Support

An Out-of-the-Box TensorRT-based Framework for High Performance Inference with C++/Python Support

手写AI 1.1k Aug 8, 2022
ncnn is a high-performance neural network inference framework optimized for the mobile platform

ncnn ncnn is a high-performance neural network inference computing framework optimized for mobile platforms. ncnn is deeply considerate about deployme

Tencent 15.2k Aug 12, 2022
C++ high-performance gym environment framework

gym_cpp author: yeting email : [email protected] C++ high-performance gym environment framework Dependence apt install python3 python3-dev cmake pip3

Ye Ting 1 Dec 26, 2021