Machine Learning Framework for Operating Systems - Brings ML to Linux kernel

Overview
logo

KML: A Machine Learning Framework for Operating Systems & Storage Systems

CircleCI codecov

Storage systems and their OS components are designed to accommodate a wide variety of applications and dynamic workloads. Storage components inside the OS contain various heuristic algorithms to provide high performance and adaptability for different workloads. These heuristics may be tunable via parameters, and some system calls allow users to optimize their system performance. These parameters are often predetermined based on experiments with limited applications and hardware. Thus, storage systems often run with these predetermined and possibly suboptimal values. Tuning these parameters manually is impractical: one needs an adaptive, intelligent system to handle dynamic and complex workloads. Machine learning (ML) techniques are capable of recognizing patterns, abstracting them, and making predictions on new data. ML can be a key component to optimize and adapt storage systems. We propose KML, an ML framework for operating systems & storage systems. We implemented a prototype and demonstrated its capabilities on the well-known problem of tuning optimal readahead values. Our results show that KML has a small memory footprint, introduces negligible overhead, and yet enhances throughput by as much as 2.3×.

For more information on the KML project, please see our papers

KML is under development by Ibrahim Umit Akgun of the File Systems and Storage Lab (FSL) at Stony Brook University under Professor Erez Zadok.

Table of Contents

Setup

Clone KML

# SSH
git clone --recurse-submodules [email protected]:sbu-fsl/kernel-ml.git

# HTTPS
git clone --recurse-submodules https://github.com/sbu-fsl/kernel-ml.git

Build Dependencies

KML depends on the following third-party repositories:

# Create and enter a directory for dependencies
mkdir dependencies
cd dependencies

# Clone repositories
git clone https://github.com/google/benchmark.git
git clone https://github.com/google/googletest.git

# Build google/benchmark
cd benchmark
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ../
make
sudo make install

# Build google/googletest
cd ../googletest
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ../
make
sudo make install
cd ../..

Install KML Linux Kernel Modifications

KML requires Linux kernel modifications to function. We recommend allocating at least 25 GiB of disk space before beginning the installation process.

  1. Navigate to the kernel-ml/kernel-ml-linux directory. This repository was recursively cloned during setup
    cd kernel-ml-linux
  2. Install the following packages
    git fakeroot build-essential ncurses-dev xz-utils libssl-dev bc flex libelf-dev bison
    
  3. Install the modified kernel as normal. No changes are required for make menuconfig
    cp /boot/config-$(uname -r) .config
    make menuconfig
    make -j$(nproc)
    sudo make modules_install -j$(nproc)
    sudo make install -j$(nproc)
  4. Restart your machine
    sudo reboot
    
  5. Confirm that you now have Linux version 4.19.51+ installed
    uname -a

Specify Kernel Header Location

Edit kernel-ml/cmake/FindKernelHeaders.cmake to specify the absolute path to the aforementioned kernel-ml/kernel-ml-linux directory. For example, if kernel-ml-linux lives in /home/kernel-ml/kernel-ml-linux:

...

# Find the headers
find_path(KERNELHEADERS_DIR
        include/linux/user.h
        PATHS /home/kernel-ml/kernel-ml-linux
)

...

Build KML

# Create a build directory for KML
mkdir build
cd build 
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-Werror" ..
make

Double Check

In order to check everything is OK, we can run tests and benchmarks.

cd build
ctest --verbose

Design

kernel-design

Example

Citing KML

To cite this repository:

@TECHREPORT{umit21kml-tr,
  AUTHOR =       "Ibrahim Umit Akgun and Ali Selman Aydin and Aadil Shaikh and Lukas Velikov and Andrew Burford and Michael McNeill and Michael Arkhangelskiy and Erez Zadok",
  TITLE =        "KML: Using Machine Learning to Improve Storage Systems",
  INSTITUTION =  "Computer Science Department, Stony Brook University",
  YEAR =         "2021",
  MONTH =        "Nov",
  NUMBER =       "FSL-21-02",
}
@INPROCEEDINGS{hotstorage21kml,
  TITLE =        "A Machine Learning Framework to Improve Storage System Performance",
  AUTHOR =       "Ibrahim 'Umit' Akgun and Ali Selman Aydin and Aadil Shaikh and Lukas Velikov and Erez Zadok",
  NOTE =         "To appear",
  BOOKTITLE =    "HotStorage '21: Proceedings of the 13th ACM Workshop on Hot Topics in Storage",
  MONTH =        "July",
  YEAR =         "2021",
  PUBLISHER =    "ACM",
  ADDRESS =      "Virtual",
  KEY =          "HOTSTORAGE 2021",
}
Releases(v0.0.1)
Owner
File systems and Storage Lab (FSL)
Researchers and students in the FSL group perform research in operating systems with focus on file systems, storage, security, and networking.
File systems and Storage Lab (FSL)
A lightweight C++ machine learning library for embedded electronics and robotics.

Fido Fido is an lightweight, highly modular C++ machine learning library for embedded electronics and robotics. Fido is especially suited for robotic

The Fido Project 411 May 31, 2022
A C++ standalone library for machine learning

Flashlight: Fast, Flexible Machine Learning in C++ Quickstart | Installation | Documentation Flashlight is a fast, flexible machine learning library w

Facebook Research 4.3k Jun 30, 2022
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

mlpack 4k Jun 23, 2022
null 5.5k Jul 1, 2022
Flashlight is a C++ standalone library for machine learning

Flashlight is a fast, flexible machine learning library written entirely in C++ from the Facebook AI Research Speech team and the creators of Torch and Deep Speech.

null 4.3k Jul 3, 2022
Edge ML Library - High-performance Compute Library for On-device Machine Learning Inference

Edge ML Library (EMLL) offers optimized basic routines like general matrix multiplications (GEMM) and quantizations, to speed up machine learning (ML) inference on ARM-based devices. EMLL supports fp32, fp16 and int8 data types. EMLL accelerates on-device NMT, ASR and OCR engines of Youdao, Inc.

NetEase Youdao 176 Jun 17, 2022
ML++ - A library created to revitalize C++ as a machine learning front end

ML++ Machine learning is a vast and exiciting discipline, garnering attention from specialists of many fields. Unfortunately, for C++ programmers and

marc 999 Jun 22, 2022
Caffe: a fast open framework for deep learning.

Caffe Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR)/The Berke

Berkeley Vision and Learning Center 32.7k Jun 24, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 22.9k Jul 1, 2022
Samsung Washing Machine replacing OS control unit

hacksung Samsung Washing Machine WS1702 replacing OS control unit More info at https://www.hackster.io/roni-bandini/dead-washing-machine-returns-to-li

null 24 May 12, 2022
Distributed (Deep) Machine Learning Community 677 Apr 14, 2022
RNNLIB is a recurrent neural network library for sequence learning problems. Forked from Alex Graves work http://sourceforge.net/projects/rnnl/

Origin The original RNNLIB is hosted at http://sourceforge.net/projects/rnnl while this "fork" is created to repeat results for the online handwriting

Sergey Zyrianov 869 Jun 26, 2022
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

Frog - A Tagger-Lemmatizer-Morphological-Analyzer-Dependency-Parser for Dutch Copyright 2006-2020 Ko van der Sloot, Maarten van Gompel, Antal van den

Language Machines 69 Jun 20, 2022
R2LIVE is a robust, real-time tightly-coupled multi-sensor fusion framework, which fuses the measurement from the LiDAR, inertial sensor, visual camera to achieve robust, accurate state estimation.

R2LIVE is a robust, real-time tightly-coupled multi-sensor fusion framework, which fuses the measurement from the LiDAR, inertial sensor, visual camera to achieve robust, accurate state estimation.

HKU-Mars-Lab 568 Jun 21, 2022
A Modular Framework for LiDAR-based Lifelong Mapping

LT-mapper News July 2021 A preprint manuscript is available (download the preprint). LT-SLAM module is released.

Giseop Kim 209 Jun 23, 2022
[RSS 2021] An End-to-End Differentiable Framework for Contact-Aware Robot Design

DiffHand This repository contains the implementation for the paper An End-to-End Differentiable Framework for Contact-Aware Robot Design (RSS 2021). I

Jie Xu 47 Jun 24, 2022
Ingescape - Model-based framework for broker-free distributed software environments

Ingescape - Model-based framework for broker-free distributed software environments Overview Scope and Goals Ownership and License Dependencies with o

The ZeroMQ project 25 Jun 10, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8k Jul 1, 2022
kvm-host is a minimalist type 2 hypervisor using Linux Kernel-based Virtual Machine (KVM), capable of running Linux kernel partially.

kvm-host kvm-host is a minimalist type 2 hypervisor using Linux Kernel-based Virtual Machine (KVM), capable of running Linux kernel partially. Build a

null 65 Jun 23, 2022
Corvusoft's Restbed framework brings asynchronous RESTful functionality to C++14 applications.

Restbed Restbed is a comprehensive and consistent programming model for building applications that require seamless and secure communication over HTTP

Corvusoft 1.7k Jun 21, 2022
Corvusoft's Restbed framework brings asynchronous RESTful functionality to C++14 applications.

Restbed Restbed is a comprehensive and consistent programming model for building applications that require seamless and secure communication over HTTP

Corvusoft 1.7k Jun 24, 2022
Corvusoft's Restbed framework brings asynchronous RESTful functionality to C++14 applications.

Restbed Restbed is a comprehensive and consistent programming model for building applications that require seamless and secure communication over HTTP

Corvusoft 1.4k Mar 8, 2021
This is the Arduino® compatible port of the AIfES machine learning framework, developed and maintained by Fraunhofer Institute for Microelectronic Circuits and Systems.

AIfES for Arduino® AIfES (Artificial Intelligence for Embedded Systems) is a platform-independent and standalone AI software framework optimized for e

null 129 Jun 27, 2022
Dining philosophers problem is a problem created by Edsger Wybe Dijkstra in 1965 to explain the deadlock state of an operating system, which is traditionally commonly introduced in lectures on operating systems

42-philosophers Dining philosophers problem is a problem created by Edsger Wybe Dijkstra in 1965 to explain the deadlock state of an operating system,

Lavrenova Maria 24 Jun 27, 2022
This software brings you the possibility to Read and Write the internal Flash of the Nordic nRF52 series with an ESP32

ESP32 nRF52 SWD flasher This software brings you the possibility to Read and Write the internal Flash of the Nordic nRF52 series with an ESP32 using t

null 117 Jun 22, 2022
MHPatches is a plugin that brings some of PS2 features of Manhunt to the PC.

MHPatches Intro MHPatches is a plugin that brings some of PS2 features of Manhunt to the PC. Requirements UAL (https://github.com/ThirteenAG/Ultimate-

Fire_Head 22 May 27, 2022
Minimal Linux Live (MLL) is a tiny educational Linux distribution, which is designed to be built from scratch by using a collection of automated shell scripts. Minimal Linux Live offers a core environment with just the Linux kernel, GNU C library, and Busybox userland utilities.

Minimal Linux Live (MLL) is a tiny educational Linux distribution, which is designed to be built from scratch by using a collection of automated shell scripts. Minimal Linux Live offers a core environment with just the Linux kernel, GNU C library, and Busybox userland utilities.

John Davidson 1.3k Jun 20, 2022
A very minimal type-2 hypervisor built using Linux Kernel Virtual Machine for Linux.

wiser A very minimal type-2 hypervisor built using Linux Kernel Virtual Machine for Linux. Following project is under-development expect unfinished co

flouthoc 245 Jun 24, 2022
CS:APP is an excellent material for learning computer systems and systems programming

CS:APP is an excellent material for learning computer systems and systems programming. However, it is inconvenient to use a virtual machine for self-learners. In this repo, I build a Docker image with most pre-requistes installed and attached all lab materials in it.

Guochao Xie 47 Jun 1, 2022