ESE is an embedded / ISAM-based database engine, that provides rudimentary table and indexed access.

Overview

Extensible-Storage-Engine

A Non-SQL Database Engine

The Extensible Storage Engine (ESE) is one of those rare codebases having proven to have a more than 25 year serviceable lifetime. First shipping in Windows NT 3.51 and shortly thereafter in Exchange 4.0, and rewritten twice in the 90s, and heavily updated over the subsequent two decades after that, it remains a core Microsoft asset to this day.

  • It's running on 100s of thousands of machines and millions of disks for the Office 365 Mailbox Storage Backend servers
  • It's also running on large SMP systems with TB of memory for large Active Directory deployments
  • Every single Windows Client computer has several database instances running in low memory modes. In over 1 billion Windows 10 devices today, ESE has been in use in Windows client SKUs since Windows XP

ESE enables applications to store data to, and retrieve data from tables using indexed or sequential cursor navigation. It supports denormalized schemas including wide tables with numerous sparse columns, multi-valued columns, and sparse and rich indexes. ESE enables applications to enjoy a consistent data state using transacted data update and retrieval. A crash recovery mechanism is provided so that data consistency is maintained even in the event of a system crash. ESE provides ACID (Atomic Consistent Isolated Durable) transactions over data and schema by way of a write-ahead log and a snapshot isolation model.

The library provides many other strongly layered and, thus, reusable sub-facilities as well:

  • A synchronization and locking library
  • An STL-like data structures library
  • An OS abstraction layer
  • A Block / Cache Manager

All this is in addition to the full-blown database engine itself.

The version of source we post here will likely be a bit in advance of the version compiled into the latest Windows update. Therefore, the JET API documentation may be out of date with it.

Is this the JET database / engine?

No. Well ... it depends ... the question is not quite correct. Most people do not know that JET was an acronym for an API set, not a specific database format or engine. Just as there is no such thing as "the SQL engine", as there are many implementations of the protocol, there is no "JET engine" or "JET database". It is in the acronym, "Joint Engine Technology". And as such, there are two separate implementations of the JET API. This is the JET Blue engine implementation, see Notes in here. The origin of the colors have an an amusing source by the way. Most people think of the "JET engine" as JET Red, that shipped under Microsoft Access. This is not that "JET engine". We renamed to ESE to try to avoid this confusion, but it seems that the confusion continues to this day.

Future Plans

Comments

You may notice the initial code is without comments! This codebase has a long history of internal development at Microsoft, so, in order to stay on the safe side with the very first release of the source code, we have temporarily removed all comments and excluded certain file types. We will be pushing enhanced and cleaned up comments as we are able to review them.

CMake

We also will be pushing build files, codegen scripts, and a little more infrastructure to get a building ESE. Right now, the code is provided just for instructional purposes only.

Tests

We are initially withholding the test code, and, as with the comments and the codegen scripts, we will be gradually releasing the tests, as well as adding Azure pipelines to run them.

Issues
  • CMakeLists: fix dependency on Microsoft-ETW-ESE.man

    CMakeLists: fix dependency on Microsoft-ETW-ESE.man

    The CMakeLists.txt configuration in src\_etw specifies the non-existing dependency when calling the message compiler for Microsoft-ETW-ESE.man.

    This causes the following error during rebuild in my environment, most likely due to an incorrect build order:

    Generating ../../../../gen/_etw/Microsoft-ETW-ESE.h, ../../../../gen/_etw/Microsoft-ETW-ESE.rc
    mc : error : 0x2 trying to open file <[...]/build/gen/_etw/Microsoft-ETW-ESE.man>.
    

    This PR fixes it by making the command depend on Microsoft-ETW-ESE.man.

    opened by evgenykotkov 2
  • cpage_test.cxx: Fix build error caused by mismatching defines

    cpage_test.cxx: Fix build error caused by mismatching defines

    The PageValidationBig() test uses #ifdef DEBUG and #ifndef RTM as if they were interchangeable, but this is not so at least in the RelWithDebInfo cmake configuration.

    This currently causes the following errors when building the RelWithDebInfo configuration:

    error C2065: 'fPreviouslySet': undeclared identifier
    

    This PR fixes it by consistently using #ifndef RTM, as it's already being used in the other similar tests in cpage_test.cxx.

    (Perhaps, the RelWithDebInfo configuration could use the same defines as Release, but the code discrepancy seems to be worth fixing by itself.)

    opened by evgenykotkov 1
  • Big treasure.

    Big treasure.

    Happy to see microsoft opensource this good database engine.

    Hope you guys continue opensource more and more project.

    Very good move. To open .for all developers.

    We all can got benefit on your opensource codebase.

    opened by netroby 1
  • Fix an out-of-bounds write in CResource::ErrGetParam()

    Fix an out-of-bounds write in CResource::ErrGetParam()

    Current implementation of the CResource::ErrGetParam() method may result in an out of bounds write under certain circumstances.

    This method unconditionally sets *pdwParam to 0, where pdwParam is a DWORD_PTR. On x64, this translates into a 8-byte write to the memory location pointed by pdwParam. However, for the JET_resoperTag param, the pdwParam is allowed to point to a buffer of a lesser size, because the tag's length (JET_resTagSize) is 4 bytes.

    For example, CResource::PvAlloc_() passes a buffer of size JET_resTagSize + 1 (= 5). So the remaining part of the 8-byte write happens into a memory location after the allocated buffer.

    This particular out-of-bounds write has been confirmed to not have any actual security implications, so I am posting this as a public PR.

    The attached patch fixes the problem by adding the same check as in the CResourceManager::ErrGetParam() method.

    Microsoft Reviewers: Open in CodeFlow
    opened by evgenykotkov 0
  • CMake project for ESE, plus unit tests

    CMake project for ESE, plus unit tests

    This PR includes:

    • CMakeLists.txt files necessary to build the core ESE libraries
    • Additional files to make building possible, such as the resource files, codegen scripts, etc.
    • Unit tests
      • the test subtree
      • there are some in the dev subtree: see embeddedunittest
    • The initial BUILDING.md guide
    opened by 2BitSalute 0
  • Create CONTRIBUTING.md and update README.md accordingly

    Create CONTRIBUTING.md and update README.md accordingly

    Moving the contribution section into its own file and adding more details about our current contribution policy. I hope in the wording it is clear that we can accept, e.g., PRs fixing typos in README.md.

    I'm also adding the blurb about the CLA and the process of signing it.

    opened by 2BitSalute 0
  • Migrate FabricBot Tasks to Config-as-Code

    Migrate FabricBot Tasks to Config-as-Code

    TL;DR; Requesting to add FabricBot configuration associated with your repository to .github/fabricbot.json.

    Context

    FabricBot is now a config-as-code-only platform. As a result, while you can still use the FabricBot Configuration Portal to modify your FabricBot configuration, you can no longer save the changes. The only way to save changes to your configuration at the moment is to export configuration from the portal and upload the exported configuration to .github/fabricbot.json in your repository. In this pull request, we are adding your FabricBot configuration to your repository at .github/fabricbot.json so that you can make changes to it going forward.

    While the FabricBot Configuration Portal is the only way to modify your FabricBot configuration at the moment, we have a feature on our backlog to publish the JSON schema defining the structure of the FabricBot configuration file. With the JSON schema, you can (1) use a plaintext editor of your choice to modify the FabricBot configuration file and use the schema to validate the file after editing or (2) configure VS Code to use the schema when editing FabricBot configuration file to take advantage of convenience features such as automatic code completion and field description on mouseover.

    Pull Request Create, a MerlinBot Extension, was used to automatically create this pull request. If you have any questions or concerns with this pull request, please contact MerlinBot Expert DRI.

    Microsoft Reviewers: Open in CodeFlow
    opened by msftbot[bot] 0
  • Why _TLS ptls need duplicate OSTLS's memory?

    Why _TLS ptls need duplicate OSTLS's memory?

    https://github.com/microsoft/Extensible-Storage-Engine/blob/104bfcefaeb946403c43e87aecc1d17d7d7441c9/dev/ese/src/os/thread.cxx#L173

    ostls in _TLS is a struct not a point so sizeof(_TLS) already contains sizeof(OSTLS). Why _TLS's instance ptls need duplicate OSTLS's memory? I tried to delete sizeof(OSTLS) and run the sample. Everything seems good.

    opened by fuxiuyin 0
  • Reimplement CSemaphore using WaitOnAddress() for better performance under contention

    Reimplement CSemaphore using WaitOnAddress() for better performance under contention

    Some time ago I found out that ESE's write throughput is severely capped by the number of concurrent writers.

    The attached jet_concurrent_write_stall.zip example illustrates this. Detailed benchmarks are below, but in short, all cases with more than 5 concurrent writers work even slower than a single writer.

    Apparently, the throughput is limited by the current implementation of CSemaphore. The existing implementation uses a spinlock that falls back to a — slow! — kernel semaphore. When the contention is high and cannot be alleviated by the spinlock, this approach fails to deliver an adequate throughput. CSemaphore is a building block for other primitives such as CCriticalSection, which are in turn used in multiple performance-critical parts such as in the log buffer writer.

    This PR reimplements CSemaphore using WaitOnAddress() / WakeByAddress() as the underlying primitive.

    The new implementation delivers an up to 20x synthetic improvement. Since the CSemaphore is used as a building block for other synchronization primitives, such as CCriticalSection, this translates into the up to ~5.4x throughput improvement in a benchmark with multiple concurrent db writers.

    The benchmarks below were conducted in an environment with an i9-9900K CPU and an NVMe SSD.

    Synthetic benchmark based on the adjusted semaphoreperf.cpp test:

    1 thread:    93850 KOps/Thread  →  93950 KOps/Thread
    2 threads:   20445 KOps/Thread  →  30165 KOps/Thread  (+48 %)
    3 threads:    6083 KOps/Thread  →  10897 KOps/Thread  (+79 %)
    8 threads:     102 KOps/Thread  →   2058 KOps/Thread  (+1917 %)
    16 threads:     45 KOps/Thread  →    719 KOps/Thread  (+1498 %)
    

    Real benchmark with multiple concurrent db writers:

    (The numbers represent the amount of complete "write sequences" and are meaningful only as a relative measurement of throughput)

    1 writer:     8443 Sequences  →   8562 Sequences
    2 writers:   14934 Sequences  →  15331 Sequences
    3 writers:   17434 Sequences  →  17386 Sequences
    4 writers:   16583 Sequences  →  17997 Sequences
    5 writers:   15924 Sequences  →  17431 Sequences
    6 writers:    8437 Sequences  →  17223 Sequences  (+104 %)
    7 writers:    5020 Sequences  →  18145 Sequences  (+261 %)
    8 writers:    4131 Sequences  →  17670 Sequences  (+339 %)
    9 writers:    4168 Sequences  →  17438 Sequences  (+318 %)
    10 writers:   3670 Sequences  →  19709 Sequences  (+437 %)
    

    The patch passes all relevant tests in my environment. CSemaphore is thoroughly tested in semaphore.cxx and indirectly by various other tests.

    opened by evgenykotkov 2
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
A very fast lightweight embedded database engine with a built-in query language.

upscaledb 2.2.1 Fr 10. Mär 21:33:03 CET 2017 (C) Christoph Rupp, [email protected]; http://www.upscaledb.com This is t

Christoph Rupp 531 Jun 20, 2022
An Embedded NoSQL, Transactional Database Engine

UnQLite - Transactional Embedded Database Engine

PixLab | Symisc Systems 1.7k Jun 23, 2022
MySQL Server, the world's most popular open source database, and MySQL Cluster, a real-time, open source transactional database.

Copyright (c) 2000, 2021, Oracle and/or its affiliates. This is a release of MySQL, an SQL database server. License information can be found in the

MySQL 7.9k Jun 24, 2022
A mini database for learning database

A mini database for learning database

Chuckie Tan 3 Nov 3, 2021
C++11 wrapper for the LMDB embedded B+ tree database library.

lmdb++: a C++11 wrapper for LMDB This is a comprehensive C++ wrapper for the LMDB embedded database library, offering both an error-checked procedural

D.R.Y. C++ 257 Jun 16, 2022
libmdbx is an extremely fast, compact, powerful, embedded, transactional key-value database, with permissive license

One of the fastest embeddable key-value ACID database without WAL. libmdbx surpasses the legendary LMDB in terms of reliability, features and performance.

Леонид Юрьев (Leonid Yuriev) 1k Apr 13, 2022
C++ embedded memory database

ShadowDB 一个C++嵌入式内存数据库 语法极简风 支持自定义索引、复合条件查询('<','<=','==','>=','>','!=',&&,||) 能够快速fork出一份数据副本 // ShadowDB简单示例 // ShadowDB是一个可以创建索引、能够快速fork出一份数据分支的C+

null 8 Jun 21, 2022
recovery postgresql table data by update/delete/rollback/dropcolumn command

recovery postgresql table data by update/delete/rollback/dropcolumn command

RadonDB 4 Mar 30, 2022
SpDB is a data integration tool designed to organize scientific data from different sources under the same namespace according to a global schema and to provide access to them in a unified form (views)

SpDB is a data integration tool designed to organize scientific data from different sources under the same namespace according to a global schema and to provide access to them in a unified form (views). Its main purpose is to provide a unified data access interface for complex scientific computations in order to enable the interaction and integration between different programs and databases.

YU Zhi 0 Jun 22, 2022
A hook for Project Zomboid that intercepts files access for savegames and puts them in an SQLite DB instead.

ZomboidDB This project consists of a library and patcher that results in file calls for your savegame(s) being transparently intercepted and redirecte

Oliver 6 May 6, 2022
Tntdb is a c++-class-library for easy access to databases

Tntdb is a c++-class-library for easy access to databases

Tommi Mäkitalo 30 Jan 23, 2022
dqlite is a C library that implements an embeddable and replicated SQL database engine with high-availability and automatic failover

dqlite dqlite is a C library that implements an embeddable and replicated SQL database engine with high-availability and automatic failover. The acron

Canonical 3k Jun 27, 2022
PolarDB for PostgreSQL (PolarDB for short) is an open source database system based on PostgreSQL.

PolarDB for PostgreSQL (PolarDB for short) is an open source database system based on PostgreSQL. It extends PostgreSQL to become a share-nothing distributed database, which supports global data consistency and ACID across database nodes, distributed SQL processing, and data redundancy and high availability through Paxos based replication. PolarDB is designed to add values and new features to PostgreSQL in dimensions of high performance, scalability, high availability, and elasticity. At the same time, PolarDB remains SQL compatibility to single-node PostgreSQL with best effort.

Alibaba 2.3k Jun 30, 2022
A MariaDB-based command line tool to connect to OceanBase Database.

什么是 OceanBase Client OceanBase Client(简称 OBClient) 是一个基于 MariaDB 开发的客户端工具。您可以使用 OBClient 访问 OceanBase 数据库的集群。OBClient 采用 GPL 协议。 OBClient 依赖 libobclie

OceanBase 47 Mar 16, 2022
Database system project based on CMU 15-445/645 (FALL 2020)

Database system project based on CMU 15-445/645 (FALL 2020)

null 15 Jul 1, 2022
HybridSE (Hybrid SQL Engine) is an LLVM-based, hybrid-execution and high-performance SQL engine

HybridSE (Hybrid SQL Engine) is an LLVM-based, hybrid-execution and high-performance SQL engine. It can provide fast and consistent execution on heterogeneous SQL data systems, e.g., OLAD database, HTAP system, SparkSQL, and Flink Stream SQL.

4Paradigm 45 Sep 12, 2021
A friendly and lightweight C++ database library for MySQL, PostgreSQL, SQLite and ODBC.

QTL QTL is a C ++ library for accessing SQL databases and currently supports MySQL, SQLite, PostgreSQL and ODBC. QTL is a lightweight library that con

null 155 Jun 26, 2022
GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.

Overview GridDB is Database for IoT with both NoSQL interface and SQL Interface. Please refer to GridDB Features Reference for functionality. This rep

GridDB 1.8k Jun 27, 2022
ObjectBox C and C++: super-fast database for objects and structs

ObjectBox Embedded Database for C and C++ ObjectBox is a superfast C and C++ database for embedded devices (mobile and IoT), desktop and server apps.

ObjectBox 131 Jun 17, 2022