YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features

Overview

YugabyteDB


License Documentation Status Ask in forum Slack chat Analytics

What is YugabyteDB?

YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features. It is best to fit for cloud-native OLTP (i.e. real-time, business-critical) applications that need absolute data correctness and require at least one of the following: scalability, high tolerance to failures, or globally-distributed deployments.

The core features of YugabyteDB include:

  • Powerful RDBMS capabilities Yugabyte SQL (YSQL for short) reuses the query layer of PostgreSQL (similar to Amazon Aurora PostgreSQL), thereby supporting most of its features (datatypes, queries, expressions, operators and functions, stored procedures, triggers, extensions, etc). Here is a detailed list of features currently supported by YSQL.

  • Distributed transactions The transaction design is based on the Google Spanner architecture. Strong consistency of writes is achieved by using Raft consensus for replication and cluster-wide distributed ACID transactions using hybrid logical clocks. Snapshot and serializable isolation levels are supported. Reads (queries) have strong consistency by default, but can be tuned dynamically to read from followers and read-replicas.

  • Continuous availability YugabyteDB is extremely resilient to common outages with native failover and repair. YugabyteDB can be configured to tolerate disk, node, zone, region, and cloud failures automatically. For a typical deployment where a YugabyteDB cluster is deployed in one region across multiple zones on a public cloud, the RPO is 0 (meaning no data is lost on failure) and the RTO is 3 seconds (meaning the data being served by the failed node is available in 3 seconds).

  • Horizontal scalability Scaling a YugabyteDB cluster to achieve more IOPS or data storage is as simple as adding nodes to the cluster.

  • Geo-distributed, multi-cloud YugabyteDB can be deployed in public clouds and natively inside Kubernetes. It supports deployments that span three or more fault domains, such as multi-zone, multi-region, and multi-cloud deployments. It also supports xCluster asynchronous replication with unidirectional master-slave and bidirectional multi-master configurations that can be leveraged in two-region deployments. To serve (stale) data with low latencies, read replicas are also a supported feature.

  • Multi API design The query layer of YugabyteDB is built to be extensible. Currently, YugabyteDB supports two distributed SQL APIs: Yugabyte SQL (YSQL), a fully relational API that re-uses query layer of PostgreSQL, and Yugabyte Cloud QL (YCQL), a semi-relational SQL-like API with documents/indexing support with Apache Cassandra QL roots.

  • 100% open source YugabyteDB is fully open-source under the Apache 2.0 license. The open-source version has powerful enterprise features such as distributed backups, encryption of data-at-rest, in-flight TLS encryption, change data capture, read replicas, and more.

Read more about YugabyteDB in our Docs.

Get Started

Cannot find what you are looking for? Have a question? Please post your questions or comments on our Community Slack or Forum.

Build Apps

YugabyteDB supports several languages and client drivers. Below is a brief list.

Language ORM YSQL Drivers YCQL Drivers
Java Spring/Hibernate PostgreSQL JDBC cassandra-driver-core-yb
Go Gorm pq gocql
NodeJS Sequelize pg cassandra-driver
Python SQLAlchemy psycopg2 yb-cassandra-driver
Ruby ActiveRecord pg yugabyte-ycql-driver
C# EntityFramework npgsql CassandraCSharpDriver
C++ Not tested libpqxx cassandra-cpp-driver
C Not tested libpq Not tested

What's being worked on?

This section was last updated in Mar, 2021.

Current roadmap

Here is a list of some of the key features being worked on for the upcoming releases (the YugabyteDB v2.7 latest release is expected to be released in Apr, 2021, and the v2.4 stable release was released in Jan 2021).

Feature Status Release Target Progress Comments
Automatic tablet splitting enabled by default PROGRESS v2.6, v2.7 Track
Point in time restores PROGRESS v2.6, v2.7 Track
YSQL: table statistics and CBO PROGRESS v2.7 Track
Support GIN indexes PROGRESS v2.7 Track
[YSQL] Feature support - ALTER TABLE PROGRESS v2.6, v2.7 Track
Online schema migration PROGRESS v2.7 Track
Support Liquibase, Flyway, ORM schema migrations PROGRESS v2.7
Support Spark 3 on YCQL DONE v2.5, v2.6 Track
Incorporate PostgreSQL 12 features PLANNING v2.7 Track
Improving day 2 operations of Yugabyte Platform PROGRESS v2.5 Track
Row-level geo-partitioning PROGRESS v2.7 Track Enhance YSQL language support
Improve TPC-C benchmarking PROGRESS v2.7 Track
Transparently restart transactions PROGRESS v2.5 Track Decrease the incidence of transaction restart errors seen in various scenarios
Pessimistic locking PROGRESS v2.7 Track

Planned additions to the roadmap

The following items are being planned as additions to the roadmap.

Feature Status Release Target Progress Comments
Support pgloader to migrate from MySQL PLANNING Track
Make COLOCATED tables default for YSQL PLANNING Track
Support Kafka as source and sink PLANNING Support source and sink for both YSQL and YCQL
Support for transactions in async xCluster replication PLANNING Track Apply transactions atomically on consumer cluster.

Recently released features

Feature Status Release Target Docs / Enhancements Comments
Support ALTER TABLE add primary key PROGRESS v2.6, v2.7 Track
Identity and access management in YSQL DONE v2.5 Track LDAP and Active Directory support
Follower reads in YSQL BETA v2.5 Issue Ability to perform follower reads for YSQL and transactional tables in YCQL.
YSQL cluster administration features - Node-Level statistics DONE v2.5 Issue Per-node view of currently active queries, find which queries are slow, what active connections are doing, etc.
Support loading large data sets into YSQL using COPY DONE v2.5 Issue Improving transactions which have a very large number of operations, as well as provide various options to batch load data more efficiently
Database runtime activity monitoring DONE v2.5 Issue Activity monitoring, audit logging, inactivity monitoring
Online rebuild of indexes DONE v2.2 Docs coming soon. See pending enhancements
DEFERRED constraints in YSQL DONE v2.2 Docs coming soon. See pending enhancements.
COLOCATED tables GA DONE v2.2 Docs coming soon
Online schema migration framework DONE v2.2 Note that this is just the framework implementation. See planned enhancements in this area.
Distributed backups for transactional tables DONE v2.2 Docs coming soon. See pending enhancements.
IPV6 support for YugabyteDB DONE v2.2 Docs coming soon
Automatic tablet splitting BETA v2.2 Docs See further enhancements
Change data capture BETA This feature is currently available but in beta.
xCluster replication (async cross-cluster replication) DONE v2.1 Docs
Encryption of data at rest DONE v2.1 Docs

Architecture

YugabyteDB Architecture

Review detailed architecture in our Docs.

Need Help?

Contribute

As an an open-source project with a strong focus on the user community, we welcome contributions as GitHub pull requests. See our Contributor Guides to get going. Discussions and RFCs for features happen on the design discussions section of our Forum.

License

Source code in this repository is variously licensed under the Apache License 2.0 and the Polyform Free Trial License 1.0.0. A copy of each license can be found in the licenses directory.

The build produces two sets of binaries:

  • The entire database with all its features (including the enterprise ones) are licensed under the Apache License 2.0
  • The binaries that contain -managed in the artifact and help run a managed service are licensed under the Polyform Free Trial License 1.0.0.

By default, the build options generate only the Apache License 2.0 binaries.

Read More

Comments
  • [#7349] Platform: Support for multiple backup configs.

    [#7349] Platform: Support for multiple backup configs.

    Description: Currently, the platform does not support multiple backup configs for any of the backup config types (viz S3, NFS, GCS, Azure). This feature will help users create multiple backup configs and edit/delete/show all the linked universes functionality.

    We are adding one column to store the name of the config that is config_name.

    incase of any migration failure,

    alter table customer_config drop column config_name ;

    to go back to a stable previous state.

    Test Plan:

    1. Go to configs and hit the backup tab.
    2. Hit the create backup button and fill in all the required details.
    3. Hit the save button, redirecting you to the backup config page view.
    4. You'll see the action button for the respective backup config along with the Edit/Delete/Show universe option.
    5. You can use these option to do perform the respective task.
    6. Took some backups for S3, nfs.

    Here's the screenshot:

    NFS

    Create_NFS

    [info] Test com.yugabyte.yw.commissioner.TaskGarbageCollectorTest.testPurge_invalidData started
    [info] Test run finished: 0 failed, 0 ignored, 7 total, 0.138s
    [error] Failed: Total 1359, Failed 1, Errors 0, Passed 1354, Skipped 4
    [error] Failed tests:
    [error] 	com.yugabyte.yw.commissioner.tasks.RemoveNodeFromUniverseTest
    [error] (Test / test) sbt.TestsFailedException: Tests unsuccessful
    [error] Total time: 2080 s (34:40), completed 2 Jun, 2021 5:44:57 PM
    [email protected] managed % sbt "testOnly com.yugabyte.yw.commissioner.tasks.RemoveNodeFromUniverseTest"
    [info] welcome to sbt 1.3.13 (AdoptOpenJDK Java 1.8.0_292)
    [info] loading settings for project managed-build from plugins.sbt ...
    [info] loading project definition from /Users/mahebhat/Projects/yugabyte/yugabyte-db/managed/project
    [info] loading settings for project root from ui-build.sbt,build.sbt ...
    [info] set current project to yugaware (in build file:/Users/mahebhat/Projects/yugabyte/yugabyte-db/managed/)
    [Yugabyte sbt log] [Resolver] Maven cache server (such as Nexus or Artifactory), specified by YB_MVN_CACHE_URL: List()
    [Yugabyte sbt log] [Resolver] Local resolver (enabled by USE_MAVEN_LOCAL, path can be customized with YB_MVN_LOCAL_REPO): List(cache:Maven2 Local: /Users/mahebhat/.m2/repository)
    [Yugabyte sbt log] [Resolver] Default resolver: Vector(FileRepository(local, Patterns(ivyPatterns=Vector(${ivy.home}/local/[organisation]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)([branch]/)[revision]/[type]s/[artifact](-[classifier]).[ext]), artifactPatterns=Vector(${ivy.home}/local/[organisation]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)([branch]/)[revision]/[type]s/[artifact](-[classifier]).[ext]), isMavenCompatible=false, descriptorOptional=false, skipConsistencyCheck=false), FileConfiguration(true, None)), public: https://repo1.maven.org/maven2/)
    [Yugabyte sbt log] [Resolver] Snapshot resolver for yb-client jar (used when USE_MAVEN_LOCAL is not set, mostly during local development, configured with YB_MVN_SNAPSHOT_URL): List()
    [info] Instrumenting 737 classes to /Users/mahebhat/Projects/yugabyte/yugabyte-db/managed/target/scala-2.12/jacoco/instrumented-classes
    [info] Test run started
    [info] Test com.yugabyte.yw.commissioner.tasks.RemoveNodeFromUniverseTest.testRemoveNodeSuccess started
    [info] Test com.yugabyte.yw.commissioner.tasks.RemoveNodeFromUniverseTest.testRemoveNodeWithMaster started
    [info] Test com.yugabyte.yw.commissioner.tasks.RemoveNodeFromUniverseTest.testRemoveNonExistentNode started
    [info] Test com.yugabyte.yw.commissioner.tasks.RemoveNodeFromUniverseTest.testRemoveNodeRF5 started
    [info] Test com.yugabyte.yw.commissioner.tasks.RemoveNodeFromUniverseTest.testRemoveNodeWithNoDataMoveRF5 started
    [info] Test com.yugabyte.yw.commissioner.tasks.RemoveNodeFromUniverseTest.testRemoveNodeWithNoDataMove started
    [info] Test com.yugabyte.yw.commissioner.tasks.RemoveNodeFromUniverseTest.testRemoveUnknownNode started
    [info] Test run finished: 0 failed, 0 ignored, 7 total, 26.982s
    [info] Passed: Total 7, Failed 0, Errors 0, Passed 7
    [success] Total time: 31 s, completed 2 Jun, 2021 5:45:51 PM
    [email protected] managed % ```
    opened by nishantSharma459 35
  • Very poor performance in comparison to Postgresql

    Very poor performance in comparison to Postgresql

    The test with YugabyteDB 2.9.0 performs about 300 times slower than Postgres :( both with rf=1 and 1 master/tablet and rf=3 and 3 masters/tablets and with rf=1, one master and 3 tablet servers.

    I have a simple test schema: basically a user with UUID primary can subscribe any number of channels also with UUID primary key.

    image

    The script create_data.py creates Python processes, each with own connection, and creates a user, a channels and then proceeds to subscribe the user to a random number of previously created channels (randing from 1 to a few thousands), channel_ids are kept in memory for performance.

    (I attached all the test python files in this gist The schema is created here. )

    I run either python create_data.py YUGABYTE_URL 100 (run on Yugabyte, with 100 parallel connections using 100 Python processes) or python create_data.py POSTGRES_URL 100 (same, on Postgres). (Connection URLs are kept in .env file and loaded with dotenv)

    PostgreSQL achieves about 40,000 subscriptions per second, while YugaByte can't get any higher than ~150 subscriptions/second.

    Disabling indexes does not help. Adding tablets does not help.

    Restarting is done with python create_data.py YUGABYTE_URL --drop, then re-init with python create_data.py YUGABYTE_URL --init.

    The worst part is that tablet servers barely show any load:

    image

    No other processes are running. I run YugaByte 2.9.0 as shown in either rf1-master.sh, rf1-tablet.sh (for replication factor 1) or using rf3-master{1,2,3}.sh rf3-tablet{1,2,3}.sh (obviously each one on their own server).

    The servers have 16 vCPUs and 30G memory (about 1G used). All servers have SSDs. Ubuntu 20.04.

    The load (create_data.py) is created from the 4th machine.

    All are connected using private IPs and the UI correctly shows 3 masters, 3 tablets (or 1 master, 1 tablet/3 tablets in rf1 case). .WARNING logs don't show anything interesting.

    With Python when I go from 1 connection to 100 connections, I get a proportional increase up until about 100 processes, where I get a sustained rate of ~40,000 subscriptions per second...

    Postgres:

    image

    YugaByte 2.9.0:

    image

    With YugaByte I get about 110 subscriptions/sec with 1 connection, about 150/sec with 3 connections and then that's it, doesn't matter if I have 100 connections, and even if I connect to random nodes. It does not matter if the connection stays open or opened for each "batch" of ten users.

    No transactions are performed.

    The same situation is with rf=1 tablet_servers=3, there's really no benefit in running 2 more servers. The worst part is that they all are basically idle:

    image

    The weird part is that inserting 10 random users takes about the same time on Postgres and YugaByteDB: 1-5 seconds on YugaByteDB, 0.4-2.5 seconds on Postgres... But the concurrency on Postgres is much higher.

    image

    Is this expected? Am I missing something here?

    Thank you!

    area/ysql 
    opened by slava-vishnyakov 32
  • [YSQL] invalid locale name

    [YSQL] invalid locale name "en_US.UTF-8"

    [EDIT] See @mbautin 's update here: https://github.com/YugaByte/yugabyte-db/issues/3732#issuecomment-653261887

    Root Cause: << When the installation directory is a symlink, post_install.sh currently is skipping a major part of what it is supposed to be doing (replacing substrings in files to point to the actual installation location). >>


    Manually setting up YB 2.0.11 tservers on Ubuntu 16.04 on AWS and getting the following errors:

    In YugaByte DB, setting LC_COLLATE to C and all other locale settings to en_US.UTF-8 by default. Locale support will be enhanced as part of addressing https://github.com/YugaByte/yugabyte-db/issues/1557initdb: invalid locale name "en_US.UTF-8"
    

    I've followed the advice and troubleshooting from the following links and github issues:

    • https://github.com/yugabyte/yugabyte-db/issues/2000#issuecomment-520945181
    • https://github.com/yugabyte/yugabyte-db/issues/1557
    • https://askubuntu.com/questions/76013/how-do-i-add-locale-to-ubuntu-server
    • https://www.digitalocean.com/community/questions/language-problem-on-ubuntu-14-04
    • https://stackoverflow.com/questions/17712700/postgres-locale-error
    • https://www.postgresql.org/docs/9.4/locale.html
    • etc.

    I've also tried performing variations on the following to ensure the locale of en_US.UTF-8 is installed on the machines:

    locale-gen en_US.UTF-8
    dpkg-reconfigure locales --frontend=noninteractive
    update-locale LC_ALL="en_US.UTF-8" LANG="en_US.UTF-8"
    

    and unfortunately still getting the invalid locale name error.

    Here are the results of locale -a:

    C
    C.UTF-8
    en_US.utf8
    POSIX
    

    and of locale:

    LANG=en_US.UTF-8
    LANGUAGE=
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE="en_US.UTF-8"
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER="en_US.UTF-8"
    LC_NAME="en_US.UTF-8"
    LC_ADDRESS="en_US.UTF-8"
    LC_TELEPHONE="en_US.UTF-8"
    LC_MEASUREMENT="en_US.UTF-8"
    LC_IDENTIFICATION="en_US.UTF-8"
    LC_ALL=
    

    if we cat out /etc/locale.gen we'll see en_US.UTF-8 UTF-8 is uncommented and should be available for when we run locale-gen en_US.UTF-8:

    cat /etc/locale.gen
    ...
    # en_SG.UTF-8 UTF-8
    # en_US ISO-8859-1
    # en_US.ISO-8859-15 ISO-8859-15
    en_US.UTF-8 UTF-8
    # en_ZA ISO-8859-1
    # en_ZA.UTF-8 UTF-8
    ...
    
    locale-gen en_US.UTF-8
    Generating locales (this might take a while)...
      en_US.UTF-8... done
    Generation complete.
    

    and when checking etc/default/locale, I see the following:

    cat /etc/default/locale 
    LANG=en_US.UTF-8
    

    Finally, when cating out the /i18n/supported locales, we see en_US.UTF-8

    cat /usr/share/i18n/SUPPORTED | grep "en_"
    ...
    en_SG.UTF-8 UTF-8
    en_SG ISO-8859-1
    en_US.UTF-8 UTF-8
    en_US ISO-8859-1
    ...
    

    I've tested running the initdb process similarly to how the pg_wrapper calling initdb would run the initdb script and the only success I've had has been if I run with --locale C, --locale POSIX, or --no-locale (same as --locale C):

    reminder, results of `locale -a`:
    C
    C.UTF-8
    en_US.utf8
    POSIX
    
    # these options work fine
    ./yugabyte/postgres/bin/initdb --locale C -D /tmp/ysqltest
    ./yugabyte/postgres/bin/initdb --locale POSIX -D /tmp/ysqltest
    ./yugabyte/postgres/bin/initdb --no-locale -D /tmp/ysqltest
    
    ## all result in...
    
    The database cluster will be initialized with locale "C".
    The default database encoding has accordingly been set to "SQL_ASCII".
    The default text search configuration will be set to "english".
    
    # fails
    ./yugabyte/postgres/bin/initdb --locale en_US.UTF-8 -D /tmp/ysqltest
    ./yugabyte/postgres/bin/initdb --locale en_US.utf8 -D /tmp/ysqltest
    ./yugabyte/postgres/bin/initdb --locale C.UTF-8.utf8 -D /tmp/ysqltest
    
    ## all result in...
    
    initdb: invalid locale name "C.UTF-8" # or some variant
    

    I'm assuming that this section of the initdb operations configuring locale is going to force locale to be en_US.UTF-8 by default, however I could really use some advice to figure out why en_US.UTF-8 is not being configured and made available on Ubuntu 16.04?

    Thanks in advice 👍

    opened by aegershman 32
  • [Jepsen] Recovery after network partitions seems to take a while

    [Jepsen] Recovery after network partitions seems to take a while

    This may not be a real issue in YB; I'm still exploring behavior and collecting more evidence!

    On Yugabyte CE, version 1.1.10.0, recovery from network partitions seems to take a little longer than one might expect. In particular, when performing a series of single-row CQL inserts during a mixture of randomized 3/2 split network partitions, followed by the end of all partitions, 30 seconds for the cluster to heal, and a final read of all rows, that read consistently times out.

    This cluster is running on 5 nodes, with replication factor 3. n1, n2, and n3 are masters; n1, n2, n3, n4, and n5 are tservers.

    For instance, in this test, the second partition (the shaded region in this plot) appears to prevent all clients from performing writes, and even with 30 seconds of recovery before the final read, that read times out:

    latency-raw 64

    This may be expected behavior, or might be some sort of client issue, so I'm going to keep exploring. I'll update this issue as I find out more. :)

    community/request 
    opened by aphyr 31
  • [YSQL] Data lost when use range partition index

    [YSQL] Data lost when use range partition index

    I'm migrating an application from mysql to yugabyte. After import data dumped from mysql, the application run into an error:

    SELECT COUNT(id) from order_items where PurchaseDate > '2020-11-20';
    
    Query error: Given ybctid is not associated with any row in table.
    

    order_items table has about 7000000 records, and has an index on PurchaseDate:

    "idx_33539_idx_oi_PurchaseDate" lsm ("PurchaseDate" DESC)
    
    area/ysql area/docdb 
    opened by iswarezwp 30
  • add SimpleQueryTest CQL testsuite

    add SimpleQueryTest CQL testsuite

    This pull request is for adding SimpleQueryTest suite to Yugabyte-DB's CQL testsuite. I have successfully run the first 7 tests of Cassandra's CQL test suite SimpleQueryTest.java (under Apache license) with Yugabyte-DB. The changes I had to make are as follows:

    SimpleQueryTest.java

    This is a testsuite with many individual tests. The first 7 tests currently run fine, the rest are commented out for now. Goal was to keep the changes to this file to the minimum so that new individual tests can be run easily. The changes to this file are:

    • replaced the inclusion of cassandra package with org.yb.cql
    • calls to flush() has been commented out, as I do not know the equivalent in Yugabyte-DB. The flush() in Cassandra's testsuite is a public method of CQLTester base class that calls the ColumnFamilyStore method forceBlockingFlush()

    CQLTester.java The class SimpleQueryTest inherits from CQLTester. I modified CQLTester.java (also under Apache license) much more to only include the minimum necessary to successfully compile and run SimpleQueryTest. These modifications were important for keeping the changes in SimpleQueryTest to be minimum. The CQLTester.java changes are:

    • import yugabyte-db relevant classes/package
    • made CQLTester a sub-class of BaseCQLTest to pull in yugabyte-db boilerplate
    • modified createTable and execute methods to provide the behavior expected by the individual tests in SimpleQueryTest. For example, the individual tests do not hardcode the table name instead it gets generated. During execution of statements (e.g. select, insert) the CQL statement string gets built with the looked up table name
    • modified assertRows() methods which is an easy way to verify the rows inserted/deleted match the rows selected. This method required more modifications
    • a comparison with Cassandra's CQLTester.java can be used to understand the modifications

    Thanks!

    kind/enhancement 
    opened by ananyaks 25
  • [YCQL] EXPLAIN SELECT causes tserver to crash

    [YCQL] EXPLAIN SELECT causes tserver to crash

    I'm running release version 2.0.1. When running EXPLAIN SELECT * FROM accounts_api.logins WHERE ... via cqlsh the local tserver crashes with:

    Oct 15 17:37:25 watson bash[16013]: *** Aborted at 1571161045 (unix time) try "date -d @1571161045" if you are using GNU date ***
    Oct 15 17:37:25 watson bash[16013]: PC: @     0x7f51308454e6 std::_Maybe_get_result_type<>
    Oct 15 17:37:25 watson bash[16013]: *** SIGILL (@0x7f51308454e6) received by PID 16013 (TID 0x7f50fa4b8700) from PID 813978854; stack trace: ***
    Oct 15 17:37:25 watson bash[16013]: @     0x7f51231d5ba0 (unknown)
    Oct 15 17:37:25 watson bash[16013]: @     0x7f51308454e5  std::_Maybe_get_result_type<>
    Oct 15 17:37:25 watson bash[16013]: @     0x7f513135663e  yb::ql::QLProcessor::ExecuteAsync()
    Oct 15 17:37:25 watson bash[16013]: @     0x7f513135680d  yb::ql::QLProcessor::RunAsync()
    Oct 15 17:37:25 watson bash[16013]: @     0x7f5131febf97  yb::cqlserver::CQLProcessor::ProcessRequest()
    Oct 15 17:37:25 watson bash[16013]: @     0x7f5131fefb64  yb::cqlserver::CQLProcessor::ProcessRequest()
    Oct 15 17:37:25 watson bash[16013]: @     0x7f5131fefed4  yb::cqlserver::CQLProcessor::ProcessCall()
    Oct 15 17:37:25 watson bash[16013]: @     0x7f51320097ad  yb::cqlserver::CQLServiceImpl::Handle()
    Oct 15 17:37:25 watson bash[16013]: @     0x7f512a0dd0d0  yb::rpc::ServicePoolImpl::Handle()
    Oct 15 17:37:25 watson bash[16013]: @     0x7f512a089763  yb::rpc::InboundCall::InboundCallTask::Run()
    Oct 15 17:37:25 watson bash[16013]: @     0x7f512a0e8c57  yb::rpc::(anonymous namespace)::Worker::Execute()
    Oct 15 17:37:25 watson bash[16013]: @     0x7f51286cea18  yb::Thread::SuperviseThread()
    Oct 15 17:37:25 watson bash[16013]: @     0x7f51231cd694 start_thread
    Oct 15 17:37:25 watson bash[16013]: @     0x7f512290a41d __clone
    Oct 15 17:37:25 watson bash[16013]: @                0x0 (unknown)
    Oct 15 17:37:25 watson systemd[1]: yugabyte-tserver.service: main process exited, code=killed, status=4/ILL
    Oct 15 17:37:25 watson systemd[1]: Unit yugabyte-tserver.service entered failed state.
    

    I haven't tried other tables or other queries to see if they cause it to crash.

    Here's the table, if that helps:

    > DESCRIBE TABLE accounts_api.logins;
    CREATE TABLE accounts_api.logins (
        auth_id text,
        auth_type smallint,
        scope text,
        auth_desc text,
        auth_secret text,
        auth_email text,
        auth_updated timestamp,
        scope_string_id text,
        scope_int_id bigint,
        created timestamp,
        updated timestamp,
        last_used timestamp,
        account_ids map<uuid, smallint>,
        flags int,
        alias authidtype,
        PRIMARY KEY (auth_id, auth_type, scope)
    ) WITH CLUSTERING ORDER BY (auth_type ASC, scope ASC)
        AND default_time_to_live = 0
        AND transactions = {'enabled': 'false'};
    
    community/request 
    opened by jameshartig 24
  • Too many Network Connection - High Cpu load after update.

    Too many Network Connection - High Cpu load after update.

    Problem : CPU load on Each T-SERVER, Reason : After Update to Last Version from Previous. NO - WARN message, NO - ERROR NO - FATAL continuesly.

    TCP Package content : Yugabyte consensus - Update .

    400 KiB Bandwith each seconds between servers make cpu load.

    conse_2

    community/request 
    opened by kutayzorlu 23
  • [Platform] Old backups are not deleted by a schedule

    [Platform] Old backups are not deleted by a schedule

    Backups records in DB (table backup) have empty/null values for schedule_uuid. But old backups are selected for further deletion using the next finder:

      public static List<Backup> getExpiredBackups(UUID scheduleUUID) {
        // Get current timestamp.
        Date now = new Date();
        return Backup.find.query().where()
          .eq("schedule_uuid", scheduleUUID)
          .lt("expiry", now)
          .eq("state", BackupState.Completed)
          .findList();
      }
    
    

    So no backups could be selected if they don't have the schedule_uuid specified. Also this explains why the customer is able to delete old backups manually. Possible reason is that the schedule in task_params field has value: "scheduleUUID":null

    If a task, which creates backups, takes schedule_uuid from these parameters then this could be the root cause. Additional information is inside the ticket 661 (application.log and YW db dump).

    priority/high area/platform 2.2 Backport Required 2.4 Backport Completed 
    opened by SergeyPotachev 22
  • No support for time with micro-second precision

    No support for time with micro-second precision

    Example:

    cqlsh> SELECT totimestamp(now()) FROM myapp.stock_market;
    
     totimestamp(now())
    ---------------------------------
     2018-07-02 16:28:44.433000+0000
     2018-07-02 16:28:44.433000+0000
     2018-07-02 16:28:44.434000+0000
     2018-07-02 16:28:44.434000+0000
     2018-07-02 16:28:44.434000+0000
     2018-07-02 16:28:44.434000+0000
    

    As one can see, even though there are 6 significant digits after second, there are only 3 non-zero digits. Therefore, the minimal time precision is only millisecond. Is it possible to get it to the level of micro-second?

    I find this useful in my application. Suppose I have a high rate of requests from many users from multiple servers so that each millisecond sometimes could easily contain multiple requests. Then how do I now the order of the requests? Does TIMEUUID guarantee the correct order? Additionally, if I want the users to know the order of the requests of all users (say I run a blockchain and want the users to be able to verify the results), how could that be easily done without adding much overhead (they do not have YugaByte installed). Is it possible to add a field of id corresponding to the order of insertion with ACID guarantee? Thanks!

    kind/question wontfix 
    opened by yjiangnan 22
  • [YSQL] Foreign Data Wrapper support

    [YSQL] Foreign Data Wrapper support

    This PR adds support for postgres_fdw. This PR targets the following github issue: https://github.com/yugabyte/yugabyte-db/issues/5714. At the current stage, it is possible to do the following:

    bash-4.2$ ysqlsh
    ysqlsh (11.2-YB-2.7.3.0-b0)
    Type "help" for help.
    
    yugabyte=# create role tenant1 with login password 'tenant1' connection limit 10;
    CREATE ROLE
    yugabyte=# create database tenant1db owner = tenant1;
    
    CREATE DATABASE
    yugabyte=# create role tenant2 with login password 'tenant2' connection limit 10;
    CREATE ROLE
    yugabyte=# create database tenant2db owner = tenant2;
    CREATE DATABASE
    yugabyte=# \connect tenant2db
    You are now connected to database "tenant2db" as user "yugabyte".
    tenant2db=# set role tenant2;
    SET
    tenant2db=> create table table1tenant2 (rowid int, rowval text);
    CREATE TABLE
    tenant2db=> set role yugabyte;
    SET
    tenant2db=# grant select, insert on table table1tenant2 to yugabyte;
    GRANT
    tenant2db=# \connect tenant1db
    You are now connected to database "tenant1db" as user "yugabyte".
    tenant1db=# create extension postgres_fdw;
    CREATE EXTENSION
    tenant1db=# CREATE SERVER foreigndb_fdw FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '127.0.0.1', port '5433', dbname 'tenant2db');
    CREATE SERVER
    tenant1db=# GRANT USAGE ON FOREIGN SERVER foreigndb_fdw TO yugabyte;
    GRANT
    tenant1db=# CREATE USER MAPPING FOR yugabyte SERVER foreigndb_fdw OPTIONS (user 'tenant2', password 'tenant2');
    CREATE USER MAPPING
    tenant1db=# CREATE FOREIGN TABLE table1tenant2 (
    tenant1db(#   rowid integer OPTIONS (column_name 'rowid'),
    tenant1db(#   rowval text OPTIONS (column_name 'rowval')
    tenant1db(# ) SERVER foreigndb_fdw
    tenant1db-# OPTIONS (schema_name 'public', table_name 'table1tenant2');
    CREATE FOREIGN TABLE
    tenant1db=# select * from table1tenant2;
     rowid | rowval
    -------+--------
    (0 rows)
    
    tenant1db=# insert into table1tenant2 (rowid, rowval) values (1, 'aaa');
    INSERT 0 1
    tenant1db=# select * from table1tenant2;
     rowid | rowval
    -------+--------
         1 | aaa
    (1 row)
    
    tenant1db=# \connect tenant2db
    You are now connected to database "tenant2db" as user "yugabyte".
    tenant2db=# set role tenant2;
    SET
    tenant2db=> select * from table1tenant2;
     rowid | rowval
    -------+--------
         1 | aaa
    (1 row)
    
    tenant2db=>
    

    There are some immediate issues:

    1. I am not yet able to import foreign schema:
    tenant1db=# IMPORT FOREIGN SCHEMA public LIMIT TO (table1tenant2) FROM SERVER foreigndb_fdw INTO public;
    ERROR:  COLLATE not supported yet
    LINE 3:   rowval text OPTIONS (column_name 'rowval') COLLATE pg_cata...
                                                         ^
    HINT:  See https://github.com/YugaByte/yugabyte-db/issues/1127. Click '+' on the description to raise its priority
    QUERY:  CREATE FOREIGN TABLE table1tenant2 (
      rowid integer OPTIONS (column_name 'rowid'),
      rowval text OPTIONS (column_name 'rowval') COLLATE pg_catalog."default"
    ) SERVER foreigndb_fdw
    OPTIONS (schema_name 'public', table_name 'table1tenant2');
    

    ~~It seems that I was not able to locate the correct unsupported COLLATE case (https://github.com/YugaByte/yugabyte-db/issues/1127).~~ This one is due to this: https://github.com/yugabyte/yugabyte-db/blob/v2.7.2/src/postgres/src/backend/parser/gram.y#L3722, as an interim solution I can configure and compile YugabyteDB with YB_POSTGRES_WITH_ICU=[true | 1 | t | yes | y] environment variable, then it all works:

    tenant1db=# IMPORT FOREIGN SCHEMA public LIMIT TO (table1tenant2) FROM SERVER foreigndb_fdw INTO public;
    IMPORT FOREIGN SCHEMA
    
    1. Not sure if this is a real issue:
    tenant1db=> IMPORT FOREIGN SCHEMA public LIMIT TO (table1tenant2) FROM SERVER foreigndb_fdw INTO public;
    ERROR:  password is required
    DETAIL:  Non-superuser cannot connect if the server does not request a password.
    HINT:  Target server's authentication method must be changed.
    
    1. Currently, the postgres_fdw extension has been hard coded in the shared_preload_libraries configuration directive. Preferably, this should be supported as a user configurable switch. However, for this to be the case, the following PR would have to go in first: https://github.com/yugabyte/yugabyte-db/pull/9576.
    2. Regression tests have to be adapted.
    3. I would assume some a test suite more comprehensive than a regression test suite might be needed.
    4. Obviously, there might be some limitations on the DocDB side leading to this not working in a distributed environment, which I am not aware of.
    5. Needs to be tested in a distributed cluster. Currently tested using the single node launched via yugabyted.

    Early feedback welcomed.

    opened by radekg 21
  • [DocDB] Enable lease revocation

    [DocDB] Enable lease revocation

    Jira Link: DB-4778

    Description

    GH13611 disable lease revocation feature to overcome the issues like GH14225. With 14225 fixed, we can enable lease revocation again.

    kind/bug area/docdb priority/medium 2.12 Backport Required 2.14 Backport Required 2.16 Backport Required 
    opened by rthallamko3 0
  • [YSQL] Scan filters don't prevent YB Batched Nested Loop joins from being created when they should

    [YSQL] Scan filters don't prevent YB Batched Nested Loop joins from being created when they should

    Jira Link: DB-4776

    Description

    Consider the following query

    CREATE TABLE p1 (a int, b int, c varchar, primary key(a,b));
    INSERT INTO p1 SELECT i, i % 25, to_char(i, 'FM0000') FROM generate_series(0, 599) i WHERE i % 2 = 0;
    CREATE INDEX p1_b_idx ON p1 (b ASC);
    SET enable_hashjoin = off;
    SET enable_mergejoin = off;
    SET enable_material = off;
    
    SET yb_bnl_batch_size = 3;
    /*+ IndexScan(p1 p1_b_idx) Leading((p2 p1)) */ EXPLAIN (COSTS OFF) SELECT * FROM p1 JOIN p2 ON p1.a = p2.b AND p2.a = p1.b;
                         QUERY PLAN                      
    -----------------------------------------------------
     YB Batched Nested Loop Join
       Join Filter: ((p1.a = p2.b) AND (p1.b = p2.a))
       ->  Seq Scan on p2
       ->  Index Scan using p1_b_idx on p1
             Index Cond: (b = ANY (ARRAY[p2.a, $1, $2]))
             Filter: (p2.b = a)
    

    We see that while the index condition handles batches of p2, the filter does not. This will lead to incorrect results as we see here:

    SET yb_bnl_batch_size = 3;
    /*+ IndexScan(p1 p1_b_idx) Leading((p2 p1)) */ SELECT * FROM p1 JOIN p2 ON p1.a = p2.b AND p2.a = p1.b;
     a  | b  |  c   | a  | b  |  c   
    ----+----+------+----+----+------
     18 | 18 | 0018 | 18 | 18 | 0018
    (1 row)
    

    If we disable batching then we get the results:

    SET yb_bnl_batch_size = 1;
    /*+ IndexScan(p1 p1_b_idx) Leading((p2 p1)) */ SELECT * FROM p1 JOIN p2 ON p1.a = p2.b AND p2.a = p1.b;
     a  | b  |  c   | a  | b  |  c   
    ----+----+------+----+----+------
     12 | 12 | 0012 | 12 | 12 | 0012
      6 |  6 | 0006 |  6 |  6 | 0006
     18 | 18 | 0018 | 18 | 18 | 0018
      0 |  0 | 0000 |  0 |  0 | 0000
     24 | 24 | 0024 | 24 | 24 | 0024
    (5 rows)
    

    This is because yb_get_batched_index_paths in indxpath.c does not look through the filter clauses to see if they should inhibit batching anymore. This is functionality that used to exist before 5bccceff7f583e93046a0a4882feccb50949b880 but was unintentionally removed.

    kind/bug area/ysql priority/medium status/awaiting-triage 
    opened by tanujnay112 0
  • [docs][PLAT-6511] Document Helm overrides for YBA chart

    [docs][PLAT-6511] Document Helm overrides for YBA chart

    • Adds section for each override, that way we can link to them.
    • Switch to using a values YAML file instead of just --set flag. The flag is hard to use in most of the cases, and some users tend to forget what all they had applied. On the other hand using YAML is easier, and it can be version controlled.

    Preview: https://deploy-preview-15495--infallible-bardeen-164bc9.netlify.app/preview/yugabyte-platform/install-yugabyte-platform/install-software/kubernetes

    opened by bhavin192 1
  • [DocDB] Do not log version edits by default

    [DocDB] Do not log version edits by default

    Jira Link: DB-4775

    Description

    Do not log version edits by default. We currently log each RocksDB version edit at the time it is appended. It frequently containus many schema packing versions, overflowing the glog buffer. Also it would be good to avoid printing RocksDB keys in the log.

    kind/bug area/docdb priority/medium status/awaiting-triage 
    opened by mbautin 0
  • [docs] Quote urls in wget command to work under macos zsh shell because it's interpreting the `?` in the URL as a globbing wildcard

    [docs] Quote urls in wget command to work under macos zsh shell because it's interpreting the `?` in the URL as a globbing wildcard

    fixes #15467

    Searching wget zsh: no matches found brought me to https://old.reddit.com/r/zsh/comments/h9mdvc/why_do_i_get_this_error_zsh_no_matches_found/ so I quoted the URL.

    Works on linux ubuntu 18.04, should work on macos too.

    cc @yushenng please try and confirm the changes?

    opened by ddorian 1
Releases(v2.16.0.0)
  • v2.16.0.0(Dec 15, 2022)

  • v2.8.11.0(Dec 15, 2022)

  • v2.17.0.0(Dec 14, 2022)

  • v2.14.5.0(Nov 18, 2022)

  • v2.15.3.2(Nov 9, 2022)

  • v2.14.4.0(Oct 28, 2022)

  • v2.15.3.0(Oct 28, 2022)

  • v2.8.10.0(Oct 28, 2022)

  • v2.14.2.1(Oct 27, 2022)

  • v2.8.9.1(Oct 20, 2022)

  • v2.15.2.1(Oct 11, 2022)

  • v2.14.3.1(Oct 12, 2022)

  • v2.14.3.0(Oct 11, 2022)

  • v2.6.20.0(Oct 11, 2022)

  • v2.14.2.0(Sep 14, 2022)

  • v2.15.2.0(Sep 14, 2022)

  • v2.12.10.0(Sep 14, 2022)

  • v2.12.9.2(Sep 1, 2022)

  • v2.8.9.0(Sep 7, 2022)

  • v2.14.1.0(Aug 10, 2022)

  • v2.6.19.0(Aug 22, 2022)

  • v2.15.1.0(Jul 22, 2022)

  • v2.12.9.1(Jul 22, 2022)

  • v2.14.0.0(Jul 14, 2022)

  • v2.12.6.1(Jul 12, 2022)

  • v2.12.8.0(Jul 7, 2022)

  • v2.15.0.0(Jun 30, 2022)

  • v2.12.7.0(Jun 25, 2022)

  • v2.12.4.2(Jun 18, 2022)

  • v2.8.7.0(Jun 30, 2022)

Owner
yugabyte
The high-performance distributed SQL database for global, internet-scale apps.
yugabyte
HybridSE (Hybrid SQL Engine) is an LLVM-based, hybrid-execution and high-performance SQL engine

HybridSE (Hybrid SQL Engine) is an LLVM-based, hybrid-execution and high-performance SQL engine. It can provide fast and consistent execution on heterogeneous SQL data systems, e.g., OLAD database, HTAP system, SparkSQL, and Flink Stream SQL.

4Paradigm 45 Sep 12, 2021
PGSpider: High-Performance SQL Cluster Engine for distributed big data.

PGSpider: High-Performance SQL Cluster Engine for distributed big data.

PGSpider 132 Sep 8, 2022
PolarDB for PostgreSQL (PolarDB for short) is an open source database system based on PostgreSQL.

PolarDB for PostgreSQL (PolarDB for short) is an open source database system based on PostgreSQL. It extends PostgreSQL to become a share-nothing distributed database, which supports global data consistency and ACID across database nodes, distributed SQL processing, and data redundancy and high availability through Paxos based replication. PolarDB is designed to add values and new features to PostgreSQL in dimensions of high performance, scalability, high availability, and elasticity. At the same time, PolarDB remains SQL compatibility to single-node PostgreSQL with best effort.

Alibaba 2.5k Dec 31, 2022
pgagroal is a high-performance protocol-native connection pool for PostgreSQL.

pgagroal is a high-performance protocol-native connection pool for PostgreSQL.

Agroal 555 Dec 27, 2022
Kunlun distributed DBMS is a NewSQL OLTP relational distributed database management system

Kunlun distributed DBMS is a NewSQL OLTP relational distributed database management system. Application developers can use Kunlun to build IT systems that handles terabytes of data, without any effort on their part to implement data sharding, distributed transaction processing, distributed query processing, crash safety, high availability, strong consistency, horizontal scalability. All these powerful features are provided by Kunlun.

zettadb 114 Dec 26, 2022
dqlite is a C library that implements an embeddable and replicated SQL database engine with high-availability and automatic failover

dqlite dqlite is a C library that implements an embeddable and replicated SQL database engine with high-availability and automatic failover. The acron

Canonical 3.3k Jan 9, 2023
A PostgreSQL extension providing an async networking interface accessible via SQL using a background worker and curl.

pg_net is a PostgreSQL extension exposing a SQL interface for async networking with a focus on scalability and UX.

Supabase 49 Dec 14, 2022
High-performance time-series aggregation for PostgreSQL

PipelineDB has joined Confluent, read the blog post here. PipelineDB will not have new releases beyond 1.0.0, although critical bugs will still be fix

PipelineDB 2.5k Dec 26, 2022
Nebula Graph is a distributed, fast open-source graph database featuring horizontal scalability and high availability

Nebula Graph is an open-source graph database capable of hosting super large scale graphs with dozens of billions of vertices (nodes) and trillions of edges, with milliseconds of latency.

vesoft inc. 834 Dec 24, 2022
Distributed PostgreSQL as an extension

What is Citus? Citus is a PostgreSQL extension that transforms Postgres into a distributed database—so you can achieve high performance at any scale.

Citus Data 7.7k Dec 30, 2022
DuckDB is an in-process SQL OLAP Database Management System

DuckDB is an in-process SQL OLAP Database Management System

DuckDB 7.8k Jan 3, 2023
TimescaleDB is an open-source database designed to make SQL scalable for time-series data.

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.

Timescale 14.3k Jan 2, 2023
A friendly and lightweight C++ database library for MySQL, PostgreSQL, SQLite and ODBC.

QTL QTL is a C ++ library for accessing SQL databases and currently supports MySQL, SQLite, PostgreSQL and ODBC. QTL is a lightweight library that con

null 173 Dec 12, 2022
upstream module that allows nginx to communicate directly with PostgreSQL database.

About ngx_postgres is an upstream module that allows nginx to communicate directly with PostgreSQL database. Configuration directives postgres_server

RekGRpth 1 Apr 29, 2022
A framework to monitor and improve the performance of PostgreSQL using Machine Learning methods.

pg_plan_inspector pg_plan_inspector is being developed as a framework to monitor and improve the performance of PostgreSQL using Machine Learning meth

suzuki hironobu 198 Dec 27, 2022
BaikalDB, A Distributed HTAP Database.

BaikalDB supports sequential and randomised realtime read/write of structural data in petabytes-scale. BaikalDB is compatible with MySQL protocol and it supports MySQL style SQL dialect, by which users can migrate their data storage from MySQL to BaikalDB seamlessly.

Baidu 1k Dec 28, 2022
GalaxyEngine is a MySQL branch originated from Alibaba Group, especially supports large-scale distributed database system.

GalaxyEngine is a MySQL branch originated from Alibaba Group, especially supports large-scale distributed database system.

null 281 Jan 4, 2023
Test any type of cloud database on Android apps. No need of a dedicated backend.

DB Kong - Database connections simplified DB Kong is an Android library that allows you to connect, interact and test any type of cloud database on An

Arjun 9 May 9, 2022
OrioleDB – building a modern cloud-native storage engine

OrioleDB is a new storage engine for PostgreSQL, bringing a modern approach to database capacity, capabilities and performance to the world's most-loved database platform.

OrioleDB 1.3k Dec 31, 2022