Kunlun distributed DBMS is a NewSQL OLTP relational distributed database management system

Related tags

Database kunlun
Overview

Introduction

For more information, resources, documentation of Kunlun distributed RDBMS, please visit www.zettadb.com See ReleaseNotes.md for the list of features released in each version of Kunlun distributed DBMS. Join our Slack channel kunlun_users to get involved in technical discussions with our developers and our community.

To build computing node program from source, use build.sh directly or refer to it for instructions. To build kunlun-storage from source, see kunlun-storage/INSTALL.kunlun.md for instructions. To build cluster_mgr from source, see cluster_mgr/README for instructions.

Refer to INSTALL.kunlun.md to install Kunlun distributed DBMS cluster.

KunLun Distributed database cluster

KunLun distributed database cluster(Kunlun) is a distributed relational database management system developed to manage massive amount (tera-bytes up to peta-bytes) of relational data and serve massive concurrent data read and/or write access workloads with low latency and high throughput. It provides robust transaction ACID guarantees, high scalability, high availability and transparent data partitioning and elastic horizontal scale-out capabilities, and standard SQL query support over distributed or partitioned data. All of these features altogether are known as NewSQL capabilities, i.e. Kunlun is a NewSQL OLTP distributed RDBMS.

Users and applications could connect to Kunlun using JDBC/ODBC, and client libraries of PostgreSQL (and MySQL in future), as well as client libraries for scriting languages like php/python/go/ruby/.net, and interact with Kunlun exactly the same way they would do with a standalone MySQL or PostgreSQL database instance, using standard SQL using standard SQL, and get all the above NewSQL benefits without any work or effort on the client side. Especially, applications can utilize OR mapping tools like Hibernate and Mybatis to access relational data so as to avoid manually writing SQL statements in application code.

By using Kunlun, users can quickly develop robust, highly available and highly scalable information systems that are capable of processing hundreds of terabytes of data or more with no engineering effort to implement the NewSQL features. All the technical&engineering challenges are conquered by Kunlun, which greatly reduces the cost and difficulty and timespan required to develop such system and improves the overall quality (availability, robustness, stability, scalability, and performance) of such systems.

See ReleaseNotes.txt for the detailed list of features released in each version.

Architecture

A KunLun distributed database cluster consists of two types of components: one or more computing nodes, one or more storage shards. And it also shares with other Kunlun clusters a cluster_manager program and a meta-data cluster.

This piece of software is Kunlun's computing node. Users are supposed to use the kunlun-storage software which is a deeply engineered branch of percona-mysql-8.0 to setup their storage shards and metadata shards, because Kunlun needs some supporting features which only exist in kunlun-storage, and also kunlun-storage has fixes of all community MySQL-8.0 XA transaction crash safety bugs and pitfalls. And also, kunlun-storage has more than 50% performance improvement in terms of XA transaction processing compared to the same version of MySQL-8.0.x.

A Computing node accepts and validates client connections using PostgreSQL client protocol(MySQL protocol will be supported in future), and execute SQL statements from connected client connections by interacting with the cluster's storage shards. Users can add more computing nodes as their workloads grow, each and every computing node can serve user read/write requests. A Kunlun cluster's computing nodes locally has all the meta-data of all database objects(tables, views, materialized views, sequences, stored procs/functions, users/roles and priviledges etc), but they don't store user data locally. Instead, computing nodes store it in storage shards.

To execute a client SQL query, a computing node parses the client SQL query, optimizes it and at execution, it forms one or more SQL queries to send to the target storage shards which contain portions of target data it needs for the client SQL query. And if the query is a SELECT or an INSERT/DELETE/UPDATE...RETURNING statement instead of a bare INSERT/DELETE/UPDATE statement, the computing node gets partial results from all target storage shards, and assembles them into final result to reply to the client.

User data is stored in one or more storage shards, not in computing nodes. Each storage shard stores a subset of all user data in the KunLun cluster, data in different storage shards don't overlap(i.e. share nothing). Users can extend or shrink the NO. of shards as their data volumns and workloads grow or shrink. A storage shard is a MySQL binlog replication cluster, which currently uses standard MySQL binlog replication(MGR) to achieve high availability. The primary node of each shard receives from computing nodes write and/or read SQL queries to insert/update/delete user data, or return target user data. And the MySQL node executes such SQL queries and return results to the requesting computing node.

A meta-data shard is also a kunlun-storage cluster. It stores the meta-data of a Kunlun cluster. Multiple KunLun clusters can share the same metadata cluster. A cluster manager program runs as daemon process to maintain correct running status for one or more Kunlun clusters, it takes little computing resources during its work.

Advantages

Kunlun is currently developped based on PostgreSQL-11.5. In order to support some advanced features such as automatic DDL synchronization, distributed transactions processing, etc, we modified PostgreSQL code extensively rather than simply using its FDW. We modified PostgreSQL in a modular and least intrusive way so that we can easily keep upgrading with official upstream PostgreSQL releases.

Kunlun distributed database cluster is built for high scalability, high availability, ACID guarantees of distributed transactions, and full-fledged distributed query processing and elastic horizontal scalability.

Highly Scalable

Kunlun clusters are highly scalable. It not only scales up but also scales out: users can add more computing nodes to have more query processing power, every computing node can serve both write and read workloads; And users(DBAs) can add more storage shards for more data storage and transaction processing capability and Kunlun will automatically move parts of data to the new shards to balance workloads.

Highly Available(HA)

Kunlun clusters are highly available, partial node failures won't harm the availability of the cluster. For any single storage shard or meta-data cluster of 2*N+1 MySQL nodes, the shard/cluster can resist N simultaneous node failures and remain writable; and it can remain readable as long as one kunlun-storage node is still working;

And for computing nodes, as long as there is one computing node working, a Kunlun cluster can keep serving clients. The crash/stall/hang of one computing node doesn't affect any other parts of a Kunlun cluster. And a computing node doesn't need replicas for HA because a computing node's entire state can be rebuilt using the metadata cluster. DBAs can add an empty computing node at any time to a Kunlun cluster and the new empty computing node will automatically upgrade itself to latest local state by connecting to metadata cluster and replay the accumulated DDL logs. When executing concurrent DDLs, computing nodes are well coordinated so that every computing node execute exactly the same sequence of DDL operations so that their local states are identical always.

Distributed Transaction&Query Processing

Kunlun distributed query processing aims to relieve users from having to write SQL queries according to their data's distribution, i.e. it partitions user data transparently. This is achieved partly via its distributed transaction processing features, and partly via its distributed query processing features. With the help of these features, users can simply write SQL queries as if they were using a traditional standalone PostgreSQL/MySQL database, they don't have to know or consider in which storage shards certain portions of data are stored in order to write a working SQL query and transaction.

Kunlun distributed database cluster automatically does distributed transaction processing using the robust and well validated two phase commit(2PC) protocol, and as long as storage shards is fully "XA resillient", a distributed transaction has ACID guarantees. However currently no official releases of MySQL community server is fully "XA resillient", they all have a list of unsupported XA features. This doc and this one has the full list of unsupported features that make official MySQL not XA resillient. When you use official MySQL(including Percona-mysql), if one or more primary nodes and/or replica nodes go down or MySQL binlog replication is broken/stopped, it's possible that some distributed transactions lose partial committed changes or becomes inconsistent, or MySQL binlog replication fails to resume working. In Kunlun we provide kunlun-storage, which is an enhanced MySQL branch which proved to be fully XA resillient, and all these 'XA resillience' issues are well solved.

Kunlun's distributed query processing is made as an integral part of PostgreSQL's query parse/optimization/execution process. Optimization of remote queries takes into account the network transfer cost of partial data from storage shards to the initiating computing node, and the extra cost inside computing node to process such partial results from storage shards. And we try to minimize the overall cost of remote query processing in our query optimizations.

As of this latest version, Kunlun can handle cross shard table joins and aggregates queries, and regular insert/delete/update/select statements, and it supports prepared statements, sequences, and all regular DDLs. More query optimization work is still going on and will be released soon.

With transparent SQL compatibility, application developers can utilize standard SQL workflows and tool chains to streamline their workflow for premium efficiency and productivity. For example they can use OR mapping tools like hibernate or MyBatis to avoid writing SQL statements by hand in their application development, which would be impossible if they were using sharding middleware or doing sharding in application code or using some other sharding solution which isn't totally SQL compatible. Our aim is to keep Kunlun as SQL compatible as the PostgreSQL version we base on, except the features that we explicitly reject to support, such as triggers, foreign keys, etc.

SQL features NOT supported in Kunlun

create table ... select from select into ... from ... foreign keys triggers multi-table update/delete statements, i.e. updating/deleting rows of multiple tables in one statement, and updating/deleting rows of one table by identifying target rows by joining other tables.

All other standard SQL syntax will be supported.

Standard SQL Data types supported

All standard SQL data types supported by PostgreSQL, and some PostgreSQL extended data types, are supported, as detailed below.

All integer types and numeric types: bool, smallint, int, bigint, numeric(P,S), money, float/real, double;
All text types: char(N), varchar(N), text, blob, user defined enum types, and bit(N)/varbit(N))
Most date and/or time types, including date, time, timetz, timestamp, timestamptz. interval is not yet supported.
Some PostgreSQL private types are supported: Name, Oid, CID, TID, XID, LSN, macaddr, macaddr8, cidr, uuid.

PostgreSQL specific SQL features that Kunlun won't support

Cursors statements

DECLARE, FETCH, MOVE stmts, and the use of cursors in UPDATE/DELETE stmts, will never be supported, the result of using them is undefined.

unsupported table options

ON COMMIT clause in 'CREATE TABLE' is not supported. create table using table inheritance is NOT supported. Using both of these stmts is behavior undefined.

Tablespaces

CREATE/ALTER TABLESPACE, and tablespace settings used anywhere including CREATE/ALTER database/table/index.

Indexing settings

Exclude, include, COLLATE, and partial indexing. Specifying them produces an error, they'll be never supported as restricted by mysql.

Storage settings for db/table/index

These include 'with oids' table option, the tablespace settings and table storage parameters for dbs, tables and indexes. Since user tables are not 'stored' in computing nodes, no such storage related settings are supported. Specifying 'WITH OIDS' setting and any storage parameter of tables other than the newly added 'shard' parameter, produces an error. Storage parameters for indexes and attributes are simply ignored; All tablespace settings are ignored.

PostgreSQL specific data types

Some PostgreSQL's private extensions are NOT and will never supported, including arrays, ranges, vectors, composite types, row types, table inheritance, table/relation types, etc.
Domains and user defined types that derive from basic numeric or text types are not supported now but they may be supported in future; All other user defined domains/types except enum types will not be supported.
json and spatial types will be supported in future.

Automatic DDL synchronization

Kunlun DDC supports automatic DDL synchronization, that is, any DDL statement executed in one computing node automatically takes effect on all computing nodes of the same cluster. And also the DDLs required to execute in relevant storage shards are automatically executed as part of the DDL statement execution. And such operations are made crash safe, so that if during the execution of such a DDL any computing node(s) or storage shard node(s) terminates/exits for any reason, the entire system data and metadata is consistent and integral. So this feature can greatly relieve DBA and devops engineers from repetitive and error prone routines and greatly improve their productivity.

For example, suppose we have connection conn1 connected to computing node CN1, and connection conn2 connected to computing node CN2. Now we create a table tx via conn1, during the DDL execution, the table is properly defined in CN1's catalog, and tx's storage table is automatically properly created in the selected storage shard, and after the "create table" statement completes, immediately(in configurable period of time) in CN2 and all other computing nodes we can use/access the table tx.

We name this feature 'automatic DDL synchronization'. All DDLs that involve storage shards are well supported, including create/drop/alter table/index/partition/sequence/view/materialized view/database/schema statements; And all commonly used DDLs in PostgreSQL are supported by kunlun DDC.

Cautions

Although Kunlun DDC is under active development, it's still not suitable for production use, it's ready for POC now. You are encouraged to try it out and report any requirements or issues to us.

Do not modify anything (table, stored procedure, etc) in Kunlun_Metadata_DB database of the meta-data shard manually, otherwise Kunlun DDC may not work correctly and you may lose your data. At the same time, do not manually modify any metadata tables(i.e. whose names start with pg_ ) in computing nodes, such as pg_shard, pg_shard_node, etc, otherwise Kunlun DDC may not work correctly and you may lose your data. You can only modify system metadata using SQL commands and/or scripts provided in Kunlun DDC.

Contact

Although we already have many useful features, some very useful features are still being actively developed, and you are welcome to give us feedbacks, bug reports and feature requests in this github page. Also please visit www.zettadb.com for more information about Kunlun distributed DBMS, visit downloads.zettadb.com to download docker images, built binaries and pdf docs about Kunlun distributed database, and bugs.zettadb.com for the bugs and tasks we completed and plan to work on.

You are welcome to join our Slack channel kunlun_users, search for 'kunlun_users' (without the quotes) in slack or click this link to join: https://join.slack.com/t/kunlunusers/shared_invite/zt-yslip1ev-R9N~7UdsTKX~huPDtHZHEw

Comments
  • Lateral expression crash in table join queries

    Lateral expression crash in table join queries

    Issue migrated from trac ticket # 43

    component: computing nodes | priority: major | resolution: fixed

    2021-04-15 16:54:38: @jd-zhang created the issue


    sql code:

    CREATE TABLE INT4_TBL(f1 int4); CREATE TABLE INT8_TBL(q1 int8, q2 int8); CREATE TABLE TEXT_TBL (f1 text);

    explain (verbose, costs off) select * from text_tbl t1 left join int8_tbl i8 on i8.q2 = 123, lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, lateral (select ss1.* from text_tbl t3 limit 1) as ss2 where t1.f1 = ss2.f1; psql: server closed the connection unexpectedly < This probably means the server terminated abnormally < before or while processing the request.

    -- AND no explain--

    select * from text_tbl t1 left join int8_tbl i8 on i8.q2 = 123, lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, lateral (select ss1.* from text_tbl t3 limit 1) as ss2 where t1.f1 = ss2.f1; psql: server closed the connection unexpectedly < This probably means the server terminated abnormally < before or while processing the request.

    -- AND --

    select * from text_tbl t1 left join int8_tbl i8 on i8.q2 = 123, lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, lateral (select ss1.* from text_tbl t3 limit 1) as ss2 where t1.f1 = ss2.f1; psql: server closed the connection unexpectedly < This probably means the server terminated abnormally < before or while processing the request.

    -- AND --

    select ss2.* from int4_tbl i41 left join int8_tbl i8 join (select i42.f1 as c1, i43.f1 as c2, 42 as c3 from int4_tbl i42, int4_tbl i43) ss1 on i8.q1 = ss1.c2 on i41.f1 = ss1.c1, lateral (select i41., i8., ss1.* from text_tbl limit 1) ss2 where ss1.c2 = 0;

    --- others are not listed here ---

    kunlun fixed defect @high #minor 
    opened by jd-zhang 17
  • Bad results for  partition join query when SET enable_partitionwise_join to true

    Bad results for partition join query when SET enable_partitionwise_join to true

    Issue migrated from trac ticket # 118

    component: computing nodes | priority: major | resolution: fixed

    2021-06-24 14:44:59: @jd-zhang created the issue


    SET enable_partitionwise_join TO true; drop table if exists pagg_tab1; CREATE TABLE pagg_tab1(x int, y int) PARTITION BY RANGE(x); CREATE TABLE pagg_tab1_p1 PARTITION OF pagg_tab1 FOR VALUES FROM (0) TO (10); CREATE TABLE pagg_tab1_p2 PARTITION OF pagg_tab1 FOR VALUES FROM (10) TO (20); CREATE TABLE pagg_tab1_p3 PARTITION OF pagg_tab1 FOR VALUES FROM (20) TO (30); drop table if exists pagg_tab2; CREATE TABLE pagg_tab2(x int, y int) PARTITION BY RANGE(y); CREATE TABLE pagg_tab2_p1 PARTITION OF pagg_tab2 FOR VALUES FROM (0) TO (10); CREATE TABLE pagg_tab2_p2 PARTITION OF pagg_tab2 FOR VALUES FROM (10) TO (20); CREATE TABLE pagg_tab2_p3 PARTITION OF pagg_tab2 FOR VALUES FROM (20) TO (30); INSERT INTO pagg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i; INSERT INTO pagg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i; SELECT t1.x, sum(t1.y), count(*) FROM pagg_tab1 t1, pagg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY 1, 2, 3;

    For kunlun, the result is: x | sum | count 0 | 500 | 100 6 | 1100 | 100 24 | 900 | 100 This result is not correct.

    For offical pg, the result is: x | sum | count 0 | 500 | 100 6 | 1100 | 100 12 | 700 | 100 18 | 1300 | 100 24 | 900 | 100

    This only happens when enable_partitionwise_join is set to true.

    similar error queries:

    SELECT t1.y, sum(t1.x), count(*) FROM pagg_tab1 t1, pagg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.y HAVING avg(t1.x) > 10 ORDER BY 1, 2, 3;

    SELECT b.y, sum(a.y) FROM pagg_tab1 a LEFT JOIN pagg_tab2 b ON a.x = b.y GROUP BY b.y ORDER BY 1 NULLS LAST;

    SELECT b.y, sum(a.y) FROM pagg_tab1 a RIGHT JOIN pagg_tab2 b ON a.x = b.y GROUP BY b.y ORDER BY 1 NULLS LAST;

    kunlun fixed defect @high #major 
    opened by jd-zhang 14
  • no_rep cluster-->restore_new_cluster API, when restore compouter succeed,Login compouter node,create database failed;mgr cluster also has this problem.

    no_rep cluster-->restore_new_cluster API, when restore compouter succeed,Login compouter node,create database failed;mgr cluster also has this problem.

    Issue migrated from trac ticket # 580

    component: computing nodes | priority: major | resolution: fixed

    2022-04-15 16:15:18: [email protected] created the issue


    1.restore_new_cluster API,cluster_info like this : datas = { "ver":"0.1", "job_id":job_id, "job_type":"create_cluster", "shards":"2", "nodes":"1", "comps":"1", "ha_mode":"no_rep", "max_storage_size":"20", "max_connections":"6", "cpu_cores":"8", "innodb_size":"1", "user_name":"test1", "machinelist": [ {"hostaddr":"192.168.0.132"} ] } or datas = { "ver":"0.1", "job_id":job_id, "job_type":"create_cluster", "shards":"1", "nodes":"1", "comps":"1", "ha_mode":"no_rep", "max_storage_size":"20", "max_connections":"6", "cpu_cores":"8", "innodb_size":"1", "user_name":"test1", "machinelist": [ {"hostaddr":"192.168.0.132"} ] } 2.restore compouter succeed 3.psql postgres://abc:[email protected]:59033/postgres input "create database testdb;" get stuck


    mgr cluster info: { "ver":"0.1", "job_id":job_id, "job_type":"create_cluster", "shards":"2", "nodes":"3", "comps":"1", "ha_mode":"mgr", "max_storage_size":"20", "max_connections":"6", "cpu_cores":"8", "innodb_size":"4", "user_name":"test1", "machinelist": [ {"hostaddr":"192.168.0.132"} ] }

    kunlun fixed defect @high #trivial 
    opened by jd-zhang 13
  • Crash when no column needed from a RemoteJoin node

    Crash when no column needed from a RemoteJoin node

    Issue migrated from trac ticket # 215

    component: computing nodes | priority: blocker | resolution: fixed

    2021-11-10 14:18:25: @vito4172 created the issue


    sql code:

    drop table if exists atest5; CREATE TABLE atest5 (one int, two int unique, three int, four int unique); INSERT INTO atest5 VALUES (1,2,3); SELECT 1 FROM atest5 a JOIN atest5 b USING (one);

    psql:privileges.sql:286: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. psql:privileges.sql:286: fatal: connection to server was lost

    Comment on line 286 of privileges.sql

    kunlun fixed defect @high #minor 
    opened by jd-zhang 13
  • ERROR: invalid attnum for relation

    ERROR: invalid attnum for relation

    Issue migrated from trac ticket # 69

    component: computing nodes | priority: major | resolution: fixed

    2021-04-28 18:25:19: @jd-zhang created the issue


    sql code:

    drop table if exists pagg_tab_m; CREATE TABLE pagg_tab_m (a int, b int, c int) PARTITION BY RANGE(a, ((a+b)/2)); CREATE TABLE pagg_tab_m_p1 PARTITION OF pagg_tab_m FOR VALUES FROM (0, 0) TO (10, 10); CREATE TABLE pagg_tab_m_p2 PARTITION OF pagg_tab_m FOR VALUES FROM (10, 10) TO (20, 20); CREATE TABLE pagg_tab_m_p3 PARTITION OF pagg_tab_m FOR VALUES FROM (20, 20) TO (30, 30); INSERT INTO pagg_tab_m SELECT i % 30, i % 40, i % 50 FROM generate_series(0, 2999) i; EXPLAIN (COSTS OFF) SELECT a, sum(b), avg(c), count(*) FROM pagg_tab_m GROUP BY a, (a+b)/2 HAVING sum(b) < 50 ORDER BY 1, 2, 3;

    ERROR: invalid attnum 4 for relation "pagg_tab_m"

    Notice, this only happens when there are 3 columns, it does not happen when the table have 4 or more columns.

    kunlun fixed defect @high #minor 
    opened by jd-zhang 12
  • error in partition join query

    error in partition join query

    Issue migrated from trac ticket # 66

    component: computing nodes | priority: major | resolution: fixed

    2021-04-28 12:29:34: @jd-zhang created the issue


    sql code:

    DROP TABLE if exists prt1; DROP TABLE if exists prt2; CREATE TABLE prt1 (a int, b int, c varchar) PARTITION BY RANGE(a); CREATE TABLE prt1_p1 PARTITION OF prt1 FOR VALUES FROM (0) TO (250); CREATE TABLE prt1_p3 PARTITION OF prt1 FOR VALUES FROM (500) TO (600); CREATE TABLE prt1_p2 PARTITION OF prt1 FOR VALUES FROM (250) TO (500); CREATE TABLE prt2 (a int, b int, c varchar) PARTITION BY RANGE(b); CREATE TABLE prt2_p1 PARTITION OF prt2 FOR VALUES FROM (0) TO (250); CREATE TABLE prt2_p2 PARTITION OF prt2 FOR VALUES FROM (250) TO (500); CREATE TABLE prt2_p3 PARTITION OF prt2 FOR VALUES FROM (500) TO (600); SELECT t1, t2 FROM prt1 t1 LEFT JOIN prt2 t2 ON t1.a # t2.b WHERE t1.b0 ORDER BY t1.a, t2.b;

    error output: ERROR: MySQL storage node (1, 1) returned error: 1064, You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the rig, a from tdb1_$$_public.prt1' at line 1. ERROR: MySQL storage node (2, 2) returned error: 1399, XAER_RMFAIL: The command cannot be executed when global transaction is in the NON-EXISTING state.

    The error does no happen if no data is populated.

    A similar query sql is:

    SELECT t1.a, t2.b FROM prt1 t1, prt2 t2 WHERE t1::text # t2::text AND t1.at2.b ORDER BY t1.a;

    kunlun fixed defect @high 
    opened by jd-zhang 12
  • Explain update on view fails

    Explain update on view fails

    Issue migrated from trac ticket # 28

    component: computing nodes | priority: major | resolution: fixed

    2021-04-15 12:22:18: @jd-zhang created the issue


    tdb1=# CREATE TABLE base_tbl (a int PRIMARY KEY, b varchar(50) DEFAULT 'Unspecified'); CREATE TABLE tdb1=# INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) g(i); INSERT 0 5 tdb1=# tdb1=# CREATE VIEW rw_view1 AS SELECT * FROM base_tbl WHERE a>0; CREATE VIEW tdb1=# EXPLAIN (costs off) UPDATE rw_view1 SET a=6 WHERE a=5; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded.

    kunlun fixed defect @high 
    opened by jd-zhang 12
  • Bad processing for join query

    Bad processing for join query

    *Issue migrated from trac ticket # 81 www.kunlunbase.com *

    component: computing nodes | priority: major | resolution: fixed

    2021-04-29 17:36:31: [email protected] created the issue


    sql code:

    drop table if exists INT8_TBL; CREATE TABLE INT8_TBL(q1 int8, q2 int8); INSERT INTO INT8_TBL VALUES(' 123 ',' 456'); INSERT INTO INT8_TBL VALUES('123 ','4567890123456789'); INSERT INTO INT8_TBL VALUES('4567890123456789','123'); INSERT INTO INT8_TBL VALUES(+4567890123456789,'4567890123456789'); INSERT INTO INT8_TBL VALUES('+4567890123456789','-4567890123456789'); select t1.q2, count(t2.*) from int8_tbl t1 left join (select * from int8_tbl) t2 on (t1.q2 = t2.q1) group by t1.q2 order by 1; -ERROR: invalid input syntax for integer: "▒▒▒"

    similar query is: select t1.q2, count(t2.*) from int8_tbl t1 left join (select q1, case when q2=1 then 1 else q2 end as q2 from int8_tbl) t2 on (t1.q2 = t2.q1) group by t1.q2 order by 1

    enhancement kunlun fixed @high 
    opened by jd-zhang 11
  • crash when select from information_schema.foreign_data_wrapper_options

    crash when select from information_schema.foreign_data_wrapper_options

    Issue migrated from trac ticket # 565

    component: computing nodes | priority: minor | resolution: fixed

    2022-04-08 18:21:35: [email protected] created the issue


    stack:

    (gdb) bt
    #0  0x00007fb89869467b in kill () from /lib64/libc.so.6
    #1  0x000000000090304b in my_write_core (sig=11) at stacktrace.c:420
    #2  0x0000000000902b56 in handle_fatal_signal (sig=11) at signal_handler.c:148
    #3  <signal handler called>
    #4  0x00000000007149c5 in ExecHashJoinOuterGetTuple (outerNode=0x7fb895fcfe08, hjstate=0x7fb895fcfb30, hashvalue=0x7fffd1b2a28c) at nodeHashjoin.c:864
    #5  0x0000000000713d79 in ExecHashJoinImpl (pstate=0x7fb895fcfb30, parallel=false) at nodeHashjoin.c:361
    #6  0x00000000007142d9 in ExecHashJoin (pstate=0x7fb895fcfb30) at nodeHashjoin.c:572
    #7  0x00000000006f65b1 in ExecProcNodeFirst (node=0x7fb895fcfb30) at execProcnode.c:469
    #8  0x000000000070cc1d in ExecProcNode (node=0x7fb895fcfb30) at ../../../src/include/executor/executor.h:249
    #9  0x000000000070d26b in gather_getnext (gatherstate=0x7fb895fcf958) at nodeGather.c:276
    #10 0x000000000070d0ef in ExecGather (pstate=0x7fb895fcf958) at nodeGather.c:207
    #11 0x00000000006f65b1 in ExecProcNodeFirst (node=0x7fb895fcf958) at execProcnode.c:469
    #12 0x0000000000725e11 in ExecProcNode (node=0x7fb895fcf958) at ../../../src/include/executor/executor.h:249
    #13 0x0000000000725eeb in ExecProjectSet (pstate=0x7fb895fcf780) at nodeProjectSet.c:94
    #14 0x00000000006f65b1 in ExecProcNodeFirst (node=0x7fb895fcf780) at execProcnode.c:469
    #15 0x0000000000726ec0 in ExecProcNode (node=0x7fb895fcf780) at ../../../src/include/executor/executor.h:249
    #16 0x00000000007270f0 in ExecResult (pstate=0x7fb895fcf640) at nodeResult.c:115
    #17 0x00000000006f65b1 in ExecProcNodeFirst (node=0x7fb895fcf640) at execProcnode.c:469
    #18 0x00000000007298f4 in ExecProcNode (node=0x7fb895fcf640) at ../../../src/include/executor/executor.h:249
    #19 0x0000000000729aa5 in ExecSort (pstate=0x7fb895fcf3c0) at nodeSort.c:118
    #20 0x00000000006f65b1 in ExecProcNodeFirst (node=0x7fb895fcf3c0) at execProcnode.c:469
    #21 0x00000000006eaed7 in ExecProcNode (node=0x7fb895fcf3c0) at ../../../src/include/executor/executor.h:249
    #22 0x00000000006ed7f5 in ExecutePlan (estate=0x7fb895fcf118, planstate=0x7fb895fcf3c0, use_parallel_mode=true, operation=CMD_SELECT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection,
        dest=0x7fb8838d9bb8, execute_once=true) at execMain.c:1735
    #23 0x00000000006eb4a9 in standard_ExecutorRun (queryDesc=0x7fb895fe3518, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:366
    #24 0x00000000006eb2d7 in ExecutorRun (queryDesc=0x7fb895fe3518, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:309
    #25 0x00000000008fd085 in PortalRunSelect (portal=0x7fb895dea118, forward=true, count=0, dest=0x7fb8838d9bb8) at pquery.c:956
    #26 0x00000000008fcd2a in PortalRun (portal=0x7fb895dea118, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x7fb8838d9bb8, altdest=0x7fb8838d9bb8, completionTag=0x7fffd1b2a8f0 "")
        at pquery.c:797
    #27 0x00000000008f6b65 in exec_simple_query (query_string=0x7fb895cdb118 "SELECT * FROM information_schema.foreign_data_wrapper_options\nORDER BY 1, 2, 3;") at postgres.c:1163
    #28 0x00000000008faf15 in PostgresMain (argc=1, argv=0x7fb895d5f150, dbname=0x7fb895d5f078 "regress", username=0x7fb895d5f058 "abc") at postgres.c:4235
    #29 0x000000000084cf93 in BackendRun (port=0x7fb895c65540) at postmaster.c:4444
    #30 0x000000000084c6dd in BackendStartup (port=0x7fb895c65540) at postmaster.c:4106
    #31 0x0000000000848847 in ServerLoop () at postmaster.c:1713
    #32 0x00000000008480f8 in PostmasterMain (argc=3, argv=0x7fb895c250c0) at postmaster.c:1386
    #33 0x00000000007620f3 in main (argc=3, argv=0x7fb895c250c0) at main.c:233
    
    
    kunlun fixed defect #minor @low 
    opened by jd-zhang 11
  • Memory overrun in an internal array

    Memory overrun in an internal array

    Issue migrated from trac ticket # 304

    component: computing nodes | priority: major | resolution: fixed

    2021-12-14 17:09:52: @vito4172 created the issue


    sql code: drop table if exists pagg_tab; CREATE TABLE pagg_tab (a int, b int, c text, d int) PARTITION BY LIST(c); CREATE TABLE pagg_tab_p1 PARTITION OF pagg_tab FOR VALUES IN ('0000', '0001', '0002', '0003'); CREATE TABLE pagg_tab_p2 PARTITION OF pagg_tab FOR VALUES IN ('0004', '0005', '0006', '0007'); CREATE TABLE pagg_tab_p3 PARTITION OF pagg_tab FOR VALUES IN ('0008', '0009', '0010', '0011'); INSERT INTO pagg_tab SELECT i % 20, i % 30, to_char(i % 12, 'FM0000'), i % 30 FROM generate_series(0, 2999) i;

    EXPLAIN (COSTS OFF) SELECT a FROM pagg_tab WHERE a < 3 GROUP BY a ORDER BY 1; SELECT a FROM pagg_tab WHERE a < 3 GROUP BY a ORDER BY 1;

    WARNING: problem in alloc set MessageContext: detected write past chunk end in block 0x7fabcd794000, chunk 0x7fabcd797c90 WARNING: problem in alloc set MessageContext: detected write past chunk end in block 0x7fabcd794000, chunk 0x7fabcd797c90

    kunlun fixed defect @high 
    opened by jd-zhang 11
  • Crash when no column needed  from a remote node

    Crash when no column needed from a remote node

    Issue migrated from trac ticket # 223

    component: computing nodes | priority: blocker | resolution: fixed

    2021-11-15 10:35:07: @vito4172 created the issue


    sqlcode: drop table if exists onek; CREATE TABLE onek ( unique1 int4, unique2 int4, two int4, four int4, ten int4, twenty int4, hundred int4, thousand int4, twothousand int4, fivethous int4, tenthous int4, odd int4, even int4, stringu1 name, stringu2 name, string4 name ); select ten, sum(distinct four) from onek a group by grouping sets((ten,four),(ten)) having exists (select 1 from onek b where sum(distinct a.four) = b.four);

    groupingsets.sql:229: connection to server was lost

    kunlun fixed defect @high #major 
    opened by jd-zhang 11
  • 向自增列插入多个default值时,报错

    向自增列插入多个default值时,报错

    *Issue migrated from trac ticket # 900 www.kunlunbase.com *

    component: computing nodes | priority: major

    2022-07-13 12:50:30: smith created the issue


    复现

    create table t(a int auto_increment);
    insert into t values(default), (default);
    
    kunlun defect @high #trivial 
    opened by jd-zhang 0
  • 文件lexer_quotes在mysql计算节点上出现的错误

    文件lexer_quotes在mysql计算节点上出现的错误

    *Issue migrated from trac ticket # 899 www.kunlunbase.com *

    component: computing nodes | priority: major

    2022-07-12 17:44:11: [email protected] created the issue


    多个语句当做成一个语句

    select """;
    select """";
    select """"";
    
    +-------------------------+
    | ?column?                |
    +-------------------------+
    | ";
    select "";
    select "" |
    +-------------------------+
    1 row in set (0.00 sec)
    
    
    select ''';
    select '''';
    select *'*;
    
    +-------------------------+
    | ?column?                |
    +-------------------------+
    | ';
    select '';
    select '' |
    +-------------------------+
    1 row in set (0.00 sec)
    
    select """'''"";
    select """'''\"'";
    select *'"""*;
    
    +--------------------------------+
    | ?column?                       |
    +--------------------------------+
    | "'''";
    select "'";
    select '''" |
    +--------------------------------+
    1 row in set (0.00 sec)
    
    

    列名不同,空值与0的不同 select '\n','\t','\0','\1','\a','\j';

    +----------+----------+----------+----------+----------+----------+
    | ?column? | ?column? | ?column? | ?column? | ?column? | ?column? |
    +----------+----------+----------+----------+----------+----------+
    |
            |               | \0       | \1       | a        | j        |
    +----------+----------+----------+----------+----------+----------+
    1 row in set (0.00 sec)
    
    

    预期:

    +---+---+---+---+---+---+
    |   |   |   | 1 | a | j |
    +---+---+---+---+---+---+
    |
     |       |   | 1 | a | j |
    +---+---+---+---+---+---+
    
    

    create view "t""1" as select*from t1; drop view "t""1";

    ERROR 1149 (42000): syntax error at or near ""t""1"" at character: 13
    
    FILE:  scan.l    LINE: 1488     FUNCTION:  scanner_yyerror
    
    kunlun defect @high 
    opened by jd-zhang 0
  • 更新语法帮助功能

    更新语法帮助功能

    *Issue migrated from trac ticket # 898 www.kunlunbase.com *

    component: computing nodes | priority: major

    2022-07-12 14:31:23: smith created the issue


    将新增的语法补充到语法查阅函数中,分别有:

    help;
    insert ignore;
    insert ... on duplicate key update;
    update/delete ... order by ... limit ...;
    replace into 
    

    例如查阅show和replace语法

    mysql> \h show
    Name: 'SHOW'
    Description:
    show the value of a run-time parameter
    Examples:
    SHOW name
    SHOW ALL
    SHOW DATABASES
    SHOW SCHEMAS
    SHOW TABLES
    SHOW PROCESSLIST
    
    mysql> \h replace
    Name: 'REPLACE'
    Description:
    If the old row in the table has a unique constraint violation with the new row, the old row in the table is replaced; otherwise, the new row is inserted.
    Examples:
    [ WITH [ RECURSIVE ] with_query [, ...] ]
    REPLACE INTO table_name [ AS alias ] [ ( column_name [, ...] ) ]
        [ OVERRIDING { SYSTEM | USER} VALUE ]
        { DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | query }
        [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]
    
    mysql>
    
    kunlun defect @high #trivial 
    opened by jd-zhang 4
  • 支持语法create database if not exists

    支持语法create database if not exists

    *Issue migrated from trac ticket # 897 www.kunlunbase.com *

    component: computing nodes | priority: major

    2022-07-12 13:38:53: smith created the issue


    abc=# create database if not exists abc;
    WARNING:  database "abc" already exists
    CREATE DATABASE
    abc=#
    
    kunlun defect @high #trivial 
    opened by jd-zhang 1
  • 类型转换的下推错误;聚合函数没有下推

    类型转换的下推错误;聚合函数没有下推

    *Issue migrated from trac ticket # 895 www.kunlunbase.com *

    component: computing nodes | priority: major

    2022-07-11 17:51:40: smith created the issue


    ** 复现

    -* 问题1

    create table t1(a int, b varchar(10));
    
    insert into t1 select 1, ' 01';
    
    select a::text::float =b::float from t1; -- 预期为true
    

    ** 问题2

    explain select count(distinct a) from t1; -- 预期count(distinct a)
    完全下推到存储节点
    
    kunlun defect @high #trivial 
    opened by jd-zhang 3
  • analyze导致计算节点ddl重放报错

    analyze导致计算节点ddl重放报错

    *Issue migrated from trac ticket # 894 www.kunlunbase.com *

    component: computing nodes | priority: major

    2022-07-11 15:25:43: smith created the issue


    ** analyze的执行逻辑

    1、计算节点向存储节点发送analyze命令;
    2、从存储节点读取表的统计信息;
    3、根据(2)生成对系统表的update/insert语句,例如:
     UPDATE pg_class SET relpages=97,
             reltuples=16341
      WHERE relname='t1'
            AND relnamespace = 
        (SELECT oid
        FROM pg_namespace
        WHERE nspname = 'public')
    

    -* 问题

    1、步骤一没有检查用户的权限;
    2、其他计算节点执行(3)的sql时,使用的是最开始执行analyze的用户,而该用户可能没有对应的权限,导致ddl复制卡住。
    

    -* 复现

    create user test;
    create table t1(a int, b int);
    set role test;
    analyze t1; -- 预期warning
    create table t2(a int, b int);
    analyze t2; -- 预期其他计算节点ddl复制进程卡住
    
    kunlun defect @high #trivial 
    opened by jd-zhang 2
Owner
zettadb
泽拓科技(深圳)有限责任公司
zettadb
DuckDB is an in-process SQL OLAP Database Management System

DuckDB is an in-process SQL OLAP Database Management System

DuckDB 7.8k Jan 3, 2023
MillenniumDB is a graph oriented database management system

Millennium DB MillenniumDB is a graph oriented database management system developed by the Millennium Institute for Foundational Research on Data (IMF

null 27 Nov 26, 2022
GalaxyEngine is a MySQL branch originated from Alibaba Group, especially supports large-scale distributed database system.

GalaxyEngine is a MySQL branch originated from Alibaba Group, especially supports large-scale distributed database system.

null 281 Jan 4, 2023
MySQL Server, the world's most popular open source database, and MySQL Cluster, a real-time, open source transactional database.

Copyright (c) 2000, 2021, Oracle and/or its affiliates. This is a release of MySQL, an SQL database server. License information can be found in the

MySQL 8.6k Dec 26, 2022
A mini database for learning database

A mini database for learning database

Chuckie Tan 4 Nov 14, 2022
Nebula Graph is a distributed, fast open-source graph database featuring horizontal scalability and high availability

Nebula Graph is an open-source graph database capable of hosting super large scale graphs with dozens of billions of vertices (nodes) and trillions of edges, with milliseconds of latency.

vesoft inc. 834 Dec 24, 2022
YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features

YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features. It is best to fit for cloud-native OLTP (i.e. real-time, business-critical) applications that need absolute data correctness and require at least one of the following: scalability, high tolerance to failures, or globally-distributed deployments.

yugabyte 7.4k Jan 7, 2023
BaikalDB, A Distributed HTAP Database.

BaikalDB supports sequential and randomised realtime read/write of structural data in petabytes-scale. BaikalDB is compatible with MySQL protocol and it supports MySQL style SQL dialect, by which users can migrate their data storage from MySQL to BaikalDB seamlessly.

Baidu 1k Dec 28, 2022
PolarDB for PostgreSQL (PolarDB for short) is an open source database system based on PostgreSQL.

PolarDB for PostgreSQL (PolarDB for short) is an open source database system based on PostgreSQL. It extends PostgreSQL to become a share-nothing distributed database, which supports global data consistency and ACID across database nodes, distributed SQL processing, and data redundancy and high availability through Paxos based replication. PolarDB is designed to add values and new features to PostgreSQL in dimensions of high performance, scalability, high availability, and elasticity. At the same time, PolarDB remains SQL compatibility to single-node PostgreSQL with best effort.

Alibaba 2.5k Dec 31, 2022
Database system project based on CMU 15-445/645 (FALL 2020)

Database system project based on CMU 15-445/645 (FALL 2020)

null 26 Jan 3, 2023
Money Manager Ex is an easy to use, money management application built with wxWidgets

Money Manager Ex Money Manager Ex is a free, open-source, cross-platform, easy-to-use personal finance software. It primarily helps organize one's fin

Money Manager EX 1.2k Dec 31, 2022
ESE is an embedded / ISAM-based database engine, that provides rudimentary table and indexed access.

Extensible-Storage-Engine A Non-SQL Database Engine The Extensible Storage Engine (ESE) is one of those rare codebases having proven to have a more th

Microsoft 792 Dec 22, 2022
TimescaleDB is an open-source database designed to make SQL scalable for time-series data.

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.

Timescale 14.3k Jan 2, 2023
Beryl-cli is a client for the BerylDB database server

Beryl-cli is a client for the BerylDB database server. It offers multiple commands and is designed to be fast and user-friendly.

BerylDB 11 Oct 9, 2022
A MariaDB-based command line tool to connect to OceanBase Database.

什么是 OceanBase Client OceanBase Client(简称 OBClient) 是一个基于 MariaDB 开发的客户端工具。您可以使用 OBClient 访问 OceanBase 数据库的集群。OBClient 采用 GPL 协议。 OBClient 依赖 libobclie

OceanBase 51 Nov 9, 2022
A proxy server for OceanBase Database.

OceanBase Database Proxy TODO: some badges here OceanBase Database Proxy (ODP for short) is a dedicated proxy server for OceanBase Database. OceanBase

OceanBase 79 Dec 9, 2022
SOCI - The C++ Database Access Library

Originally, SOCI was developed by Maciej Sobczak at CERN as abstraction layer for Oracle, a Simple Oracle Call Interface. Later, several database backends have been developed for SOCI, thus the long name has lost its practicality. Currently, if you like, SOCI may stand for Simple Open (Database) Call Interface or something similar.

SOCI 1.2k Jan 9, 2023
StarRocks is a next-gen sub-second MPP database for full analysis senarios, including multi-dimensional analytics, real-time analytics and ad-hoc query, formerly known as DorisDB.

StarRocks is a next-gen sub-second MPP database for full analysis senarios, including multi-dimensional analytics, real-time analytics and ad-hoc query, formerly known as DorisDB.

StarRocks 3.7k Dec 30, 2022
Velox is a new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.

Velox is a C++ database acceleration library which provides reusable, extensible, and high-performance data processing components

Facebook Incubator 2k Jan 8, 2023