StarRocks is a next-gen sub-second MPP database for full analysis senarios, including multi-dimensional analytics, real-time analytics and ad-hoc query, formerly known as DorisDB.

Related tags

Database starrocks
Overview

StarRocks

StarRocks is a next-gen sub-second MPP database for full analysis senarios, including multi-dimensional analytics, real-time analytics and ad-hoc query, formerly known as DorisDB.

Technology

  • Native vectorized SQL engine: StarRocks adopts vectorization technology to make full use of the parallel computing power of CPU, achieving sub-second query returns in multi-dimensional analyses, which is 5 to 10 times faster than previous systems.
  • Simple architecture: StarRocks does not rely on any external systems. The simple architecture makes it easy to deploy, maintain and scale out. StarRocks also provides high availability, reliability, scalability and fault tolerance.
  • Standard SQL: StarRocks supports ANSI SQL syntax (fully supportted TPC-H and TPC-DS). It is also compatible with the MySQL protocol. Various clients and BI software can be used to access StarRocks.
  • Smart query optimization: StarRocks can optimize complex queries through CBO (Cost Based Optimizer). With a better execution plan, the data analysis efficiency will be greatly improved.
  • Realtime update: The updated model of StarRocks can perform upsert/delete operations according to the primary key, and achieve efficient query while concurrent updates.
  • Intelligent materialized view: The materialized view of StarRocks can be automatically updated during the data import and automatically selected when the query is executed.
  • Convenient query federation: StarRocks allows direct access to data from Hive, MySQL and Elasticsearch without importing.

User cases

  • StarRocks supports not only high concurrency & low latency points queries, but also high throughput ad-hoc queries.
  • StarRocks unified batch and near real-time streaming data ingestion.
  • Pre-aggregations, flat tables, star and snowflake schemas are supported and all run at enhanced speed.
  • StarRocks hybridizes serving and analytical processing(HSAP) in a easy way. The minimalist architectural design reduces the complexity and maintenance cost of StarRocks and increases its reliability and scalability.

Install

Download the current release here.
For detailed instructions, please refer to deploy.

Links

LICENSE

Code in this repository is provided under the Elastic License 2.0. Some portions are available under open source licenses. Please see our FAQ.

Contributing to StarRocks

A big thanks for your attention to StarRocks! In order to accept your pull request, please follow the CONTRIBUTING.md.

Issues
  • [Feature] Make StarRocks support FQDN

    [Feature] Make StarRocks support FQDN

    What type of PR is this:

    • [ ] bug
    • [ ] feature
    • [x] enhancement
    • [ ] others

    Which issues of this PR fixes :

    Fixes #5527 #401

    Problem Summary(Required):

    StarRocks only supported IP before. Some users' network environments need to support FQDN. You can refer to the following documentation to use FQDN in your StarRocks cluster First, you need to configure all the FQDN information of the cluster in each machine's /etc/hosts file. If you want to use FQDN to create a new cluster, you should do nothing, this feature will make cluster use fqdn automatically And if you want to upgrade an old cluster, please follow the next steps.

    Be change ip to fqdn

    1. BE Reboot directly after upgrading to the version that contains the feature
    2. FE Execute SQL alter system modify backend host "192.1.1.0" to "testFqdn"

    Fe upgrade ip to fqdn

    • FE cluster upgrade with less than three followers is not supported
    • Change follower node first, then master node

    How to follower's ip to fqdn?

    1. stop the node you want to change the IP to FQDN, wait for the alive state to be false
    2. Execute the following SQL alter system modify frontend host "127.1.1.0" to "sandbox16"
    3. After the FE is upgraded to the version that includes the feature
    4. restart the FE with sh bin/start_fe.sh --host_type=fqdn --daemon and wait for the FE alive state to be true

    How to master's ip to fqdn?

    When all the follower besides of the master in the cluster are upgraded, shut down the master and wait for the new master to be elected, and then restart the original master, wait it to be a follower and alive. Then the new follower could be processed according to the above to upgrade ip to fqdn

    Other:

    this feature can help you start fe by using specified host type,just like sh bin/start_fe.sh --host_type=fqdn sh bin/start_fe.sh --host_type=ip

    opened by waittttting 32
  • Enable pipeline engine in 2.2 in default

    Enable pipeline engine in 2.2 in default

    What type of PR is this:

    • [ ] bug
    • [ ] feature
    • [ ] enhancement
    • [ ] others

    Which issues of this PR fixes :

    Problem Summary(Required) :

    opened by satanson 29
  • Access s3/oss via new `S3Client` of aws-sdk-cpp

    Access s3/oss via new `S3Client` of aws-sdk-cpp

    What type of PR is this:

    • [ ] bug
    • [x] feature
    • [ ] enhancement
    • [ ] others

    Which issues of this PR fixes :

    To access s3/oss via new S3Client

    Problem Summary(Required) :

    ref: #2533 #1336

    opened by dirtysalt 29
  • Hive external execute failed,how can I set User?

    Hive external execute failed,how can I set User?

    starrocks version:1.18.1

    image

    log: 2021-10-20 11:34:15 ERROR TThreadPoolServer:321 - Thrift Error occurred during processing of message. org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:455) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:354) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:243) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:210) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125) ... 9 more

    Ranger is configured for Hive but Kerberos is not enabled, how can I set User?

    type/question 
    opened by winfys 20
  • [BugFix] Call `hdfsCloseFile` in pthread

    [BugFix] Call `hdfsCloseFile` in pthread

    What type of PR is this:

    • [x] bug
    • [ ] feature
    • [ ] enhancement
    • [ ] refactor
    • [ ] others

    Which issues of this PR fixes :

    Fixes #6621

    Problem Summary(Required) :

    JNI does not work well with bthread.

    • In most cases, JNI function is running in std::thread(or pthread).
    • But when JNI function runs in bthread in some cases like cancel_fragment, it fails.

    https://github.com/apache/incubator-brpc/blob/master/docs/cn/server.md

    image
    #60 starrocks::pipeline::FragmentContextManager::~FragmentContextManager (this=0x176f25440, __in_chrg=<optimized out>) at /root/starrocks/be/src/exec/pipeline/fragment_context.h:156
    #61 std::default_delete<starrocks::pipeline::FragmentContextManager>::operator() (__ptr=0x176f25440, this=0xef96f290) at /usr/include/c++/10.3.0/bits/unique_ptr.h:85
    #62 std::__uniq_ptr_impl<starrocks::pipeline::FragmentContextManager, std::default_delete<starrocks::pipeline::FragmentContextManager> >::reset (__p=0x0, this=0xef96f290) at /usr/include/c++/10.3.0/bits/unique_ptr.h:182
    #63 std::unique_ptr<starrocks::pipeline::FragmentContextManager, std::default_delete<starrocks::pipeline::FragmentContextManager> >::reset (__p=0x0, this=0xef96f290) at /usr/include/c++/10.3.0/bits/unique_ptr.h:456
    #64 starrocks::pipeline::QueryContext::~QueryContext (this=0xef96f270, __in_chrg=<optimized out>) at /root/starrocks/be/src/exec/pipeline/query_context.cpp:31
    #65 0x0000000001974b6a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0xef96f260) at /usr/include/c++/10.3.0/ext/atomicity.h:70
    #66 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0xef96f260) at /usr/include/c++/10.3.0/bits/shared_ptr_base.h:151
    #67 0x0000000003322cff in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7f00f6d2fb98, __in_chrg=<optimized out>) at /root/starrocks/be/src/exec/pipeline/fragment_context.h:80
    #68 std::__shared_ptr<starrocks::pipeline::QueryContext, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7f00f6d2fb90, __in_chrg=<optimized out>) at /usr/include/c++/10.3.0/bits/shared_ptr_base.h:1183
    #69 std::shared_ptr<starrocks::pipeline::QueryContext>::~shared_ptr (this=0x7f00f6d2fb90, __in_chrg=<optimized out>) at /usr/include/c++/10.3.0/bits/shared_ptr.h:121
    #70 starrocks::PInternalServiceImplBase<doris::PBackendService>::cancel_plan_fragment (this=<optimized out>, cntl_base=<optimized out>, request=0x354e2630, result=0x1b208b540, done=<optimized out>) at /root/starrocks/be/src/service/internal_service.cpp:243
    #71 0x000000000409e82e in brpc::policy::ProcessRpcRequest (msg_base=<optimized out>) at /var/local/thirdparty/src/incubator-brpc-0.9.7/src/brpc/policy/baidu_rpc_protocol.cpp:496
    #72 0x0000000004095297 in brpc::ProcessInputMessage ([email protected]=0x45bdd900) at /var/local/thirdparty/src/incubator-brpc-0.9.7/src/brpc/input_messenger.cpp:135
    #73 0x0000000004096143 in brpc::RunLastMessage::operator() (last_msg=0x45bdd900, this=<synthetic pointer>) at /var/local/thirdparty/src/incubator-brpc-0.9.7/src/brpc/input_messenger.cpp:141
    #74 std::unique_ptr<brpc::InputMessageBase, brpc::RunLastMessage>::~unique_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /usr/include/c++/10.3.0/bits/unique_ptr.h:361
    #75 brpc::InputMessenger::OnNewMessages (m=0x40b00000) at /usr/include/c++/10.3.0/bits/unique_ptr.h:355
    #76 0x000000000413ce0e in brpc::Socket::ProcessEvent (arg=0x40b00000) at /var/local/thirdparty/src/incubator-brpc-0.9.7/src/brpc/socket.cpp:1017
    #77 0x000000000404ad9f in bthread::TaskGroup::task_runner (skip_remained=<optimized out>) at /var/local/thirdparty/src/incubator-brpc-0.9.7/src/bthread/task_group.cpp:296
    #78 0x00000000041d3581 in bthread_make_fcontext ()
    
    sig/DLA jni 
    opened by dirtysalt 17
  • Support new Jar UDF in FE side

    Support new Jar UDF in FE side

    ref: #2473

    To differentiate new UDF from old one written in C++, we add properties and new workflow:

    1. type (if 'StarrocksJar', then file in Jar file)
    2. search UDF class in jar file by searching name in symbol.
    3. UDF class should have evaluate method, which matches function signature of definition.
    4. And there should be only one evaluate method which is also non-static.
    5. In thrift, we add another function binary type SRJAR (abbr of StarrocksJar)

    Those keywords are not finalized yet and still needs discussion.

    ======

    It works like

    
    CREATE FUNCTION my_add4(INT, INT) RETURNS INT PROPERTIES (
    "symbol" = "com.dirlt.SimpleUDF",
    "type" = "StarrocksJar",
    "file" = "http://localhost:41006/work/myudf.jar"
    );
    
    

    and Java source file is

    package com.dirlt;
    
    public class SimpleUDF {
    
        public Integer evaluate(Integer a, Integer b) {
            if (a != null && b != null) {
                return a + b;
            }
            return null;
        }
    }
    
    

    And executing show full function you can see the detailed information

    +------------------+-------------+---------------+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Signature        | Return Type | Function Type | Intermediate Type | Properties                                                                                                                                              |
    +------------------+-------------+---------------+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
    | my_add4(INT,INT) | INT         | Scalar        | NULL              | {"symbol":"com.dirlt.SimpleUDF","file":"http://localhost:41006/work/myudf.jar","type":"SRJAR","md5":"716c6b08413b8ddc39517c0b3becd754"} |
    +------------------+-------------+---------------+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
    
    opened by dirtysalt 17
  • [Feature] support extract fields from parquet struct

    [Feature] support extract fields from parquet struct

    What type of PR is this:

    • [ ] bug
    • [x] feature
    • [ ] enhancement
    • [ ] refactor
    • [ ] others

    Which issues of this PR fixes :

    Fixes #

    Problem Summary(Required) :

    Support loading parquet struct into starrocks with such SQL:

    LOAD LABLE xxx
    (
        DATA INFILE("xxx")
        INTO TABLE xxx 
        FORMAT AS "parquet"
        SET (
            col_json_int=get_json_int(`col_parquet_struct`, "$.s0")
        )
    )
    WITH BROKER xxx;
    

    In the above statement:

    1. The parquet struct type could be converted to starrocks JSON first
    2. Then the get_json_int would extract $.s0 from the JSON
    3. At last the column col_json_int2 would be filled with $.s0

    This PR contains several modifications:

    • Allow converting from parquet struct/list/map to JSON as default type
    • Make get_json_string support JSON type as argument, to reduce implicit cast during load
    • Support several get_json_int functions in loading
    opened by mofeiatwork 16
  • Aws sdk can be slow when get ec2 metadata in Aws::Client::ClientConfiguration

    Aws sdk can be slow when get ec2 metadata in Aws::Client::ClientConfiguration

    What type of PR is this:

    • [ ] bug
    • [ ] feature
    • [x] enhancement
    • [ ] others

    Which issues of this PR fixes :

    Fixes #

    Problem Summary(Required) :

    When we init ClientConfiguration with default constructor, it will try to get current region use ec2 metadata.And there is an exclusive lock in the sdk code which will cause query slow when we have many files to scan.

    ClientConfiguration::ClientConfiguration()
    {
        setLegacyClientConfigurationParameters(*this);
        retryStrategy = InitRetryStrategy();
    
        if (region.empty() &&
            Aws::Utils::StringUtils::ToLower(Aws::Environment::GetEnv("AWS_EC2_METADATA_DISABLED").c_str()) != "true")
        {
            auto client = Aws::Internal::GetEC2MetadataClient();
            if (client)
            {
                region = client->GetCurrentRegion();
            }
        }
        if (!region.empty())
        {
            return;
        }
        region = Aws::String(Aws::Region::US_EAST_1);
    }
    
    Aws::String EC2MetadataClient::GetCurrentRegion() const
            {
                if (!m_region.empty())
                {
                    return m_region;
                }
    
                AWS_LOGSTREAM_TRACE(m_logtag.c_str(), "Getting current region for ec2 instance");
    
                Aws::StringStream ss;
                ss << m_endpoint << EC2_REGION_RESOURCE;
                std::shared_ptr<HttpRequest> regionRequest(CreateHttpRequest(ss.str(), HttpMethod::HTTP_GET,
                                                                             Aws::Utils::Stream::DefaultResponseStreamFactoryMethod));
                {
                    std::lock_guard<std::recursive_mutex> locker(m_tokenMutex);
                    if (m_tokenRequired)
                    {
                        regionRequest->SetHeaderValue(EC2_IMDS_TOKEN_HEADER, m_token);
                    }
                }
    

    I have a test results, for the query below the time cost can optimized from 200s+ to < 10s. The reason is that we can increase the concurrency and we remove the time-cost operation during ClientConfiguration initialization after this pr

    MySQL [iceberg_10t]> SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat WHERE LO_ORDERDATE = 19970101 AND LO_DISCOUNT BETWEEN 1 AND 3 AND LO_QUANTITY < 25;
    

    you can also refer to: https://github.com/aws/aws-sdk-cpp/issues/1511 https://github.com/aws/aws-sdk-cpp/issues/1440

    opened by caneGuy 16
  • Attempt to free invalid pointer 0x623ff80

    Attempt to free invalid pointer 0x623ff80

    Steps to reproduce the behavior (Required)

    1. start a cluster includes 3 FE and 4 BE, use the release here.
    2. buid BE from commit 92d8297667cf0ffaeb1406c1d7443ec68e5da3df of main brach
    3. replace one BE of cluster by that built from step 2
    4. after about one hour, the BE got exception and exit

    Expected behavior (Required)

    be normal

    Real behavior (Required)

    I0914 17:40:26.355070 1166569 internal_service.cpp:241] exec plan fragment, fragment_instance_id=ca55666a-153f-11ec-ba58-6c92bf928229, coord=TNetworkAddress(hostname=********, port=9020), backend=7 is_pipeline 0
    I0914 17:40:26.355098 1166569 plan_fragment_executor.cpp:68] Prepare(): query_id=ca55666a-153f-11ec-ba58-6c92bf928226 fragment_instance_id=ca55666a-153f-11ec-ba58-6c92bf928229 backend_num=7
    I0914 17:40:26.355139 1166569 plan_fragment_executor.cpp:124] Using query memory limit: 2.00 GB
    I0914 17:40:26.355481 1166418 plan_fragment_executor.cpp:228] Open(): fragment_instance_id=ca55666a-153f-11ec-ba58-6c92bf928229
    src/tcmalloc.cc:332] Attempt to free invalid pointer 0x623ff80
    *** Aborted at 1631612426 (unix time) try "date -d @1631612426" if you are using GNU date ***
    PC: @     0x7effb7190387 __GI_raise
    *** SIGABRT (@0x1f80011cb88) received by PID 1166216 (TID 0x7efc00568700) from PID 1166216; stack trace: ***
        @          0x3145132 google::(anonymous namespace)::FailureSignalHandler()
        @     0x7effb8e3b630 (unknown)
        @     0x7effb7190387 __GI_raise
        @     0x7effb7191a78 __GI_abort
        @          0x140b0a0 _ZN8tcmalloc3LogENS_7LogModeEPKciNS_7LogItemES3_S3_S3_.cold
        @          0x4247609 (anonymous namespace)::InvalidFree()
        @          0x1edb505 starrocks::vectorized::timestamp::to_string()
        @          0x1ed9265 starrocks::vectorized::TimestampValue::to_string()
        @          0x23778c3 starrocks::cast_to_string<>()
        @          0x238a4af starrocks::ColumnValueRange<>::to_olap_filter()
        @          0x249bdc5 starrocks::vectorized::OlapScanNode::_start_scan()
        @          0x249c7ca starrocks::vectorized::OlapScanNode::get_next()
        @          0x25c0669 starrocks::vectorized::ProjectNode::get_next()
        @          0x247a6a4 starrocks::vectorized::DistinctStreamingNode::get_next()
        @          0x1ef22c7 starrocks::PlanFragmentExecutor::_get_next_internal_vectorized()
        @          0x1ef34ef starrocks::PlanFragmentExecutor::_open_internal_vectorized()
        @          0x1ef40eb starrocks::PlanFragmentExecutor::open()
        @          0x1e8f77f starrocks::FragmentExecState::execute()
        @          0x1e909a0 starrocks::FragmentMgr::exec_actual()
        @          0x1e958dc std::_Function_handler<>::_M_invoke()
        @          0x1fd8754 starrocks::ThreadPool::dispatch_thread()
        @          0x1fe6bf5 starrocks::Thread::supervise_thread()
        @     0x7effb8e33ea5 start_thread
        @     0x7effb72589fd __clone
    

    StarRocks version (Required)

    • commit 92d8297
    • OS CentOS7.2
    • gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2)
    type/bug 
    opened by mchades 16
  • Refactor decimal multiplication

    Refactor decimal multiplication

    What type of PR is this:

    • [x] bug
    • [ ] feature
    • [ ] enhancement
    • [ ] others

    Which issues of this PR fixes :

    Problem Summary(Required) :

    Old decimal multiplication scale adjust rules is not robust, becase of a narrow decimal multiplied by another narrow decimal yields a narrow decimal, too. for example decimal64(10,2) * decimal64(10,2) yields decimal32(18,4), the result maybe overflow. so we must cast the narrow decimal to wide decimal if necessary to void overflow.

    New decimal multiplication scale adjust rules as follows: decimal(p1,s1) * decimal(p2, s2) => decimal(p, s) At first, denote p' = p1 + p2, s' = s1+s2;

    1. in case of p' <= 38, result never overflows, use the narrowest decimal type that can holds the result. so: p = p1+ p2, s = s1+s2; for examples: decimal32(4,3) * decimal32(4,3) => decimal32(8,6); decimal64(15,3) * decimal32(9,4) => decimal128(24,7).
    2. in case of p' > 38 and s' <=38, the multiplication is computable but the result maybe overflow, so use decimal128 arithmetic and adopt maximum decimal precision(38) as precision of the result. so: p = 38, s = s1 + s2 for examples: decimal128(23,5) * decimal64(18,4) => decimal128(38, 9).
    3. in case of s' > 38, the decimal is too large to hold in a decimal128, so report error. decimal's scale exceeds limit 38.

    decimal arithmetic are testified both in disable/enable overflow check in BE, so now we turn on overflow check on BE for decimal arithmetic operations.

    opened by satanson 15
  • [Cherrypick] Fix crash of sha2 function with null parameter

    [Cherrypick] Fix crash of sha2 function with null parameter

    What type of PR is this:

    • [x] bug
    • [ ] feature
    • [ ] enhancement
    • [ ] others

    Which issues of this PR fixes :

    Problem Summary(Required) :

    Cherrypicked from #4217.

    Fix sha2 function crash on null parameter .

    opened by mofeiatwork 14
  • [Refactor] ddlStmt dealed by ddlExecutor and then transmited to metadata

    [Refactor] ddlStmt dealed by ddlExecutor and then transmited to metadata

    Signed-off-by: zombee0 [email protected]

    What type of PR is this:

    • [x] refactor

    Which issues of this PR fixes :

    Fixes #

    Problem Summary(Required) :

    Refactor, ddlStmt dealed by ddlExecutor and then transmited to metadata. This part is about createDbStmt.

    opened by zombee0 1
  • start be failed.

    start be failed.

    Steps to reproduce the behavior (Required)

    ./start_be.sh

    Expected behavior (Required)

    start successfully.

    Real behavior (Required)

    start time: Sat Jun 25 23:12:32 CST 2022
    *** Check failure stack trace: ***
        @          0x3f714ad  google::LogMessage::Fail()
        @          0x3f7391f  google::LogMessage::SendToLog()
        @          0x3f70ffe  google::LogMessage::Flush()
        @          0x3f73f29  google::LogMessageFatal::~LogMessageFatal()
        @          0x1a48ebc  starrocks::TabletManager::load_tablet_from_meta()
        @          0x1a29ec4  _ZNSt17_Function_handlerIFbllSt17basic_string_viewIcSt11char_traitsIcEEEZN9starrocks7DataDir4loadEvEUlliS3_E0_E9_M_invokeERKSt9_Any_dataOlSC_OS3_
        @          0x1a59395  _ZNSt17_Function_handlerIFbSt17basic_string_viewIcSt11char_traitsIcEES3_EZN9starrocks17TabletMetaManager4walkEPNS5_7KVStoreERKSt8functionIFbllS3_EEEUlS3_S3_E_E9_M_invokeERKSt9_Any_dataOS3_SJ_
        @          0x1bf39e0  starrocks::KVStore::iterate()
        @          0x1a5b89d  starrocks::TabletMetaManager::walk()
        @          0x1a2b5b1  starrocks::DataDir::load()
        @          0x1a14bf6  _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN9starrocks13StorageEngine14load_data_dirsERKSt6vectorIPNS3_7DataDirESaIS7_EEEUlvE_EEEEE6_M_runEv
        @          0x59fb4d0  execute_native_thread_routine
        @     0x7f4f36fd5e65  start_thread
        @     0x7f4f365f088d  __clone
        @              (nil)  (unknown)
    

    StarRocks version (Required)

    • You can get the StarRocks version by executing SQL select current_version()
    • d8c635c
    type/bug 
    opened by yongbingwang 0
  • [Feature][MaterializedView]add RefreshMaterializedViewStatement for MaterializedView

    [Feature][MaterializedView]add RefreshMaterializedViewStatement for MaterializedView

    What type of PR is this:

    • [ ] bug
    • [x] feature
    • [ ] enhancement
    • [ ] refactor
    • [ ] others

    Which issues of this PR fixes :

    Fixes #4783

    Problem Summary(Required) :

    opened by zhuxt2015 1
  • [Enhancement] optimize function if for nullable input

    [Enhancement] optimize function if for nullable input

    What type of PR is this:

    • [ ] bug
    • [ ] feature
    • [x] enhancement
    • [ ] refactor
    • [ ] others

    Problem Summary(Required) :

    process rows: 565115211 baseline:

    PROJECT (plan_node_id=4):
    CommonMetrics:
     - CloseTime: 501ns
     - OperatorTotalTime: 5s112ms
    

    patched:

    PROJECT (plan_node_id=4):
    CommonMetrics:
     - CloseTime: 4.357us
     - OperatorTotalTime: 2s123ms
    
    opened by stdpain 0
  • [BugFix] try to parse it when casting string to json (backport #7835)

    [BugFix] try to parse it when casting string to json (backport #7835)

    This is an automatic backport of pull request #7835 done by Mergify.


    Mergify commands and options

    More conditions and actions can be found in the documentation.

    You can also trigger Mergify actions by commenting on this pull request:

    • @Mergifyio refresh will re-evaluate the rules
    • @Mergifyio rebase will rebase this PR on its base branch
    • @Mergifyio update will merge the base branch into this PR
    • @Mergifyio backport <destination> will backport this PR on <destination> branch

    Additionally, on Mergify dashboard you can:

    • look at your merge queues
    • generate the Mergify configuration with the config editor.

    Finally, you can contact us on https://mergify.com

    opened by mergify[bot] 0
  • starrocks_be_resource_group_mem_allocated_bytes in metric for resource group Display is not accurate

    starrocks_be_resource_group_mem_allocated_bytes in metric for resource group Display is not accurate

    Steps to reproduce the behavior (Required)

    1. CREATE TABLE '...'
    2. INSERT INTO '....'
    3. SELECT '....'

    Expected behavior (Required)

    Real behavior (Required)

    StarRocks version (Required)

    • You can get the StarRocks version by executing SQL select current_version()
    mysql> select current_version();
    +----------------------------+
    | current_version()          |
    +----------------------------+
    | BRANCH-2.3-RELEASE 648bbc5 |
    +----------------------------+
    
    type/bug 
    opened by colorfulu 0
Releases(2.1.8)
  • 2.1.8(Jun 8, 2022)

    Release date: June 10, 2022

    Improvements

    • The concurrency control mechanism used for internal processing workloads such as schema changes is optimized to reduce the pressure on frontend (FE) metadata. As such, load jobs are less likely to pile up and slow down if these load jobs are concurrently run to load a large amount of data. #6560 #6804
    • The performance of StarRocks in loading data at a high frequency is improved. #6532 #6533

    Bug Fixes

    The following bugs are fixed:

    • ALTER operation logs do not record all information about LOAD statements. Therefore, after you perform an ALTER operation on a routine load job, the metadata of the job is lost after checkpoints are created. #6936
    • A deadlock may occur if you stop a routine load job. #6450
    • By default, a backend (BE) uses the default UTC+8 time zone for a load job. If your server uses the UTC time zone, 8 hours are added to the timestamps in the DateTime column of the table that is loaded by using a Spark load job. #6592
    • The GET_JSON_STRING function cannot process non-JSON strings. If you extract a JSON value from a JSON object or array, the function returns NULL. The function has been optimized to return an equivalent JSON-formatted STRING value for a JSON object or array. #6426
    • If the data volume is large, a schema change may fail due to excessive memory consumption. Optimizations have been made to allow you to specify memory consumption limits at all stages of a schema change. #6705
    • If the number of duplicate values in a column of a table that is being compacted exceeds 0x40000000, the compaction is suspended. #6513
    • After an FE restarts, it encounters high I/O and abnormally increasing disk usage due to a few issues in BDB JE v7.3.8 and shows no sign of restoring to normal. The FE is restored to normal after it rolls back to BDB JE v7.3.7. [#6634](https://github.com/StarRocks/starrocks/issues/6634)

    Full Changelog: https://github.com/StarRocks/starrocks/compare/2.1.7...2.1.8

    Thanks to

    @Astralidea, @Linkerist, @RowenWoo, @chaoyli, @gengjun-git, @meegoo, @mergify, @rickif, @stdpain, @xiaoyong-z

    Source code(tar.gz)
    Source code(zip)
  • 2.0.7(Jun 13, 2022)

    Release date: June 13, 2022

    Bug Fixes

    The following bugs are fixed:

    • If the number of duplicate values in a column of a table that is being compacted exceeds 0x40000000, the compaction is suspended. #6513
    • After an FE restarts, it encounters high I/O and abnormally increasing disk usage due to a few issues in BDB JE v7.3.8 and shows no sign of restoring to normal. The FE is restored to normal after it rolls back to BDB JE v7.3.7. #6634

    Full Changelog: https://github.com/StarRocks/starrocks/compare/2.0.6...2.0.7

    Thanks to

    @Astralidea, @RowenWoo, @gengjun-git, @meegoo, @mergify, @xiaoyong-z

    Source code(tar.gz)
    Source code(zip)
  • 2.2.1(Jun 6, 2022)

    Release date: June 2, 2022

    Improvements

    • Optimized the data loading performance and reduced long tail latency by reconstructing part of the hotspot code and reducing lock granularity. #6641
    • Added the CPU and memory usage information of the machines on which BEs are deployed for each query to the FE audit log. #6208 #6209
    • Supported JSON data types in the tables that use the Primary Key model and tables that use the Unique Key model. #6544
    • Reduced FEs load by reducing lock granularity and deduplicating BE report requests. Optimized the report performance when a large number of BEs are deployed, and solved the issue of Routine Load tasks getting stuck in a large cluster. #6293

    Bug Fixes

    The following bugs are fixed:

    • An error occurs when StarRocks parses the escape characters specified in the SHOW FULL TABLES FROM DatabaseName statement. #6559
    • FE disk space usage rises sharply (Fix this bug by rolling back the BDBJE version). #6708
    • BEs become faulty because relevant fields cannot be found in the data returned after columnar scanning is enabled (enable_docvalue_scan=true). #6600
    Source code(tar.gz)
    Source code(zip)
  • 2.1.7(May 26, 2022)

    Release date: May 26, 2022

    Improvements

    For window functions in which the frame is set to ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, if the partition involved in a calculation is large, StarRocks caches all data of the partition before it performs the calculation. In this situation, a large number of memory resources are consumed. StarRocks has been optimized not to cache all data of the partition in this situation. 5829

    Bug Fixes

    The following bugs are fixed:

    • When data is loaded into a table that uses the Primary Key model, data processing errors may occur if the creation time of each data version stored in the system does not monotonically increase due to reasons such as backward-moved system time and related unknown bugs. Such data processing errors cause backends (BEs) to stop. #6046
    • Some graphical user interface (GUI) tools automatically configure the set_sql_limit variable. As a result, the SQL statement ORDER BY LIMIT is ignored, and consequently an incorrect number of rows are returned for queries. #5966
    • If the DROP SCHEMA statement is executed on a database, the database is forcibly deleted and cannot be restored. #6201
    • When JSON-formatted data is loaded, BEs stop if the data contains JSON format errors. For example, key-value pairs are not separated by commas (,). #6098
    • When a large amount of data is being loaded in a highly concurrent manner, tasks that are run to write data to disks are piled up on BEs. In this situation, the BEs may stop. #3877
    • StarRocks estimates the amount of memory that is required before it performs a schema change on a table. If the table contains a large number of STRING fields, the memory estimation result may be inaccurate. In this situation, if the estimated amount of memory that is required exceeds the maximum memory that is allowed for a single schema change operation, schema change operations that are supposed to be properly run encounter errors. #6322
    • After a schema change is performed on a table that uses the Primary Key model, a "duplicate key xxx" error may occur when data is loaded into that table. #5878
    • If low-cardinality optimization is performed during Shuffle Join operations, partitioning errors may occur. #4890
    • If a colocation group (CG) contains a large number of tables and data is frequently loaded into the tables, the CG may not be able to stay in the stable state. In this case, the JOIN statement does not support Colocate Join operations. StarRocks has been optimized to wait for a little longer during data loading. This way, the integrity of the tablet replicas to which data is loaded can be maximized.

    Full Changelog: https://github.com/StarRocks/starrocks/compare/2.1.6...2.1.7

    Thanks to

    @Astralidea, @HangyuanLiu, @Linkerist, @Youngwb, @chaoyli, @decster, @dirtysalt, @gengjun-git, @meegoo, @rickif, @sevev, @stdpain, @trueeyu, @xiaoyong-z

    Source code(tar.gz)
    Source code(zip)
  • 2.0.6(May 26, 2022)

    Release date: May 25, 2022

    Bug Fixes

    The following bugs are fixed:

    • Some graphical user interface (GUI) tools automatically configure the set_sql_limit variable. As a result, the SQL statement ORDER BY LIMIT is ignored, and consequently an incorrect number of rows are returned for queries. #5966
    • If a colocation group (CG) contains a large number of tables and data is frequently loaded into the tables, the CG may not be able to stay in the stable state. In this case, the JOIN statement does not support Colocate Join operations. StarRocks has been optimized to wait for a little longer during data loading. This way, the integrity of the tablet replicas to which data is loaded can be maximized.
    • If a few replicas fail to be loaded due to reasons such as heavy loads or high network latencies, cloning on these replicas is triggered. In this case, deadlocks may occur, which may cause a situation in which the loads on processes are low but a large number of requests time out. #5646 #6290
    • After the schema of a table that uses the Primary Key model is changed, a "duplicate key xxx" error may occur when data is loaded into that table. #5878
    • If the DROP SCHEMA statement is executed on a database, the database is forcibly deleted and cannot be restored. #6201

    Full Changelog: https://github.com/StarRocks/starrocks/compare/2.0.5...2.0.6

    Thanks to

    @Astralidea, @Linkerist, @chaoyli, @decster, @dirtysalt, @gengjun-git, @sevev, @stdpain

    Source code(tar.gz)
    Source code(zip)
  • 2.2.0(May 25, 2022)

    New Features

    • [Preview] Resource groups are supported. By using resource groups to control CPU and memory usage, StarRocks can achieve resource isolation and rational use of resources when different tenants perform complex and simple queries in the same cluster.
    • [Preview] Java UDFs (user-defined functions) are supported. StarRocks supports writing UDFs in Java, extending StarRocks' functions.
    • [Preview] Primary key model supports partial updates when data is loaded to the primary key model using Stream Load, Broker Load, and Routine Load. In real-time data update scenarios such as updating orders and joining multiple streams, partial updates allow users to update only a few columns.
    • [Preview] JSON data types and JSON functions are supported.
    • External tables based on Apache Hudi are supported, which further improves data lake analytics experience.
    • The following functions are supported:
      • ARRAY functions, including array_agg, array_sort, array_distinct, array_join, reverse, array_slice, array_concat, array_difference, array_overlap, and array_intersect
      • BITMAP functions, including bitmap_max and bitmap_min
      • Other functions, including retention and square

    Improvement

    • CBO's Parser and Analyzer are reconstructed, code structure is optimized and syntax such as Insert with CTE is supported. So the performance of complex queries is optimized, such as those queries reusing common table expression (CTE).
    • The query performance of object storage-based (AWS S3, Alibaba Cloud OSS, Tencent COS) Apache Hive external table is optimized. After optimization, the performance of object storage-based queries is comparable to that of HDFS-based queries. Also, late materialization of ORC files is supported, improving query performance of small files.
    • When external tables are used to query Apache Hive, StarRocks supports automatic and incremental updating of cached metastore data by consuming Hive Metastore events, such as data changes and partition changes. Moreover, it also supports querying DECIMAL and ARRAY data in Apache Hive.
    • The performance of UNION ALL operator is optimized, delivering improvement of up to 2-25 times.
    • The pipeline engine which can adaptively adjust query parallelism is released, and its profile is optimized. The pipeline engine can improve performance for small queries in high concurrent scenarios.
    • StarRocks supports the loading of CSV files with multi-character row delimiters.

    Bug Fixes

    The following bugs are fixed:

    • Deadlocks occur when data is loaded and changes are committed into tables based on Primary Key model. #4998
    • Some FE (including BDBJE) stability issues. #4428, #4666, #2
    • The return value overflows when the SUM function is used to calculate a large amount of data. #3944
    • The return values of ROUND and TRUNCATE functions have precision issues. #4256 Some bugs detected by SQLancer. Please see SQLancer related issues.

    Others

    • The Flink connector flink-connector-starrocks supports Flink 1.14.
    Source code(tar.gz)
    Source code(zip)
  • 2.0.5(May 14, 2022)

    Release date: May 13, 2022 Upgrade recommendation: Some critical bugs related to the correctness of stored data or data queries have been fixed in this version. It is recommended that you upgrade your StarRocks cluster in time.

    Bug Fixes

    The following bugs are fixed:

    • [Critical Bug] Data may be lost as a result of BE failures. This bug is fixed by introducing a mechanism that is used to publish a specific version to multiple BEs at a time. #3140
    • [Critical Bug] If tablets are migrated in specific data ingestion phases, data continues to be written to the original disk on which the tablets are stored. As a result, data is lost, and queries cannot be run properly. #5160
    • [Critical Bug] When you run queries after you perform multiple DELETE operations, you may obtain incorrect query results if optimization on low-cardinality columns is performed for the queries. #5712
    • [Critical Bug] If a query contains a JOIN clause that is used to combine a column with DOUBLE values and a column with VARCHAR values, the query result may be incorrect. #5809
    • In certain circumstances, when you load data into your StarRocks cluster, some replicas of specific versions are marked as valid by the FEs before taking effect. At this time, if you query data of the specific versions, StarRocks cannot find the data and reports errors. #5153
    • If a parameter in the SPLIT function is set to NULL, the BEs of your StarRocks cluster may stop running. #4092
    • After your cluster is upgraded from Apache Doris 0.13 to StarRocks 1.19.x and keeps running for a period of time, a further upgrade to StarRocks 2.0.1 may fail. #5309

    Thanks to:

    @ABingHuang, @Astralidea, @HangyuanLiu, @Pslydhh, @Seaven, @Youngwb, @adzfolc, @decster, @gengjun-git, @kangkaisen, @mergify, @miomiocat, @mofeiatwork, @rickif, @satanson, @sevev, @stdpain

    Source code(tar.gz)
    Source code(zip)
  • 2.1.6(May 11, 2022)

    Release date: May 10, 2022

    Bug Fixes

    The following bugs are fixed:

    • When you run queries after you perform multiple DELETE operations, you may obtain incorrect query results if optimization on low-cardinality columns is performed for the queries. #5712
    • If tablets are migrated in specific data ingestion phases, data continues to be written to the original disk on which the tablets are stored. As a result, data is lost, and queries cannot be run properly. #5160
    • If you covert values between the DECIMAL and STRING data types, the return values may be in an unexpected precision. #5608
    • If you multiply a DECIMAL value by a BIGINT value, an arithmetic overflow may occur. A few adjustments and optimizations are made to fix this bug. #4211

    Thanks to

    @ABingHuang, @Astralidea, @HangyuanLiu, @Seaven, @ZiheLiu, @caneGuy, @gengjun-git, @mergify, @satanson, @sevev, @silverbullet233, @stdpain

    Source code(tar.gz)
    Source code(zip)
  • 2.1.5(Apr 27, 2022)

    Release date: April 27, 2022

    BugFix

    The following bugs are fixed:

    • The calculation result is not correct when decimal multiplication overflows. After the bug is fixed, NULL is returned when decimal multiplication overflows.
    • When statistics have a considerable deviation from the actual statistics, the priority of Collocate Join can be lower than Broadcast Join. As a result, the query planner may not choose Colocate Join as the more appropriate Join strategy. #4817
    • Query fails because the plan for complex expressions is wrong when there are more than 4 tables to join.
    • BEs may stop working under Shuffle Join when the shuffle column is a low-cardinality column. #4890
    • BEs may stop working when the SPLIT function uses a NULL parameter. #4092

    Thanks to:

    @ABingHuang, @Astralidea, @HangyuanLiu, @Linkerist, @Seaven, @Youngwb, @adzfolc, @chaoyli, @decster, @gengjun-git, @kangkaisen, @liuyehcf, @meegoo, @mergify, @miomiocat, @mofeiatwork, @rickif, @satanson, @sevev, @stdpain, @trueeyu, @wyb

    Source code(tar.gz)
    Source code(zip)
  • 2.0.4(Apr 18, 2022)

    Release date: April 18, 2022

    Bug Fixes

    The following bugs are fixed:

    • After deleting columns, adding new partitions, and cloning tablets, the columns' unique ids in old and new tablets may not be the same, which may cause BE to stop working because the system uses a shared tablet schema. #4514
    • When data is loading to a StarRocks external table, if the configured FE of the target StarRocks cluster is not a Leader, it will cause the FE to stop working. #4573
    • Query results may be incorrect, when a Duplicate Key table performs schema change and creates materialized view at the same time. #4839
    • The problem of possible data loss due to BE failure (solved by using Batch publish version). #3140
    Source code(tar.gz)
    Source code(zip)
  • 2.1.4(Apr 12, 2022)

    Release date: April 8, 2022

    New Feature

    • The UUID_NUMERIC function is supported, which returns a LARGEINT value. Compared with UUID function, the performance of UUID_NUMERIC function can be improved by nearly 2 orders of magnitude.

    BugFix

    The following bugs are fixed:

    • After deleting columns, adding new partitions, and cloning tablets, the columns' unique ids in old and new tablets may not be the same, which may cause BE to stop working because the system uses a shared tablet schema. #4514
    • When data is loading to a StarRocks external table, if the configured FE of the target StarRocks cluster is not a Leader, it will cause the FE to stop working. #4573
    • The results of CAST function are different in StarRocks version 1.19 and 2.1. #4701
    • Query results may be incorrect, when a Duplicate Key table performs schema change and creates materialized view at the same time. #4839
    Source code(tar.gz)
    Source code(zip)
  • 2.1.3(Apr 12, 2022)

    Release date: March 19, 2022

    Bug Fixes

    The following bugs are fixed:

    • The problem of possible data loss due to BE failure (solved by using Batch publish version). #3140
    • Some queries may cause memory limit exceeded errors due to inappropriate execution plans.
    • The checksum between replicas may be inconsistent in different compaction processes. #3438
    • Query may fail in some situation when JSON reorder projection is not processed correctly. #4056
    Source code(tar.gz)
    Source code(zip)
  • 2.0.3(Mar 14, 2022)

    Release date: March 14, 2022

    BugFix

    The following bugs are fixed:

    • Query fails when BE nodes are in suspended animation.
    • Query fails when there is no appropriate execution plan for single-tablet table joins. #3854
    • A deadlock problem may occur when an FE node collects information to build a global dictionary for low-cardinality optimization. #3839
    Source code(tar.gz)
    Source code(zip)
  • 2.1.2(Mar 14, 2022)

    Release date: March 14, 2022

    BugFix

    The following bugs are fixed:

    • In a rolling upgrade from version 1.19 to 2.1, BE nodes stop working because of unmatched chunk sizes beween two versions. #3834
    • Loading tasks may fail while StarRocks is updating from version 2.0 to 2.1. #3828
    • Query fails when there is no appropriate execution plan for single-tablet table joins. #3854
    • A deadlock problem may occur when an FE node collects information to build a global dictionary for low-cardinality optimization. #3839
    • Query fails when BE nodes are in suspended animation due to deadlock.
    • BI tools cannot connect to StarRocks when the show variables command fails.#3708
    Source code(tar.gz)
    Source code(zip)
  • 2.1.0(Feb 28, 2022)

    New Features

    • [Preview] StarRocks now supports Iceberg external tables.
    • [Preview] The pipeline engine is now available. It is a new execution engine designed for multicore scheduling. The query parallelism can be adaptively adjusted without the need to set the parallel_fragment_exec_instance_num parameter. This also improves performance in high concurrency scenarios.
    • The CTAS (Create Table As Select) function is supported, making ETL and table creation easier.
    • SQL fingerprint is supported. SQL fingerprint is generated in audit.log, which facilitates the location of slow queries.

    Improvements

    • Compaction is optimized. A flat table can contain up to 10,000 columns.
    • The performance of first-time scan and page cache is optimized. Random I/O is reduced to improve first-time scan performance. The improvement is more noticeable if first-time scan occurs on SATA disks. StarRocks' page cache can store original data, which eliminates the need for bitshuffle encoding and unnecessary decoding. This improves the cache hit rate and query efficiency.
    • Schema change is supported in the primary key model. You can add, delete, and modify bitmap indexes by using Alter table.
    • [Preview] The size of a string can be up to 1 MB.
    • JSON load performance is optimized. You can load more than 100 MB JSON data in a single file.
    • Bitmap index performance is optimized.
    • The performance of StarRocks Hive external tables is optimized. Data in the CSV format can be read.
    • DEFAULT CURRENT_TIMESTAMP is supported in the create table statement. #1193
    • StarRocks supports the loading of CSV files with multiple delimiters.

    BugFix The following bugs are fixed:

    • Auto __op mapping does not take effect if jsonpaths is specified in the command used for loading JSON data. #3405
    • BE nodes fail because the source data changes during data loading using Broker Load. #3481
    • Some SQL statements report errors after materialized views are created. #2975
    • The routine load does not work due to quoted jsonpaths. #2488
    • Query concurrency decreases sharply when the number of columns to query exceeds 200.

    Behavior Changes

    • The API for disabling a Colocation Group is changed from DELETE /api/colocate/group_stable to POST /api/colocate/group_unstable.

    Others

    • flink-connector-starrocks is now available for Flink to read StarRocks data in batches. This improves data read efficiency compared to the JDBC connector.
    Source code(tar.gz)
    Source code(zip)
  • 2.0.2(Mar 2, 2022)

    Improvement

    • Memory usage is optimized. Users can specify the label_keep_max_num parameter to control the maximum number of loading jobs to retain within a period of time. This prevents full GC caused by high memory usage of FE during frequent data loading. #2410

    BugFix The following bugs are fixed:

    • BE nodes fail when the column decoder encounters an exception. #3510
    • Auto __op mapping does not take effect when jsonpaths is specified in the command used for loading JSON data. #3405
    • BE nodes fail because the source data changes during data loading using Broker Load. #3481
    • Some SQL statements report errors after materialized views are created. #3053
    • Query may fail if an SQL clause contains a predicate that supports global dictionary for low-cardinality optimization and a predicate that does not. #3421
    Source code(tar.gz)
    Source code(zip)
  • 2.0.1(Jan 22, 2022)

    Release date: Jan 21, 2022

    Improvement

    • Add implicit_cast in hive external table (#2829)
    • Use read/write lock avoid cpu cost too much when collect metrics(#2901)
    • optimize some statistics.

    Bugfix

    • Fix Global dictionary when replica is inconsistent. (#2700) (#2765)
    • Add exec_mem_limit for stream load (#2693)
    • Fix OOM when loading into primary key model. (#2743)(#2777)
    • Fix BE hang when query external mysql table (#2881)
    Source code(tar.gz)
    Source code(zip)
  • 2.0.0-GA(Jan 4, 2022)

    Release date: Jan 4, 2022

    New Feature

    • External Table
      • [Experimental Function]Support for Hive external table on S3
      • DecimalV3 support for external table #425
    • Implement complex expressions to be pushed down to the storage layer for computation, thus gaining performance gains
    • Primary Key is officially released, which supports Stream Load, Broker Load, Routine Load, and also provides a second-level synchronization tool for MySQL data based on Flink-cdc

    Improvement

    • Arithmetic operators optimization
      • Optimize the performance of dictionary with low cardinality #791
      • Optimize the scan performance of int for single table #273
      • Optimize the performance of count(distinct int) with high cardinality #139 #250 #544#570
      • Execution level optimization and refinement Group by 2 int / limit / case when / not equal
      • Optimize Group by 2 int / limit / case when / not equal in implementation-level
    • Memory management optimization
      • Refactor the memory statistics and control framework to accurately count memory usage and completely solve OOM
      • Optimize metadata memory usage
      • Solve the problem of large memory release stuck in execution threads for a long time
      • Add process graceful exit mechanism and support memory leak check #1093

    Bugfix

    • Fix the problem that the Hive external table is timeout to get metadata in a large amount.
    • Fix the problem of unclear error message of materialized view creation.
    • Fix the implementation of like in vectorization engine #722
    • Repair the error of parsing the predicate is in alter table #725
    • Fix the problem that the curdate function can not format the date.
    Source code(tar.gz)
    Source code(zip)
  • 1.19.5(Dec 20, 2021)

  • 1.19.4(Dec 10, 2021)

    Imporvement

    • support cast(varchar as bitmap) (#1941)
    • Modify Hive external table scan scheduling strategy (#1394) (#1807)

    Bugfix

    • Fix cross join lose predicate in JoinAssociativityRule (#1918)
    • Fix cast to decimal(0,0) bug (#1709) (#1738)
    • Fix replicate join in plan fragment builder (#1727)
    • Fix several planner cost calculation
    Source code(tar.gz)
    Source code(zip)
  • 1.19.3(Dec 2, 2021)

    Improvement

    Upgrade jprotobuf to enhance security (#1506)

    Major Bugfix

    Fix some CBO bad cases and correctness issues. Fix grouping sets with same column bug(#1395) (#1119) Fix some date function issues (#1385) (#1627) Fix streaming aggregation issue(#1584)

    Source code(tar.gz)
    Source code(zip)
  • 1.19.2(Dec 2, 2021)

    Improvement

    • bucket shuffle join support right join and full outer join(#1209) (#1234)

    Major Bugfix

    • Support push down predicate through repeat node(#1410) (#1417)
    • Fix routine load data losing bug when FE master changed (#1074) (#1272)
    • Fix create view failed with union (#1083)
    • Fix some Hive external table stability issues(#1408)
    • Fix select group by view error (#1231)
    Source code(tar.gz)
    Source code(zip)
  • 1.19.1(Nov 2, 2021)

    Improvement

    • Optimize the performance of show frontends. # 507 # 984
    • Add monitoring of slow queries and meta logs. # 502 # 891
    • Optimize the fetching of Hive external metadata to achieve parallel fetching.# 425 # 451

    BugFix

    • Fix the problem of Thrift protocol compatibility, so that the Hive external table can be connected with Kerberos. # 184 # 947 # 995 # 999
    • Fix several bugs in view creation. # 972 # 987# 1001
    • Fix the problem that FE cannot be upgraded in grayscale. # 485 # 890
    Source code(tar.gz)
    Source code(zip)
  • 1.19.0(Nov 3, 2021)

    New Feature

    • Implement Global Runtime Filter, which can enable runtime filter for shuffle join.
    • CBO Planner is enabled by default, improved colocated join, bucket shuffle, statistical information estimation, etc.
    • [Experimental Function] Primary Key model release: To better support real-time/frequent update features, StarRocks has added a new table type: primary key model. The model supports Stream Load, Broker Load, Routine Load, JSON import, and also provides a second-level synchronization tool for MySQL data based on Flink-cdc.
    • [Experimental Function] Support write function for external tables. Support writing data to another StarRocks cluster table by external tables to solve the read/write separation requirement and provide better resource isolation.

    Improvement

    • Performance optimization.
      • count distinct int statement
      • group by int statement
      • or statement
    • Optimize disk balance algorithm. Data can be automatically balanced after adding disks to a single machine.
    • Support partial column export.
    • Optimize show processlist to show specific SQL.
    • Support multiple variable settings in SET_VAR .
    • Improve the error reporting information, including table_sink, routine load, creation of materialized view, etc.

    Bugfix

    • Fix the issue that the dynamic partition table cannot be created automatically after the data recovery operation is completed. # 337
    • Fix the problem of error reported by row_number function after CBO is opened.
    • Fix the problem of FE stuck due to statistical information collection
    • Fix the problem that set_var takes effect for session but not for statements.
    • Fix the problem that select count(*) returns abnormality on the Hive partition external table.
    Source code(tar.gz)
    Source code(zip)
Owner
StarRocks
StarRocks
GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.

Overview GridDB is Database for IoT with both NoSQL interface and SQL Interface. Please refer to GridDB Features Reference for functionality. This rep

GridDB 1.8k Jun 27, 2022
Velox is a new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.

Velox is a C++ database acceleration library which provides reusable, extensible, and high-performance data processing components

Facebook Incubator 893 Jun 27, 2022
A very fast lightweight embedded database engine with a built-in query language.

upscaledb 2.2.1 Fr 10. Mär 21:33:03 CET 2017 (C) Christoph Rupp, [email protected]; http://www.upscaledb.com This is t

Christoph Rupp 531 Jun 20, 2022
A redis module, similar to redis zset, but you can set multiple scores for each member to support multi-dimensional sorting

TairZset: Support multi-score sorting zset Introduction Chinese TairZset is a data structure developed based on the redis module. Compared with the na

Alibaba 43 Jun 15, 2022
The database built for IoT streaming data storage and real-time stream processing.

The database built for IoT streaming data storage and real-time stream processing.

HStreamDB 501 Jun 29, 2022
A mini database for learning database

A mini database for learning database

Chuckie Tan 3 Nov 3, 2021
SiriDB is a highly-scalable, robust and super fast time series database

SiriDB is a highly-scalable, robust and super fast time series database. Build from the ground up SiriDB uses a unique mechanism to operate without a global index and allows server resources to be added on the fly. SiriDB's unique query language includes dynamic grouping of time series for easy analysis over large amounts of time series.

SiriDB 464 Jun 13, 2022
TimescaleDB is an open-source database designed to make SQL scalable for time-series data.

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.

Timescale 13.3k Jun 27, 2022
以简单、易用、高性能为目标、开源的时序数据库,支持Linux和Windows, Time Series Database

松果时序数据库(pinusdb) 松果时序数据库是一款针对中小规模(设备数少于10万台,每天产生的数据量少于10亿条)场景设计的时序数据库。以简单、易用、高性能为设计目标。使用SQL语句进行交互,拥有极低的学习、使用成本, 提供了丰富的功能、较高的性能。 我们的目标是成为最简单、易用、健壮的单机时序

null 94 Apr 27, 2022
A friendly and lightweight C++ database library for MySQL, PostgreSQL, SQLite and ODBC.

QTL QTL is a C ++ library for accessing SQL databases and currently supports MySQL, SQLite, PostgreSQL and ODBC. QTL is a lightweight library that con

null 155 Jun 26, 2022
ObjectBox C and C++: super-fast database for objects and structs

ObjectBox Embedded Database for C and C++ ObjectBox is a superfast C and C++ database for embedded devices (mobile and IoT), desktop and server apps.

ObjectBox 131 Jun 17, 2022
dqlite is a C library that implements an embeddable and replicated SQL database engine with high-availability and automatic failover

dqlite dqlite is a C library that implements an embeddable and replicated SQL database engine with high-availability and automatic failover. The acron

Canonical 3k Jun 27, 2022
ESE is an embedded / ISAM-based database engine, that provides rudimentary table and indexed access.

Extensible-Storage-Engine A Non-SQL Database Engine The Extensible Storage Engine (ESE) is one of those rare codebases having proven to have a more th

Microsoft 780 Jun 13, 2022
Nebula Graph is a distributed, fast open-source graph database featuring horizontal scalability and high availability

Nebula Graph is an open-source graph database capable of hosting super large scale graphs with dozens of billions of vertices (nodes) and trillions of edges, with milliseconds of latency.

vesoft inc. 807 Jun 30, 2022
OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.

What is OceanBase database OceanBase Database is a native distributed relational database. It is developed entirely by Alibaba and Ant Group. OceanBas

OceanBase 4.4k Jun 27, 2022
Config and tools for config of tasmota devices from mysql database

tasmota-sql Tools for management of tasmota devices based on mysql. The tasconfig command can load config from tasmota and store in sql, or load from

RevK 3 Jan 8, 2022
Serverless SQLite database read from and write to Object Storage Service, run on FaaS platform.

serverless-sqlite Serverless SQLite database read from and write to Object Storage Service, run on FaaS platform. NOTES: This repository is still in t

老雷 7 May 12, 2022
Trilogy is a client library for MySQL-compatible database servers, designed for performance, flexibility, and ease of embedding.

Trilogy is a client library for MySQL-compatible database servers, designed for performance, flexibility, and ease of embedding.

GitHub 248 Jun 11, 2022
DB Browser for SQLite (DB4S) is a high quality, visual, open source tool to create, design, and edit database files compatible with SQLite.

DB Browser for SQLite What it is DB Browser for SQLite (DB4S) is a high quality, visual, open source tool to create, design, and edit database files c

null 16.7k Jun 27, 2022