Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB

Overview

Scylla

Slack Twitter

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

  • The users mailing list and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
  • The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.
Comments
  • c-s latency caused by high latency from peer node

    c-s latency caused by high latency from peer node

    1. Start 2 nodes n1, n2 using recent scylla master 1fd701e
    2. Enable slow query curl -X POST "http://127.0.0.1:10000/storage_service/slow_query?enable=true&fast=false&threshold=80000" curl -X POST "http://127.0.0.2:10000/storage_service/slow_query?enable=true&fast=false&threshold=80000"
    3. Start c-s cassandra-stress write no-warmup cl=TWO n=5000000 -schema 'replication(factor=2)' -port jmx=6868 -mode cql3 native -rate threads=200 -col 'size=FIXED(5) n=FIXED(8)' -pop seq=1500000000..2500000000
    4. Run repair to make c-s latency high to trigger the slow query tracing

    See the following trace, node 127.0.0.2 applies the write very fast (less than 100us), while the remote node 127.0.0.1 took 295677 us. This means the 300ms c-s latency seen by the client (c-s) were mostly contributed by the remote node. Due to the tracing issues I reported here https://github.com/scylladb/scylla/issues/9403, we do not know where the time was spent on the remote took. It might be disk or network or cpu contention. But I have a feeling, the contention is from network when repair runs since we do not have a network scheduler. So the theory is that the remote node applies the write very quickly, but either the network rpc message to send the request or response are contented, so in the end, node 127.0.0.2 got the response with a high latency.

    cqlsh> SELECT * from system_traces.events WHERE session_id=ea0a5cc0-2021-11ec-be32-b254958ec4a2;
    
     session_id                           | event_id                             | activity                                                                                           | scylla_parent_id | scylla_span_id  | source    | source_elapsed | thread
    --------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------+------------------+-----------------+-----------+----------------+---------
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a770a-2021-11ec-be32-b254958ec4a2 |                                                                                    Checking bounds |                0 | 373048741859841 | 127.0.0.2 |              0 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a770f-2021-11ec-be32-b254958ec4a2 |                                                                             Processing a statement |                0 | 373048741859841 | 127.0.0.2 |              0 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a781b-2021-11ec-be32-b254958ec4a2 | Creating write handler for token: -6493410074079723942 natural: {127.0.0.1, 127.0.0.2} pending: {} |                0 | 373048741859841 | 127.0.0.2 |             27 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a782e-2021-11ec-be32-b254958ec4a2 |                                  Creating write handler with live: {127.0.0.1, 127.0.0.2} dead: {} |                0 | 373048741859841 | 127.0.0.2 |             29 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a7850-2021-11ec-be32-b254958ec4a2 |                                                                 X Sending a mutation to /127.0.0.1 |                0 | 373048741859841 | 127.0.0.2 |             32 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a786a-2021-11ec-be32-b254958ec4a2 |                                                                     X Executing a mutation locally |                0 | 373048741859841 | 127.0.0.2 |             35 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a7993-2021-11ec-be32-b254958ec4a2 |                                                            Z Finished executing a mutation locally |                0 | 373048741859841 | 127.0.0.2 |             65 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a799c-2021-11ec-be32-b254958ec4a2 |                                                                     Got a response from /127.0.0.2 |                0 | 373048741859841 | 127.0.0.2 |             65 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a79e0-2021-11ec-be32-b254958ec4a2 |                                                        Z Finished Sending a mutation to /127.0.0.1 |                0 | 373048741859841 | 127.0.0.2 |             72 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea3794ed-2021-11ec-be32-b254958ec4a2 |                                                                     Got a response from /127.0.0.1 |                0 | 373048741859841 | 127.0.0.2 |         295677 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea3794f3-2021-11ec-be32-b254958ec4a2 |                                       Delay decision due to throttling: do not delay, resuming now |                0 | 373048741859841 | 127.0.0.2 |         295677 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea3797f8-2021-11ec-be32-b254958ec4a2 |                                                                    Mutation successfully completed |                0 | 373048741859841 | 127.0.0.2 |         295755 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea379808-2021-11ec-be32-b254958ec4a2 |                                                               Done processing - preparing a result |                0 | 373048741859841 | 127.0.0.2 |         295756 | shard 0
    
    (13 rows)
    
    latency Backport candidate Eng-3 
    opened by asias 203
  • Node stuck 12 hours in decommission

    Node stuck 12 hours in decommission

    Installation details Scylla version (or git commit hash): 3.1.0.rc5-0.20190902.623ea5e3d Cluster size: 4 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-02055ad6b0af5669b

    We see that Thrift and CQL ports are closed but nodetool command is stuck

    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] compaction - Compacted 1 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-280840-big-Data.db:level=2,
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-279118-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-280714-big-Data.db:level=1, /var/lib/scy
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: unbootstrap done
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - Thrift server stopped
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - CQL server stopped
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: shutdown rpc and cql server done
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: stop batchlog_manager done
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] gossip - My status = LEFT
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] gossip - No local state or state is in silent shutdown, not announcing shutdown
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: stop_gossiping done
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 12] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 10] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 9] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 8] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 9] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 4] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 4] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 2] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 3] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 7] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 2] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 3] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 5] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 5] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 5] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacted 1 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-286124-big-Data.db:level=2,
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-285942-big-Data.db:level=1, ]
    Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacted 1 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-286138-big-Data.db:level=2,
    Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-285956-big-Data.db:level=1, ]
    Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 7] compaction - Compacted 9 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-282149-big-Data.db:level=3,
    Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 7] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-278747-big-Data.db:level=3, /var/lib/scy
    Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 10] compaction - Compacting [/var/lib/scylla/data/system/large_rows-40550f66085839a09430f27fc08034e9/mc-4252-big-Data.db:level=0, /var/lib/scylla
    Sep 04 22:43:35 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 10] compaction - Compacted 2 sstables to [/var/lib/scylla/data/system/large_rows-40550f66085839a09430f27fc08034e9/mc-4266-big-Data.db:level=0, ].
    

    nodetool stuck more than 12 hours

    [[email protected] centos]# ps -fp 119286
    UID         PID   PPID  C STIME TTY          TIME CMD
    centos   119286   1759  0 Sep04 ?        00:00:00 /bin/sh /usr/bin/nodetool -u cassandra -pw cassandra decommission
    [[email protected] centos]# date
    Thu Sep  5 12:57:52 UTC 2019
    [[email protected] centos]#
    

    Probably related to issue with nodetool drain stuck #4891 and old issue #961

    bug Regression 
    opened by bentsi 197
  • resharding + alternator LWT -> Scylla service takes 36 minutes to start

    resharding + alternator LWT -> Scylla service takes 36 minutes to start

    Installation details

    Kernel Version: 5.13.0-1021-aws

    Scylla version (or git commit hash): 2022.1~rc3-20220406.5cc3b678c with build-id 48dfae0735cd8efc4ae2f5c777beaee2a1e89f4a

    Cluster size: 4 nodes (i3.4xlarge)

    Scylla Nodes used in this run:

    • alternator-48h-2022-1-db-node-81cb61d9-5 (34.241.246.188 | 10.0.2.75) (shards: 14)
    • alternator-48h-2022-1-db-node-81cb61d9-4 (52.30.41.107 | 10.0.3.6) (shards: 14)
    • alternator-48h-2022-1-db-node-81cb61d9-3 (52.214.185.121 | 10.0.1.89) (shards: 14)
    • alternator-48h-2022-1-db-node-81cb61d9-2 (34.242.68.250 | 10.0.1.112) (shards: 14)
    • alternator-48h-2022-1-db-node-81cb61d9-1 (176.34.90.117 | 10.0.0.237) (shards: 14)

    OS / Image: ami-071c70d20f0fdbb2c (aws: eu-west-1)

    Test: longevity-alternator-200gb-48h-test

    Test id: 81cb61d9-8d3f-45ae-8b50-f7882b4a6af8

    Test name: longevity/longevity-alternator-200gb-48h-test

    Test config file(s):

    Issue description

    At 2022-04-16 09:34:34.496 a restart with resharding nemesis has started on node 4. The nemesis shuts down the scylla service, edits the murmur3_partitioner_ignore_msb_bits config value to force resharding, and starts the scylla service again exepcting the initialization to take 5 minutes at most. When we start the service, however, it took 36 minutes for scylla to start:

    2022-04-16T09:36:11+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - installing SIGHUP handler
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Scylla version 2022.1.rc3-0.20220406.5cc3b678c with build-id 48dfae0735cd8efc4ae2f5c777beaee2a1e89f4a starting ...
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting prometheus API server
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting tokens manager
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting effective_replication_map factory
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting migration manager notifier
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting lifecycle notifier
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - creating tracing
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - creating snitch
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting API server
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Scylla API server listening on 127.0.0.1:10000 ...
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing storage service
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting gossiper
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - seeds={10.0.0.237}, listen_address=10.0.3.6, broadcast_address=10.0.3.6
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing storage service
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting per-shard database core
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - creating and verifying directories
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting database
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting storage proxy
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting migration manager
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting query processor
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing batchlog manager
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - loading system sstables
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - loading non-system sstables
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting view update generator
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - setting up system keyspace
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting commit log
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing migration manager RPC verbs
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing storage proxy RPC verbs
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting streaming service
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting hinted handoff manager
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting messaging service
    2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting CDC Generation Management service
    2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting CDC log service
    2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting storage service
    2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting sstables loader
    2022-04-16T10:07:47+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting system distributed keyspace
    2022-04-16T10:11:47+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting tracing
    2022-04-16T10:11:48+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - SSTable data integrity checker is disabled.
    2022-04-16T10:11:48+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting auth service
    2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting batchlog manager
    2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting load meter
    2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting cf cache hit rate calculator
    2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting view update backlog broker
    2022-04-16T10:11:53+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Waiting for gossip to settle before accepting client requests...
    2022-04-16T10:12:06+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - allow replaying hints
    2022-04-16T10:12:07+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Launching generate_mv_updates for non system tables
    2022-04-16T10:12:07+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting the view builder
    2022-04-16T10:12:25+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting native transport
    2022-04-16T10:12:26+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting the expiration service
    2022-04-16T10:12:27+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - serving
    2022-04-16T10:12:27+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Scylla version 2022.1.rc3-0.20220406.5cc3b678c initialization completed.
    

    Namely, the loading phases took way longer than usual.

    • Restore Monitor Stack command: $ hydra investigate show-monitor 81cb61d9-8d3f-45ae-8b50-f7882b4a6af8
    • Restore monitor on AWS instance using Jenkins job
    • Show all stored logs command: $ hydra investigate show-logs 81cb61d9-8d3f-45ae-8b50-f7882b4a6af8

    Logs:

    db-cluster: https://cloudius-jenkins-test.s3.amazonaws.com/81cb61d9-8d3f-45ae-8b50-f7882b4a6af8/20220424_100353/db-cluster-81cb61d9.tar.gz loader-set: https://cloudius-jenkins-test.s3.amazonaws.com/81cb61d9-8d3f-45ae-8b50-f7882b4a6af8/20220424_100353/loader-set-81cb61d9.tar.gz monitor-set: https://cloudius-jenkins-test.s3.amazonaws.com/81cb61d9-8d3f-45ae-8b50-f7882b4a6af8/20220424_100353/monitor-set-81cb61d9.tar.gz

    Jenkins job URL

    high Regression compaction resharding 
    opened by ShlomiBalalis 143
  • Coredumps during restart_then_repair_node nemesis

    Coredumps during restart_then_repair_node nemesis

    This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

    • [x] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

    Installation details Scylla version (or git commit hash):3.1.0.rc4-0.20190826.e4a39ed31 Cluster size:4 OS (RHEL/CentOS/Ubuntu/AWS AMI):ami-0ececa5cacea302a8

    During restart_then_repair_node, the target node (# 5) suffered from streaming exceptions:

    (DatabaseLogEvent Severity.CRITICAL): type=DATABASE_ERROR regex=Exception  line_number=26510 node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-5 [52.50.193.198 | 10.0.133.1] (seed: False)
    2019-08-27T22:03:51+00:00  ip-10-0-133-1 !WARNING | scylla: [shard 0] range_streamer - Bootstrap with 10.0.10.203 for keyspace=scylla_bench failed, took 773.173 seconds: streaming::stream_exception (Stream failed)
    

    While 2 other nodes suffered from semaphore timeouts (could be related to #4615)

    (DatabaseLogEvent Severity.CRITICAL): type=DATABASE_ERROR regex=Exception  line_number=14442 node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
    2019-08-27T22:06:09+00:00  ip-10-0-178-144 !ERR     | scylla: [shard 7] storage_proxy - Exception when communicating with 10.0.178.144: seastar::semaphore_timed_out (Semaphore timedout)
    

    and created coredumps like so:

    (CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
    corefile_urls=
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.4406.1566942687000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.4406.1566942687000000.gz.aa
    backtrace=           PID: 4406 (scylla)
               UID: 996 (scylla)
               GID: 1001 (scylla)
            Signal: 6 (ABRT)
         Timestamp: Tue 2019-08-27 21:51:27 UTC (1min 55s ago)
      Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
        Executable: /opt/scylladb/libexec/scylla
     Control Group: /
           Boot ID: 9f0393fe20f04dfab829e5bb5cc4bdad
        Machine ID: df877a200226bc47d06f26dae0736ec9
          Hostname: ip-10-0-178-144.eu-west-1.compute.internal
          Coredump: /var/lib/systemd/coredump/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.4406.1566942687000000
           Message: Process 4406 (scylla) of user 996 dumped core.
                    
                    Stack trace of thread 4430:
                    #0  0x00007f95cfcc953f raise (libc.so.6)
                    #1  0x00007f95cfcb395e abort (libc.so.6)
                    #2  0x00000000040219ab on_allocation_failure (scylla)
    

    Here I'll add links to all of those kind of coredumps, knowing that there's currently a bit of an issue with uploading them, hoping that one of them uploaded correctly:

    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.16744.1566943317000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.16744.1566943317000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17150.1566944278000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17150.1566944278000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17686.1566944862000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17686.1566944862000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18167.1566945503000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18167.1566945503000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18731.1566946375000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18731.1566946375000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.19423.1566947108000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.19423.1566947108000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz.aa
    
    
    (CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
    corefile_urls=
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz.aa
    backtrace=           PID: 20078 (scylla)
               UID: 996 (scylla)
               GID: 1001 (scylla)
            Signal: 6 (ABRT)
         Timestamp: Tue 2019-08-27 23:17:10 UTC (1min 57s ago)
      Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
        Executable: /opt/scylladb/libexec/scylla
     Control Group: /
           Boot ID: 9f0393fe20f04dfab829e5bb5cc4bdad
        Machine ID: df877a200226bc47d06f26dae0736ec9
          Hostname: ip-10-0-178-144.eu-west-1.compute.internal
          Coredump: /var/lib/systemd/coredump/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000
           Message: Process 20078 (scylla) of user 996 dumped core.
                    
                    Stack trace of thread 20089:
                    #0  0x00007fa119b2853f raise (libc.so.6)
                    #1  0x00007fa119b1295e abort (libc.so.6)
                    #2  0x0000000000469b8e _ZN8logalloc18allocating_section7reserveEv (scylla)
    

    Different backtraces and translation during the run:

    Aug 27 22:03:29 ip-10-0-10-203.eu-west-1.compute.internal scylla[5160]:  [shard 10] seastar - Failed to allocate 851968 bytes
    0x00000000041808b2
    0x000000000406d935
    0x000000000406dc35
    0x000000000406dce3
    0x00007f7420b4602f
    /opt/scylladb/libreloc/libc.so.6+0x000000000003853e
    /opt/scylladb/libreloc/libc.so.6+0x0000000000022894
    0x00000000040219aa
    0x0000000004022a0e
    0x000000000131bcb3
    0x000000000137d78f
    0x000000000131725f
    0x000000000136c8b1
    0x00000000014555b5
    0x0000000001296442
    0x000000000145c35a
    0x000000000406ae21
    0x000000000406b01e
    0x000000000414d06d
    0x00000000041776ab
    0x00000000040390dd
    /opt/scylladb/libreloc/libpthread.so.0+0x000000000000858d
    /opt/scylladb/libreloc/libc.so.6+0x00000000000fd6a2
    
    

    translated:

    void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::backtrace_buffer::append_backtrace() at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) print_with_backtrace at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:1104
    seastar::print_with_backtrace(char const*) at /usr/include/boost/program_options/variables_map.hpp:146
    sigabrt_action at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5012
     (inlined by) _FUN at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5008
    ?? ??:0
    ?? ??:0
    ?? ??:0
    seastar::memory::on_allocation_failure(unsigned long) at memory.cc:?
    operator new(unsigned long) at ??:?
     (inlined by) operator new(unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/memory.cc:1674
    seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::expand(unsigned long) at crtstuff.c:?
     (inlined by) seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::expand(unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/circular_buffer.hh:301
    sstables::promoted_index_blocks_reader::process_state(seastar::temporary_buffer<char>&, sstables::promoted_index_blocks_reader::m_parser_context&) at crtstuff.c:?
     (inlined by) seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::maybe_expand(unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/circular_buffer.hh:331
     (inlined by) void seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::emplace_back<position_in_partition, position_in_partition, unsigned long&, unsigned long&, std::optional<sstables::deletion_time> >(position_in_partition&&, position_in_partition&&, unsigned long&, unsigned long&, std::optional<sstables::deletion_time>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/circular_buffer.hh:391
     (inlined by) sstables::promoted_index_blocks_reader::process_state(seastar::temporary_buffer<char>&, sstables::promoted_index_blocks_reader::m_parser_context&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/./sstables/index_entry.hh:416
    data_consumer::continuous_data_consumer<sstables::promoted_index_blocks_reader>::process(seastar::temporary_buffer<char>&) at crtstuff.c:?
     (inlined by) sstables::promoted_index_blocks_reader::process_state(seastar::temporary_buffer<char>&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/./sstables/index_entry.hh:456
     (inlined by) data_consumer::continuous_data_consumer<sstables::promoted_index_blocks_reader>::process(seastar::temporary_buffer<char>&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/consumer.hh:404
    seastar::future<> seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}>(seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}) at crtstuff.c:?
     (inlined by) seastar::future<seastar::consumption_result<char> > std::__invoke_impl<seastar::future<seastar::consumption_result<char> >, sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char> >(std::__invoke_other, sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char>&&) at /usr/include/c++/8/bits/invoke.h:60
     (inlined by) std::__invoke_result<sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char> >::type std::__invoke<sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char> >(sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char>&&) at /usr/include/c++/8/bits/invoke.h:96
     (inlined by) std::result_of<sstables::promoted_index_blocks_reader& (seastar::temporary_buffer<char>&&)>::type std::reference_wrapper<sstables::promoted_index_blocks_reader>::operator()<seastar::temporary_buffer<char> >(seastar::temporary_buffer<char>&&) const at /usr/include/c++/8/bits/refwrap.h:319
     (inlined by) seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}::operator()() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:227
     (inlined by) seastar::future<> seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}>(seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:285
    sstables::index_reader::advance_upper_past(position_in_partition_view) at crtstuff.c:?
     (inlined by) seastar::future<> seastar::input_stream<char>::consume<sstables::promoted_index_blocks_reader>(sstables::promoted_index_blocks_reader&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:236
     (inlined by) data_consumer::continuous_data_consumer<sstables::promoted_index_blocks_reader>::consume_input() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/consumer.hh:377
     (inlined by) sstables::index_entry::get_next_pi_blocks() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/./sstables/index_entry.hh:614
     (inlined by) sstables::index_reader::advance_upper_past(position_in_partition_view) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/index_reader.hh:582
    seastar::future<bool> seastar::futurize<seastar::future<bool> >::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&, std::tuple<>&&) [clone .constprop.7996] at sstables.cc:?
     (inlined by) seastar::apply_helper<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#1}&&, std::tuple) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:35
     (inlined by) auto seastar::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:43
     (inlined by) seastar::future<bool> seastar::futurize<seastar::future<bool> >::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1392
    _ZN7seastar12continuationIZZNS_6futureIJEE9then_implIZN8sstables12index_reader34advance_lower_and_check_if_presentEN3dht18ring_position_viewESt8optionalI26position_in_partition_viewEEUlvE_NS1_IJbEEEEET0_OT_ENKUlvE_clEvEUlSF_E_JEE15run_and_disposeEv at crtstuff.c:?
     (inlined by) seastar::future<bool> seastar::future<>::then<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}, seastar::future<bool> >(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:917
     (inlined by) sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/index_reader.hh:775
     (inlined by) seastar::apply_helper<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#1}&&, std::tuple<>) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:35
     (inlined by) auto seastar::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:43
     (inlined by) seastar::future<bool> seastar::futurize<seastar::future<bool> >::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1392
     (inlined by) _ZZZN7seastar6futureIJEE9then_implIZN8sstables12index_reader34advance_lower_and_check_if_presentEN3dht18ring_position_viewESt8optionalI26position_in_partition_viewEEUlvE_NS0_IJbEEEEET0_OT_ENKUlvE_clEvENUlSE_E_clINS_12future_stateIJEEEEEDaSE_ at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:950
     (inlined by) _ZN7seastar12continuationIZZNS_6futureIJEE9then_implIZN8sstables12index_reader34advance_lower_and_check_if_presentEN3dht18ring_position_viewESt8optionalI26position_in_partition_viewEEUlvE_NS1_IJbEEEEET0_OT_ENKUlvE_clEvEUlSF_E_JEE15run_and_disposeEv at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:377
    seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) seastar::reactor::run() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:4243
    seastar::smp::configure(boost::program_options::variables_map, seastar::reactor_config)::{lambda()#3}::operator()() const at /usr/include/boost/program_options/variables_map.hpp:146
    std::function<void ()>::operator()() const at /usr/include/c++/8/bits/std_function.h:687
     (inlined by) seastar::posix_thread::start_routine(void*) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/posix.cc:52
    
    Aug 27 22:18:38 ip-10-0-10-203.eu-west-1.compute.internal scylla[31878]:  [shard 0] seastar - Failed to allocate 131072 bytes
    0x00000000041808b2
    0x000000000406d935
    0x000000000406dc35
    0x000000000406dce3
    0x00007f2e2d4c002f
    /opt/scylladb/libreloc/libc.so.6+0x000000000003853e
    /opt/scylladb/libreloc/libc.so.6+0x0000000000022894
    0x00000000040219aa
    0x00000000040235f4
    0x0000000004124fac
    0x0000000000cf0f7d
    0x0000000000cf1027
    0x00000000036ec4bb
    0x0000000004000a85
    0x00000000040030f4
    0x0000000001523df0
    0x0000000001581d82
    0x00000000015a5249
    0x00000000015a7e14
    0x0000000001094cdf
    0x000000000109798d
    0x000000000109872d
    0x000000000109b983
    0x000000000109d0d5
    0x00000000010c786f
    0x00000000010985c1
    0x000000000109b983
    0x000000000109d0d5
    0x00000000010e9aae
    0x00000000010eaa19
    0x0000000000e52db4
    0x000000000406ae21
    0x000000000406b01e
    0x000000000414d06d
    0x0000000003fd51d6
    0x0000000003fd6922
    0x00000000007d9d69
    /opt/scylladb/libreloc/libc.so.6+0x0000000000024412
    0x000000000083a1fd
    
    

    Translation:

    void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::backtrace_buffer::append_backtrace() at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) print_with_backtrace at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:1104
    seastar::print_with_backtrace(char const*) at /usr/include/boost/program_options/variables_map.hpp:146
    sigabrt_action at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5012
     (inlined by) _FUN at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5008
    ?? ??:0
    ?? ??:0
    ?? ??:0
    seastar::memory::on_allocation_failure(unsigned long) at memory.cc:?
    __libc_posix_memalign at ??:?
     (inlined by) posix_memalign at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/memory.cc:1601
    seastar::temporary_buffer<unsigned char>::aligned(unsigned long, unsigned long) at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) seastar::file::read_state<unsigned char>::read_state(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/file.hh:481
     (inlined by) seastar::shared_ptr_no_esft<seastar::file::read_state<unsigned char> >::shared_ptr_no_esft<unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&>(unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/shared_ptr.hh:164
     (inlined by) seastar::lw_shared_ptr<seastar::file::read_state<unsigned char> > seastar::lw_shared_ptr<seastar::file::read_state<unsigned char> >::make<unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&>(unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/shared_ptr.hh:266
     (inlined by) seastar::lw_shared_ptr<seastar::file::read_state<unsigned char> > seastar::make_lw_shared<seastar::file::read_state<unsigned char>, unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&>(unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/shared_ptr.hh:416
     (inlined by) seastar::posix_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:2352
    checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&)::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/sstring.hh:257
     (inlined by) auto do_io_check<checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&)::{lambda()#1}, , void>(std::function<void (std::__exception_ptr::exception_ptr)> const&, checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&)::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/disk-error-handler.hh:73
    checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/sstring.hh:257
    tracking_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/reader_concurrency_semaphore.cc:184
    seastar::future<seastar::temporary_buffer<char> > seastar::file::dma_read_bulk<char>(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/file.hh:421
     (inlined by) seastar::file_data_source_impl::issue_read_aheads(unsigned int)::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/fstream.cc:256
     (inlined by) seastar::future<seastar::temporary_buffer<char> > seastar::futurize<seastar::future<seastar::temporary_buffer<char> > >::apply<seastar::file_data_source_impl::issue_read_aheads(unsigned int)::{lambda()#1}>(seastar::file_data_source_impl::issue_read_aheads(unsigned int)::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/future.hh:1402
     (inlined by) seastar::file_data_source_impl::issue_read_aheads(unsigned int) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/fstream.cc:255
    seastar::file_data_source_impl::get() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/fstream.cc:173
    seastar::data_source::get() at /usr/include/c++/8/variant:1356
     (inlined by) seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::data_consume_rows_context_m> >(std::reference_wrapper<sstables::data_consume_rows_context_m>&&)::{lambda()#1}::operator()() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:206
     (inlined by) seastar::future<> seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::data_consume_rows_context_m> >(std::reference_wrapper<sstables::data_consume_rows_context_m>&&)::{lambda()#1}>(seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::data_consume_rows_context_m> >(std::reference_wrapper<sstables::data_consume_rows_context_m>&&)::{lambda()#1}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:285
    sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const at crtstuff.c:?
     (inlined by) seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context_m>(sstables::data_consume_rows_context_m&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:236
     (inlined by) data_consumer::continuous_data_consumer<sstables::data_consume_rows_context_m>::consume_input() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/consumer.hh:377
     (inlined by) sstables::data_consume_context<sstables::data_consume_rows_context_m>::read() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/data_consume_context.hh:98
     (inlined by) sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/partition.cc:479
     (inlined by) seastar::apply_helper<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#2}&&, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:35
     (inlined by) auto seastar::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:43
     (inlined by) seastar::future<> seastar::futurize<seastar::future<> >::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1392
     (inlined by) seastar::future<> seastar::future<>::then_impl<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}, seastar::future<> >(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:936
     (inlined by) seastar::future<> seastar::future<>::then<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}, seastar::future<> >(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:917
     (inlined by) sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/partition.cc:480
    seastar::future<> seastar::do_until<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}) at crtstuff.c:?
     (inlined by) seastar::future<> seastar::futurize<void>::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}&>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1385
     (inlined by) seastar::future<> seastar::do_until<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#1}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#1}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:507
     (inlined by) sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/partition.cc:481
     (inlined by) seastar::future<> seastar::do_void_futurize_helper<seastar::future<> >::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1359
     (inlined by) seastar::future<> seastar::futurize<void>::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1385
     (inlined by) seastar::future<> seastar::do_until<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:507
    sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at crtstuff.c:?
    flat_mutation_reader::impl::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) flat_mutation_reader::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.hh:337
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:308
     (inlined by) apply<mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)>, mutation_reader_merger::reader_and_last_fragment_kind&> at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1402
     (inlined by) futurize_apply<mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)>, mutation_reader_merger::reader_and_last_fragment_kind&> at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1474
     (inlined by) parallel_for_each<mutation_reader_merger::reader_and_last_fragment_kind*, mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:129
    parallel_for_each<utils::small_vector<mutation_reader_merger::reader_and_last_fragment_kind, 4>&, mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) mutation_reader_merger::prepare_next(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:307
    mutation_reader_merger::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
    mutation_fragment_merger<mutation_reader_merger>::fetch(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) mutation_fragment_merger<mutation_reader_merger>::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:120
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:489
    repeat<combined_mutation_reader::fill_buffer(seastar::lowres_clock::time_point)::<lambda()> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) combined_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:500
    flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.hh:391
     (inlined by) restricting_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda(flat_mutation_reader&)#1}::operator()(flat_mutation_reader&) const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:637
     (inlined by) _ZN27restricting_mutation_reader11with_readerIZNS_11fill_bufferENSt6chrono10time_pointIN7seastar12lowres_clockENS1_8durationIlSt5ratioILl1ELl1000EEEEEEEUlR20flat_mutation_readerE_EEDcT_S9_ at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:610
     (inlined by) restricting_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:641
    flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) mutation_reader_merger::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:384
    mutation_fragment_merger<mutation_reader_merger>::fetch(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) mutation_fragment_merger<mutation_reader_merger>::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:120
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:489
    repeat<combined_mutation_reader::fill_buffer(seastar::lowres_clock::time_point)::<lambda()> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) combined_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:500
    flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /usr/include/c++/8/bits/unique_ptr.h:81
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.cc:681
    apply<flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()>&> at /usr/include/c++/8/bits/unique_ptr.h:81
     (inlined by) apply<flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()>&> at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1385
     (inlined by) do_until<flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()>, flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:507
     (inlined by) fill_buffer at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.cc:682
    _ZN7seastar8internal8repeaterIZZ19fragment_and_freeze20flat_mutation_readerSt8functionIFNS_6futureIJNS_10bool_classINS_18stop_iteration_tagEEEEEE15frozen_mutationbEEmENKUlRT_RT0_E_clIS2_28fragmenting_mutation_freezerEEDaSD_SF_EUlvE_E15run_and_disposeEv at frozen_mutation.cc:?
     (inlined by) flat_mutation_reader::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.hh:337
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/frozen_mutation.cc:259
     (inlined by) run_and_dispose at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:218
    seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) seastar::reactor::run() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:4243
    seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/future.hh:768
    seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/future.hh:768
    main at crtstuff.c:?
    ?? ??:0
    _start at ??:?
    
    (CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
    corefile_urls=
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20758.1566948577000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20758.1566948577000000.gz.aa
    backtrace=           PID: 20758 (scylla)
               UID: 996 (scylla)
               GID: 1001 (scylla)
            Signal: 6 (ABRT)
         Timestamp: Tue 2019-08-27 23:29:37 UTC (1min 52s ago)
      Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
        Executable: /opt/scylladb/libexec/scylla
     Control Group: /
           Boot ID: 9f0393fe20f04dfab829e5bb5cc4bdad
        Machine ID: df877a200226bc47d06f26dae0736ec9
          Hostname: ip-10-0-178-144.eu-west-1.compute.internal
          Coredump: /var/lib/systemd/coredump/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20758.1566948577000000
           Message: Process 20758 (scylla) of user 996 dumped core.
                    
                    Stack trace of thread 20769:
                    #0  0x00007f179a95053f raise (libc.so.6)
                    #1  0x00007f179a93a95e abort (libc.so.6)
                    #2  0x0000000000469b8e _ZN8logalloc18allocating_section7reserveEv (scylla)
                    #3  0x00000000071c0d93 n/a (n/a)
    
    

    The other node's coredumps:

    (CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-1 [63.35.248.143 | 10.0.10.203] (seed: True)
    corefile_urls=
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.5160.1566943409000000.gz/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.5160.1566943409000000.gz.aa
    backtrace=           PID: 5160 (scylla)
               UID: 996 (scylla)
               GID: 1001 (scylla)
            Signal: 6 (ABRT)
         Timestamp: Tue 2019-08-27 22:03:29 UTC (1min 55s ago)
      Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
        Executable: /opt/scylladb/libexec/scylla
     Control Group: /
           Boot ID: 3f7c927968ca4130a5cfc4b02933017f
        Machine ID: df877a200226bc47d06f26dae0736ec9
          Hostname: ip-10-0-10-203.eu-west-1.compute.internal
          Coredump: /var/lib/systemd/coredump/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.5160.1566943409000000
           Message: Process 5160 (scylla) of user 996 dumped core.
                    
                    Stack trace of thread 5170:
                    #0  0x00007f742044b53f raise (libc.so.6)
                    #1  0x00007f742043595e abort (libc.so.6)
                    #2  0x00000000040219ab on_allocation_failure (scylla)
    

    Other download locaions:

    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.31878.1566944318000000.gz/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.31878.1566944318000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.32438.1566945161000000.gz/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.32438.1566945161000000.gz.aa
    
    

    relevant journalctl logs of the nodes can be found on scratch.scylladb.com/shlomib/longevity-large-partitions-4d-db-cluster.tar

    bug 
    opened by ShlomiBalalis 140
  • Significant fall down of operations per seconds during replace node

    Significant fall down of operations per seconds during replace node

    Installation details Scylla version (or git commit hash): 4.2.rc4-0.20200914.338196eab with build-id 7670ef1d82ff6b35783e1035d6544c7cc9abd90f Cluster size: 5 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0bb0f15782d03eec3 (eu-north-1) instance type: i3.4xlarge

    During job https://jenkins.scylladb.com/view/scylla-4.2/job/scylla-4.2/job/longevity/job/longevity-mv-si-4days-test/5 several times nemesis TerminateAndReplace nemesis executed. This nemesis terminate instance of one node4 and after that it adds new node6. During adding new node for each nemesis execution operations per seconds jump down from 25k ops to 81 ops: Screenshot from 2020-09-17 18-32-33

    During second time node5 was terminated and and node8

    Screenshot from 2020-09-17 18-41-53

    monitoring node available: http://13.49.78.221:3000/d/N0wDzKdGk/scylla-per-server-metrics-nemesis-master?orgId=1&from=1600146424237&to=1600300730028&var-by=instance&var-cluster=&var-dc=All&var-node=All&var-shard=All&var-sct_tags=DatabaseLogEvent&var-sct_tags=DisruptionEvent

    Next c-s commands:

    2020-09-15 14:49:50.535: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_4mv_5queries.yaml ops'(insert=15,read1=1,read2=1,read3=1,read4=1,read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:50:20.510: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_4mv_5queries.yaml ops'(insert=15,read1=1,read2=1,read3=1,read4=1,read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:50:30.964: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2mv_2queries.yaml ops'(insert=6,mv_p_read1=1,mv_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:51:00.837: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2mv_2queries.yaml ops'(insert=6,mv_p_read1=1,mv_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:51:11.261: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_3si_5queries.yaml ops'(insert=25,si_read1=1,si_read2=1,si_read3=1,si_read4=1,si_read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:51:41.228: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_3si_5queries.yaml ops'(insert=25,si_read1=1,si_read2=1,si_read3=1,si_read4=1,si_read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:51:51.676: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2si_2queries.yaml ops'(insert=10,si_p_read1=1,si_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:52:21.540: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2si_2queries.yaml ops'(insert=10,si_p_read1=1,si_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    

    next schema generated:

    CREATE KEYSPACE mview WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
    
    CREATE TABLE mview.users (
        userid bigint PRIMARY KEY,
        address text,
        email text,
        first_name text,
        initials int,
        last_access timeuuid,
        last_name text,
        password text,
        userdata blob
    ) WITH bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_first_name AS
        SELECT first_name, userid, email
        FROM mview.users
        WHERE first_name IS NOT null AND userid IS NOT null
        PRIMARY KEY (first_name, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_initials AS
        SELECT initials, userid
        FROM mview.users
        WHERE initials IS NOT null AND userid IS NOT null
        PRIMARY KEY (initials, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_email AS
        SELECT email, userid
        FROM mview.users
        WHERE email IS NOT null AND userid IS NOT null
        PRIMARY KEY (email, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_password AS
        SELECT password, userid
        FROM mview.users
        WHERE password IS NOT null AND userid IS NOT null
        PRIMARY KEY (password, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_last_name AS
        SELECT last_name, userid, email
        FROM mview.users
        WHERE last_name IS NOT null AND userid IS NOT null
        PRIMARY KEY (last_name, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_address AS
        SELECT address, userid
        FROM mview.users
        WHERE address IS NOT null AND userid IS NOT null
        PRIMARY KEY (address, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE KEYSPACE keyspace1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '5'}  AND durable_writes = true;
    
    CREATE TABLE keyspace1.standard1 (
        key blob PRIMARY KEY,
        "C0" blob,
        "C1" blob,
        "C2" blob,
        "C3" blob,
        "C4" blob,
        aqpq3qgcom list<frozen<set<timestamp>>>,
        b69k9r389z list<frozen<map<frozen<map<frozen<map<bigint, timeuuid>>, frozen<map<bigint, bigint>>>>, frozen<set<tinyint>>>>>,
        bdbs5ixqdq map<frozen<map<frozen<set<inet>>, frozen<set<frozen<list<date>>>>>>, frozen<list<frozen<set<decimal>>>>>,
        f4xwkb2zcm set<frozen<map<frozen<set<timestamp>>, frozen<map<timeuuid, inet>>>>>,
        fywh69a04j set<frozen<map<frozen<set<varint>>, frozen<set<frozen<map<frozen<map<smallint, inet>>, varint>>>>>>>,
        hacdvjo18p set<frozen<list<frozen<map<smallint, bigint>>>>>,
        iopuqysiqf list<frozen<map<frozen<set<frozen<list<text>>>>, frozen<map<frozen<set<ascii>>, ascii>>>>>,
        jxu8tsm8v5 set<frozen<map<frozen<list<blob>>, frozen<map<frozen<list<text>>, frozen<map<int, smallint>>>>>>>,
        ki1u5t67nf set<frozen<set<ascii>>>,
        l8pw46826p list<frozen<map<frozen<list<date>>, frozen<list<frozen<map<ascii, double>>>>>>>,
        oj5epbs4pn list<frozen<set<frozen<map<smallint, int>>>>>,
        ortj1um8mc set<frozen<list<frozen<map<float, double>>>>>,
        p8v0kjmfsr list<frozen<list<varint>>>,
        rulnhv7azy set<frozen<set<frozen<set<float>>>>>,
        si5zsclur2 map<frozen<map<frozen<list<bigint>>, boolean>>, frozen<set<frozen<set<float>>>>>,
        v3p7qqv1vn list<frozen<list<bigint>>>,
        wyhqruomlw map<frozen<map<frozen<set<int>>, frozen<set<smallint>>>>, frozen<set<frozen<list<frozen<map<ascii, int>>>>>>>,
        yskieerio3 set<frozen<list<frozen<map<frozen<list<timeuuid>>, frozen<set<decimal>>>>>>>
    ) WITH bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    
    CREATE KEYSPACE sec_index WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
    
    CREATE TABLE sec_index.users (
        userid bigint PRIMARY KEY,
        address text,
        email text,
        first_name text,
        initials int,
        last_access timeuuid,
        last_name text,
        password text,
        userdata blob
    ) WITH bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 4678
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    CREATE INDEX users_initials_ind ON sec_index.users (initials);
    CREATE INDEX users_last_name_ind ON sec_index.users (last_name);
    CREATE INDEX users_last_access_ind ON sec_index.users (last_access);
    CREATE INDEX users_first_name_ind ON sec_index.users (first_name);
    CREATE INDEX users_address_ind ON sec_index.users (address);
    
    CREATE MATERIALIZED VIEW sec_index.users_address_ind_index AS
        SELECT address, idx_token, userid
        FROM sec_index.users
        WHERE address IS NOT NULL
        PRIMARY KEY (address, idx_token, userid)
        WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW sec_index.users_first_name_ind_index AS
        SELECT first_name, idx_token, userid
        FROM sec_index.users
        WHERE first_name IS NOT NULL
        PRIMARY KEY (first_name, idx_token, userid)
        WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW sec_index.users_initials_ind_index AS
        SELECT initials, idx_token, userid
        FROM sec_index.users
        WHERE initials IS NOT NULL
        PRIMARY KEY (initials, idx_token, userid)
        WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW sec_index.users_last_access_ind_index AS
        SELECT last_access, idx_token, userid
        FROM sec_index.users
        WHERE last_access IS NOT NULL
        PRIMARY KEY (last_access, idx_token, userid)
        WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW sec_index.users_last_name_ind_index AS
        SELECT last_name, idx_token, userid
        FROM sec_index.users
        WHERE last_name IS NOT NULL
        PRIMARY KEY (last_name, idx_token, userid)
        WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    

    No reactor stalls were detected during this.

    All db logs: https://cloudius-jenkins-test.s3.amazonaws.com/ca850009-fb1d-4d43-ac60-0fdbce75cc71/20200916_203618/db-cluster-ca850009.zip

    bug high repair-based-operations 
    opened by aleksbykov 132
  • Performance regression of 780% in p99th latency compared to 2.2.0 for 100% read test

    Performance regression of 780% in p99th latency compared to 2.2.0 for 100% read test

    Installation details Scylla version (or git commit hash): 2.3.rc0-0.20180722.a77bb1fe3 Cluster size: 3 OS (RHEL/CentOS/Ubuntu/AWS AMI): AWS AMI (ami-905252ef) instance type: i3.4xlarge

    test_latency_read results showing 780% regression in p99th latency compared to 2.2.0:

    Version | Op rate total | Latency mean | Latency 99th percentile -- | -- | -- | -- 2.2.0 |  39997.0 [2018-07-19 10:26:37] | 1.4 [2018-07-19 10:26:37] |  3.1 [2018-07-19 10:26:37] 2.3.0 | 37200.0 (6% Regression) | 8.2 (485% Regression) | 27.3 (780% Regression)

    2.3.0 p99th latency looks abnormal and reaches peaks of ~400ms: screen shot 2018-07-25 at 1 26 42

    Test is populating 1TB of data and then start a c-s read command: cassandra-stress read no-warmup cl=QUORUM duration=50m -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate 'threads=100 limit=10000/s' -errors ignore -col 'size=FIXED(1024) n=FIXED(1)' -pop 'dist=gauss(1..1000000000,500000000,50000000)' (During the first part of the test we can still see compactions that are leftovers of the write population)

    Full screenshot: screencapture-34-230-6-17-3000-dashboard-db-scylla-per-server-metrics-nemesis-master-test-latency-2-3-2018-07-25-01_31_03

    bug performance high Regression 
    opened by roydahan 127
  • Some shards get stuck in tight loop during repair

    Some shards get stuck in tight loop during repair

    This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

    • [x] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

    Installation details Scylla version (or git commit hash): 5.0.1 Cluster size: 5 OS (RHEL/CentOS/Ubuntu/AWS AMI): Ubuntu 20.04

    Hardware details (for performance issues) Delete if unneeded Platform (physical/VM/cloud instance type/docker): Hetzner Hardware: sockets=1 cores=4 hyperthreading=8 memory=64G Disks: 2x SSD in RAID1

    A few shards on one of my nodes got stuck in a tight loop while running a repair operation. It has been going for a day and is not making any progress. All the while CPU usage on three cores is stuck at 100%: image

    Restarting the node also hangs, until it eventually gets killed by systemd. When the node restarts the same shards get stuck again shortly after initialization.

    I have exported logs for the node that is getting stuck: https://pixeldrain.com/u/rwtFViqp And the repair master: https://pixeldrain.com/u/PjML3Y3F

    Some of my data has become inconsistent after some downtime and I have no way to repair it now.. please help.

    bug User Request high 
    opened by Fornax96 117
  • test_latency_mixed_with_nemesis - latency during

    test_latency_mixed_with_nemesis - latency during "steady state" get to 20 ms without heavy stalls

    Installation details Scylla version (or git commit hash): 4.4.dev-0.20210114.32fd38f34 with build-id 0642bb3b142094f1092b0d276f6efa858081fe96 Cluster size: 3 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-012cafbb2dc4f1e4d (eu-west-1)

    running mixed workload with the command: cassandra-stress mixed no-warmup cl=QUORUM duration=350m -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate 'threads=50 throttle=3500/s' -col 'size=FIXED(128) n=FIXED(8)' -pop 'dist=gauss(1..250000000,125000000,12500000)'

    during the steady state, the only stalls detected were:

    2021-01-15T06:40:39+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO    | scylla: Reactor stalled for 6 ms on shard 5.
    2021-01-15T06:48:16+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-3 !INFO    | scylla: Reactor stalled for 6 ms on shard 5.
    2021-01-15T06:51:27+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO    | scylla: Reactor stalled for 6 ms on shard 4.
    2021-01-15T06:58:25+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO    | scylla: Reactor stalled for 6 ms on shard 6.
    2021-01-15T06:59:50+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO    | scylla: Reactor stalled for 6 ms on shard 7.
    2021-01-15T07:07:13+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO    | scylla: Reactor stalled for 6 ms on shard 5.
    

    the values for the steady state latency are: Metric name | Metric value -----------------| ------------------ "c-s P95" | "5.40" "c-s P99" |"19.10" "Scylla P99_read - node-3" | "19.20" "Scylla P99_write - node-1" | "13.76" "Scylla P99_read - node-2" | "23.66" "Scylla P99_write - node-2" | "13.79" "Scylla P99_read - node-1" | "23.55" "Scylla P99_write - node-3" | "1.56"

    there is a live monitor here

    here is a live snapshot (if monitor dies)

    from the monitor, we can see: c-s latency Screenshot from 2021-01-28 15-30-08 (copy)

    c-s_max

    and Scylla latency: read_99th

    write_99th

    comparing with the original document where we checked these values, we have: for Scylla 4.1:

    Metric name | read value | write value -----------------|----------------|--------------- Mean | 0.9 ms | 0.4 ms P95 | 7.8 ms | 1.4 ms P99 | 48.2 ms | 2.5 ms Max | 71 ms | 71 ms

    for Scylla 666.development-0.20200910.02ee0483b: Metric name | read value | write value -----------------|----------------|--------------- Mean | 0.7 ms | 0.3 ms P95 | 3.6 ms | 0.9 ms P99 | 6 ms | 1.2 ms Max | 16.8 ms | 16.8 ms

    all the nodes logs can be downloaded here

    even c-s 95th is too high for a steady state time: c-s_95th

    bug high latency 
    opened by fgelcer 116
  • Permanent read/write fails after

    Permanent read/write fails after "bad_alloc"

    This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

    • [*] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

    Installation details Scylla version (or git commit hash): 3.2.4 Cluster size: 5+5 (multi DC) OS (RHEL/CentOS/Ubuntu/AWS AMI): C7.5

    Platform (physical/VM/cloud instance type/docker): bare metal Hardware: sockets=2 x Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz cores=40 hyperthreading=yes memory= 6x32GB DDR4 2666MHz Disks: RAID 10 of 10HDDs 14TB each for data, RAID 1 SSD 1TB for clogs

    Hi!

    The problem started with errors like "exception during mutation write to 10.161.180.24: std::bad_alloc (std::bad_alloc)" and led to one shard constantly failing a lot of (probably all) write/read operations until scylla-server was manually restarted. I guess that this can be due to having large partitions so here is what we have on that (we have 2 CFs):

    becca/events histograms
    Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                                  (micros)          (micros)           (bytes)
    50%             2.00             16.00          47988.50               770                29
    75%             2.00             20.00          79061.50              5722               215
    95%             6.00             33.00         185724.05             88148              2759
    98%             8.00             36.00         239365.28            182785              5722
    99%            10.00             46.73         295955.11            263210              8239
    Min             0.00              1.00             20.00                73                 2
    Max            24.00          29492.00        2051039.00         464228842           5839588
    
    becca/events_by_ip histograms
    Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                                  (micros)          (micros)           (bytes)
    50%             0.00             16.00              0.00              6866               179
    75%             0.00             19.75              0.00             29521               770
    95%             0.00             33.00              0.00            315852              8239
    98%             0.00             41.00              0.00            785939             20501
    99%             0.00             48.43              0.00           1629722             42510
    Min             0.00              1.00              0.00                73                 0
    Max             0.00          19498.00              0.00         386857368           4866323
    

    Anyway if some big query arrived and failed I do not quite understand why all subsequent queries failed until the node was restarted.

    Logs: https://cloud.mail.ru/public/C3AZ/RxPZyKUV6

    Dashboard (by shard)

    Снимок экрана 2020-05-09 в 20 11 38 Снимок экрана 2020-05-09 в 17 41 49 bug User Request hinted-handoff bad_alloc 
    opened by gibsn 114
  • Cassandra Stress times out: BusyPoolException: no available connection and timed out after 5000 MILLISECONDS / using shard-aware driver, get the 1tb longevity to overload

    Cassandra Stress times out: BusyPoolException: no available connection and timed out after 5000 MILLISECONDS / using shard-aware driver, get the 1tb longevity to overload

    Installation details Scylla version (or git commit hash): 4.3.rc2-0.20201126.bc922a743 with build-id 840fd4b3f6304765c03e886269b1c2550bf23e53 Cluster size: 4 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-09f30667ba6e09e9b (eu-west-1) Scenario: 1tb-7days

    Half an hour into the stress' run, at 15:15, a consistent BusyPoolException from three of the four nodes, which continued throughout the entire remaining run of the stress:

    15:15:22.497 [Thread-641] DEBUG c.d.driver.core.RequestHandler - [1227134168-0] Error querying 10.0.0.5/10.0.0.5:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.0.5/10.0.0.5:9042] Pool is busy (no available connection and the queue has reached its max size 256)
    ...
    15:32:59.650 [cluster1-nio-worker-21] DEBUG c.d.driver.core.RequestHandler - [540726118-0] Error querying 10.0.0.5/10.0.0.5:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.0.5/10.0.0.5:9042] Pool is busy (no available connection and timed out after 5000 MILLISECONDS)
    
    15:25:50.717 [Thread-177] DEBUG c.d.driver.core.RequestHandler - [544250492-0] Error querying 10.0.3.37/10.0.3.37:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.3.37/10.0.3.37:9042] Pool is busy (no available connection and the queue has reached its max size 256)
    ...
    15:32:59.638 [cluster1-nio-worker-29] DEBUG c.d.driver.core.RequestHandler - [640744570-0] Error querying 10.0.1.149/10.0.1.149:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.1.149/10.0.1.149:9042] Pool is busy (no available connection and timed out after 5000 MILLISECONDS)
    
    15:32:59.638 [cluster1-nio-worker-29] DEBUG c.d.driver.core.RequestHandler - [640744570-0] Error querying 10.0.1.149/10.0.1.149:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.1.149/10.0.1.149:9042] Pool is busy (no available connection and timed out after 5000 MILLISECONDS)
    

    At the same time, the stress experienced consistent WriteTimeoutException, since the stress failed to achieve quorum:

    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
    
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.1.149/10.0.1.149:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.1.149/10.0.1.149:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.1.149/10.0.1.149:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.3.37/10.0.3.37:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.3.37/10.0.3.37:9042] Timed out waiting for server response
    

    At 16:18, the stress starts to experience EMPTY RESULT errors:

    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.3.37:9042-7, inFlight=128, closed=false] Response received on stream 27584 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.3.37:9042-7, inFlight=128, closed=false] Response received on stream 27648 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.3.37:9042-7, inFlight=128, closed=false] Response received on stream 27712 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.0.5:9042-11, inFlight=128, closed=false] Response received on stream 32640 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.0.5:9042-11, inFlight=128, closed=false] Response received on stream 32704 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.0.5:9042-11, inFlight=128, closed=false] Response received on stream 0 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.149:9042-4, inFlight=128, closed=false] Response received on stream 16320 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.149:9042-4, inFlight=128, closed=false] Response received on stream 16384 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.149:9042-4, inFlight=128, closed=false] Response received on stream 16448 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    

    Weirdly enough, node#4 ,10.0.1.77, does not seem to experience any timeouts. In fact, the messages in the stress' log I see in that time period are healthy heartbeat messages:

    14:42:15.661 [cluster1-nio-worker-3] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.77:9042-2, inFlight=1, closed=false] Keyspace set to keyspace1
    16:19:32.899 [cluster1-reconnection-0] DEBUG com.datastax.driver.core.Host.STATES - [10.0.1.77/10.0.1.77:9042] preparing to open 1 new connections, total = 15
    16:19:32.901 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] Connection established, initializing transport
    16:19:32.937 [cluster1-nio-worker-17] DEBUG c.d.s.netty.handler.ssl.SslHandler - [id: 0x14eb560e, L:/10.0.1.115:48940 - R:10.0.1.77/10.0.1.77:9042] HANDSHAKEN: TLS_RSA_WITH_AES_128_CBC_SHA
    16:19:41.082 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Host.STATES - [10.0.1.77/10.0.1.77:9042] Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] Transport initialized, connection ready
    16:20:03.838 [cluster1-reconnection-0] DEBUG com.datastax.driver.core.Host.STATES - [Control connection] established to 10.0.1.77/10.0.1.77:9042
    16:20:33.809 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
    16:20:41.918 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
    16:21:11.926 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
    16:21:13.881 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
    16:21:43.882 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
    16:21:48.369 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
    16:22:18.373 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
    16:22:22.816 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
    

    Screenshot from 2020-12-03 13-44-38

    Screenshot from 2020-12-03 14-04-38

    From the looks of the metrics of both foreground and background writes per instance, it seems that node#4 indeed receives less writes than any other node. Perhaps it's possible that this fact caused the inflight hints messages of the other nodes to fill up, considering that in the previous errors nodes 1-3 reported that inFlight=128? Perhaps there is an issue with key distribution between the nodes, which caused the other nodes to receive more stress than they could have handled.

    The failed stress command:

    cassandra-stress write cl=QUORUM n=1100200300 -schema 'replication(factor=3) compaction(strategy=LeveledCompactionStrategy)' -port jmx=6868 -mode cql3 native -rate threads=1000 -col 'size=FIXED(200) n=FIXED(5)' -pop seq=1..1100200300
    

    Other prepare stresses for this run:

    cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=LZ4Compressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=lz4 -rate threads=50 -pop seq=1..50000000 -log interval=5
    cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=SnappyCompressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=snappy -rate threads=50 -pop seq=1..50000000 -log interval=5
    cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=DeflateCompressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=none -rate threads=50 -pop seq=1..50000000 -log interval=5
    cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=ZstdCompressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=none -rate threads=50 -pop seq=1..50000000 -log interval=5
    

    (Each of them runs once, spread across 2 loaders)

    Node list:

    longevity-tls-1tb-7d-4-3-db-node-66a319cd-1 [34.243.3.190 | 10.0.1.149]
    longevity-tls-1tb-7d-4-3-db-node-66a319cd-2 [54.246.50.198 | 10.0.0.5] 
    longevity-tls-1tb-7d-4-3-db-node-66a319cd-3 [54.247.54.152 | 10.0.3.37]
    longevity-tls-1tb-7d-4-3-db-node-66a319cd-4 [52.211.7.163 | 10.0.1.77] 
    

    Logs:

    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    |                                                                                            Log links for testrun with test id 66a319cd-223d-450b-8f0f-2bb423d39693                                                                                            |
    +-----------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Date            | Log type    | Link                                                                                                                                                                                                                          |
    +-----------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 20190101_010101 | prometheus  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/prometheus_snapshot_20201202_164129.tar.gz                                                                                                |
    | 20201202_163157 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_163157/grafana-screenshot-overview-20201202_163158-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png                          |
    | 20201202_163157 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_163157/grafana-screenshot-scylla-per-server-metrics-nemesis-20201202_163545-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png |
    | 20201202_164145 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_164145/grafana-screenshot-overview-20201202_164145-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png                          |
    | 20201202_164145 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_164145/grafana-screenshot-scylla-per-server-metrics-nemesis-20201202_164500-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png |
    | 20201202_165046 | db-cluster  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/db-cluster-66a319cd.zip                                                                                                   |
    | 20201202_165046 | loader-set  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/loader-set-66a319cd.zip                                                                                                   |
    | 20201202_165046 | monitor-set | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/monitor-set-66a319cd.zip                                                                                                  |
    | 20201202_165046 | sct-runner  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/sct-runner-66a319cd.zip                                                                                                   |
    +-----------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    

    To start the monitor using hydra:

    hydra investigate show-monitor 66a319cd-223d-450b-8f0f-2bb423d39693
    
    high longevity overload 
    opened by ShlomiBalalis 109
  • some non-prepared statements can leak memory (with set/map/tuple/udt literals)

    some non-prepared statements can leak memory (with set/map/tuple/udt literals)

    This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

    • [*] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

    Installation details Scylla version: 4.0.4 Cluster size: 10 nodes, 4 shards per node OS: Ubuntu

    After running ok for a few days, nodes consistently start having 'bad_alloc' errors, even though we do not have a lot of Data files (~1850 Data files) and our data size (400G per node) is not that great comparing to the memory available to the node (90G for 4 shards, so about 22G per shard).

    Aug 11 04:31:13 fr-eqx-scylla-04 scylla[10177]: WARN 2020-08-11 04:31:13,007 [shard 0] storage_proxy - Failed to apply mutation from 192.168.96.47#0: std::bad_alloc (std::bad_alloc)

    Our non_lsa memory is always growing and at some point it just start having bad_alloc once it reaches a level:

    2020-08-11_non_lsa_scylla-04_shards

    bug User Request bad_alloc 
    opened by withings-sas 104
  • dist: drop deprecated AMI parameters on setup scripts

    dist: drop deprecated AMI parameters on setup scripts

    Since we moved all IaaS code to scylla-machine-image, we nolonger need AMI variable on sysconfig file or --ami parameter on setup scripts, and also never used /etc/scylla/ami_disabled. So let's drop all of them from Scylla core core.

    Related with scylladb/scylla-machine-image#61

    opened by syuu1228 2
  • mutation_fragment_stream_validator: avoid allocation when stream is correct

    mutation_fragment_stream_validator: avoid allocation when stream is correct

    Currently the ctor of said class always allocates as it copies the provided name string and it creates a new name via format(). We want to avoid this, now that the validator is used on the read path. So defer creating the formatted name to when we actually want to log something, which is either when log level is debug or when an error is found. We don't care about performance in either case, but we do care about it on the happy path. Further to the above, provide a constructor for string literal names and when this is used, don't copy the name string, just save a view to it.

    Refs: #11174

    opened by denesb 1
  • storage_service: Retry when unknown gossip status node is found

    storage_service: Retry when unknown gossip status node is found

    In 4cefc0151e7375192b91b605c3acaf90a2e346f8 (storage_service: Reject to bootstrap new node when node has unknown gossip status), the bootstrap is rejected immediately when a node with unknown gossip status is found.

    This patch adds retry logic to make it a bit more user-friendly.

    Fixes #11853

    opened by asias 1
  • RFC: gossip: Add instance_id concept

    RFC: gossip: Add instance_id concept

    Consider the following:

    1. replaced node goes back to cluster
    • 3 nodes cluster, n1, n2, n3
    • n3 is dead
    • n3 is replaced with another node n4
    • the replaced node goes back to the cluster
    1. removed node goes back to cluster
    • 3 nodes cluster, n1, n2, n3
    • n3 is dead
    • n3 is removed with nodetool removenode
    • the removenode node goes back to the cluster

    In commit 12ab2c3d8d6925a14(storage_service: Prevent removed node to restart and join the cluster), we fixed the restart case. This patch goes one step further to fix more cases.

    Scylla does not allow the replaced or removed node to go back to the cluster. In cloud environments, it is hard to guarantee this never happens.

    In this patch, the instance_id concept is introduced to solve this problem.

    When a scylla node starts, a random and unique instance_id is created. Unlike host_id, the instance_id is unique even if the host_id is the same. For example, a new node is started to replace an old dead node during the replace operation.

    When the node operation like replace or removenode is performed, we mark the instance_id as blocked.

    When the node with the blocked instance_id comes back to the cluster, it won't be able to communicate to other nodes with gossip, as if the node was not back to the cluster.

    For example, when n3 is marked as blocked and comes back:

    n1:

    gossip - The instance_id e76618a2-a2eb-4021-9a38-81a4a1ec3681
    for node 127.0.0.3 is blocked by node 127.0.0.1. Mark node 127.0.0.3 as
    DOWN.
    
    UN  127.0.0.1  544 KB     256          ?  781b9ae2-63a6-4f6f-ade2-8ee6b30e2786  rack1
    UN  127.0.0.2  764 KB     256          ?  295bc7fb-a59a-48de-b5bd-23fd771c0578  rack1
    DN  127.0.0.3  1.62 MB    256          ?  eecba8ab-95f9-4e58-8934-6fd9bf19c773  rack1
    

    n2:

    gossip - The instance_id e76618a2-a2eb-4021-9a38-81a4a1ec3681 for node
    127.0.0.3 is blocked by node 127.0.0.2. Mark node 127.0.0.3 as DOWN
    
    UN  127.0.0.1  544 KB     256          ?  781b9ae2-63a6-4f6f-ade2-8ee6b30e2786  rack1
    UN  127.0.0.2  764 KB     256          ?  295bc7fb-a59a-48de-b5bd-23fd771c0578  rack1
    DN  127.0.0.3  1.62 MB    256          ?  eecba8ab-95f9-4e58-8934-6fd9bf19c773  rack1
    

    n3:

    gossip - failure_detector_loop: Send echo to node 127.0.0.1, status =
    failed: seastar::rpc::remote_verb_error (The instance_id
    e76618a2-a2eb-4021-9a38-81a4a1ec3681 for node 127.0.0.3 is blocked by
    node 127.0.0.1)
    
    gossip - failure_detector_loop: Send echo to node 127.0.0.2, status =
    failed: seastar::rpc::remote_verb_error (The instance_id
    e76618a2-a2eb-4021-9a38-81a4a1ec3681 for node 127.0.0.3 is blocked by
    node 127.0.0.2)
    gossip - failure_detector_loop: Mark node 127.0.0.2 as DOWN
    gossip - failure_detector_loop: Mark node 127.0.0.1 as DOWN
    gossip - InetAddress 127.0.0.1 is now DOWN, status = NORMAL
    gossip - InetAddress 127.0.0.2 is now DOWN, status = NORMAL
    
    DN  127.0.0.1  544 KB     256          ?  781b9ae2-63a6-4f6f-ade2-8ee6b30e2786  rack1
    DN  127.0.0.2  764 KB     256          ?  295bc7fb-a59a-48de-b5bd-23fd771c0578  rack1
    UN  127.0.0.3  1.62 MB    256          ?  eecba8ab-95f9-4e58-8934-6fd9bf19c773  rack1
    

    When n3 restarts:

    [shard 0] init - Startup failed: seastar::rpc::remote_verb_error (The
    instance_id e76618a2-a2eb-4021-9a38-81a4a1ec3681 for node 127.0.0.3 is
    blocked by node 127.0.0.1)
    

    TODO:

    • Persistent the blocked instance ID. It is a small amount of data. Even if we locked 10000 instances in the history of the cluster, it is only 128 bit * 10000 of data.

    • To be more safe and prevent any damanage to the cluster, we can even make the node stop itself when it knows it is blocked by others.

    Fixes #11217

    opened by asias 7
  • test/alternator: un-xfail a test which passes on modern Python

    test/alternator: un-xfail a test which passes on modern Python

    We had an xfailing test that reproduced a case where Alternator tried to report an error when the request was too long, but the boto library didn't see this error and threw a "Broken Pipe" error instead. It turns out that this wasn't a Scylla bug but rather a bug in urllib3, which overzealously reported a "Broken Pipe" instead of trying to read the server's response. It turns out this issue was already fixed in https://github.com/urllib3/urllib3/pull/1524

    and now, on modern installations, the test that used to fail now passes and reports "XPASS".

    So in this patch we remove the "xfail" tag, and skip the test if running an old version of urllib3.

    Fixes #8195

    opened by nyh 2
  • Mark Alternator TTL feature no longer experimental

    Mark Alternator TTL feature no longer experimental

    According to @tzach in https://github.com/scylladb/scylladb/pull/11997#issuecomment-1321062173, it has been decided that the Alternator TTL feature is no longer experimental.

    Therefore, we should

    1. In db/config.hh, drop the ALTERNATOR_TTL feature.
    2. In db/config.cc, change the alternator-ttl to be unused.
    3. In gm/feature_service.cc, drop the if() that disables the ALTERNATOR_TTL feature when the experimental feature is off.
    4. In main.cc, start the expiration feature based on _proxy.data_dictionary().features().alternator_ttl, not on the exprimental feature.
    5. Update the documentation that this feature is no longer experimental. We already have a pull request for that: https://github.com/scylladb/scylladb/pull/11997 by @annastuchlik

    Note that even after this change, Alternator TTL is still a cluster feature, and if disabled because some of the nodes are old and predate Alternator TTL or had it as an un-enabled experimental feature - then the cluster feature will disabled and UpdateTimeToLive commands will be refused and the scanning threads will not run

    We should backport this change to whichever release branches that we decided will have Alternator TTL not marked experimental.

    TTL Alternator 
    opened by nyh 0
Owner
ScyllaDB
ScyllaDB, the The Real-Time Big Data Database
ScyllaDB
Kvrocks is a key-value NoSQL database based on RocksDB and compatible with Redis protocol.

Kvrocks is a key-value NoSQL database based on RocksDB and compatible with Redis protocol.

Bit Leak 1.8k Nov 27, 2022
Kvrocks is a distributed key value NoSQL database based on RocksDB and compatible with Redis protocol.

Kvrocks is a distributed key value NoSQL database based on RocksDB and compatible with Redis protocol.

Kvrocks Labs 1.8k Nov 24, 2022
🥑 ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

?? ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

ArangoDB 12.7k Nov 28, 2022
Nebula Graph is a distributed, fast open-source graph database featuring horizontal scalability and high availability

Nebula Graph is an open-source graph database capable of hosting super large-scale graphs with billions of vertices (nodes) and trillions of edges, with milliseconds of latency. It delivers enterprise-grade high performance to simplify the most complex data sets imaginable into meaningful and useful information.

vesoft inc. 8.2k Nov 25, 2022
FEDB is a NewSQL database optimised for realtime inference and decisioning application

FEDB is a NewSQL database optimised for realtime inference and decisioning applications. These applications put real-time features extracted from multiple time windows through a pre-trained model to evaluate new data to support decision making. Existing in-memory databases cost hundreds or even thousands of milliseconds so they cannot meet the requirements of inference and decisioning applications.

4Paradigm 1.7k Nov 22, 2022
The MongoDB Database

The MongoDB Database

mongodb 22.9k Nov 29, 2022
RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB

Facebook 24k Nov 25, 2022
RediSearch is a Redis module that provides querying, secondary indexing, and full-text search for Redis.

A query and indexing engine for Redis, providing secondary indexing, full-text search, and aggregations.

null 4k Nov 23, 2022
GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.

Overview GridDB is Database for IoT with both NoSQL interface and SQL Interface. Please refer to GridDB Features Reference for functionality. This rep

GridDB 1.9k Nov 24, 2022
✔️The smallest header-only GUI library(4 KLOC) for all platforms

Welcome to GUI-lite The smallest header-only GUI library (4 KLOC) for all platforms. 中文 Lightweight ✂️ Small: 4,000+ lines of C++ code, zero dependenc

null 6.5k Nov 24, 2022
MinIO C++ Client SDK for Amazon S3 Compatible Cloud Storage

The MinIO C++ Client SDK provides simple APIs to access any Amazon S3 compatible object storage.

Multi-Cloud Object Storage 38 Nov 13, 2022
MySQL Server, the world's most popular open source database, and MySQL Cluster, a real-time, open source transactional database.

Copyright (c) 2000, 2021, Oracle and/or its affiliates. This is a release of MySQL, an SQL database server. License information can be found in the

MySQL 8.5k Nov 24, 2022
The database built for IoT streaming data storage and real-time stream processing.

The database built for IoT streaming data storage and real-time stream processing.

HStreamDB 562 Nov 23, 2022
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.

Tuplex 791 Nov 15, 2022
Overlay Microsoft Flight Simulator (FS2020) aircraft data onto real airport charts in real-time

FLIGHTSIM CHARTS Introduction Overlay Microsoft Flight Simulator (FS2020) aircraft data onto real airport charts in real-time. Instantly teleport to a

Scott Vincent 3 May 31, 2022
An R interface to the 'Apache Arrow' C API

carrow The goal of carrow is to wrap the Arrow Data C API and Arrow Stream C API to provide lightweight Arrow support for R packages to consume and pr

Dewey Dunnington 30 Aug 5, 2022
Amazon Lumberyard is a free AAA game engine deeply integrated with AWS and Twitch – with full source.

Amazon Lumberyard Amazon Lumberyard is a free, AAA game engine that gives you the tools you need to create high quality games. Deeply integrated with

Amazon Web Services 1.9k Nov 21, 2022
Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine DSSTNE (pronounced "Destiny") is an open source software library for training and deploying

Amazon Archives 4.4k Nov 17, 2022
A sample project combining Epic Games' MetaHuman digital characters with Amazon Polly text-to-speech.

Amazon Polly & MetaHumans Sample Project A sample project combining Epic Games' MetaHuman digital characters with Amazon Polly text-to-speech. This Un

AWS Samples 65 Nov 2, 2022
An easy to build CO2 Monitor/Meter with Android and iOS App for real time visualization and charting of air data, data logger, a variety of communication options (BLE, WIFI, MQTT, ESP-Now) and many supported sensors.

CO2-Gadget An easy to build CO2 Monitor/Meter with cell phone App for real time visualization and charting of air data, datalogger, a variety of commu

Mariete 29 Nov 15, 2022