How have the realities of databases and data management in general been evolving? And how has ScyllaDB been keeping up? We connected with ScyllaDB co-founder and CEO Dor Laor to discuss the details of the new release, as well developments in the database world.
Cloud as the center of gravity for databases
We first covered ScyllaDB on ZDNet back in 2017. Its story is one of deep tech, open source, and pivots. Started by Hypervisor and Linux Red Hat veterans Dor Laor and Avi Kivity, the database that positions itself as a faster Apache Cassandra did not set out as a database at all. Having embarked on that course, however, it remains set. Laor is a very technically oriented CEO, who prefers to dive in head-first to an analysis of what ScyllaDB 5.0 brings to the table on the technical front. However, we thought we’d start with the overall trends driving technical developments, which Laor also acknowledged. Granted, it’s nothing you have not heard before: data is going to the cloud, and real-time data processing is on the rise. ScyllaDB has been operating its own database as a service, Scylla Cloud just for a few years, but it’s quickly becoming the center of gravity for the company. Scylla Cloud was introduced in 2019, and grew 200% in 2021, following up on 200% growth in 2020. Laor said the service’s momentum is strong, with the prediction being for 140% growth in 2022. It will become half of ScyllaDB’s business, Laor went on to add, as people just prefer to consume services: “It’s hard to find talent to run a distributed database. It’s a challenge and also very expensive. Vendors who maintain their own automation around this will bring [users] better results, because our implementation is the recommended way. Most users who run a database on their own will be too busy to implement backup and restore, for example. That’s not the case with us”, Laor said. Scylla Cloud was initially made available on AWS, while later expanding to cover GCP too. On AWS, users can choose to run ScyllaDB in their own account if they wish. On GCP, ScyllaDB will soon be available in the marketplace. Support for Azure is coming soon, too. Laor said their focus at the moment is on automating and completing various aspects of the service’s user management and security. As part of its own research, ScyllaDB conducted some benchmarks on AWS. Those benchmarks were shared with the public at Scylla Summit 2022, the company’s recent online event. Benchmarking is hard, which is clear for a vendor like ScyllaDB who is quite into benchmarks. ScyllaDB staff benchmarked their database at the petabyte level, using features like workload prioritization to control priorities of transactional (read-write) and analytic (read-only) queries on the same cluster with smooth and predictable performance. In the process, they also unearthed some insights on different vendor CPUs and AWS instances. In the summit, benchmarks comparing AWS i3 instances with Intel’s x86 solution with instances running on AMD were presented. AWS will also soon make available i4, another instance family based on newer x86 machines, and since ScyllaDB had early access, they also included it. All of these families are outstanding, Laor said. ScyllaDB’s tests showed i4’s to be twice as fast as i3’s. Arm-based instances were generally found to be slower, but if you factor in price performance, then on some workloads they’re cheaper than i3s, Laor said. Overall, however, all of them are recommended, their NVMe has improved a lot, and they are far better than network storage, he went on to add.
Data at scale and in real-time
The other trend in data management that ScyllaDB is playing into is the ongoing emphasis on real-time data processing. One notable example from Scylla Summit 2022 was Palo Alto Networks using stream processing with ScyllaDB, without a message queue. The motivation was to reduce operational complexity, and by extension, cost. Initially, we thought that may have been built on top of ScyllaDB’s Change Data Capture (CDC) feature, which has been in place since version 4.0. CDC allows users to track changes in their data, recording both the original data values and the new values to records. Changes are streamed to a standard CQL table that can be indexed or filtered to find critical changes to data. Apparently, Palo Alto’s use case was a tailor-made one, also involving Kafka. If your know your data pattern, that’s the best way, Laor commented. CDC will usually be implemented for users who don’t know what was written to the database, or whose data does not have a regular pattern. Regardless, the rise of real-time data processing shows in ScyllaDB’s partnerships, as well as in the program of its recent summit. The summit featured presentations from Confluent, Redpanda, and StreamNative, who all deal with real-time data processing, with the former two being vendors in this space. Laor noted that ScyllaDB has a Kafka connector and other connectors people can work with. As far technical achievements go, ScyllaDB 5.0 has made progress on two key fronts: performance and operations. On the performance front, Laor emphasized ScyllaDB’s new I/O scheduler, which has been in the works for about 6 years. It’s built to match new hardware capabilities and works on the shard level. What ScyllaDB’s people realized was that workloads with mixed read/write requests require special management, and this is what they worked on. Another major performance improvement was in how large partitions are managed. Those are tricky both for the database and for users. ScyllaDB improved indexing large partitions and added the ability to cache indexes has been added. Laor referred to this issue as going from “half-solved” in Cassandra and previous ScyllaDB versions to “fully-solved” in ScyllaDB 5.0. In terms of operational improvements, the major change is the shift from being an eventual consistency database to an immediately consistent database, as Laor put it. The consensus protocol governing transactions has changed, as ScyllaDB switched from Paxos to Raft. Laor elaborated on the journey. When ScyllaDB implemented the Paxos protocol with lightweight transactions, they also started implementing the DynamoDB API for Alternator, and completed the Jepsen tests. That showed the limitations of the Raft protocol, including scenarios that are not transactional, such as schema changes and topology changes. With Raft, multiple schema changes can be supported in a transactional fashion, while topology changes are works in progress. The other major improvement is around repair base node operations. Node operations refer to adding, removing or replacing nodes in a cluster. In all of those operations, data has to be streamed back and forth from other replicas. That’s a heavyweight operation, followed by a repair phase. The repair base node protocol rolls both into one phase while being stateful. This means quicker operation that can also be resumed. Overall, Laor outlined continued technical evolution and projected business growth for ScyllaDB. The customer base has been expanding, from household names such as Amdocs and Instacart to more exotic use cases around blockchain. The database itself is use case agnostic, although high data volumes and time-series applications are where it shines – affordable scale, as Laor put it. Growth so far has been coming mostly from brownfield use cases, i.e. from clients replacing Cassandra or DynamoDB with ScyllaDB; however the greenfield segment is growing too, Laor mentioned. ScyllaDB’s plans include the expansion of its cloud offering to Azure, as well as multi-tenancy and serverless features built on its Kubernetes operator. As the world’s digital footprint is expanding, it’s a good time to be in the data business, Laor concluded.