StarRocks Instead of Oracle for Mixed Analytical Workloads: A Practical Test

In this article we share practical experience testing StarRocks, an MPP solution gaining traction for analytical workloads. The Alphyn.ai team regularly publishes load test results on the Alphyn Lakehouse data platform. Today, however, we explore a non-obvious usage scenario — OLTP workloads.

To clarify upfront: by "OLTP" here we do not mean classic transactional load with tens of thousands of TPS typical of operational systems, but rather sub-second response for reads and, importantly, key-based writes under mixed analytical workload conditions. That is exactly what StarRocks declares:

Deliver sub-second latency at scale with an engine purpose-built for modern CPUs and complex SQL.

Background

One of our clients runs a system based on Oracle. The workload involves OLAP from ELT processes, transactional load from analytical applications, and access to prepared data (reports).

Everything you'd expect from Oracle — one universal solution for all use cases. But unfortunately, the existing solution has two major problems:

The system is not scalable;
Import substitution is required.

The seemingly obvious solution is a heterogeneous system — for example, the Greenplum + Postgres + optional ClickHouse combination commonly proposed in the market.

StarRocks achieves 3,036 QPS on reads and sub-second response times under mixed OLTP/OLAP workloads — on a fraction of the hardware Oracle requires.

That approach works, but there are obvious drawbacks:

Drawback	Impact
Data duplication across different DBMSs	Higher total cost of ownership, additional key synchronization solutions, logic duplication, etc.
Data synchronization between DBMSs	Lower SLA due to delivery latency and data replication, increased overhead
Different SQL dialects	Higher team skill requirements, application adaptation for each DBMS
Increased architectural complexity	Multiple administration teams, three types of hardware, and very high total cost of ownership

We and the client wanted to stay within the paradigm of a single universal platform but with scalability capabilities. This is how the story of testing StarRocks for such a challenging and non-trivial task began.

The success criteria were as follows:

Concurrent processing of dozens of Read/Write SQL queries with OLAP profile;
Concurrent processing of hundreds of Read/Write SQL queries with OLTP profile with response under 1 second;
Concurrent processing of queries with mixed load profile (OLAP+OLTP) with response under 1 second.

Concurrent processing in this context means simultaneous read, write, update, and delete operations on the same tables by ETL, applications, and BI (i.e., OLTP/OLAP load profile).

Testing Conditions

Alphyn Lakehouse StaRter Pack build with Alphyn StarRocks 3.5.9 in shared-nothing architecture;
Data model of 25 tables closely matching Oracle DDL:
- Primary Key tables (same PK as in Oracle);
- Replication factor 3;
- Columnar storage format;
- zstd(3) compression;
Data was anonymized, closely approximating production data. Matching:
- Cardinality;
- Partition distribution;
- Referential integrity of the original Oracle model.
SQL queries:
- Several dozen real OLTP/OLAP-profile queries adapted for StarRocks syntax and parameterized with random values;
- Result set cardinality comparable to the original Oracle model;
Test scenarios:
- OLTP load profile: concurrent read/write with OLTP queries only;
- Mixed profile: concurrent read/write with all query types (OLTP+OLAP);
Cache completely disabled both at the database level and at the operating system level;
Optimizations for specific queries prohibited (use of matviews, materializations, pre-applied sorts, etc.);
Cluster configuration, DDL, and SQL configured once and not changed during testing;
Scenarios launched from JMeter and automatically logged;
Results re-verified twice: by the client and an independent auditor.

Cluster Specifications

Infrastructure — cloud environment, virtualization — K8S.

Host	CPU	RAM	Disks	Network
1x FE (front end)	16	32 GB	2 x SSD 838 GB	10 Gbit/s
4x BE (backend)	40	200 GB	4 x SSD 3576 GB	10 Gbit/s

It is important to note that the disk subsystem configuration (primarily the count) is far from optimal for MPP platforms, and a cloud environment with a 10 Gbit/s network always imposes certain constraints. However, jumping ahead — the results are still impressive.

JMeter Scenario Description

A specified number of sessions are started, each randomly selecting one query of the required type (OLTP/OLAP, read/write);
Each query is assigned one or more random parameters (identifier, date, etc.) ensuring work with an arbitrary block of the table;
The query is executed;
The cycle repeats for 120 minutes.

Load Test Results

1. Classic OLTP Profile

First, it was interesting to find out the maximum QPS (queries per second) for queries belonging to the "classic operational OLTP" scenario. In other words, key-access queries of the form:

select * from TABLE where id=<pk>
insert into TABLE values(...)
update TABLE set atr=<> where id=<pk>
delete from TABLE where id=<pk>

Reads and writes separately, 100 sessions each:

Concurrency	Read/Write	Average query execution time, s	QPS
100 sessions	Read	0.025	3,036
100 sessions	Write	0.332	573

Simultaneous read and write to the same table, 100 sessions:

Concurrency	Read/Write	Average query execution time, s	QPS
100 sessions	Read (50 sessions)	0.152	302
100 sessions	Write (50 sessions)	0.587	85

A reminder: "100 sessions" in practice means 100 connections open in StarRocks that continuously launch arbitrary parameterized queries, starting new ones as previous ones complete, for 120 minutes continuously. All of this occurs with caching disabled.

In this experiment we hit the current coordinator configuration limit. The FE (FrontEnd node) maxes out at the available 16 vCPUs.

Important note: the coordinator in StarRocks scales horizontally. This bottleneck can therefore be addressed using standard platform tools.

Prepared Statements

A Prepared Statement is a pre-compiled SQL query stored on the server that can be executed multiple times with different parameters.

When executing a prepared statement, the SQL parsing and CBO plan preparation stages are skipped, resulting in ~30% higher performance compared to the same query with varying parameters.

In StarRocks, Prepared Statements are implemented via the MySQL Binary Protocol. This separates the query lifecycle into three phases:

Prepare — the client sends an SQL query template. The Frontend (FE) parses it, performs basic validation, and stores the structure in an internal cache;
Execute — the client sends only parameter values. The FE substitutes them into the ready plan and sends it for execution;
Close (Deallocate) — removal of the prepared statement from FE memory.

Important: many drivers (especially JDBC) emulate prepared statements client-side by default. To get real benefits in StarRocks, enable Server-side Prepared Statements (e.g., in JDBC via useServerPrepStmts=true).

In the last StarRocks versions starting from version 3, a special optimization for Prepared Statements targeting Point Select (key-based lookup) was introduced. When a query matches the conditions (simple Primary Key lookup from a PK model table), StarRocks uses a "short path" — the query bypasses the standard heavyweight optimizer, and the FE directly communicates with the necessary Backend node (BE) holding the data. This allows achieving tens of thousands of QPS per node, bringing StarRocks close to KV-store performance.

2. OLTP Profile: Real-world Case

The key difference from the previous scenario is that in addition to classic operational OLTP key-access queries, it also includes queries that "look like" OLTP — highly selective OLAP queries.

For example:

select * from TABLE
where region_id = 6
and object_id = 12345
and timestamp('2025-06-30') between date_from and date_to

None of the attributes is a primary key, yet the query returns one row. In a production Oracle system this query runs in tens/hundreds of milliseconds, and even then only thanks to a composite index.

Results after 2 hours of continuous operation:

Concurrent Sessions	Read/Write	Average query execution time, s	QPS
25 sessions	Read (70%) / Write (30%)	0.039	556
50 sessions	Read (70%) / Write (30%)	0.049	903
75 sessions	Read (70%) / Write (30%)	0.064	1,060
100 sessions	Read (70%) / Write (30%)	0.083	1,129

Breakdown by DML type at 100 sessions:

Read/Write	Average query time, s	P95 time, s	QPS
Read	0.061	0.037	1,085
Update	0.636	0.290	6
Insert	0.592	0.250	33
Delete	0.789	0.420	5

In this scenario the load pattern on the cluster changed: the coordinator (FE) is loaded at about one-third capacity, while the BEs are at 100% CPU.

Comment: OLAP queries "resembling" OLTP (not by key) do not use the PK index and run longer, so they hold CPU longer and reduce overall throughput.

How much does key-based query performance differ from non-key OLAP-style queries?

Example	Query Type	Average execution time, ms
Table 1	Read by key	30.15
Table 1	Non-key read	84.16
Table 2	Read by key	29.81
Table 2	Non-key read	138.26
Table 3	Read by key	30.57
Table 3	Non-key read	96.77

Such performance is provided by ZoneMap indexes. In StarRocks, ZoneMap indexes are two-level and very efficient. They dramatically reduce scan volumes, preserving millisecond response times even in such scenarios.

The ZoneMap index stores statistics for each data block (chunk), including Min (minimum value), Max (maximum value), HasNull (presence of null values), and HasNotNull (presence of non-null values). When executing queries, StarRocks can quickly determine which blocks can be skipped.

3. Mixed Profile

This scenario tests how different query groups behave under mixed workload closely resembling reality. Three query groups run simultaneously: ETL, OLAP, OLTP. Will response times hold?

All query types simultaneously write to and read from the same tables. We supplement the test with ETL queries involving massive write/update operations on target tables with analytical Join operations in the select:

insert into TARGET
select ... from T1
join T2
join T3

Resource group settings applied:

OLAP+ETL

ALTER RESOURCE GROUP olap_etl WITH (
    'concurrency_limit' = '24',
    'cpu_weight' = '40',
    'mem_limit' = '0.9',
    'spill_mem_limit_threshold' = '0.9'
);

OLTP

ALTER RESOURCE GROUP oltp WITH (
    'cpu_weight' = '10',
    'exclusive_cpu_cores' = '0',
    'mem_limit' = '0.1'
);

Results at 100 concurrent sessions:

SQL Query Type	Avg time per query, s	QPS
OLTP	0.122	150.92
OLAP	39.520	1.26
ETL	43.800	0.68
TOTAL	0.643	152.87

Resource group configuration can be changed dynamically. After reducing cpu_weight for the olap_etl group from 40 to 20 (redistributing CPU in a 1/3 to 2/3 ratio between OLTP and OLAP+ETL):

SQL Query Type	Avg time per query, s	QPS
OLTP	0.070	291.56
OLAP	44.120	0.91
ETL	47.520	0.69
TOTAL	0.320	293.17

Execution times for all query types are assessed as stable and predictable. Load can be managed dynamically with resource groups, e.g., applying different configurations depending on the load profile at different time windows. Sub-second response for OLTP is maintained; the amount of resources allocated to the OLTP group affects QPS.

4. Impact of Compaction

During prolonged continuous StarRocks operation in our load test, sporadic periodic spikes in OLTP query execution times occur — sometimes up to several seconds when the usual time is less than 100 ms. On OLAP workload such behavior is not noticeable.

Compaction	QPS	OLTP Read, s	OLTP Modifications, s	OLTP P95, s	OLTP Max, s	OLAP, s
Disabled	152	0.093	0.414	0.055	9.110	39.520
Every 5 minutes	219	0.061	0.342	0.049	5.440	47.050
Every 15 minutes	201	0.069	0.364	0.053	7.990	40.810
Default (1 minute)	86	0.140	4.538	0.098	20.220	55.100

Spikes are caused by the Compaction process (StarRocks storage format maintenance), which locks tablet metadata for reads/writes during its operation. Until the maintenance process completes, the query is blocked.

Disabling Compaction on sporadically changing data is bad for performance, as it accumulates many versions affecting query response. Too-frequent Compaction is also bad, causing frequent locks.

Pay attention to default settings — they are completely unsuitable for production use. For production environments, choose an optimal maintenance strategy taking into account available hardware and resources allocated to the maintenance operation.

Conclusion

OLTP query performance — over 1,000 queries/second. Response time under 100 ms on average;
StarRocks handles mixed load profiles while ensuring OLTP execution in under 1 second (122 ms on average);
In a target production configuration, performance is expected to be significantly better.

Technical Aspects

StarRocks is a solution capable of handling mixed workloads with "short queries" for reads and modifications, with OLAP queries typical of ETL transformations and analytical tasks;
For intensive OLTP workloads with minimal response time, Primary Key table type must be used;
The bottleneck under high query concurrency can be the FE coordinator. In production sizing, multiple FEs should be planned not only for fault tolerance but also for load balancing;
Prepared statements increase OLTP query performance by 30% through "soft parse" of identical queries. Note that "soft parsing" works only within a single session;
Timely Compaction must be ensured for planned OLTP workloads. Timeliness is determined by the optimal balance between the allowed percentile deviation from target SLA and resources allocated to the maintenance process;
In mixed-load scenarios, resource group settings must be applied;
StarRocks cluster can be dynamically configured for the required performance with regard to load and data volumes — cluster parameters, resource groups, do not require cluster restart (graceful shutdown).

Migration Methodology

The typical desire for a complex analytical system is to migrate to new technological foundations with minimal changes to application and service logic — preserving expertise, established development approaches, reducing risk of production acceptance, reusing integrations with existing automated test systems and data quality control, and so on.

This is where the main challenge arises. StarRocks performed excellently in our testing on specific practical scenarios, but covering all Oracle-equivalent cases still has work to be done.

The Alphyn.ai team developing the Alphyn Lakehouse data platform has developed additional StarRocks functionality including:

Support for procedural code closely matching Oracle PL/SQL notation;
Materialization to local disk with compute-deduplication of CTE computations;
A mechanism for collecting and storing extended information about completed queries into a unified logging and audit system for working with query history and visualizing graphical plans;
Specialized connectors for fast direct data import and export to extend federation capabilities.

Taking into account these extensions, a successful migration follows this formula:

Migrate the data model as-is with type transformation but without quality loss;
Preliminary conversion on the Oracle side of procedural code related to cursor processing (in an analytical system this is not a good development pattern, and fortunately few such changes will be needed);
Exclusion of trigger-based processing logic;
Migrating Oracle PL/SQL procedural code to LPSQL with minimal changes in procedural logic.

The team is currently conducting research into the effectiveness of new StarRocks functionality for multiple covering b-tree indexes with additional fields included in PK, and prepared statements for insert, update, and delete operations.

StarRocks Instead of Oracle for Mixed Analytical Workloads: A Practical Test

StarRocks Instead of Oracle for Mixed Analytical Workloads: A Practical Test

Background

Testing Conditions

Cluster Specifications

JMeter Scenario Description

Load Test Results

1. Classic OLTP Profile

Prepared Statements

2. OLTP Profile: Real-world Case

3. Mixed Profile

4. Impact of Compaction

Conclusion

Technical Aspects

Migration Methodology

Get the latest posts in your inbox

Continue Reading

The Small Files Problem: S3/MinIO vs HDFS vs Greenplum — Benchmarks

Benchmarking MPP SQL Engines: Impala vs Trino vs GreenPlum

Spark SQL Scripting: Procedural Extension for Apache Spark 4