DB(s) Benchmark¶
For benches we are using the following tool: H2OAI Bench
Prerequisites
Ubuntu:
Run R and then in a R console type:
Exit from R and type in a terminal:
Group By Results¶
Legend: Q1-Q7 = Query 1 through Query 7. Lower values (ms) are better.
| DB | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 |
|---|---|---|---|---|---|---|---|
| DuckDB (multithread turned on) | 63 | 153 | 360 | 23 | 322 | 330 | 878 |
| DuckDB (multithread turned off) | 347 | 690 | 601 | 108 | 440 | 528 | 3269 |
| ClickHouse | 51 | 189 | 235 | 47 | 265 | 249 | 1732 |
| ? (4.0) | 59 | 143 | 166 | 99 | 156 | 551 | 4497 |
| Rayforce | 60 | 74 | 118 | 72 | 122 | 104 | 1394 |
| ThePlatform | 213 | 723 | 507 | 285 | 488 | 465 | 15712 |
Group By Benchmark Details
Dataset: G1_1e7_1e2_0_0.csv (10 million rows)
DuckDB (multithread turned on)
.timer "on"
Load CSV: create table t as SELECT * FROM read_csv('./db-benchmark/G1_1e7_1e2_0_0.csv');
Q1 (63ms)
Q2 (153ms)
Q3 (360ms)
Q4 (23ms)
Q5 (322ms)
Q6 (330ms)
Q7 (878ms)
DuckDB (multithread turned off)
.timer "on"
SET threads = 1;
Load CSV: create table t as SELECT * FROM read_csv('./db-benchmark/G1_1e7_1e2_0_0.csv');
Q1 (347ms)
Q2 (690ms)
Q3 (601ms)
Q4 (108ms)
Q5 (440ms)
Q6 (528ms)
Q7 (3269ms)
ClickHouse
Load CSV: CREATE TABLE t (id1 String, id2 String, id3 String, id4 Int64, id5 Int64, id6 Int64, v1 Int64, v2 Int64, v3 Float64) ENGINE = Memory;
Q1 (51ms)
Q2 (189ms)
Q3 (235ms)
Q4 (47ms)
Q5 (265ms)
Q6 (249ms)
Q7 (1732ms)
? (4.0)
Load CSV: t: ("SSSJJJJJF";enlist",") 0: hsym `$":./db-benchmark/G1_1e7_1e2_0_0.csv"
Q1 (59ms)
Q2 (143ms)
Q3 (166ms)
Q4 (99ms)
Q5 (156ms)
Q6 (551ms)
Q7 (4497ms)
Rayforce
Load CSV: (set t (csv [Symbol Symbol Symbol I64 I64 I64 I64 I64 F64] "./db-benchmark/G1_1e7_1e2_0_0.csv"))
Q1 (60ms)
Q2 (74ms)
Q3 (118ms)
Q4 (72ms)
Q5 (122ms)
Q6 (104ms)
Q7 (1394ms)
ThePlatform
Load CSV: t: ("SSSJJJJJF";enlist",") 0: `$":./db-benchmark/G1_1e7_1e2_0_0.csv"
Q1 (213ms)
Q2 (723ms)
Q3 (507ms)
Q4 (285ms)
Q5 (488ms)
Q6 (465ms)
Q7 (15712ms)
Join Results¶
Legend: Q1 = Left Join, Q2 = Inner Join. OOM = Out of Memory. Lower values (ms) are better.
| DB | Q1 | Q2 |
|---|---|---|
| DuckDB (multithread turned on) | OOM | OOM |
| DuckDB (multithread turned off) | OOM | OOM |
| ClickHouse | OOM | OOM |
| ? (4.0) | 3174 | 3098 |
| Rayforce | 3149 | 1610 |
| ThePlatform | 23987 | 34104 |
Join Benchmark Details
Data generation:
DuckDB (multithread turned on)
.timer "on"
Load CSV:
- x:
create table x as SELECT * FROM read_csv('./db-benchmark/J1_1e7_NA_0_0.csv'); - y:
create table y as SELECT * FROM read_csv('./db-benchmark/J1_1e7_1e7_0_0.csv');
Q1 (OOM)
Q2 (OOM)
DuckDB (multithread turned off)
.timer "on"
SET threads = 1;
Q1 (OOM)
Q2 (OOM)
ClickHouse
Load CSV:
CREATE TABLE x (id1 Int64, id2 Int64, id3 Int64, id4 String, id5 String, id6 String, v1 Float64) ENGINE = Memory;CREATE TABLE y (id1 Int64, id2 Int64, id3 Int64, id4 String, id5 String, id6 String, v2 Float64) ENGINE = Memory;
clickhouse-client -q "INSERT INTO default.x FORMAT CSV" < ./db-benchmark/J1_1e7_NA_0_0.csv
clickhouse-client -q "INSERT INTO default.y FORMAT CSV" < ./db-benchmark/J1_1e7_1e7_0_0.csv
Q1 (OOM)
Q2 (OOM)
? (4.0)
Load CSV:
- x:
("JJJSSSF";enlist",") 0: `$":./db-benchmark/J1_1e7_NA_0_0.csv" - y:
("JJJSSSF";enlist",") 0: `$":./db-benchmark/J1_1e7_1e7_0_0.csv"
Q1 (3174ms)
Q2 (3098ms)
Rayforce
Load CSV:
(set x (csv [I64 I64 I64 Symbol Symbol Symbol F64] "./db-benchmark/J1_1e7_NA_0_0.csv"))(set y (csv [I64 I64 I64 Symbol Symbol Symbol F64] "./db-benchmark/J1_1e7_1e7_0_0.csv"))
Q1 (3149ms)
Q2 (1610ms)
ThePlatform
Load CSV:
- x:
("JJJSSSF";enlist",") 0: `$":./db-benchmark/J1_1e7_NA_0_0.csv" - y:
("JJJSSSF";enlist",") 0: `$":./db-benchmark/J1_1e7_1e7_0_0.csv"
Q1 (23987ms)
0N#.(?[lj[(`x;x);(`y;y);((~`x`id1;~`x`id2);(~`y`id1;~`y`id2))]; (); 0b; `a`b`c`d`e`f`g`h!(~`y`id1;~`y`id2;`y`id3;~`y`id4;~`y`id5;~`y`id6;~`x`v1;~`y`v2)])
Q2 (34104ms)
Window Join Results¶
Legend: Q1 = Window Join query. Lower values are better.
| DB | Q1 |
|---|---|
| ? (4.0) | ~33 min |
| Rayforce | 59145.60 ms |
Window Join Benchmark Details
? (4.0)

Rayforce
