Left Join Benchmark¶
| DB | Q1 |
|---|---|
| RayforceDB | 3149 |
| ? (4.0) | 3174 |
| ThePlatform | 23987 |
| ClickHouse | OOM |
| DuckDB (multithread turned on) | OOM |
| DuckDB (multithread turned off) | OOM |
OOM = Out of Memory. Lower values (ms) are better.
Data generation:
RayforceDB excels in left join operations, demonstrating superior performance where other systems fail:
- While DuckDB and ClickHouse encounter Out of Memory (OOM) errors, RayforceDB successfully completes the left join query
- RayforceDB (3149ms) performs nearly identically to ? (4.0) (3174ms), with only a 0.8% difference
- RayforceDB is 7.6x faster than ThePlatform (23987ms)
RayforceDB Queries¶
(set x (csv [I64 I64 I64 Symbol Symbol Symbol F64] "./db-benchmark/J1_1e7_NA_0_0.csv"))
(set y (csv [I64 I64 I64 Symbol Symbol Symbol F64] "./db-benchmark/J1_1e7_1e7_0_0.csv"))
(timeit (lj [id1 id2] x y)) ;; Q1
DuckDB Queries¶
.timer "on"
create table x as SELECT * FROM read_csv('./db-benchmark/J1_1e7_NA_0_0.csv');
create table y as SELECT * FROM read_csv('./db-benchmark/J1_1e7_1e7_0_0.csv');
select * from x left join y on x.id1 = y.id1 and x.id2 = y.id2; # Q1
ClickHouse Queries¶
CREATE TABLE x (id1 Int64, id2 Int64, id3 Int64, id4 String, id5 String, id6 String, v1 Float64) ENGINE = Memory;
CREATE TABLE y (id1 Int64, id2 Int64, id3 Int64, id4 String, id5 String, id6 String, v2 Float64) ENGINE = Memory;
clickhouse-client -q "INSERT INTO default.x FORMAT CSV" < ./db-benchmark/J1_1e7_NA_0_0.csv
clickhouse-client -q "INSERT INTO default.y FORMAT CSV" < ./db-benchmark/J1_1e7_1e7_0_0.csv
? (4.0) Queries¶
x: ("JJJSSSF";enlist",") 0: `$":./db-benchmark/J1_1e7_NA_0_0.csv"
y: ("JJJSSSF";enlist",") 0: `$":./db-benchmark/J1_1e7_1e7_0_0.csv"
x lj 2!y // Q1