pg_roaringbitmap

RoaringBitmap extension for PostgreSQL.

It’s initial based on https://github.com/zeromax007/gpdb-roaringbitmap.

Introduction

Roaring bitmaps are compressed bitmaps which tend to outperform conventional compressed bitmaps such as WAH, EWAH or Concise. In some instances, roaring bitmaps can be hundreds of times faster and they often offer significantly better compression. They can even be faster than uncompressed bitmaps. More information https://github.com/RoaringBitmap/CRoaring.

Build

Requirements

  • PostgreSQL 10+
  • Other Requirements from https://github.com/RoaringBitmap/CRoaring

Note: The regression testing before the version release only covers PostgreSQL 13 and above.

Build

su - postgres
make
sudo make install
psql -c "create extension roaringbitmap"

Note:You can use make -f Makefile_native instead of make to let the compiler use the SIMD instructions if it’s supported by your CPU. In some scenarios, it may double the performance. But if you copy the pg_roaringbitmap binary which builded on machine with SIMD support to other machine without SIMD and run, you could get a SIGILL crash.

Test

make installcheck

Usage

roaringbitmap

about roaringbitmap data type

Logically, you could think of roaringbitmap data type as bit(4294967296), and it should be noted that the integers added to bitmaps is considered to be unsigned. Within bitmaps, numbers are ordered according to uint32. We order the numbers like 0, 1, …, 2147483647, -2147483648, -2147483647,…, -1. But we use bigint to reference the range of these integers, that is [0 4294967296).

input and ouput

Support two kind of input/output syntax ‘array’ and ‘bytea’, The default output format is ‘bytea’.

postgres=# select roaringbitmap('{1,100,10}');
                 roaringbitmap
------------------------------------------------
 \x3a30000001000000000002001000000001000a006400
(1 row)

or

postgres=# select '\x3a30000001000000000002001000000001000a006400'::roaringbitmap;
                 roaringbitmap
------------------------------------------------
 \x3a30000001000000000002001000000001000a006400
(1 row)

output format can changed by roaringbitmap.output_format

postgres=# set roaringbitmap.output_format='bytea';
SET
postgres=# select '{1}'::roaringbitmap;
             roaringbitmap
----------------------------------------
 \x3a3000000100000000000000100000000100
(1 row)

postgres=# set roaringbitmap.output_format='array';
SET
postgres=# select '{1}'::roaringbitmap;
 roaringbitmap
---------------
 {1}
(1 row)

sample of usage

Use bitmap as type of column

CREATE TABLE t1 (id integer, bitmap roaringbitmap);

Build bitmap from integers

INSERT INTO t1 SELECT 1,rb_build(ARRAY[1,2,3,4,5,6,7,8,9,200]);

INSERT INTO t1 SELECT 2,rb_build_agg(e) FROM generate_series(1,100) e;

Bitmap Calculation (OR, AND, XOR, ANDNOT)

SELECT roaringbitmap('{1,2,3}') | roaringbitmap('{3,4,5}');
SELECT roaringbitmap('{1,2,3}') & roaringbitmap('{3,4,5}');
SELECT roaringbitmap('{1,2,3}') # roaringbitmap('{3,4,5}');
SELECT roaringbitmap('{1,2,3}') - roaringbitmap('{3,4,5}');

Bitmap Aggregate (OR, AND, XOR, BUILD)

SELECT rb_or_agg(bitmap) FROM t1;
SELECT rb_and_agg(bitmap) FROM t1;
SELECT rb_xor_agg(bitmap) FROM t1;
SELECT rb_build_agg(e) FROM generate_series(1,100) e;

Calculate cardinality

SELECT rb_cardinality('{1,2,3}');

Convert bitmap to integer array

SELECT rb_to_array(bitmap) FROM t1 WHERE id = 1;

Convert bitmap to SET of integers

SELECT unnest(rb_to_array('{1,2,3}'::roaringbitmap));

or

SELECT rb_iterate('{1,2,3}'::roaringbitmap);

Opperator List

Function List

Aggregation List

roaringbitmap64

about roaringbitmap64 data type

​​roaringbitmap64​​ is a 64-bit Roaring bitmap implementation, and its format definition can be found in https://github.com/RoaringBitmap/RoaringFormatSpec. Logically, you could think of roaringbitmap64 data type as bit(18446744073709551615) just like roaringbitmap, and it should be noted that the bigint data added to bitmaps is considered to be unsigned. Within 64 bit bitmaps, numbers are ordered according to uint64. We order the numbers like 0, 1, …, 9223372036854775807, -9223372036854775808, -9223372036854775807,…, -1.

input and ouput

Support two kind of input/output syntax ‘array’ and ‘bytea’, The default output format is ‘bytea’.

postgres=# select roaringbitmap64('{1,100,10}');
                            roaringbitmap64
------------------------------------------------------------------------
 \x0100000000000000000000003a30000001000000000002001000000001000a006400
(1 row)

or

postgres=# select '\x0100000000000000000000003a30000001000000000002001000000001000a006400'::roaringbitmap64;
                            roaringbitmap64
------------------------------------------------------------------------
 \x0100000000000000000000003a30000001000000000002001000000001000a006400
(1 row)

output format can changed by roaringbitmap.output_format

postgres=# set roaringbitmap.output_format='bytea';
SET
postgres=# select '{1}'::roaringbitmap64;
                        roaringbitmap64
----------------------------------------------------------------
 \x0100000000000000000000003a3000000100000000000000100000000100
(1 row)

postgres=# set roaringbitmap.output_format='array';
SET
postgres=# select '{1}'::roaringbitmap64;
 roaringbitmap64
-----------------
 {1}
(1 row)

sample of usage

Use bitmap as type of column

CREATE TABLE t1 (id integer, bitmap roaringbitmap64);

Build bitmap from set of bigint

INSERT INTO t1 SELECT 1,rb64_build(ARRAY[1,2,3,4,5,6,7,8,9,200]);

INSERT INTO t1 SELECT 2,rb64_build_agg(e) FROM generate_series(1,100) e;

Bitmap Calculation (OR, AND, XOR, ANDNOT)

SELECT roaringbitmap64('{1,2,3}') | roaringbitmap64('{3,4,5}');
SELECT roaringbitmap64('{1,2,3}') & roaringbitmap64('{3,4,5}');
SELECT roaringbitmap64('{1,2,3}') # roaringbitmap64('{3,4,5}');
SELECT roaringbitmap64('{1,2,3}') - roaringbitmap64('{3,4,5}');

Bitmap Aggregate (OR, AND, XOR, BUILD)

SELECT rb64_or_agg(bitmap) FROM t1;
SELECT rb64_and_agg(bitmap) FROM t1;
SELECT rb64_xor_agg(bitmap) FROM t1;
SELECT rb64_build_agg(e) FROM generate_series(1,100) e;

Calculate cardinality

SELECT rb64_cardinality('{1,2,3}');

Convert bitmap to bigint array

SELECT rb64_to_array(bitmap) FROM t1 WHERE id = 1;

Convert bitmap to SET of bigint

SELECT unnest(rb64_to_array('{1,2,3}'::roaringbitmap64));

or

SELECT rb64_iterate('{1,2,3}'::roaringbitmap64);

Opperator List

Function List

Aggregation List

Cloud Vendor Support

pg_roaringbitmap is supported by the following cloud vendors

  • Alibaba Cloud RDS PostgreSQL: https://www.alibabacloud.com/help/doc-detail/142340.htm
  • Huawei Cloud RDS PostgreSQL: https://support.huaweicloud.com/usermanual-rds-pg/rds_09_0045.html
  • Tencent Cloud RDS PostgreSQL: https://cloud.tencent.com/document/product/409/67299
  • Google Cloud SQL: https://docs.cloud.google.com/sql/docs/postgres/extensions

To request support for pg_roaringbitmap from other cloud vendors, please see the following:

  • Amazon RDS: send an email to rds-postgres-extensions-request@amazon.com with the extension name and use case (docs)
  • DigitalOcean Managed Databases: comment on this idea
  • Azure Database for PostgreSQL: comment on this post