1.pg-emoji
emoji is a pure SQL PostgreSQL extension to encode/decode bytea/text to/from emoji.
A lookup-table is constructed from the first 1024 emojis from [https://unicode.org/Public/emoji/13.1/emoji-test.txt], where each emoji maps to a unique 10 bit sequence.
The input data is split into 10 bit fragments, mapped to the corresponding emojis.
The first emoji in the result is a header, where the first bit is 1 if the result was zero padded, and the remaining 9 bits is a checksum based on the input data.
If the checksum is invalid during decode, NULL is returned.
2. Dependencies
None.
3. Installation
Install the emoji
extension with:
$ git clone https://github.com/truthly/pg-emoji.git
$ cd pg-emoji
$ make
$ sudo make install
$ make installcheck
4. Usage
Use with:
$ psql
# CREATE EXTENSION emoji;
CREATE EXTENSION;
5. API
emoji.encode(bytea)→text
SELECT emoji.encode('\x0123456789abcdef'::bytea);
encode
----------
????????
(1 row)
Making a subtle change to the input data will not only change
the corresponding emoji, but also the first emoji which contains a
9-bit checksum of the data, which means it will change with 99.8%
confidence (511/512). Notice in the example below what happens if
the last f is changed to 7.
SELECT emoji.encode('\x0123458789abcde7'::bytea);
encode
----------
????????
(1 row)
emoji.decode(text)→bytea
SELECT emoji.decode('????????');
decode
--------------------
\x0123456789abcdef
(1 row)
Thanks to the first emoji containing a 9-bit checksum of the data,
failing to properly copy/paste the entire emoji string will be
detected upon decoding with 99.8% confidence and NULL
will be returned.
SELECT emoji.decode('???????');
decode
--------
(1 row)
emoji.from_text(text)→text
SELECT emoji.from_text('Hello ?!');
from_text
------------
??????????
(1 row)
emoji.to_text(text)→text
SELECT emoji.to_text('??????????');
to_text
----------
Hello ?!
(1 row)