IMap Connector

The IMap connector supports reading and writing to local IMaps. Remote IMaps are not supported yet.

Jet SQL works with IMaps through the concept of external mapping, just like all other resources. It provides an additional convenience: if your IMap already stores some data, Jet can automatically detect the key and value types by sampling an existing entry. This works only when there's at least one entry stored in every cluster node because we sample one random node.

From the SQL semantics perspective, there is an implicit mapping for all the local IMaps in the partitioned schema. To query a local IMap, you can directly execute a statement against it. If you don't name a schema, Jet first looks up the object in the public schema (where the explicitly created mappings are created) and then in the partitioned schema.

Serialization Options

The keyFormat and valueFormat options are mandatory. Currently, if you create the mapping explicitly, we can't resolve these from a sample.

Possible values for keyFormat and valueFormat:

any of the supported SQL data types, except for OBJECT
portable
json
java

The key and value formats can be different.

Primitive serialization

If the format is one of the primitive type names, then the Java class representing that type will be stored in the IMap. By primitive type we mean any supported SQL data type, except for OBJECT.

For example, to create a mapping for IMap<Integer, String>, use:

CREATE MAPPING my_map
TYPE IMap
OPTIONS (
    'keyFormat'='int',
    'valueFormat'='varchar'
)

`Portable` Serialization

For this format, you need to specify additional options:

keyPortableFactoryId, valuePortableFactoryId
keyPortableClassId, valuePortableClassId
keyPortableVersion, valuePortableVersion: optional, default is 0

If you omit a column list from the CREATE MAPPING command, Jet will resolve column names and types by looking at the ClassDefinition found using the given factory ID, class ID, and version.

If the ClassDefinition with the given IDs is not known to the cluster, the column list is mandatory and Jet will create the ClassDefinition based on the column list.

The benefit of this format is that it doesn't deserialize the whole key or value when reading only a subset of fields. Also it doesn't require a custom Java class to be defined on the cluster, so it's usable for non-Java clients.

Example mapping where both key and value are Portable:

CREATE MAPPING my_map
TYPE IMap
OPTIONS (
    'keyFormat' = 'portable',
    'keyPortableFactoryId' = '123',
    'keyPortableClassId' = '456',
    'keyPortableVersion' = '0',  -- optional
    'valueFormat' = 'portable',
    'valuePortableFactoryId' = '123',
    'valuePortableClassId' = '789',
    'valuePortableVersion' = '0'  -- optional
)

For more information on Portable see Hazelcast IMDG Reference Manual.

JSON Serialization

The value will be stored as HazelcastJsonValue. Jet can't automatically determine the column list for this format, you must explicitly specify it:

CREATE MAPPING my_map(
    __key BIGINT,
    ticker VARCHAR,
    amount INT)
TYPE IMap
OPTIONS (
    'keyFormat' = 'bigint',
    'valueFormat' = 'json')

There are no additional options for this format.

JSON's type system doesn't match SQL's exactly. For example, JSON numbers have unlimited precision, but such numbers are typically not portable. We convert SQL integer and floating-point types into JSON numbers. We convert the DECIMAL type, as well as all temporal types, to JSON strings.

We don't support the JSON type from the SQL standard yet. That means you can't use functions like JSON_VALUE or JSON_QUERY. If your JSON documents don't all have the same fields or if they contain nested objects, the usability is limited.

Internally, we store all JSON values in the string form.

Java Serialization

Java serialization is the last-resort serialization option. It uses the Java object exactly as IMap.get() returns it. You can use it for objects serialized using the Java serialization or Hazelcast custom serialization (DataSerializable or IdentifiedDataSerializable).

For this format you must specify the class name using keyJavaClass and valueJavaClass options, for example:

CREATE MAPPING my_map
TYPE IMap
OPTIONS (
    'keyFormat' = 'java',
    'keyJavaClass' = 'java.lang.Long',
    'valueFormat' = 'java',
    'valueJavaClass' = 'com.example.Person')

If the Java class corresponds to one of the basic data types (numbers, dates, strings), that type will directly be used for the key or value and mapped as a column named __key for keys and this for values. In the example above, the key will be mapped with the BIGINT type. In fact, the above keyFormat & keyJavaClass duo is equivalent to 'keyFormat'='bigint'.

If the Java class is not one of the basic types, Hazelcast will analyze the class using reflection and use its properties as column names. It recognizes public fields and JavaBean-style getters. If some property has a non-primitive type, it will be mapped under the OBJECT type.

The class must be available to the cluster. You can either add it to the members class paths by creating a JAR file and adding it to the lib folder, or you can use User Code Deployment. The user code deployment has to be enabled on the members; add the following section to the config/hazelcast.yaml file:

hazelcast:
  user-code-deployment:
    enabled: true

Then use a client to upload the class:

ClientConfig clientConfig = new JetClientConfig();
clientConfig.getUserCodeDeploymentConfig()
            .setEnabled(true)
            .addClass(Trade.class);
JetInstance jet = Jet.newJetClient(clientConfig);

External Column Name

You rarely need to specify the columns in DDL. If you do, you might need to specify the external name for the column.

The entries in a map naturally have key and value elements. Because of this, the format of the external name must be either __key.<name> for a field in the key or this.<name> for a field in the value. If it's just <name>, then this.<name> is assumed.

The external name defaults to this.<columnName>, so normally you only need to specify it for key fields. There are also columns that represent the entire key and value objects, called __key and this.

Implicit Mapping for Existing IMaps

For existing IMaps with data you don't have to create a mapping at all, you can directly query them. The mapping will be resolved based on a random sample read from the map. However, it's not enough if the map has only one entry - we search for a sample on a single random member; if a sample entry isn't found, the query fails. Therefore if your IMap doesn't contain many rows, it's safer to create a mapping explicitly.

The implicit mapping is also problematic if your map contains heterogeneous data, for example instances of multiple classes or JSON objects with varying set of keys. The mapping will be created from a random sample and a different sample could be used each time leading to a different column list each time.

Heterogeneous Maps

For example, let's say you have this in a JSON-valued IMap:

key	value
1	`{"name":"Alice","age":42}`
2	`{"name":"Bob","age":43,"petName":"Zaz"}`

If you map the column petName, it will have the value null for the entry with key=1. This scenario is supported.

Individual entries in the IMap may even use different serialization strategies. We'll try to extract the field by name regardless of the actual serialization format encountered at runtime. However, while this works in the current release, this setup is unsupported - future optimizations we plan to implement might break it. Only maps with homogenous entries should be accessed through SQL.

Remote IMap

IMaps in remote Hazelcast clusters are not yet supported.

4.5.4

Change Data Capture

Serialization Options

Primitive serialization

`Portable` Serialization

JSON Serialization

Java Serialization

External Column Name

Implicit Mapping for Existing IMaps

Heterogeneous Maps

Remote IMap

Serialization Options

Primitive serialization

Portable Serialization

JSON Serialization

Java Serialization

External Column Name

Implicit Mapping for Existing IMaps

Heterogeneous Maps

Remote IMap

`Portable` Serialization