A kotlin library for serializing and deserializing Kafka messages using Confluent's Schema Registry.
Warning
This library is in early development, so the API may change in future releases.
It can serialize any type which is natively supported by Avro4k, plus:
- Serializing any
java.lang.Collectionandjava.lang.Mapimplementation - Can serialize a
Mapas aRECORDschema, where keys are field names and values are field values - Serializing all the classes implementing:
IndexedRecordGenericRecordSpecificRecordandSpecificRecordBase(so finally works with generated classes)GenericEnumSymbolGenericFixedGenericArray
- Serializing all the generic types from the apache's avro library:
GenericData.RecordGenericData.ArrayGenericData.FixedGenericData.EnumSymbolByteBufferUtf8
- Also supports
NonRecordContainerto explicit the schema to be used for non-generic types (e.g. types not implementingGenericContainer) like primitive types, or specifying unions for schema registration.
There is no difference between generic serialization and reflection serialization
First, the logical type is resolved if it exists based on the registered logical types in the Avro instance using Avro { setLogicalTypeSerializer("logical type name", TheSerializer()) }.
By default, those logical types are registered:
durationas [com.github.avrokotlin.avro4k.serializer.AvroDuration]uuidas [java.util.UUID]dateas [java.time.LocalDate]time-millisas [java.time.LocalTime]time-microsas [java.time.LocalTime]timestamp-millisas [java.time.Instant]timestamp-microsas [java.time.Instant]
Then, for generic deserialization, if there isn't any logical type indicated, or the one set in the schema is not registered, it can deserialize following schema types:
- schema
BOOLEAN,INT,LONG,FLOAT,DOUBLE,STRINGare deserialized as their corresponding kotlin typesBoolean,Int,Long,Float,Double,String - schema
BYTESis deserialized asByteArray - schema
ARRAYis deserialized asArrayList - schema
MAPis deserialized asHashMap, where keys are alwaysString - schema
FIXEDis deserialized asGenericFixed - schema
ENUMis deserialized asGenericEnumSymbol - schema
RECORDis deserialized asGenericRecord
For reflection deserialization, it will be a little bit more specific:
- First, if it exists, it deserializes to the type specified in the
java-classproperty in the schema. This step is skipped if the property value doesn't exist in the classpath. - Then, for the named types (
ENUM,RECORD,FIXED), it will try to deserialize to a concrete class, where the class lookup is based on the schema full name or its aliases. - Finally, if no concrete class is found for both
java-classproperty and schema name/aliases, it will fallback to generic deserialization.
Note
Important note: To allow deserializing to a type based on its @AvroAlias, or based on a type with a custom @SerialName, you need to register the type to the Avro instance in the SerializersModule
Add the dependency to your project:
implementation("com.github.avro-kotlin.avro4k:avro4k-confluent-kafka-serializer:<latest-version>")There are 3 types of serialization:
- Generic: for serializing any type and deserializing to generic classes (like
GenericRecord,GenericEnumSymbol,GenericFixed,GenericArray) and primitive types (likeInt,String, etc) - Reflect: for serializing any type but deserializing to specific records or kotlin/java classes. It will fall back to generic deserialization if no specific class is found
- Specific: for serializing (any type) and deserializing to a specific type known at compile time.
If you need to (de)serialize data in kafka based on a schema registry, you would mostly use reflect serdes/serializers/deserializers to allow deserializing concrete kotlin/java classes using reflection.
First, you have to create an instance of the serde. Don't forget to call .configure(props, isKey) before using it to configure the schema registry client.
val serde = ReflectAvro4kKafkaSerde()
serde.configure(mapOf("schema.registry.url" to "http://the-url.com"), isKey = false)You can also create a configured instance directly, removing the need to call .configure:
val serde = ReflectAvro4kKafkaSerde(
isKey = false,
props = mapOf("schema.registry.url" to "http://the-url.com")
)Then, you can get the serializer and deserializer from the serde:
val serializer = serde.serializer()
val deserializer = serde.deserializer()For some use cases like with KafkaProducer or KafkaConsumer, you can also create instances of the serializer and deserializer directly.
val serializer = ReflectAvro4kKafkaSerializer()
serializer.configure(mapOf("schema.registry.url" to "http://the-url.com"), isKey = false)
val deserializer = ReflectAvro4kKafkaDeserializer()
deserializer.configure(mapOf("schema.registry.url" to "http://the-url.com"), isKey = false)Or create configured instances directly:
val serializer = ReflectAvro4kKafkaSerializer(
isKey = false,
props = mapOf("schema.registry.url" to "http://the-url.com")
)
val deserializer = ReflectAvro4kKafkaDeserializer(
isKey = false,
props = mapOf("schema.registry.url" to "http://the-url.com")
)Finally, if you want to use a custom Avro instance (for example, to register custom logical types or a serializers module), you can pass it as a parameter:
val avro = Avro {
// your custom configuration
setLogicalTypeSerializer("my-logical-type", MyLogicalTypeSerializer())
implicitNulls = false
}
val serde = ReflectAvro4kKafkaSerde(
avro = avro,
isKey = false,
props = mapOf("schema.registry.url" to "http://the-url.com")
)You will generally use generic (de)serialization when you don't want to instantiate concrete kotlin/java classes during deserialization.
All the creation methods and their usage are similar to the reflect ones.
You can create an instance by passing explicitly the type serializer:
Don't forget to call .configure(props, isKey) before using it to configure the schema registry client.
val serde = SpecificAvro4kKafkaSerde(YourType.serializer())
serde.configure(mapOf("schema.registry.url" to "http://the-url.com"), isKey = false)
val serializer = SpecificAvro4kKafkaSerializer(YourType.serializer())
serializer.configure(mapOf("schema.registry.url" to "http://the-url.com"), isKey = false)
val deserializer = SpecificAvro4kKafkaDeserializer(YourType.serializer())
deserializer.configure(mapOf("schema.registry.url" to "http://the-url.com"), isKey = false)You can create an instance using those convenient reified methods to infer the type automatically (done at compile time).
Don't forget to call .configure(props, isKey) before using it to configure the schema registry client.
val serde = SpecificAvro4kKafkaSerde<YourType>()
serde.configure(mapOf("schema.registry.url" to "http://the-url.com"), isKey = false)
val serializer = SpecificAvro4kKafkaSerializer<YourType>()
serializer.configure(mapOf("schema.registry.url" to "http://the-url.com"), isKey = false)
val deserializer = SpecificAvro4kKafkaDeserializer<YourType>()
deserializer.configure(mapOf("schema.registry.url" to "http://the-url.com"), isKey = false)Finally, you can directly create configured instances:
val serde = SpecificAvro4kKafkaSerde<YourType>(
isKey = false,
props = mapOf("schema.registry.url" to "http://the-url.com")
)
val serializer = SpecificAvro4kKafkaSerializer<YourType>(
isKey = false,
props = mapOf("schema.registry.url" to "http://the-url.com")
)
val deserializer = SpecificAvro4kKafkaDeserializer<YourType>(
isKey = false,
props = mapOf("schema.registry.url" to "http://the-url.com")
)In any case, if you want to use a custom Avro instance (for example, to register custom logical types or a serializers module), you can pass it as a parameter:
val avro = Avro {
// your custom configuration
}
val serde = SpecificAvro4kKafkaSerde<YourType>(
avro = avro,
isKey = false,
props = mapOf("schema.registry.url" to "http://the-url.com")
)Note
Reference documentation: Record serialization and deserialization
Then, configure the KafkaAvroSerializer and KafkaAvroDeserializer in your spring configuration:
- for serializing anything and deserializing generic records, enums and fixed types (not using reflection):
com.github.avrokotlin.avro4k.kafka.confluent.GenericAvro4kKafkaSerdecom.github.avrokotlin.avro4k.kafka.confluent.GenericAvro4kKafkaSerializercom.github.avrokotlin.avro4k.kafka.confluent.GenericAvro4kKafkaDeserializer
- for serializing anything but deserializing to specific records or kotlin/java classes:
com.github.avrokotlin.avro4k.kafka.confluent.ReflectAvro4kKafkaSerdecom.github.avrokotlin.avro4k.kafka.confluent.ReflectAvro4kKafkaSerializercom.github.avrokotlin.avro4k.kafka.confluent.ReflectAvro4kKafkaDeserializer
spring.cloud.stream.kafka.streams.binder.configuration.default.key.serde: com.github.avrokotlin.avro4k.kafka.confluent.GenericAvro4kKafkaSerde
spring.cloud.stream.kafka.streams.binder.configuration.default.value.serde: com.github.avrokotlin.avro4k.kafka.confluent.GenericAvro4kKafkaSerde
spring.cloud.stream.kafka.streams.bindings.<binder name>.consumer.keySerde: com.github.avrokotlin.avro4k.kafka.confluent.GenericAvro4kKafkaSerde
spring.cloud.stream.kafka.streams.bindings.<binder name>.consumer.valueSerde: com.github.avrokotlin.avro4k.kafka.confluent.GenericAvro4kKafkaSerdeYou may need special handling for your own logical types, so that you can deserialize them to specific types instead of the default ones (like String for uuid).
You can register your own logical type serializers in the Avro instance used by the serde/serializer/deserializer.
This applies to all the serialization types: generic, reflect and specific.
val serde = ReflectAvro4kKafkaSerde(
avro = Avro {
setLogicalTypeSerializer("my-logical-type", MyLogicalTypeSerializer())
},
isKey = false,
props = mapOf("schema.registry.url" to "http://the-url.com")
)When using a schema registry, you are probably going to evolve your schemas and models. However, consuming a kafka event with a more recent schema version may fail if the schema name has changed. Also, some schema may have been created with a name that doesn't match your kotlin class name. Finally, you may want to use the same kotlin class for different schemas.
To solve this, you can use the @SerialName and @AvroAlias annotations to indicate alternative names for your kotlin classes.
Then, to make accessible the alias'ed types, you need to register those classes in the Avro instance used by the serde/serializer/deserializer.
package com.example
@Serializable
enum class TheEnum {
VALUE1, VALUE2
}
val schemaRegistry = MockSchemaRegistry()
val serializer = ReflectAvro4kKafkaSerializer(isKey = false, schemaRegistry = schemaRegistry)
val bytes = serializer.serialize("topic", TheEnum.VALUE1) // registered schema name is "com.example.TheEnum"
package my.awesome.refactoring.packaging
@Serializable
enum class RefactoredEnum {
VALUE1, VALUE2
}
val deserializer = ReflectAvro4kKafkaDeserializer(isKey = false, schemaRegistry = schemaRegistry)
deserializer.deserialize("topic", bytes) // throws exception, no class found for schema name "com.example.TheEnum"
@Serializable
@AvroAlias("ServerAppEnum")
enum class MobileAppEnumWithAlias {
VALUE1, VALUE2
}
val deserializer = ReflectAvro4kKafkaDeserializer(isKey = false, schemaRegistry = schemaRegistry)
deserializer.deserialize("topic", bytes) // still throws exception, the Avro instance doesn't know about MobileAppEnumWithAlias's aliases
val deserializer = ReflectAvro4kKafkaDeserializer(
avro = Avro {
// register the class so it can be found by its name or aliases
contextual(MobileAppEnumWithAlias.serializer())
},
isKey = false,
schemaRegistry = schemaRegistry
)
deserializer.deserialize("topic", bytes) // returns MobileAppEnumWithAlias.VALUE1