Skip to content
This repository has been archived by the owner on Nov 20, 2020. It is now read-only.

LZ4 compression #129

Open
eulerfx opened this issue Apr 5, 2017 · 4 comments
Open

LZ4 compression #129

eulerfx opened this issue Apr 5, 2017 · 4 comments

Comments

@eulerfx
Copy link
Contributor

eulerfx commented Apr 5, 2017

No description provided.

@vchekan
Copy link

vchekan commented Nov 2, 2017

You might be interested in my implementation of LZ4 in kafka4net for some hints:
https://github.com/vchekan/kafka4net/blob/master/src/Compression/Lz4KafkaStream.cs
Things, like bug in kafka checksum implementation can cause a lot of time to debug.
https://issues.apache.org/jira/browse/KAFKA-3160

Another advice, you might want to invest into java cross-validation framework, like this:
https://github.com/vchekan/kafka4net/blob/master/tools/binary-console/src/main/scala/com/ntent/kafka/main.scala
where I generate kafka messages using java driver and use it as golden standard with different types of compressions and buffer sizes. Additional bonus, I get confidence that my implementation works with java consumer.

@eulerfx
Copy link
Contributor Author

eulerfx commented Nov 3, 2017

Hey, thanks for the pointer. I wasn't aware of the LZ4 library and was considering implementing the compression from scratch. Has your experience with the library been good?

Regarding the cross-validation framework - good call, I think this would be useful.

@vchekan
Copy link

vchekan commented Nov 3, 2017

I remember we have used compression for many years in production but do not recall, which one it was, snappy or lz4.
Compression + my frames implementation I test here (every buffer size from 1 byte to 256Kb, random content):
https://github.com/vchekan/kafka4net/blob/master/tests/CompressionTests.cs

Here I run java compatibility test for gzip, lz4, snappy codecs. Idea is to invoke java and generate random content messages. C# creates text file with desired message sizes, java generates messages of desired length, publish messages to kafka and writes text file with hash codes of generated messages. C# reads java's hashes, consumes messages and compares message hash to the one generated by java.
https://github.com/vchekan/kafka4net/blob/master/tests/RecoveryTest.cs#L1967

@eulerfx
Copy link
Contributor Author

eulerfx commented Feb 28, 2018

Looks like there is a lot of interest in getting LZ4 for Kafunk at Jet now. Should be prioritizing this work soon.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants