-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change underlying kdb driver to pykx #10
Comments
That sounds like a potentially quite messy integration to call out to Python (I presume on different OS that might require different Python installations etc). I wonder why performance would be so different - as far as I could see from the Java code it's pretty raw socket operations, which I would expect to not be too bad. What kind of data are you pulling where performance is such a challenge? |
That looks extremely slow and completely unreasonable to what I have seen otherwise. Given analysis time & planning is low, no stats are being run. Can you share how many rows and also the type of columns are in the table? Potentially the paging is creating too many small queries that create an issue. CPU should be negligible - you can see that in the CPU seconds stats of the query whether Trino uses any substantial CPU. I would also think I/O is unlikely unless Trino and KDB are in vastly different networks with very low bandwidth in between. |
So based on all that I would think a) there is a horrible performance bug in JavaKDB 2.0 or b) the underlying KDB instance is periodically blocked and slow to respond. In the latest example even analysis time is 7sec - suggesting some sort of hang in the connection. Is there some underlying virtualization in the target KDB such as DNS load balancers? |
I will make a comparison between javakdb 1.0 and 2.0, and return my feedback here. |
I have tested, the results remain same when using javakdb 1.0. I tested when using |
there is no LB which would break connection for kdb. I have only one kdb instance. |
I have conduct some more experiments to find out the bottleneck, and it shows the bottleneck is at the trino-client side.
I wonder if |
the project of javakdb seems not to be maintained.
however, a project named pykx is rising and I tested it is far more faster than javakdb, because the kdb setup a new data gateway to serve pykx request. (I don't know if my test is correct)
I found it is possible to invoke python runtime in java. and I wonder if using the pykx could lead to a better performance in latency and scale.
thanks.
The text was updated successfully, but these errors were encountered: