Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change underlying kdb driver to pykx #10

Open
zeddit opened this issue Aug 10, 2023 · 9 comments
Open

change underlying kdb driver to pykx #10

zeddit opened this issue Aug 10, 2023 · 9 comments

Comments

@zeddit
Copy link

zeddit commented Aug 10, 2023

the project of javakdb seems not to be maintained.
however, a project named pykx is rising and I tested it is far more faster than javakdb, because the kdb setup a new data gateway to serve pykx request. (I don't know if my test is correct)

I found it is possible to invoke python runtime in java. and I wonder if using the pykx could lead to a better performance in latency and scale.

thanks.

@tuor713
Copy link
Owner

tuor713 commented Aug 11, 2023

That sounds like a potentially quite messy integration to call out to Python (I presume on different OS that might require different Python installations etc). I wonder why performance would be so different - as far as I could see from the Java code it's pretty raw socket operations, which I would expect to not be too bad.

What kind of data are you pulling where performance is such a challenge?

@zeddit
Copy link
Author

zeddit commented Aug 16, 2023

sorry for late reply. I am debugging the performance of javakdb. and some results are listed below, the results mainly relate to query small dataset.
It can be found that the actual data related time(queried time, analysis time) is less than 1 second, however the total elapsed time is quite unaccepted. the dataset is about several MBs.
So am I correct to get the conclusion that the performance issue is not the problem with javakdb, but the trino itself.
I wonder if the network or trino cpu/memory leads to the problem.
As a comparison, pykx has no such delays and very fast.
could you recommend me with any other metrics or debugging tools to find more useful insight, thanks.
截屏2023-08-14 19 44 13
截屏2023-08-14 19 55 07

@tuor713
Copy link
Owner

tuor713 commented Aug 17, 2023

That looks extremely slow and completely unreasonable to what I have seen otherwise. Given analysis time & planning is low, no stats are being run.

Can you share how many rows and also the type of columns are in the table? Potentially the paging is creating too many small queries that create an issue.

CPU should be negligible - you can see that in the CPU seconds stats of the query whether Trino uses any substantial CPU. I would also think I/O is unlikely unless Trino and KDB are in vastly different networks with very low bandwidth in between.

@zeddit
Copy link
Author

zeddit commented Aug 17, 2023

thanks for your reply. some more logs are pasted below.
there is only about 10k rows and most rows is of float data type.
the date field is surprisingly not datetime data type. I am using trino-sqlalchemy and it doesn't deal with the type correctly. I am not sure if this type cause the performance issue. I think I didn't migrate the javakdb correctly.

截屏2023-08-17 14 39 14 截屏2023-08-17 14 39 22 截屏2023-08-17 14 39 33 截屏2023-08-17 14 39 46 截屏2023-08-17 14 40 11

@tuor713
Copy link
Owner

tuor713 commented Aug 19, 2023

So based on all that I would think a) there is a horrible performance bug in JavaKDB 2.0 or b) the underlying KDB instance is periodically blocked and slow to respond. In the latest example even analysis time is 7sec - suggesting some sort of hang in the connection. Is there some underlying virtualization in the target KDB such as DNS load balancers?

@zeddit
Copy link
Author

zeddit commented Aug 21, 2023

I will make a comparison between javakdb 1.0 and 2.0, and return my feedback here.

@zeddit
Copy link
Author

zeddit commented Aug 21, 2023

I have tested, the results remain same when using javakdb 1.0.
I wonder how the trino process the data. e.g. it pushs down the querys to the kdb, and kdb return the results which is part of the dataset or table back to trino. Or trino retrieves all data inside the table and calculate a subset of the rows and return it back.

I tested when using pykx read kdb directly into python environment, it needs only 3 seconds to finish. However, when using trino, it needs about 2minutes to finish. It's surprisingly high.

@zeddit
Copy link
Author

zeddit commented Aug 21, 2023

there is no LB which would break connection for kdb. I have only one kdb instance.
I noticed that a query task whose Physical Input Bytes/s could reach 60MB/s but Input Bytes/s could only get to 10MB/s. the later number is almost the bottleneck. I am not sure what Input Bytes/s means and who limit it.

@zeddit
Copy link
Author

zeddit commented Aug 24, 2023

I have conduct some more experiments to find out the bottleneck, and it shows the bottleneck is at the trino-client side.

# test worker local performance limit.
create table memory.default.sf100_lineitem AS select * from tpch.sf100.lineitem; # it tells me about 1Gb/s Input Bytes;

# test worker to client side bottleneck.
select * from memory.default.sf1_lineitem # it's only about 12-18MB/s, which is far from my expectation. why it's so slow.

# test my backend db to worker.
create table memory.default.test AS select * from kdb.default.test; # it's about 90MB/s. which I think is slow but at now, it's bottlenecked by the client side.

I wonder if 90MB/s limit of the javakdb speed is correct. because when I am using pykx directly, it could reach about 200MB/s at the same kdb instance. great thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants