Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to continue, as the project seems to be unmaintained? #748

Open
m7913d opened this issue Jul 27, 2024 · 6 comments
Open

How to continue, as the project seems to be unmaintained? #748

m7913d opened this issue Jul 27, 2024 · 6 comments

Comments

@m7913d
Copy link

m7913d commented Jul 27, 2024

Sadly, this project seems to be unmaintained by now. The last commit is from 2022. Issues and PRs continues to grow.

@tfussell has done a great job creating and maintaining this repo for many years. I hope he does well.

To keep the project healthy, it might be a good idea to fork the repo into a new github organisation, so that the community can maintain this library.

Who is interested in joining this effort?

  • @tfussell, we would be delighted if you were part of this effort, but don't feel obligated. I understand if you have other priorities now.
  • @Crzyrndm, you are the biggest contributor (apart from tfussel himself). Are you interested in joining this effort?
  • @ThibaultDECO, you are one of the latest contributors to this repo and have several PRs that have not yet been accepted. Would you like to join this effort?
  • @doomlaur, you have one of the most active forks lately. Do you want to join this effort?
  • @musshorn, you have multiple PRs that have not yet been accepted. Are you interested?

If others are interested in joining this effort or you have any other ideas on how to continue this great library, please let us know.

@doomlaur
Copy link
Contributor

First of all, a huge thank you @tfussell for creating and maintaining this library for a long time. He has done an amazing job creating this library, which was a huge help for the C++ community.

@m7913d I would love to join this effort, yes. However, I'm not very comfortable in joining this effort as the main maintainer. This is because I'm using XLNT in a research project that is going to end in a few months - probably at the end of this year. I can continue reviewing pull requests and test certain use cases, however, being the main maintainer is a responsibility I cannot take if I'm not going to use the library anymore. In other words, while I can take my time to work on XLNT over the next couple of months, I cannot promise to keep the same pace in the future if the software I'm working on is going to be discontinued (while I can always test simple use cases by writing a few lines of code independently of the project I'm working on, complex formats such as XLSX require a large amount of testing that goes above and beyond simple use cases containing a few lines of code).

I also want to mention that I talked privately with @flaviu22 about the maintenance issue. He contacted me because I have one of the most active forks. He also mentioned that we could work together on some improvements and new features. He was also interested in implementing support for the older binary XLS format, which was already discussed in the issues #731 and #227. Maybe he would also like to join the effort of maintaining XLNT.

By the way, in addition to the issue #644 discussing the next steps necessary to release XLNT 1.6, I also have a few ideas about what can be improved in the future. I talked with @flaviu22 privately about this, but if you are interested to know my opinion on the future of XLNT, please let me know.

@m7913d
Copy link
Author

m7913d commented Aug 3, 2024

@doomlaur Thank you for your interest in joining the effort. I'm definitely interested in hearing your opinion on the future of this project, but my main focus would be on bug fixes (and performance optimizations) rather than new features.

@doomlaur
Copy link
Contributor

@m7913d My main focus would be the same 😄 The following list by far does not contain everything, but here are some points that could be improved in the future:

  1. Better conformance to the ECMA-376 specification. There are multiple examples where XLNT does not support all features of the Office Open XML specification (which is understandable - I'm not criticizing), however, in such cases XLNT (or its underlying XML parser, libstudxml) throws an exception instead of ignoring the data - for example, issues Exception thrown when loading an XLSX file containing defined names without localSheetId #685 and Multiple exceptions when count does not match the number of elements #735 come to mind.
  2. Performance should be improved. I'm working on a software that can handle large amounts of data, so we need to use XLNT's streaming_workbook_reader for that purpose. However, when loading an XLSX file containing 95 MB of data, XLNT is roughly 10 times slower than Excel, also shown by the performance profiler. Take this number with a grain of salt, though - I did not benchmark it outside of our software using a simple example (not yet, at least), but either way there's a large performance difference. The older issue Improve read and write performance umbrella issue #648 discusses some points that can be improved.
  3. There are major issues with memory consumption when loading large files. Before our software used the streaming_workbook_reader, we used the simpler xlnt::workbook - however, I remember that loading files that contained over 15 MB of data used over 20 GB (!) of memory allocated by XLNT. There are also multiple issues about this, like EXCEL Take up too much memory #370, Severe Memory leak #403 and my_worbook.load() - Consumes a lot of memory #522.
  4. XLNT cannot load files that have been saved in the Strict Open XML Spreadsheet format - a format that is supported by Office 2013 and newer versions. See issue Relationship exception #515.
  5. While XLNT can load encrypted files that have been protected by a password, using it with the streaming_workbook_reader is currently not possible (which is made even worse by the memory consumption issues I explained above) - see issue Implement streaming xlsx production so that large worksheets can be written without being wholly stored in memory #180. In other words, users of this library must currently choose between good memory consumption and the ability to open password-protected documents, but cannot use both at the same time.
  6. XLNT currently cannot open XLS files (the old binary format for Excel), which was the standard before Office 2007 and is still sometimes used nowadays. See issue Implementing XLS consumption/production (originally: "Any excel2003 (.xls) plan?") #227 where @tfussell mentions that it probably shouldn't be a huge amount of work due to the way XLNT already works. @flaviu22 wrote me that he would be interested to implement this.

@m7913d
Copy link
Author

m7913d commented Aug 11, 2024

@doomlaur Many good ideas to improve XLNT. However, I don't have experience with the streaming interface. My main XLNT use case is exporting data to XLSX. So, I will probably not take the lead in improving the streaming interface or XLS support, but I'm happy to support you, @flaviu22 or others where I can.

To get started, I think we should first setup the repo we will use for the development:

  • Should we create a new organization (e.g. xlnt-community), to allow multiple maintainers?
  • From which repo will we start the development? Should we start from tfussel's repo or from your (@doomlaur's) clone?
  • Who will be the (initial) maintainers/organisation members? tfussell (just in case he wants to contribute again in the future), doomlaur and me?

Setting up an active xlnt repo would at least avoid further fragmentation of the xlnt development.

@doomlaur
Copy link
Contributor

@m7913d Since I have exactly the opposite use case (importing data from XLSX, but never exporting), I think we could complement each other very well 😉 By the way, not sure if this is an important use case for you, but since you mention exporting, according to the feature list, writing/exporting files encrypted by a password is not supported at all, as also explained in #151, #231 and #373.

I think a GitHub organization is a good idea, yes - especially because maintainers might become more or less active over time, so having multiple maintainers decreases the probability that the project will die.

Either forking this repository or mine would work. In principle, nowadays, the only difference between my fork and this repository is that I merged some important fixes from #686, #688 and #736. Otherwise, all my previous fixes have already been merged by @tfussell a few years ago. Maybe I would recommend forking this repository and merging the pull requests again, as this repository contains many other issues and pull requests - in other words, just for documentation purposes - but I don't mind either way 😄

@m7913d
Copy link
Author

m7913d commented Aug 13, 2024

The new github location to further develop XLNT by the community is: https://github.com/xlnt-community/xlnt

Feel free to join our effort to keep the good work of tfussell alive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants