Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple exceptions when count does not match the number of elements #735

Open
doomlaur opened this issue May 22, 2024 · 6 comments · Fixed by doomlaur/xlnt#3
Open

Multiple exceptions when count does not match the number of elements #735

doomlaur opened this issue May 22, 2024 · 6 comments · Fixed by doomlaur/xlnt#3

Comments

@doomlaur
Copy link
Contributor

I noticed that xlsx_consumer.cpp throws exceptions in multiple cases when count does not match the number of elements. This causes issues because:

  1. According to ECMA-376, in each of these cases, the "count" attribute is optional:
<xsd:attribute name="count" type="xsd:unsignedInt" use="optional"/>
  1. I noticed issues when trying to open files exported by SAP software, where the exporter sets count="1" but the actual count is 2. Of course this is wrong and is actually an issue of the SAP software - however, Microsoft Excel is still able to open the file correctly. For users of our software, it looks like our software is broken because it cannot open the file (while Excel can), since users do not know that the SAP software is actually the broken one 😅 On the other hand, while it is not nice to accept obviously broken files, there is no actual disadvantage for XLNT to simply ignore the fact that the count does not match.

My proposed solution:

  1. Check whether the "count" attribute exists - if it does, read it.
  2. If the count is known, call std::vector::reserve to allocate enough memory right away (small performance improvement, because why not?).
  3. Ignore exceptions due to invalid count by default. However, if THROW_ON_INVALID_XML is defined and the count is known, throw an exception, just like before. In other words, if a very strict XML parser is desired, exceptions can still be enabled.

I will send a pull request containing the proposed changes.

@doomlaur
Copy link
Contributor Author

Pull Request #736 fixes the issues I had.

@doomlaur doomlaur reopened this May 22, 2024
@mankaixin
Copy link

a new question in this restoration,When you insert data, if it's a number, no problem, if it's a string then there's a problem, it's going to be a different character than what you inserted. I found that the tag number corresponding to sheeting.xml is incorrect.

@mankaixin
Copy link

if excel is size don't match ,i write some word to excel ,but excel word is error .i found error is here

//default allow_duplicates false
std::size_t workbook::add_shared_string(const rich_text &shared, bool allow_duplicates)
{
    register_workbook_part(relationship_type::shared_string_table);
   
    if (!allow_duplicates)
    {
        auto it = d_->shared_strings_ids_.find(shared);

        if (it != d_->shared_strings_ids_.end())
        {
            return it->second;
        }
    }

    auto sz = d_->shared_strings_ids_.size();
    
    d_->shared_strings_ids_[shared] = sz;
    d_->shared_strings_values_.push_back(shared);
    //modify by 
    return d_->shared_strings_values_.size() - 1;
    //return sz;
}

in size don't match we don't return sz isn't

@doomlaur
Copy link
Contributor Author

doomlaur commented Jun 7, 2024

@mankaixin Could you please share an XLSX file where I can test your issue?

@mankaixin
Copy link

@mankaixin Could you please share an XLSX file where I can test your issue?

test2.xlsx
I write some data to sheet PadName in cells c2 through c4

I wanted to write the effect as shown below
image

, but the result is as shown below
image

@doomlaur
Copy link
Contributor Author

@mankaixin Thanks for the example. The issue I described above is about reading data, while the issue you described is about writing data. Since your issue is a different one than mine, so please open a new GitHub issue. Furthermore, you could also create a Pull Request containing your fix.

I quickly looked at the code and I agree - if allow_duplicates is true and duplicates are added, the unordered_map shared_strings_ids_ and the vector shared_strings_values_ will get out of sync and the returned index will be wrong, so your proposed fix should fix the issue - although I must admit that I did not check whether there are any other similar issues related to duplicate strings. But this is something we can discuss in a new GitHub issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants