Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed Brief Description Popover extraction Issue #1278

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion www/js/lib/popovers.js
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,12 @@ function cleanUpLedeContent (node) {
// The reason we prefer innerText is that it strips out hidden text and unnecessary whitespace, which is not the case with textContent
const innerText = para.innerText ? para.innerText : para.textContent;
const text = innerText.trim();
return !/^\s*$/.test(text) && text.length >= 50;

// removing the para with less than 50 characters
// regex to check the paragraph if its too short or a brief description
const briefDescriptionRegex = /^.{1,100}$/;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a safe way to filter out these short descriptions. We don't know in advance the amount of text, so setting a limit of 100 characters seems arbitrary and will likely have unintended consequences elsewhere, for example filtering out legitimate paragraphs with only 100 characters in them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we do have a filter requiring at least 50 characters is that there are lots of small empty paragraphs or empty paragraphs with things like <span>&nbsp;</span> which can't be filtered out any other way. Through testing, we determined that 50 characters caught nearly all these while never removing any nodes with meaningful text.


return !briefDescriptionRegex.test(text) && text.length >= 50;
});
return parasWithContent;
}
Expand Down
Loading