Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed Brief Description Popover extraction Issue #1278

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

THEBOSS0369
Copy link

@THEBOSS0369 THEBOSS0369 commented Oct 1, 2024

This PR Fixes #1276

Hello Everyone!

Description

In this PR i have fixed the popover inappropriate description extraction by adding regex logic so that only Required information will be shown.
NOTE: This PR fixes almost every popover however there are still 3 - 5 popover's that i haven't able to fix for example calsacaus in Europe , Greenland and other 3 from USA.

Test

I have done all the necessary test

  1. Checked in both "Restricted" and "ServiceWorker" Everything is working fine check the ss below.
  2. Unit tests npm test no issue
  3. End-to-end (e2e) tests-e2e-iemode-> There was some e2e.runner.js error which i have commented on this PR's Issue thread. I tried alternative http-server there wasn't any issue.
  4. extension versions with production code tested
  5. Browser Test -> Microsoft Edge, Chrome, Firefox and Brave

Screenshots

Before

rail
mount
safari

After (Test - eServiceworker / Non restricted)

rail
mount
safari

After (Test - Restricted)

high speed rail
asia
greece

Thanks

@Jaifroid
Copy link
Member

Jaifroid commented Oct 3, 2024

Your issue is probably due to the lack of a standard Firefox installation locally. I'm running tests on GH Actions, so let's see.

@Jaifroid
Copy link
Member

Jaifroid commented Oct 6, 2024

Hi, all tests passed, and I've also now had a chance to test this on the Wikivoyage that was showing it. It's working fine, well done! I just need to check for regressions with any other Wikimedia archives, which I'll do now.

Copy link
Member

@Jaifroid Jaifroid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comment on why we can't use length as a selection criterion. However, I saw while testing that these title descriptions do in fact have an ID which can be used to filter them out. See screenshots:

image

image

All you need to do, then, is refine the querySelectorAll function on line 107 so that it selects all paragraphs EXCEPT those with the id #pcs-edit-section-title-description.


// removing the para with less than 50 characters
// regex to check the paragraph if its too short or a brief description
const briefDescriptionRegex = /^.{1,100}$/;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a safe way to filter out these short descriptions. We don't know in advance the amount of text, so setting a limit of 100 characters seems arbitrary and will likely have unintended consequences elsewhere, for example filtering out legitimate paragraphs with only 100 characters in them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we do have a filter requiring at least 50 characters is that there are lots of small empty paragraphs or empty paragraphs with things like <span>&nbsp;</span> which can't be filtered out any other way. Through testing, we determined that 50 characters caught nearly all these while never removing any nodes with meaningful text.

@Jaifroid
Copy link
Member

Jaifroid commented Oct 6, 2024

Good idea to update the branch too, see button "Update branch" below. Once you've done that, be sure to pull the change in your local copy of the Repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants