Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support for Mark to Base Attachment Positioning (incl. GPOS 4, 9) #557

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

rafallyczkowskiadylic
Copy link
Contributor

@rafallyczkowskiadylic rafallyczkowskiadylic commented Feb 6, 2023

Description

Added support for Mark to Base Attachment Positioning (for the DFLT script). Feature is enabled by default because it is a part of a proper glyphs positioning. Includes capabilities implementation for the corresponding parsers (GPOS lookup 4 & 9)

Followed the current solution's convention to introduce new features within the font & layout implementation.

  • added GPOS LookupType 4 parser for MarkBasePosFormat1 support
  • added GPOS LookupType 9 parser for Extension Positioning SubtableFormat1 support
  • added GSUB support for multiSubstitutionFormat1
  • introduced the union lookup feature (getFeaturesLookups) for the layout to meet the OT specs
  • fixed GSUB substitutions algorithm
  • tests coverage for kerning and positioning features
  • tests coverage for all supported and not supported lookups of GPOS

TODO: Add support for non DFLT scripts for MARK (src/font.js:344)
TODO: Add a support device offsets for anchor table format 3 (src/parse.js:515)
TODO: Add MarkToMark Attachment positioning (lookup type 6) functionality.

Motivation and Context

Positioning changes

Fonts like Mandarin or Thai to successfully render text with all glyphs (sometimes 3+) representing characters like ปั or ริ่
requiring the LookupType 4 for mark to base attachment positioning to do it properly.

Extension subtables support is required to support the subtables that exceeds the 16-bit limits of the various other offsets in the GPOS table - that genuinely are being used eg. for the Apples SF Pro fonts and many others.

Substitution changes

There was an issue with the current algorithm where it was collecting all substitutions from all lookup tables and after that each substitution was conducted. This is invalid. Because it should apply substitution for each lookup table before moving to the next lookup table. In other words: substitution should be applied before moving to the next lookup table

Specification:

During text processing, a client applies a lookup to each glyph in the string before moving to the next lookup. A lookup is finished for a glyph after the client locates the target glyph or glyph context and performs a substitution, if specified. To move to the “next” glyph, the client will typically skip all the glyphs that participated in the lookup operation: glyphs that were substituted as well as any other glyphs that formed a context for the operation.

Ref: https://learn.microsoft.com/en-us/typography/opentype/spec/gsub#table-organization

The union lookup feature

The union lookup feature (getFeaturesLookups) was required actually to meet the OpenType specification which guarantees that all enabled features will be processed in a valid order both for positioning and substitutions.

the list of lookups is the union of lookups referenced by all of those features, and these are all processed in their lookup list order.

How Has This Been Tested?

Added tests coverage to the test/parser.js and test/tables/gpos.js including new binary data and testing:

These lookups uses existing structures like coverage tables, and were tested agains different formats. Extension positioning lookup was tested agains all currently supported and not supported subtable parsers.

Added tests coverage to the test/layout.js to test the method for union lookups (getFeaturesLookups).
Added tests coverage for glyph positioning tests in the test/font.js including:

  • Kerning feature support with kerning table and lookup table support
  • options.kerning and options.features.kern testing
  • Mark To Base Attachment feature support to work with kern features
  • options.features.mark testing
    Added tests coverage to the test/bidi.js to test multiSubstitutionFormat1 glyphs decomposition

Moreover there were many test conducted for the rendered content using latn and arab fonts.

Screenshots (if appropriate):

Positioning issues

BEFORE (INVALID glyphs pos):
image

AFTER (FIXED):
image

Substitution issues

BEFORE (INVALID) - very end glyph is a single glyph that should be decomposed into two.
image

AFTER (FIXED) - decomposed into two glyph:
image

image

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • I did npm run test and all tests passed green (including code styling checks).
  • I have added tests to cover my changes.
  • My change requires a change to the documentation.
  • I have updated the README accordingly.
  • I have read the CONTRIBUTING document.

- added GPOS LookupType 4 parser for MarkBasePosFormat1 support
- added GPOS LookupType 9 parser for Extension Positioning SubtableFormat1 support
- tests coverage for all supported and not supported lookups of GPOS

TODO: Add a support device offsets for anchor (Parser.anchor) table format 3.
- introduced an union, ordered features lookup list
- refactored the Font options features update method
- supported kerning table
- supported kern lookup table
- supported mark to base attachments
- fix for kern sequence
- tests coverage
@Connum Connum added the Spec Related to the implementation of the Opentype specification label Feb 6, 2023
@Connum Connum added this to the Release 2.0.0 milestone Feb 6, 2023
@Connum
Copy link
Contributor

Connum commented Feb 6, 2023

Does this make #491 redundant, or do we have to consolidate the two? Could you please have a look at each other's code @rafallyczkowskiadylic & @AlexandreRivet?

@ILOVEPIE
Copy link
Contributor

ILOVEPIE commented Feb 6, 2023

@rafallyczkowskiadylic I noticed that rlig is missing an implementation for latin scripts and liga is missing an implementation for arabic scripts, is that also fixed in this PR?

@rafallyczkowskiadylic
Copy link
Contributor Author

rafallyczkowskiadylic commented Feb 7, 2023

@Connum #491 is related and probably redundant. But initially I was reviewing it and it felt like not fully covered and bounding with kern feature only. This MR covers positioning feature in first place, and GPOS 9 was required to be supported along with the GPOS 4.

This MR fully covers what was proposed in https://github.com/opentypejs/opentype.js/pull/491/files

My proposition for modus operandi:
Let's focus on #535 in first place. I will commit and cover the missing rlig over there. By the time I will be working on that please review this MR and drop questions or any concerns so we can discuss.

@ILOVEPIE These two MRs are considered as separate chunks of features. I will fix whatever will be needed starting with the smaller one #535

@ILOVEPIE ILOVEPIE self-requested a review February 7, 2023 19:45
Copy link
Contributor

@ILOVEPIE ILOVEPIE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could only find one issue with the code besides some naming choices for constants. I'm going to get @Connum to review this as well as it's a larger PR and I want to make sure that it's ok.

@@ -279,7 +282,8 @@ Font.prototype.defaultRenderOptions = {
* and shouldn't be turned off when rendering arabic text.
*/
{ script: 'arab', tags: ['init', 'medi', 'fina', 'rlig'] },
{ script: 'latn', tags: ['liga', 'rlig'] }
{ script: 'latn', tags: ['liga', 'rlig'] },
{ script: 'DFLT', tags: ['mark'] },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to microsoft documentation dflt is lowercase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ILOVEPIE I could not find this. I believe you might have referred to the langSysTable (tag) and this is a script tag? Moreover:

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://learn.microsoft.com/en-us/typography/opentype/spec/scripttags it's lowercase for all other tags, but DFLT is uppercase in the docs

src/parse.js Outdated
xCoordinate: this.parseShort(),
yCoordinate: this.parseShort(),
xDevice: 0x00,
yDevice: 0x00,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when parsing these values, we have to call this.parseShort() to advance the parser past the xDevice and yDevice or subsequent reads will be at the wrong location.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I will fix shortly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Pointer is moving now

* https://learn.microsoft.com/en-us/typography/opentype/otspec191alpha/chapter2#lookup-list-table
*
* @param {string[]} requestedFeatures
* @param {string} [script='DFLT']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dflt should be lower case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ILOVEPIE ILOVEPIE requested a review from Connum February 7, 2023 20:04
@ILOVEPIE
Copy link
Contributor

ILOVEPIE commented Feb 7, 2023

Also, @rafallyczkowskiadylic does this PR fix #501, because if you're redoing this code anyways, it might be worth fixing it.

Copy link
Contributor

@Connum Connum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit that all that byte-level-parsing stuff is a bit out of my comfort zone (yes, despite me having implemented a few parsing features before ;)), but other than @ILOVEPIE's comments and those I added, there's nothing that caught my eye.

src/substitution.js Outdated Show resolved Hide resolved
src/layout.js Outdated
return [];
}

// Fitler out only supported by layout table features
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just y small typo, Fitler => Filter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

src/parse.js Outdated Show resolved Hide resolved
@Connum
Copy link
Contributor

Connum commented Mar 2, 2023

@rafallyczkowskiadylic the feature order is quite a critical thing to add before we start working on implementing more features like calt etc. It would be great if we could merge this soon!

@rafallyczkowskiadylic
Copy link
Contributor Author

@Connum once we have the other MR merged (that covers rlig that is missing here as well) I will resolve and address all concerns around this mr shortly.

@Connum
Copy link
Contributor

Connum commented Mar 3, 2023

Sure, that's fine! We don't have to support everything at once when implementing a new feature or format. We should just make sure that users are notified about any missing/unsupported functionality.

I just wanted to avoid that anything was misusing that was supposed to be in this PR already.

@rafallyczkowskiadylic
Copy link
Contributor Author

@Connum Perfect! Great you are on top. So, let's address what is pending now. We stay in touch

@rafallyczkowskiadylic
Copy link
Contributor Author

I will check #501

@rafallyczkowskiadylic
Copy link
Contributor Author

@Connum @ILOVEPIE added few more commits with the substitution algorithm fixes and a new substitution format support. I haven't refactored the other working code (just the required bits). We can find few spots where we see "similar" functionalities. For example:

  1. src/features/featureQuery.js FeatureQuery.prototype.lookupFeature collects substitutions - too much responsibility here. I marked this as deprecated and pointed the fixed algorithm
  2. src/substitution.js Substitution.prototype.getFeature - substitution processing
  3. src/bidi.js - substitution applications

I believe we could move towards:

  • Layout - to keep what it has, consume GPOS/GSUB definitions, lookups. Keep important getFeaturesLookups method that build a union lookups list for ordered processing.
  • Position and Substitution - shares base functionality to provide features lookups through the Layout
  • FeatureQuery - should be promoted to use this to find and process features (not only in the bidi)
  • substitution features per script i.e src/features/latn/latinLigatures.js src/features/arab/arabicRequiredLigatures.js should not be required since mostly they are simply the same. We basically need word ranges and context, but these are the same methods. These can be removed in favor of applySubstitutions.call(this, 'thai', ['liga', 'rlig', 'ccmp']);

Not sure if bidi is really required, maybe for the directional substitutions but I think bidi could be deprecated and it's responsibility could be dropped into the Font where bidi is being called and where positioning logic is happening anyways. But basically I think we could use a single processor both for substitutions and positioning.

Well this can open a discussion, but wanted to share some thoughts.

@Connum
Copy link
Contributor

Connum commented Oct 19, 2023

I'm totally for unifying duplicate functionality. The way that feature processing is currently structured is something that has deterred me from working on new font features because it frankly seems a bit messy and not very intuitive to me. However, I would see that as a separate step from this PR. Let's get this support in first and consolidate feature processing in the future. Otherwise there might be too much code to review at once, and we have enough open and partly almost-finished PRs lying around as it is.

That's my opinion and not the final say, of course.

edit: after a quick glance at the code changes, it looks to me like you have already done most of the work described in your previous comment? Of course let's keep the work you already put in there and just move any further consolidation to somewhere down the line.

@ILOVEPIE
Copy link
Contributor

This is blocking quite a few PRs, we should probably get this merged sometime soon.

@rafallyczkowskiadylic
Copy link
Contributor Author

Thank you guys for the comments. Yes, most of suggestions/work has been done. I have consolidated functionality as described introducing more generic functions to handle/fix specifications. I will resolve conflicts soon.

@Connum
Copy link
Contributor

Connum commented Nov 1, 2023

Great, looking forward to it! Thanks for your work!

@Connum
Copy link
Contributor

Connum commented Nov 25, 2023

Fixes #501 according to #501 (comment)

@rafallyczkowskiadylic it's been a couple of weeks - just a friendly nudge. :)

* https://learn.microsoft.com/en-us/typography/opentype/otspec191alpha/chapter2#lookup-list-table
*
* @param {string[]} requestedFeatures
* @param {string} [script='DFLT']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -279,7 +282,8 @@ Font.prototype.defaultRenderOptions = {
* and shouldn't be turned off when rendering arabic text.
*/
{ script: 'arab', tags: ['init', 'medi', 'fina', 'rlig'] },
{ script: 'latn', tags: ['liga', 'rlig'] }
{ script: 'latn', tags: ['liga', 'rlig'] },
{ script: 'DFLT', tags: ['mark'] },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://learn.microsoft.com/en-us/typography/opentype/spec/scripttags it's lowercase for all other tags, but DFLT is uppercase in the docs

@Connum
Copy link
Contributor

Connum commented Nov 29, 2023

I was finally able to test this more thoroughly... I merged it into my experimental fork after resolving some conflicts.

The chars in the GPOS–3 test of the unicode test suite still rendered wrong, and they also didn't work in our demo page.
image
image

So I had to make a few changes to make it work, see
Connum@95688c6

image
image
Hey presto!

However, these calculations are quite expensive, even before my additions, because they are done for each single glyph. I don't think that's necessary and we should implement ways to at least cache this, or best only call the feature update when something has changed.

@Connum
Copy link
Contributor

Connum commented Nov 29, 2023

Ok, now the tests for disabling the mark feature are failing of course, because I implemented it so that default features cannot be disabled. We'll have to talk about how we want to handle this.

@Connum
Copy link
Contributor

Connum commented Nov 30, 2023

The more I look at this, the more confused I get... 😅

I think we still need some cleaning up before we can finally merge this, but I'm lost here without you @rafallyczkowskiadylic. I tried to fix https://rawgit.com/unicode-org/text-rendering-tests/master/reports/OpenType.js.html#GSUB-1, but I got caught up in spaghetti code for several hours trying to follow the path of the substitution mechanism and got confused between old and new functionality. e.g. we should really get rid of this:

  • substitution features per script i.e src/features/latn/latinLigatures.js src/features/arab/arabicRequiredLigatures.js should not be required since mostly they are simply the same. We basically need word ranges and context, but these are the same methods. These can be removed in favor of applySubstitutions.call(this, 'thai', ['liga', 'rlig', 'ccmp']);

Regarding this substitution case, I started wondering why we split into words in the first place? The space glyph being part of a substitution is perfectly fine, there's no rule against it to my knowledge.

I know that's the crux of open source, but I wouldn't be such a pest if it wasn't blocking the progress so much. A clear structure for substitution and positioning as well as supporting all the different formats is really crucial for implementing further features.

@Connum
Copy link
Contributor

Connum commented Nov 30, 2023

By the way, trying to get rid of the latn-specific applyLatinLigatures, I replaced it with

function applyLatinFeatures() {
    checkGlyphIndexStatus.call(this);
    applySubstitutions.call(this, 'latn', ['liga', 'rlig', 'ccmp', 'calt']);
}

in analogy to applyThaiFeatures(), and called it in Bidi.prototype.applyFeaturesToContexts:

    if (this.checkContextReady('latinWord')) {
        applyLatinFeatures.call(this);
    }

But then all ligatures stopped working...

@rafallyczkowskiadylic
Copy link
Contributor Author

Hey @Connum I am on it. Let me resolve conflicts and see what is on the code. I will revert back by the end of the weekend

@rafallyczkowskiadylic
Copy link
Contributor Author

@Connum Conflicts resolved.

Pending items:

  1. Could you share this TestShapeEthi.ttf file with me so I can verify the outcomes?
  2. I understand. I had similar issue, with some redundancy to get into the logic, that is why I didn't wanted to deprecate old code but rather leverage a new core lookup features. The most important functions is: Layout.getFeaturesLookups that implements common logic for fetching features, for substitutions and for positioning.

I started wondering why we split into words in the first place?

It was implemented like that, and the reason for this is that within single sentence there may be characters from a different language context. Example: 我的iPhone是最好的 where "iPhone" is for the latin context word.

  1. For the applyLatinLigatures issue, could you send me specific case you are testing? Including exact string, expected result, and source font file. I will check what may be happening because I see no issues with the latin fonts I am testing currently.

Moreover, please keep in mind supported substitution formats.

/**
 * Supported substitutions
 */
const SUBSTITUTIONS = {
    11: singleSubstitutionFormat1,
    12: singleSubstitutionFormat2,
    63: chainingSubstitutionFormat3,
    41: ligatureSubstitutionFormat1, 
    21: multiSubstitutionFormat1
};

Different fonts' source files may have these definitions in a different formats that eventually may not be supported yet.


@Connum if you could share all your use cases, with results you expect to see and source font files - I can review all of them and eventually address.


For the clean up I think we could just mark functions as deprecated for now. I'd refactor only bits covered with reliable tests so we can actually rely on outcomes before/after we change the code (this was also the reason I'd rather applied the code and "promoted" than removing or refactoring old functioning pieces).

@rafallyczkowskiadylic
Copy link
Contributor Author

Is there easy way to tests svg using demo page? I wrapped up simple compiler so I am using this a

npm run font:preview -- --font="Roboto-Black.ttf" --content='Lorem ipsum dolor sit amet. ąłćżółźżćśćłóęą '

that generates preview.html that compares HTML (browser) rendered content and SVG path.

image

@Connum
Copy link
Contributor

Connum commented Dec 2, 2023

Thanks for your work!

You can check out the unicode test suite: https://github.com/unicode-org/text-rendering-tests

It does exactly what you proposed, compare SVG output with expected results. You can replace the opentype.js files in the node_modules folder with the files compiled from your dev branch (also the files in /bin/) to see if everything related to GSUB passes.

The repo also provides the test files. If you're able to fix this with this PR, you should copy the file to our tests folder (it's licensed under OFL) and add our own test. But it's OK if we just do the refactoring and mark to base first if it holds us up too much. But I'll definitely need help with that space thing as well. :)

@rafallyczkowskiadylic
Copy link
Contributor Author

rafallyczkowskiadylic commented Dec 3, 2023

Ok. @Connum currently I do not have pending items, unless you share your failing test cases (font names, content tested, etc) with me according to this so I can review and address within this MR

@Connum
Copy link
Contributor

Connum commented Dec 3, 2023

Ok. @Connum currently I do not have pending items, unless you share your failing test cases (font names, content tested, etc) with me according to this so I can review and address within this MR

See my reply! 😉 But it's OK if we stick to GPOS mark-to-base with this PR and tackle the rest later. I'll review the code tomorrow.

@Connum
Copy link
Contributor

Connum commented Dec 5, 2023

I don't really like the idea of having both approaches in the code base in parallel. The arab and latn features should work with the new approach as well, so we should move everything to work with it. I understand if this is more than you're willing to do. Could you mark the PR as allowing maintainers to make changes? Then me or someone else could look into fixing this for all the scripts.

@Connum
Copy link
Contributor

Connum commented Dec 5, 2023

Shouldn't I be able to change applyLatinLigatures() to

function applyLatinLigatures() {
    applySubstitutions.call(this, 'latn', ['liga']);
}

and it should just work? But the ligature tests then fail.

- script: latn, was not matching latinWord
@rafallyczkowskiadylic
Copy link
Contributor Author

Shouldn't I be able to change applyLatinLigatures() to

function applyLatinLigatures() {
    applySubstitutions.call(this, 'latn', ['liga']);
}

and it should just work? But the ligature tests then fail.

Actually I checked and found few tests failing for your cases, for latn they are related to manipulating font's definitions (Substitution.prototype.addLigature). These implementations are still using old lookupTables search, and manipulating local references (

In other word, addLigature is using Layout.getLookupTables, manipulates this method's output references (expecting the test will also use .getLookupTables method). Thing is my solution does not use .getLookupTables at all because of getFeaturesLookups

Substitution.prototype.addLigature = function(feature, ligature, script, language) {
    const lookupTable = this.getLookupTables(script, language, feature, 4, true)[0];
    let subtable = lookupTable.subtables[0];
    ...
    subtable.ligatureSets = .... // <- manipulation
});

yet Layout.getLookupTables creates own copy of subtables refs (that is why we can't see them within new feature)

image

If .addLigrature would instead manipulate definition on the this.font.tables[this.tableName].lookups level (not local references) everything will work as expected. This is how I would approach this, to deprecate any use of getLookupTables and if there are definitions updated I would try to make them as "high" as possible (ideally around specific table def source, this.font.tables[this.tableName]....)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Spec Related to the implementation of the Opentype specification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OpenType.js can't read Google fonts' kerning, eg. EB Garamond
3 participants