Split into `unicode-data-core` and `unicode-data` #82

wismill · 2022-09-14T08:42:20Z

Following discussion in PR #75:

To avoid making too many packages we can possibly have just two packages, we can call the lightweight package as unicode-data-core and bundle everything including unicode-data-core in the all-inclusive unicode-data package.

Currently there are 4 packages depending on unicode-data.

We should define criteria where to include APIs. As of now, names & scripts are not considered “core”. What about blocks?
This will require a major version bump.
What about the existing package unicode-names? If we do not create unicode-data-blocks-scripts, maybe we can deprecate it in favor of the new batteries included unicode-data.

I would propose the following plan:

Merge Add support for block and scripts #75.
Publish unicode-data-0.3.1 with all changes so far.
Update to Unicode 15.0.
Publish unicode-data-0.4.0 and names.
Rename unicode-data to unicode-data-core.
Re-create unicode-data, that re-export all unicode-data-* packages.
Publish unicode-data-core-1.0, unicode-data-names-1.0, unicode-data-scripts-1.0 , unicode-data-security-0.1 and unicode-data-1.0.

@harendra-kumar @adithyaov @Bodigrim

The text was updated successfully, but these errors were encountered:

adithyaov · 2022-09-24T11:04:01Z

Sounds good to me.

wismill · 2022-09-27T05:55:02Z

When we do the split, I propose the following new version scheme: U.B.M, where:

U is the supported Unicode major version; i.e. 15 for Unicode 15.0.0.
B is used to mark breaking changes: minor Unicode update or any change requiring a version bump according to PVP. It starts at 0 with every new major Unicode version.
M is used for non-breaking changes, such as additions to the API (see PVP).
All the PVP rules apply.

PROS:

It is easier to identify what Unicode version is supported.
Unicode version scheme is very stable. Minor updates are uncommon.
Promote the packages by marking them as mature (0. prefix usually means beta).

CONS:

Increasing the major version from 0 usually indicates the software is very stable and production-ready; not all unicode-data-* packages reach this stage yet. Perhaps this should apply only to packages we judge production-ready, thus keeping the usual 0.X.Y scheme for beta packages. I would say at least unicode-data, unicode-data-core and unicode-data-names are candidates for a version 15.0.0.
It makes our version scheme depend on a third party’s one. But it is very stable and we already bump version for Unicode updates.
It does not reflect the exact Unicode version. But the minor updates of Unicode are uncommon.
Too big, looks like a browser version: I do not mind versions greater than 10 and we can expect at most one major version bump a year.
Some packages may see no change with a new Unicode version. This is unlikely as a new Unicode version usually includes new characters, which will modify the bitmaps. It may happen if the characters have default values. This is not the case for Unicode 15.0.0.

Change in the plan:

If accepted: skip version 1.0 and publish version 15.0.0 instead.

harendra-kumar · 2022-10-06T21:25:26Z

Your pros and cons seem pretty thorough. The cons do not look significant. We can go with this scheme. I am wondering if there is anything to learn from the ICU versioning scheme here: https://icu.unicode.org/processes .

harendra-kumar · 2022-10-06T21:26:49Z

We should probably send an email to @Bodigrim for his opinion, in case we are missing something.

Bodigrim · 2022-10-07T17:57:16Z

Sounds good to me.

(Sorry, I have extremely limited bandwidth at the moment and this is unlikely to improve soon, so feel free to act without waiting for me)

This was referenced Sep 14, 2022

Add support for block and scripts #75

Merged

Update to Unicode 15.0.0 #85

Closed

wismill linked a pull request Sep 27, 2022 that will close this issue

Draft: All-in-one unicode-data #93

Draft

4 tasks

adithyaov mentioned this issue Oct 10, 2022

Bump the version on master post-release #95

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split into `unicode-data-core` and `unicode-data` #82

Split into `unicode-data-core` and `unicode-data` #82

wismill commented Sep 14, 2022 •

edited

Loading

adithyaov commented Sep 24, 2022

wismill commented Sep 27, 2022

harendra-kumar commented Oct 6, 2022

harendra-kumar commented Oct 6, 2022

Bodigrim commented Oct 7, 2022

Split into unicode-data-core and unicode-data #82

Split into unicode-data-core and unicode-data #82

Comments

wismill commented Sep 14, 2022 • edited Loading

adithyaov commented Sep 24, 2022

wismill commented Sep 27, 2022

harendra-kumar commented Oct 6, 2022

harendra-kumar commented Oct 6, 2022

Bodigrim commented Oct 7, 2022

Split into `unicode-data-core` and `unicode-data` #82

Split into `unicode-data-core` and `unicode-data` #82

wismill commented Sep 14, 2022 •

edited

Loading