Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request to Decrease the number of Jupyter Orgs. #12

Open
Carreau opened this issue Oct 2, 2023 · 29 comments
Open

Request to Decrease the number of Jupyter Orgs. #12

Carreau opened this issue Oct 2, 2023 · 29 comments

Comments

@Carreau
Copy link

Carreau commented Oct 2, 2023

Hi,

As of this morning I had to audit "all" the Jupyter organization for access and or update each organisation/team/repo accordingly if needs be.
I'm going to keep the what/why private for good reason, but I'm happy to share why in private channel.

I was able to audit 9 organisations, though I believe we have more (my memory tells me 13).

It is extremely painful to make the audits, and to be sure I did the audit correctly.

I am once again going to ask for a consolidation of organisations. More than a couple of organization is not manageable from a security perspective.

@Carreau
Copy link
Author

Carreau commented Oct 3, 2023

2 (3?) easy organisations to reduce.

  1. Jupyter-attic: Github Now has the "archiving" feature. I suggest to move all repos from jupyter-attic back to Jupyter and use the archiving feature.

  2. Jupyter-incubator has only 4 repository. I think the benefit of having an org for that is limited.

  3. (?) https://github.com/jupyter-standard has a single repository. I don't see whay it can't be a team under the Jupyter org

I'm guessing already having a list of these orgs would help.

As for the other issue, I understand that it may take time to reply, but please at least acknowledge reception.

@blink1073
Copy link
Contributor

blink1073 commented Oct 3, 2023

Hi @Carreau, I agree that there are too many orgs (speaking for myself not the EC). To give a concrete proposal to kick things off:

  • jupyter is related to backend, standards and meta-content
    • subsumes parts of jupyter-server org
  • jupyter-frontend includes frontend applications
    • subsumes jupyterlab, jupyter-widgets, voila-dashboards, jupyterlab-lsp org and parts of jupyter-server and jupyter
  • jupyter-kernels
    • subsumes - xeus-kernels and ipykernel
  • jupyterhub - stays as-is
  • ipython - stays as-is

Note that the above would require a change in governance, and clear standards of onboarding and offboarding repos onto those orgs, as well as org ownership.

@Carreau
Copy link
Author

Carreau commented Oct 3, 2023

Thanks @blink1073, One question I have is do we actually need to segregate things by orgs ? Arent's teams sufficient ? Why can't the kernels be under the Jupyter's Kernel Team for example ?

Note that the above would require a change in governance, and clear standards of onboarding and offboarding repos onto those orgs, as well as org ownership.

As long as a repo is under a given team it should not be a problem, for example, the rust-lang org has ~180 repo, and us projects (which are public and org-wide).

@blink1073
Copy link
Contributor

I personally don't think teams are sufficient, for two reasons:

  • Discoverability: Being able to find repositories, especially pinned ones
  • Separation of concerns: trying to fit too many activities in one org that have different needs

@Carreau
Copy link
Author

Carreau commented Nov 7, 2023

Try to restart this limited proposal, for a step by step what about just doing step 1 for now:

Jupyter-attic: Github Now has the "archiving" feature. I suggest to move all repos from jupyter-attic back to Jupyter and use the archiving feature.

No more, no less, that is already decreasing from 9 to 8, which a bit more than a 10% reduction in number of orgs.

@Carreau
Copy link
Author

Carreau commented Nov 7, 2023

(Side addition, technically IPython also rely on https://github.com/pickleshare which we are the only maintainers as well, It's another discussion but I think we should fold that repo back into this at some point).

@blink1073
Copy link
Contributor

I agree we can get rid of jupyter-attic. For pickleshare, why not bring it up to the Jupyter Foundations and Standards council to move the repo to ipython?

@Carreau
Copy link
Author

Carreau commented Nov 8, 2023

For pickleshare, why not bring it up to the Jupyter Foundations and Standards council to move the repo to ipython?

That is a good idea, I'll reach out to the rest of the Jupyter Foundation and Standards.

For Jupyter-attic, I'd love a few more +1 before moving it, unless you agree that this is a low traffic enough repository that we don't need to have EC approval.

@blink1073
Copy link
Contributor

jupyter-attic isn't called out in our governance docs, I view this as housekeeping.

@Carreau
Copy link
Author

Carreau commented Nov 8, 2023

Ok. Then when I have some time I'll move all the repos (which are already archived) to jupyter. Unless there is someone that opposes to it in the meantime.

There are "only" 36 repos, so I'm likely to do that by hand instead of a script.

@Carreau
Copy link
Author

Carreau commented Nov 10, 2023

I did not even realize that ec-team-compas is on it's own organisation called jupyter-governance.

Does this really have to been it's own org ?

@Carreau
Copy link
Author

Carreau commented Nov 12, 2023

jupyter-attic isn't called out in our governance docs, I view this as housekeeping.

All repos migrated back to Jupyter, and marked as Archived, the Jupyter-attic org itself has been archived.

@Carreau
Copy link
Author

Carreau commented Jan 16, 2024

#25 is one more symptom.

I also discovered https://github.com/jupyter-native – which is empty.

@krassowski
Copy link

Oh, repos from attic are back in jupyter/ org? I was a huge fan of that solution and was trying to advocate it more. There are many novice e.g. googling "jupyterlab-debugger" which brings up https://github.com/jupyterlab/debugger and then they break their environment by following the severely outdated instructions in there (of course this is despite a clear banner saying it is archived); moving things like that to attic would have added another layer of "do not use me" disclaimer. Of course the problem that I am highlighting can (and should) be solved by other means (either us or GitHub improving messaging in the archived repositories, for example by modifiyng their READMEs)

Anyways, attic is dead, long live a single org to rule them all!

@jtpio
Copy link

jtpio commented Jan 16, 2024

I also discovered https://github.com/jupyter-native – which is empty.

This one can likely be deleted. If I remember correctly, it was an alternative to https://github.com/jupyter-xeus at the time the xeus proposal was created.

cc @SylvainCorlay @JohanMabille

@Carreau
Copy link
Author

Carreau commented Jan 16, 2024

This one can likely be deleted. If I remember correctly, it was an alternative to @jupyter-xeus at the time the xeus proposal was created/

I'm happy if you want to keep it, maybe just remove the Jupyter Logo

@Carreau
Copy link
Author

Carreau commented Jan 16, 2024

Oh, repos from attic are back in jupyter/ org? I was a huge fan of that solution and was trying to advocate it more. There are many novice e.g. googling "jupyterlab-debugger" which brings up jupyterlab/debugger and then they break their environment by following the severely outdated instructions in there (of course this is despite a clear banner saying it is archived); moving things like that to attic would have added another layer of "do not use me" disclaimer. Of course the problem that I am highlighting can (and should) be solved by other means (either us or GitHub improving messaging in the archived repositories, for example by modifiyng their READMEs)

Oh, we can still rename the repository and push a single commit that delete all files but readme on more repos, any particular repo we need to do that ?

@krassowski
Copy link

After some confusion, I interpret that @Carreau comment on the other issue (#25 (comment)) was encouraging me to move my comment from the other issue here, so here it is:

Just to spell it out, it looks like we are stuck in a bad place here (either inconveniencing maintainers or security team) because we are using a free produce whereas the platform also offers a paid version which does not have the limitation that the security team is facing:

One of the main differences between GitHub Enterprise Cloud and other plans for GitHub.com is access to an enterprise account. Enterprise accounts provide administrators with a single point of visibility and management across multiple organizations. For more information, see "About enterprise accounts."

Link: https://docs.github.com/en/enterprise-cloud@latest/admin/overview/about-github-enterprise-cloud

@krassowski
Copy link

So apparently Jupyter is now using Enterprise as per comment from @fperez jupyter/enhancement-proposals#122 (comment). This should resolve the issues which made the security team pursue reduction in the number of orgs 🎉

@jasongrout
Copy link
Contributor

See jupyter/governance#219 for more info about the enterprise org.

@jasongrout
Copy link
Contributor

Now that we have adopted the policy of all Jupyter GitHub orgs being under the Jupyter Enterprise org, can we close this issue as resolved?

@fperez
Copy link

fperez commented Jul 9, 2024

I think so - @Carreau any objections?

@Carreau
Copy link
Author

Carreau commented Jul 10, 2024

Now that we have adopted the policy of all Jupyter GitHub orgs being under the Jupyter Enterprise org, can we close this issue as resolved?

I would prefer not to. The high number of org is still cumbersome, even with jupyter enterprise , you still have to do some steps 20 times and that's only when it's possible to do those.

Like we still have to synchronise 20 security teams, or modify most settings in 20 places.

Having the enterprise only help a bit for a few things because it allows to list and sometime get access to all the orgs and enforce some settings organisation wide (2FA). But it's really only providing the bare minimum.

For example I still can't see the teams of https://github.com/binderhub-ci-repos because I'm not an owner. Now because I am part of Jupyter enterprise I could make myself an owner.

Here is a video of me just trying to view open the settings page of the 20 orgs we have. See how I can't even reach even 1/2 of those and how long it took.

20.orgs.is.painful.mov

And this is only the level 1 page, imagine when you need to go to the 3rd or 4th level down, or verify that some teams are the same.

Does that feel reasonable to you ?

There is also a bunch of features that you can't do across orgs (migrate issues from one repo to another). And integration with third party for example code scanning, where you need to give access to 20 orgs one by one.

If I can make a metaphor, I'm saying that having 20 keys/locks to open my door is annoying, and you are giving me a keychain and telling me I only need to carry one keychain and not 20 keys.

@jasongrout
Copy link
Contributor

Point taken, and quite an analogy. Yes, I also found the enterprise tools quite minimal. Sure, let's keep this issue open to still have a bit of pressure, especially for those teams that function across Jupyter like the security team.

@fperez
Copy link

fperez commented Jul 10, 2024

Sounds good Matthias, good points!

@krassowski
Copy link

This announcement from GitHub just dropped yesterday: https://github.blog/changelog/2024-07-10-pre-defined-organization-roles-that-grant-access-to-all-repositories/
It does not mention enterprise use case specifically but is labelled as "enterprise". Might be worth checking if new options are now available in Enterprise panel and if not, checking with GitHub if they plan to add them on Enterprise level any time soon.

@Carreau
Copy link
Author

Carreau commented Jul 11, 2024

I completely see the point of enterprise, and I think it makes a lot of sens, it give you a master key and audit tools in case something goes wrong. And it is great. For a company it makes sens to have HR get access for when employee leave, or have audit. But I don't think enterprise have a goal which is aligned with large scale code management as we want to do in open source

I now for security we don't need all teams access that often, but there is a bunch of things that we might not do because we have 20 org. For example I just remember that .github repos are a thing, and we'd need to sync them across 20 orgs. Or list all the repos that are forks and can be deleted

I believe that the the criteria for some of these orgs to exists could be determined solely on the number of repo they contain:

(highest number is jupyter with 146, followed by Hub 75, and lab 72 which represent more than half of the 507 repos across all orgs).

I think that the cost of having an org for so few repos is too high, and the main case where an org should be created is when a group of repos want to split because an already existing org is too large. Not the opposite of creating an org by default.

Even the IPython orgs with 30 repos is IMHO on the edge and in general https://github.com/orgs/$ORG/repositories lists 30 repos per page so even Hub and lab would "only" be 3 pages.

@Carreau
Copy link
Author

Carreau commented Sep 12, 2024

In addition, I recently realized that having multiple orgs, means that we have to manage independently for each org:

  • GitHub sponsors,
  • Blocking/Banning users.

@Carreau
Copy link
Author

Carreau commented Oct 1, 2024

One more things, with 20 orgs, we need to join waiting lists for beta issues 20 times:
For example:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants