Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't assume that resources entries are relative #1182

Merged
merged 4 commits into from
Jan 19, 2023

Conversation

ojarjur
Copy link
Contributor

@ojarjur ojarjur commented Jan 18, 2023

Starting with the 2.0.2 release of jupyter_server, every entry in the resources field of a kernelspec reported by the kernel gateway is assumed to be a path relative to the kernel gateway URL starting with /kernelspecs).

However, this assumption is not documented anywhere in the REST API doc (as far as I can tell), and I have a kernel gateway server that uses the resources object to report additional URLs that are not relative to /kernelspecs/.

With that combination, any attempts to use this kernel gateway with a version newer than 2.0.2 results in the following failure trace:

   Traceback (most recent call last):                                                                                                                                            
      File "/home/ojarjur/.local/lib/python3.9/site-packages/tornado/web.py", line 1713, in _execute                                                                              
        result = await result                                                                                                                                                     
      File "/home/ojarjur/.local/lib/python3.9/site-packages/jupyter_server/services/kernelspecs/handlers.py", line 70, in get                                                    
        kspecs = await ensure_async(ksm.get_all_specs())                                                                                                                          
      File "/home/ojarjur/.local/lib/python3.9/site-packages/jupyter_core/utils/__init__.py", line 184, in ensure_async                                                           
        result = await obj                                                                                                                                                        
      File "/home/ojarjur/.local/lib/python3.9/site-packages/jupyter_server/gateway/managers.py", line 243, in get_all_specs                                                      
        fetched_kspecs = await self.list_kernel_specs()                                                                                                                           
      File "/home/ojarjur/.local/lib/python3.9/site-packages/jupyter_server/gateway/managers.py", line 267, in list_kernel_specs                                                  
        kernel_specs = self._replace_path_kernelspec_resources(kernel_specs)                                                                                                      
      File "/home/ojarjur/.local/lib/python3.9/site-packages/jupyter_server/gateway/managers.py", line 221, in _replace_path_kernelspec_resources                                 
        new_path = url_path_join(self.parent.base_url, "kernelspecs", split_eg_base_url[1])                                                                                       
    IndexError: list index out of range 

This change tries to remove that assumption by checking whether or not the string split actually resulted in a list of more than one element.

Starting with the 2.0.2 release of jupyter_server, every entry in the `resources` field of a kernelspec reported by the kernel gateway is assumed to be a path relative to the kernel gateway URL starting with `/kernelspecs`).

However, this assumption is not documented anywhere in the REST API doc, and I have a kernel gateway server that uses the resources object to report additional URLs that are not relative to `/kernelspecs/`.

With that combination, any attempts to use this kernel gateway with a version newer than 2.0.2 results in the following failure trace:

```
   Traceback (most recent call last):                                                                                                                                            
      File "/home/ojarjur/.local/lib/python3.9/site-packages/tornado/web.py", line 1713, in _execute                                                                              
        result = await result                                                                                                                                                     
      File "/home/ojarjur/.local/lib/python3.9/site-packages/jupyter_server/services/kernelspecs/handlers.py", line 70, in get                                                    
        kspecs = await ensure_async(ksm.get_all_specs())                                                                                                                          
      File "/home/ojarjur/.local/lib/python3.9/site-packages/jupyter_core/utils/__init__.py", line 184, in ensure_async                                                           
        result = await obj                                                                                                                                                        
      File "/home/ojarjur/.local/lib/python3.9/site-packages/jupyter_server/gateway/managers.py", line 243, in get_all_specs                                                      
        fetched_kspecs = await self.list_kernel_specs()                                                                                                                           
      File "/home/ojarjur/.local/lib/python3.9/site-packages/jupyter_server/gateway/managers.py", line 267, in list_kernel_specs                                                  
        kernel_specs = self._replace_path_kernelspec_resources(kernel_specs)                                                                                                      
      File "/home/ojarjur/.local/lib/python3.9/site-packages/jupyter_server/gateway/managers.py", line 221, in _replace_path_kernelspec_resources                                 
        new_path = url_path_join(self.parent.base_url, "kernelspecs", split_eg_base_url[1])                                                                                       
    IndexError: list index out of range 
```

This change tries to remove that assumption by checking whether or not the string split actually resulted in a list of more than one element.
This change extends the mock kernelspec used for testing the gateway
client logic so that it includes one entry in the `resources` object
that is not a local file path relative to the `/kernelspecs/`
directory.

Adding that change is sufficient to make the tests fail with
the previous version of the code and to demonstrate that the change
in this PR fixes them.
@welcome
Copy link

welcome bot commented Jan 18, 2023

Thanks for submitting your first pull request! You are awesome! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@codecov
Copy link

codecov bot commented Jan 18, 2023

Codecov Report

Base: 80.13% // Head: 80.14% // Increases project coverage by +0.01% 🎉

Coverage data is based on head (98ba523) compared to base (3d96cf8).
Patch coverage: 80.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1182      +/-   ##
==========================================
+ Coverage   80.13%   80.14%   +0.01%     
==========================================
  Files          68       68              
  Lines        8149     8150       +1     
  Branches     1604     1605       +1     
==========================================
+ Hits         6530     6532       +2     
  Misses       1191     1191              
+ Partials      428      427       -1     
Impacted Files Coverage Δ
jupyter_server/gateway/managers.py 83.17% <80.00%> (+0.03%) ⬆️
jupyter_server/serverapp.py 79.92% <0.00%> (+0.08%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@kevin-bates
Copy link
Member

kevin-bates commented Jan 18, 2023

Starting with the 2.0.2 release of jupyter_server, every entry in the resources field of a kernelspec reported by the kernel gateway is assumed to be a path relative to the kernel gateway URL starting with /kernelspecs).

Starting with #1124 (released in 2.0.2) an issue was resolved whereby if the notebook server (client to gateway) is started with a base-url configured, the kernelspec resource files could not be loaded by the client application. As a result, we needed to identify the existing URLs (sent by the gateway server) and replace those with URLs that would match the notebook server (knowing that particular handler will forward the request to the gateway server, but the issue was that the handler was not getting called). So, the only real change, that, unfortunately, breaks your customization, is that we now split on each resource entry's value.

Since I've been working on Jupyter (6 years now), every entry in the resources stanza, of which there can be multiple, is a key/value pair with the key being the resource name and the value being the url path to the resource so the appropriate handler is called.

However, this assumption is not documented anywhere in the REST API doc (as far as I can tell),

You're right, the API doc doesn't get down to this level of detail.[*]

and I have a kernel gateway server that uses the resources object to report additional URLs that are not relative to /kernelspecs/.

Do these kinds of resources entries interfere with front-end applications like Jupyter Lab?

I think the better location for this particular entry is in the more free-form metadata stanza (and properly namespaced). That said, adding this extra layer of protection to identify resources handled by the resource handler, whether that be KernelSpecResourceHandler or GatewayResourceHandler, is probably a good move in the name of robustness.

[*] Please note that should we decide to better document the REST API, we will may need to make the statement that resources in this stanza are handled by the "resource handler", so I would advise adjusting the location to the metadata stanza under another stanza specific to your customization/application.

cc: @bloomsa

Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me @ojarjur - thank you. We can talk about documenting the API and such another time (perhaps this week's server meeting?).

@kevin-bates
Copy link
Member

Hi @ojarjur. I'd like to apologize for my tone in yesterday's response as it may have come across as defensive and a bit agitated. I'm sorry.

Regarding the customization, sans official documentation, I think precedent says a lot and I don't think there have really been changes to the kernelspec model returned from servers (notebook or gateway), at least in the last 6 years. I can't guarantee that applications that need to interpret that content won't break when new items are encountered - thus the emphasis to use the metadata stanza for these kinds of customizations.

@ojarjur
Copy link
Contributor Author

ojarjur commented Jan 18, 2023

@kevin-bates Thank you for your rapid and thoughtful feedback!

We can talk about documenting the API and such another time (perhaps this week's server meeting?).

I definitely want to join that meeting, but I have a personal conflict during it, so I don't know if/when I'll be able to join. Thanks for making sure that I am aware of it, though.

I'd like to apologize for my tone in yesterday's response as it may have come across as defensive and a bit agitated. I'm sorry.

I did not pick up on anything like that, but I really appreciate how proactive you are about it.

Starting with #1124 (released in 2.0.2) an issue was resolved whereby if the notebook server (client to gateway) is started with a base-url configured, the kernelspec resource files could not be loaded by the client application. As a result, we needed to identify the existing URLs (sent by the gateway server) and replace those with URLs that would match the notebook server

That makes perfect sense.

Since I've been working on Jupyter (6 years now), every entry in the resources stanza, of which there can be multiple, is a key/value pair with the key being the resource name and the value being the url path to the resource so the appropriate handler is called.

I appreciate that context. I'm also happy to get feedback on better ways for me to use the API, so thank you for that.

Do these kinds of resources entries interfere with front-end applications like Jupyter Lab?

I haven't seen any issues with Jupyter Lab arising from this. To be clear, though; I've only tried this for new entries in the resources object; not existing ones.

So, for example, I have not tried it to see what would happen if I set one of the logo entries so that the logo was loaded from a CDN instead of the Jupyter server. That seems like something someone might want to do, but if it hasn't come up in the past 6 years then maybe I'm wrong about that.

I think the better location for this particular entry is in the more free-form metadata stanza (and properly namespaced)

Thanks! I really appreciate any guidance you are willing to share here.

To clarify, you are not referencing an existing stanza, right (I couldn't find a reference to that in the API docs)?

I.E. is the suggestion to have my server return an additional metadata entry that is outside of what is covered by the spec?

so I would advise adjusting the location to the metadata stanza under another stanza specific to your customization/application.

By this, you mean something like add an object to the top level of the KernelSpec object whose name is something like my domain name, and then have a metadata field inside of it?

... so, an example kernelspec might look like:

{
  "name": "example",
  "KernelSpecFile": {
    ...
  },
  "resources" : {
     ...
  },
  "example.com": {
    "metadata": ...
  }
}

Do we have any tests that exercise additional fields being passed through the server? If not, I might be able to add such in a future PR, but would need guidance on the right place to put them.

@kevin-bates
Copy link
Member

Oh, shoot. It slipped my mind that resources is a sibling stanza to spec (which you list as "KernelSpecFile" above) and it's in the spec stanza in which the metadata stanza to which I'm referring resides. Things like the process_proxy or kernel_provisioner sub-stanza are in metadata and I figured you could add your own sub-stanza to that.

Is this information something that is relatively static and could be persisted in the kernel.json file? Or more dynamic in nature?

I hesitate to recommend injecting into the spec.metadata stanza in the handler, but I suspect kernel parameterization (whenever we get to that) will want to store the parameter schema in that same stanza, and will be something kernel spec managers will do anyway.

I haven't seen any issues with Jupyter Lab arising from this. To be clear, though; I've only tried this for new entries in the resources object; not existing ones.

I haven't dug into how Lab uses the entries in the resources stanza. They may only look for logo-* (regex), kernel.js, and kernel.css and ignore everything else. I suspect they're only looking for logo-* however, so perhaps leaving things the way they are is not so bad.

I will try to bring this topic up in tomorrow's meeting and let you know.

@ojarjur
Copy link
Contributor Author

ojarjur commented Jan 19, 2023

Oh, shoot. It slipped my mind that resources is a sibling stanza to spec (which you list as "KernelSpecFile" above) and it's in the spec stanza in which the metadata stanza to which I'm referring resides. Things like the process_proxy or kernel_provisioner sub-stanza are in metadata and I figured you could add your own sub-stanza to that.

@kevin-bates Thanks again for all of your insights here. Can you share a link to what you are referring to?

I've been looking at the Swagger API spec here, but it seems that is missing the things you are referring to.

Have I been looking in the wrong place?

Regardless, this is a bit of a tangent from the original PR. What do we (I?) need to do next to close that out? Does this require discussion at the server meeting tomorrow or review from someone else?

Thanks again for all of your feedback and discussion here.

@kevin-bates
Copy link
Member

I've been looking at the Swagger API spec here, but it seems that is missing the things you are referring to.

Yeah, the swagger doc is not very detailed.

The location of where the metadata stanza is added is in the KernelSpecManager located in jupyter_client where it builds the spec portion of the model.

The KernelSpecManager is responsible for looking in the appropriate directories for sub-directories containing a kernel.json file. The resource_dir attribute of the spec is then converted by the kernelspec handlers (in the respective server applications) into a resources stanza that provides the base Url necessary to fetch the respective resource.

The Jupyter Client docs have a section on kernelspecs in which the metadata field is listed.

Since you're using EG, this is where the process_proxy definition is located and, with provisioners, where the kernel_provisioner definition is located.

@kevin-bates
Copy link
Member

Hi @ojarjur. We discussed this in today's server meeting and feel the best/safest course of action is to use the metadata stanza for application customizations and keep resources as-is. Please note that this does not discount your pull request and we appreciate your contribution.

@kevin-bates
Copy link
Member

@blink1073 - I think this is ready to merge but I wanted to get your opinion on the build failure as I suspect it is unrelated to the PR. Appears to be something amiss with hatch.

@blink1073
Copy link
Contributor

I've been seeing that failure intermittently on other PRs.

@blink1073 blink1073 merged commit bc735a1 into jupyter-server:main Jan 19, 2023
@welcome
Copy link

welcome bot commented Jan 19, 2023

Congrats on your first merged pull request in this project! 🎉
congrats
Thank you for contributing, we are very proud of you! ❤️

@ojarjur ojarjur deleted the patch-1 branch January 20, 2023 19:30
@ojarjur
Copy link
Contributor Author

ojarjur commented Jan 21, 2023

Thank you @kevin-bates and @blink1073 thanks for all of your feedback and help here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants