Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: DNDocs #10055

Open
NeuroXiq opened this issue Jul 3, 2024 · 8 comments
Open

[Feature]: DNDocs #10055

NeuroXiq opened this issue Jul 3, 2024 · 8 comments
Labels
feature-request Customer feature request

Comments

@NeuroXiq
Copy link

NeuroXiq commented Jul 3, 2024

Related Problem

#6946
#9776

The Elevator Pitch

Short description

Looking on github.com repositories and nuget.org packages I noticed that lots of them does not
have documentation online. If I want to see classes, methods etc. I need to open source code
and look in the C# code about implemented feature. This is cumbersome.

I created DNDocs to host API Explorer exactly for this problem:
https://dndocs.com/i/nuget/AutoMapper/13.0.1
https://dndocs.com/i/nuget/Microsoft.EntityFrameworkCore/8.0.3

Long description

DNDocs is a simple API Explorer (documentation hosting platform).
DNDocs allows to generate and host API Docs online using Docfx.
After generating a project, server do not delete it - generated docs are stored on the server disk
(cached) and available immediately after generation. So generating occurs only once
and can took 1-5 minutes and then Docs are accessible immediately.

To Explore nuget package replace values in URL and open the page:
https://dndocs.com/i/nuget/{NugetPackageName}/{NugetPackageVersion}
Wait 1-5 minutes (if server generate is for the first time) and just click green button
on the right.

Additional informations

For example, people want to have API Explorer and I think DocFX can fit well here:

#6946
#9776

And also some people on github hosts docs on DNDocs:
https://github.com/madelson/DistributedLock
https://github.com/betalgo/openai
https://github.com/adamhathcock/sharpcompress

Summary

So thats all what I want to say in this request.
I would like to hear Your opinion about this feature and DNDocs project.
I'm open to any and all ideas.

Best Regards
Marek Węglarz / NeuroXiq
https://github.com/NeuroXiq/DNDocs

Additional Context and Details

Screenshots:

image

image

image

image

image

image

image

No response

@NeuroXiq NeuroXiq added the feature-request Customer feature request label Jul 3, 2024
@erdembayar
Copy link
Contributor

If you have a screenshot, please add it here. We need to triage first based on the information you provided to us.

@erdembayar
Copy link
Contributor

@JonDouglas
Could you please take a look?

@MangelMaxime
Copy link

Coincidentally, we (F# community) are working on a similar project for F# APIs.

I wanted to point it how, because it looks like DNDocs is considering that all package published to NuGet are C# packages.

CleanShot 2024-07-14 at 20 03 40

versus

type [<AllowNullLiteral>] ArrayConstructor =
    [<Emit "new $0($1...)">]
    abstract Create: size: int -> 'T[]

I am mentioning this because, I think it is important to note that perhaps multiple services could need to be supported.

@joelverhagen
Copy link
Member

This tool looks really cool. The feature has been on the back of my mind for years. As you mentioned, there are existing feature requests from 2019.

The need was filled for a while by FuGetGallery. However, FuGetGallery has had availability issues and we disabled the link (#9783).

I have some questions, @NeuroXiq.

  1. Where is the source code for the project? I found https://github.com/NeuroXiq/DNDocs but this seems to just be an issue tracker. Is the project open source? In particular, I'd like to look at the way you fetch package content from our APIs. I also see some typos. The community may be able to help improve DNDocs in these cases if it is open source.
  2. How are deleted packages handled? When a package is removed from NuGet.org, will it be cached indefinitely by DNDocs? From time to time there are malicious and DMCA takedowns on NuGet.org. IMO it would make sense to periodically remove deleted package content from DNDocs. This could be detected by a catalog reader (guide, API) looking for PackageDelete events or via a cache timeout.
  3. What are your scalability targets? Said another way, do you expect it to be able to handle a lot of load? If we enabled this from NuGet.org, I expect the background job queue to grow somewhat as folks browse packages seeking API docs.
  4. What package types are handled? I tried a normal dependency package and it worked well, but a CLI tool redirected me to a login page for some reason. When I queued by package name I got an error. There are other package types DotnetTool, Template, DotnetPlatform, MSBuildSdk, AzureSiteExtension, DotnetCliTool, etc.
  5. Does the link to DNDocs make sense when the package already has API documentation linked elsewhere? I feel like it could be confusing to the user that they may not know which is the "official" API documentation site.
  6. Akin to @MangelMaxime's question, how are packages with different languages handled? I see F# doesn't work as expected. I wonder how native dependencies or VB.NET IL are handled. What about packages containing JS or .exe (legacy scenarios)?
  7. Can a package author request a takedown? If a package author does not want their content hosted on DNDocs, what kind of controls do they have?
  8. Suppose a type is public but the package author does not want to document it (perhaps it is a legacy API) or have it appear in their DNDocs site. How would they control the final shape of the API docs?
  9. What security considerations or mitigation do you have? I wonder what tools are run on the package and if the possibility of XSS exists.
  10. When is a GitHub login required? I saw it when I tried loading a .NET tool package and I am not sure when login is required. Ideally, the link from NuGet.org would not immediately land on a sign in page. It would break the flow of anonymous NuGet.org navigation.
  11. Could you preempt the "job waiting" delay and/or the "Open API explorer" extra link click by building API docs for all packages on NuGet.org? I am guessing this is an expensive compute-wise for you therefore the work is done on demand. But you could discover packages as they are published via the catalog API (as mentioned above for deletes) and pre-cache the API docs in all applicable cases.

Overall, this is an amazing project and I applaud your contribution to the .NET ecosystem. This may fill a need that FuGetGallery once filled.

@NeuroXiq
Copy link
Author

Hi!
Really thank You for response. I'm really sorry for that time period
but nothing was changed as I wrote so I hope we can continue discussion.

[Short answers]

Where is the source code for the project Temp repository: temp-dndocs-src
How are deleted packages handled Need to implement this
What are your scalability targets I working on to scale this how much how I can, I want to handle a lot of load
What package types are handled Works: (AzureSiteExtension, Standard .NET libraries with '*.dll' and '.xml') Not work: (DotnetTool,Template,DotnetPlatform)
Does the link to DNDocs make sense when the package already has API documentation linked elsewhere Probably yes
How are packages with different languages handled Because of lack of knowledge I don't know if generated docs looks valid but possible to generate something for other languages
Can a package author request a takedown Not sure how to answer: 1. Not impemented, 2. If somebody asks me to delete I will delete manually without problems/discussion
How would they control the final shape of the API docs Will not have any control and this is not planned to do. I considere this and there will be lots of problems
What security considerations or mitigation do you have As far as I known XSS this is not possible with current implementation for nuget.org
When is a GitHub login required This was BUG, login will never be require when using from nuget.org
Could you preempt the "job waiting" delay and/or the "Open API explorer This is Implemented and running online. Now is separate service docs.dndocs that only hosts API docs

[Questions]

  1. Is there any estimates about traffic from nuget to external sources? Is there any statistics how many people cliked distinct packages to open Nuget Trends or FuGet? I need to estimate how big Jobs queue can be. Im interested only in distinct packages per day/hour/minute. Problem is with generating, after generating I think everything will work fine.

  2. I will need to download packages to generate docs, thus lots of random package downloads from nuget.org from single IP. It's ok?

Can a package author request a takedown
What do You think, it will be ofen that somebody wants a takedown?
If I put in README.md/FAQ information that to request takedown author can create issue on github and I will delete it manually be enough?
If not I will need to implement login + somehow package validation ownership with nuget.org + db tables = more work

How are packages with different languages handled
Something is generated and hosted. Maybe leave it as it works? If docs are invalid I will remove support for other
languages and leave only for C# packages.

How are deleted packages handled? From time to time there are malicious and DMCA takedowns on NuGet.org.
Thank You very much for this informations, I will need to implement this.

[Long Answers]
Generally I wish to have coding work as little as possible because everything will
require maintenance later etc. But there is no problem if some features if needed.

Where is the source code for the project
I didn't prepare code to be published. I can't just share everything right now, I have secrets in my appsettings, server configs etc. that cannot be public right now. I removed that I created temp repository for reference
https://github.com/NeuroXiq/temp-dndocs-src
this is 95 % of the code thats running, I removed appsettings, *.csprojs and other items that have my sensitive data.
I need to remove that, prepare instruction how to run this, prepare some init scripts etc. but right now I focus on
implementing core features. Will make this project open source later 100% sure.

Very short summary how it works:
Request goes to controller
Handler invoked
Bg job started
Building project as BgJob
Publishing project in DNDocs.Docs
Can explore project from DNDocs.Docs

How I fetch nuget packages: - see 'GetDllAndXmlFromPackageAsync' method

What are your scalability targets
I must to scale how much how i can. I implemented DNDocs.Docs.web
I working on moving BuildProjectHandler into DNDocs.Job

What package types are handled
Generation docs for other types of packages e.g. CliTools makes sens but for now I will not support this.
image
There are many unrelated dlls and this will require filtering only .dlls and .xmls of the project. This
will be hard to do. And also .xml must be provided. Maybe in the futuer I will find some solution for this
but I think it will not be supported.

Does the link to DNDocs make sense when the package already has API documentation linked elsewher
Hard to answer. As a nuget.org user I just open package and always see on the right side Open in NuGet Trends. I know if I open other package I will
see this link again. I think always visible DNDocs url will give consistent experience for the user. I know
if I open other package I will see DNDocs url in the other. Randomly visible/invisible could be confusing.
But maybe hide makes sens I don't know really.

How would they control the final shape of the API docs
I considered this and I really don't want to do this. There will be a lot of problems with custom configuration.
If You mean that somebody can edit docfx.json file for generating (first problem: XSS 100% possible in this case and I very hard to avoid etc.)
This is much better and much safer and much simple and consistent to do not allow do this. I know this it would be nice
but this is something that I really don't want.

What security considerations or mitigation do you have
I investigated this.

  1. For nuget.org I believe this XSS not possible.
  2. With login feature some docs are downloaded from github repository. That was considered and build process sanitize every file from anything other that .dll and .xml (see sanitize in method "FetchMdDocs2" in BuildHandler)

When is a GitHub login required
I just working on small additional features on DNDocs that will require login.
This features:

  1. Allowing generating API Docs for multiple nuget packages
  2. Allow taking documentation from github and additionally to API docs folder, create 'Articles' folder with github docs.
  3. Allowing simple versioning
    This is not related with nuget.org

Even with my best intence I will probably bankrupt if i want to scale this accordingly to the traffic and I think
that in the future DNDocs will require little monetization for all servers for hosting and N Jobs servers. I really dont like idea of messing up DNDocs with annoying ads like popups/blinking images stuff etc. On readthedocs I saw very nice and not annoying ads but no idea if they argree. I will need to send request to ethicalads about they opinion etc. Anyway this will be my later problem. I just want to mention that I will probably need to monetize this a little. Destroying everything with ads is totally not my intention and I dont want to be considered scammer or something like that. I really hope this is clear and ok.

Summary:
I must do this, and must scale this to handle any needed traffic and I'm spend 100% of my free time to do this.
Im 90% done with this, remaining things are: Implement memory cache in docs.dndocs, implement bgjob.web server, deploy that to better servers.
I really wish to know general opinion about this.

@JonDouglas
Copy link
Contributor

Hey @NeuroXiq,

What a cool project! It definitely fits a missing gap in NuGet today regarding static API documentation for each package.

Our team has plans to implement static API documentation on NuGet.org similar to other ecosystems like https://docs.rs/syn/2.0.72/syn/, https://jsr.io/@astral/astral/doc, and https://pub.dev/documentation/bloc/latest/bloc/bloc-library.html

Here is what people would like to see on NuGet.org next based on a June 2024 survey:

image

You can see how we generate API docs today via https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.hosting?view=net-8.0 or https://github.com/dotnet/dotnet-api-docs which would be a similar experience we hope to bring to NuGet.org.

Now that brings us to this issue and DNDocs. We believe there is a time period to potentially add this experience to NuGet.org, with the understanding that we will likely bring a first-party experience relatively soon for every NuGet package overriding the need for DNDocs long-term. We don't know when we will do this work, so that could be months or even years from now.

We would highly recommend first proposing this experience in NuGet/home and we can amplify the proposal to the .NET community for comments about the current experience of DNDocs and how it could help fill this gap temporarily.

https://github.com/NuGet/Home/tree/dev/meta#how-do-i-create-a-proposal

I took a quick look at the current experience and the challenge I personally see is that it takes ~40s to generate API docs for a package. Ideally, I'd be able to see the API docs instantly for the top X/1000 packages given they have 80% or so of the total downloads.

To include this on NuGetGallery, I believe we will need a community proposal and to streamline the experience if possible for consumer traffic. Let's start with a community proposal first and that will help bring up questions, feedback, and excitement for this project.

@NeuroXiq
Copy link
Author

Thanks for response!

As an example I generated 29,000 package versions:
https://docs.dndocs.com/system/projects/29
https://docs.dndocs.com/system/projects/28
https://docs.dndocs.com/system/projects/27
https://docs.dndocs.com/system/projects/26
etc...

@JonDouglas :
Yes thanks for that information. I created a proposition as a temporary solution.

https://github.com/NuGet/Home/pull/13702/files?short_path=fe85d7c#diff-fe85d7cdecffe87d3c00407a4468bb6d371ab50a6b7eba406e2ee8e6c9fafdfd

If NuGet team implement documentation for any package to be similar like on learn.microsoft
I think that this will be very usefull for other people. Actually from time to time I want to have
high-level view of the projects (e.g. Configs etc.) but this is not possible right now. I really appreciate
plans to introduce that feature hope it will bring better expierience for all.

I proposed DNDocs because I was not aware of this and current there are no possible way
to display anything related to nuget package. Depending on time to introduce NuGet docs DNDocs
can be considered as a temporary solution.

@joelverhagen
I created repository: https://github.com/NeuroXiq/src-dndocs

@JonDouglas
Copy link
Contributor

Thank you kindly @NeuroXiq.

If you need help expanding your proposal, I'd gladly help expand the language before we evangelize it to the community.

I'll comment in that PR and get some team members to review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Customer feature request
Projects
None yet
Development

No branches or pull requests

5 participants