Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1t Community Diligence Review of Storify-Data-Pathway Allocator #178

Open
shizhigu opened this issue Sep 29, 2024 · 6 comments
Open

1t Community Diligence Review of Storify-Data-Pathway Allocator #178

shizhigu opened this issue Sep 29, 2024 · 6 comments

Comments

@shizhigu
Copy link

Latest report: https://compliance.allocator.tech/report/f03012494/1726531323/report.md

From the Allocator model operation to date a total of 2 clients have been requested, one for public datasets and one for enterprise public data, the entire allocator operation is seen to be operating well overall as an initial allocator operator, with data decentralization and Spark retrieval continuing to improve.

For the client that does not operate well, intervention is made, and also given space for explanation, if there is no reasonable explanation, direct shutdown measures will be taken.

The operational model of allocator will be improved even more in the future:

  1. more types of datasets are encouraged to join the filecoin network;
  2. KYC processes and information will be better;
  3. Spark retrieval gradually improved.

If there are any questions, please feel free to review.Thanks!

@filecoin-watchdog
Copy link

Allocator Application
Compliance Report

Example 1
There is no information about data preparation. The client did not provide it, and the allocator did not ask additional questions to clarify this issue.
Did the allocator perform KYC check?
The dataset this client wants to store in filecoin has already been stored several times. Did the allocator ask what is the purpose of storing the same data again? github search results
Inconsistent information about the total amount of DC requested versus expected dataset size (5 PIB vs 16.4 PiB).

Only 2 SPs (out of 4 provided) were used for making deals. The retrieval rate is 0%.
The Allocator did react to the poor report for this client but didn’t get answer.

Example 2
The client requested 15 PiB and declared creating 10 replicas of the 950TiB dataset. Only 9,5-10PiB would be enough, why did the client ask for 15PiB?

For this client, the allocator did better - additional question on data preparing and SPs’s location were asked. KYB was performed.

5 out of 23 SPs don’t have a retrieval rate, 2 have <5%, and the rest(16) oscillate between 26-100%. Overall, retrieval is good.

The allocator regularly checks client reports and intervenes in case of poor results.

@shizhigu
Copy link
Author

shizhigu commented Oct 8, 2024

Thanks a lots to @filecoin-watchdog for the mixed review.

Example 1

On balance, this application application was indeed not used very well, but we intervened effectively throughout the process. on 26 June, after using the first round of 512 TiB, the bot report showed that everything was fine, and it was logical that 1PiB should have been issued in the second round, but the review found that there were only two SPs, so we made a limited intervention, and the second round was still given 512 TiB.
image
image

With filecoin's adherence to the principle of maximising trust, we were shocked to unexpectedly find that the second round of 512TiB did not show any significant improvement, we are no longer proceeding with new approvals and have asked for a reasonable explanation from the client, but have not received any response so far, and we plan to stop approving the application request and close it manually by the end of the month.
image
image

Regarding the KYC check, we adopt different KYC strategies for each application, this is our first top-up application, so the KYC was relatively simple in June, but we will improve it subsequently.
image

Regarding the fact that this dataset has been stored several times, we believe that public datasets of relevance to human development deserve to be stored repeatedly and for the first time for these two SPs.

@shizhigu
Copy link
Author

shizhigu commented Oct 8, 2024

Example 2
Overall, as @watchdog replied, the application application is overall well used and has many worthwhile aspects.
Such as a more developed KYC process including data preparation, SPs disclosure, business licence etc.
image
image

On balance, the Spark retrieval rate was also good. The reason for the low Spark retrieval rate was also enquired about.
image
image

Timely attention is paid to the use of the application application and timely intervention is given when poor use of the application is detected.
image

Regarding the Datacap application quota, we consulted many SPs for professional advice and they told us that there is a conversion rate between the original data size and the Datacap consumption size, so we refer to this way and inform to the clients, please check the following link for more details:
filecoin-project/filecoin-plus-large-datasets#1592
image

@shizhigu
Copy link
Author

shizhigu commented Oct 8, 2024

So, in summary, this is our own overall assessment of our allocator's operations, with good aspects and areas that need to continue to be improved. We would like to emphasise that again:

In short, for the subsequent operations of allocator, we will continue to improve on the following aspects:

  1. encourage more types of datasets to join the filecoin network, aiming for 3-5;
  2. KYC process is more detailed and improved;
  3. Spark retrieval will be improved gradually;
  4. for the improper operation of the customer, given the opportunity to explain, if there is no improvement, will be approved to limit the intervention or directly manually close the application.

I hope to get 10 PiB support, and I hope the official team will obviously feel the progress of our allocator operation in the next review. Of course, if there is something inappropriate, we are willing to continue to adjust to comply with the official rules.

Thank you.

Allocator Application Compliance Report

Example 1 There is no information about data preparation. The client did not provide it, and the allocator did not ask additional questions to clarify this issue. Did the allocator perform KYC check? The dataset this client wants to store in filecoin has already been stored several times. Did the allocator ask what is the purpose of storing the same data again? github search results Inconsistent information about the total amount of DC requested versus expected dataset size (5 PIB vs 16.4 PiB).

Only 2 SPs (out of 4 provided) were used for making deals. The retrieval rate is 0%. The Allocator did react to the poor report for this client but didn’t get answer.

Example 2 The client requested 15 PiB and declared creating 10 replicas of the 950TiB dataset. Only 9,5-10PiB would be enough, why did the client ask for 15PiB?

For this client, the allocator did better - additional question on data preparing and SPs’s location were asked. KYB was performed.

5 out of 23 SPs don’t have a retrieval rate, 2 have <5%, and the rest(16) oscillate between 26-100%. Overall, retrieval is good.

The allocator regularly checks client reports and intervenes in case of poor results.

@Kevin-FF-USA Kevin-FF-USA added Diligence Audit in Process and removed Awaiting Response from Allocator Further information is requested labels Oct 8, 2024
@galen-mcandrew
Copy link
Collaborator

Overall we see evidence of mixed compliance, with the allocator investigating and intervening with clients, but most of this allocator's history is with a sole client. Checking this updated compliance report, and their largest client, we are seeing some issues of duplicate data with the SPs. Hopefully the allocator can investigate and provide some additional details, support, or intervention with their client.

In addition to that, we agree with the areas pointed out by this allocator:

  1. encourage more types of datasets to join the filecoin network, aiming for 3-5;
  2. KYC process is more detailed and improved;
  3. Spark retrieval will be improved gradually;
  4. for the improper operation of the customer, given the opportunity to explain, if there is no improvement, will be approved to limit the intervention or directly manually close the application.

Going forwards, we hope that the allocator continues to intervene quickly and build trust over time with clients. We will request an additional 10PiB of DataCap from RKH, to allow this allocator to show increased diligence and alignment.

@shizhigu
Copy link
Author

@galen-mcandrew Thank you for your patience in reviewing and giving pertinent comments, we will improve the operation of allocator based on the suggestions, and look forward to seeing our progress next time, thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants