Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: privacy strategy document #343

Closed
wants to merge 11 commits into from
Closed

Conversation

EazyAl
Copy link
Contributor

@EazyAl EazyAl commented Aug 17, 2023

No description provided.

@EazyAl EazyAl changed the title Privacy ali Privacy Strategy Document Aug 17, 2023
@EazyAl EazyAl changed the title Privacy Strategy Document doc: Privacy Strategy Document Aug 18, 2023
@EazyAl EazyAl changed the title doc: Privacy Strategy Document doc: privacy strategy document Aug 28, 2023
Signed-off-by: EazyAl <[email protected]>
Signed-off-by: EazyAl <[email protected]>
Signed-off-by: EazyAl <[email protected]>
@SdgJlbl
Copy link
Contributor

SdgJlbl commented Aug 29, 2023

I think there was a problem during the rebase, some commits are now duplicated between main and your branch.

@@ -128,3 +128,4 @@ Some quick links:
additional/release.rst
additional/faq.rst
additional/glossary.rst
additional/privacy-strategy.rst
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would personally rather put it under the "What is Substra" section, rather than hidden at the bottom. But happy to be challenged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is quite specific and not key to what substra is - not everyone who uses substra would be keen to read this

Signed-off-by: EazyAl <[email protected]>
Signed-off-by: EazyAl <[email protected]>
Signed-off-by: EazyAl <[email protected]>
Signed-off-by: EazyAl <[email protected]>
Signed-off-by: EazyAl <[email protected]>
Signed-off-by: EazyAl <[email protected]>
Copy link
Contributor

@SdgJlbl SdgJlbl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR.

The URL and some list items are not displaying correctly, I've made some suggestion.
You can check the rendering here https://owkin-substra-documentation--343.com.readthedocs.build/en/343/additional/privacy-strategy.html

Please don't forget to add a line in the changelog :)

docs/source/additional/privacy-strategy.rst Outdated Show resolved Hide resolved

Here we elaborate on what PETs are and provide a quick summary of the most popular ones. We then take a look at the different types of attacks that can result in data leakage or data theft and then also propose best practices for project governance and security in order to mitigate the risks of potential attacks. We also elaborate on the collaboration required between different personas to ensure data integrity and safe handling of data.

As pointed out by Katharine Jarmul in her (primer on Privacy Enhancing Technologies)[https://martinfowler.com/articles/intro-pet.html], “Privacy is a technical, legal, political, social and individual concept.” As such, technical solutions are an important part of the answer, but they must be used in conjunction with good data governance and reliable security measures.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
As pointed out by Katharine Jarmul in her (primer on Privacy Enhancing Technologies)[https://martinfowler.com/articles/intro-pet.html], “Privacy is a technical, legal, political, social and individual concept.” As such, technical solutions are an important part of the answer, but they must be used in conjunction with good data governance and reliable security measures.
As pointed out by Katharine Jarmul in her [primer on Privacy Enhancing Technologies](https://martinfowler.com/articles/intro-pet.html), “Privacy is a technical, legal, political, social and individual concept.” As such, technical solutions are an important part of the answer, but they must be used in conjunction with good data governance and reliable security measures.

Copy link
Contributor

@SdgJlbl SdgJlbl Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, of course, it's RST 🤦
So we need the stupid syntax for external URLS:

`primer on Privacy Enhancing Technologies <https://martinfowler.com/articles/intro-pet.html>`_

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(yes, backticks + a trailing underscore)

docs/source/additional/privacy-strategy.rst Outdated Show resolved Hide resolved
docs/source/additional/privacy-strategy.rst Outdated Show resolved Hide resolved
docs/source/additional/privacy-strategy.rst Outdated Show resolved Hide resolved
docs/source/additional/privacy-strategy.rst Outdated Show resolved Hide resolved
docs/source/additional/privacy-strategy.rst Outdated Show resolved Hide resolved
docs/source/additional/privacy-strategy.rst Outdated Show resolved Hide resolved
docs/source/additional/privacy-strategy.rst Outdated Show resolved Hide resolved
docs/source/additional/privacy-strategy.rst Outdated Show resolved Hide resolved
@EazyAl EazyAl changed the title doc: privacy strategy document docs: privacy strategy document Aug 30, 2023
Signed-off-by: EazyAl <[email protected]>
@EazyAl EazyAl marked this pull request as ready for review August 30, 2023 13:09
@EazyAl
Copy link
Contributor Author

EazyAl commented Aug 30, 2023

I have asked for review - @SdgJlbl , can we still add co-contributors?

@SdgJlbl
Copy link
Contributor

SdgJlbl commented Aug 31, 2023

I have asked for review - @SdgJlbl , can we still add co-contributors?

Yes, we will add the co-authors on the squash commit during the merge I think

Comment on lines +10 to 11
- New page added on Substra Privacy Strategy based on research by Privacy Task Force at Owkin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- New page added on Substra Privacy Strategy based on research by Privacy Task Force at Owkin.
- New page added on Substra Privacy Strategy based on research by Privacy Task Force at Owkin([#343](https://github.com/Substra/substra-documentation/pull/343))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're missing a space between Owkin and the opening bracket :x

We touch on a **few** of the main technologies that are making collaborative data sharing possible today in ways that can be considered more secure.

**Federated Learning**:
Federated Learning allows Machine Learning models to be sent to servers where they can train and test on data without having to ever move the data from its original location. This idea however is not restricted to machine learning and is then referred to as Federated Analytics. The Substra software enables both Federated Learning and Federated Analytics.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Federated Learning allows Machine Learning models to be sent to servers where they can train and test on data without having to ever move the data from its original location. This idea however is not restricted to machine learning and is then referred to as Federated Analytics. The Substra software enables both Federated Learning and Federated Analytics.
Federated Learning allows Machine Learning models to be sent to servers where they can train and test on data without having to ever move the data from its original location. This idea however is not restricted to machine learning and is then referred to as Federated Analytics. Substra enables both Federated Learning and Federated Analytics.

#. 3. The external world (outside of the network) contains malicious actors. We make no assumptions about any external communication and we aim at limiting as much as possible our exposure to the outside world.
#. 4. Models are accessible by data scientists in the network (with the right permissions). The data scientist is responsible for making sure that the trained model exported does not contain sensitive information enabling, for example, membership attacks. (explained below)
#. 5. Every organization in the network is a responsible actor. Every organization hosts its own node of the Substra network, and is responsible for ensuring minimal securitization of their infrastructure. Regular security audits and / or certifications are recommended.
#. 6. In this document the focus is on protecting data rather than models — thus we do not cover Byzantine attacks [Fang, M., Cao, X., Jia, J., & Gong, N. (2020). Local model poisoning attacks to {Byzantine-Robust} federated learning] and backdoor attacks [Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., & Shmatikov, V. (2020, June). How to backdoor federated learning.].- which are in a category of attacks that affect the quality of the generated model as opposed to compromising the data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting of the bibliographic references

@ThibaultFy ThibaultFy closed this Sep 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet