UPDATE/SPLIT E.4. into Unintended use and Aversarial use #82

Cedric-Garcia · 2018-11-07T22:30:02Z

Overview

I suggest splitting E.4. into two components. It seems to me that the unintended use element combines two related but distinct concepts. As a result, the users of the checklist may interpret it as one and miss the other, instead of considering both. I tried to define the two below, and give some hypothetical examples to make the distinction clearer, and at the bottom put some suggested changes.

Aversarial use: People with nefarious intent tricking the algorithm into malfunctioning.
Unintended use: People using the algorithm in a different context, with negative consequences.

The Tay Chatbot example seems to be an example of adversarial use to me, while the Deepfakes example is more of an unintended use according to the above definitions.

Hypothetical Examples

A company (say Tesla) develops an algorithm for a semi-autonomous vehicle, which is designed to be used on the highway, with users maintaining their hands on the wheel.

Adverserial use: An individual paints confusing patterns on the road tricking the algorithm into making a bad turn and causing an accident.
Unintended use: A user uses the self-driving mode without keeping their hands on the wheel, and gets into an accident.

A company develops a 'resume review' algorithm designed to rank application resumes for programmers wanting to apply to the firm, in order to focus HR resources on top candidates.

Adverserial use An applicant gets a copy of the algorithm and identifies a way to trick it by using cleverly placed keywords to ensure his resume is ranked near the top of the pile.
Unintended use The HR department in the company finds the algorithm successful, and uses it to screen for other positions than the one it was trained on.

Suggested changes

Here would be the suggested modification (happy to setup a pull request after we have discussion here)

E.4. Unintended Use: Have we taken steps to prevent end-users from using our model in unintended ways and do we have a plan to monitor these once the model is deployed?

Deepfakes—realistic but fake videos generated with AI—span the gamut from celebrity porn to presidential statements. http://theweek.com/articles/777592/rise-deepfakes
User turns on Tesla autopilot, then sits in the passenger seat https://www.nytimes.com/2018/04/29/world/europe/uk-autopilot-driver-no-hands.html

E.5. Adverserial use Have we explored which actions third parties could take to cause our algorithm to malfunction, and have we taken steps to limit this possibility and/or the consequences of such actions?

Microsoft's Twitter chatbot Tay quickly becomes racist after trolls teach it inflammatory messages. https://www.theguardian.com/technology/2016/mar/24/microsoft-scrambles-limit-pr-damage-over-abusive-ai-bot-tay
Researches create 3D printed turtle that is misclassified as a rifle. https://www.theverge.com/2017/11/2/16597276/google-ai-image-attacks-adversarial-turtle-rifle-3d-printed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPDATE/SPLIT E.4. into Unintended use and Aversarial use #82

UPDATE/SPLIT E.4. into Unintended use and Aversarial use #82

Cedric-Garcia commented Nov 7, 2018

UPDATE/SPLIT E.4. into Unintended use and Aversarial use #82

UPDATE/SPLIT E.4. into Unintended use and Aversarial use #82

Comments

Cedric-Garcia commented Nov 7, 2018

Overview

Hypothetical Examples

Suggested changes