Machine Learning Gone Wrong

```{admonition} But just because you can, doesn't mean you should. :class: warning The classic citation for this argument is from Jurassic Park.


**There are many examples of ML applied wrong and practitioners in the space that I talk to spend a lot of time keeping their data science teams from replicating some notable breakdowns:**

- [Google Flu Trends](https://gking.harvard.edu/files/gking/files/0314policyforumff.pdf) consistently over predicted flu prevalence
- IBM's Watson tried to predict cancer. How'd it go? According to internal documents: "This product is a piece of sh–."
- Amazon's engineers used ML to evaluate applicants but taught the model [that males were automatically better](https://www.theguardian.com/technology/2018/oct/10/amazon-hiring-ai-gender-bias-recruiting-engine)
- Chatbots have had many struggles. Here's Microsoft's [attempt at speaking like the youths](https://medium.com/asecuritysite-when-bob-met-alice/machine-learning-gone-bad-990e132024ea): <br>
    <img src=https://www.lexalytics.com/lexablog/wp-content/uploads/2018/05/TayChatbotFail-300x260.jpeg width="400">
- ML/AI methods replicate patterns in the data by design: [If you give it data with human biases, then the AI can easily become biased.](https://arxiv.org/pdf/1608.07187.pdf) This has led to debates about how to use ML for 
    - Criminal sentencing [based on "risk predictions"](https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing) overweight race
    - [Online advertising](https://arxiv.org/abs/1301.6822) - Google is more likely to serve up arrest records in searches for names assigned "primarily to black babies"
- Google will intelligently stitch together photos, but I guess Google's AI thought the guy was built like a mountain (congrats, I guess?): <br>
    <img src=https://i.imgur.com/60fTgCg.jpg width="400">

**Common problems with analysis (ML or otherwise)**

```{image} img/data_fallacies_to_avoid.jpg
:alt: flowchart
:width: 800px
{note}
The good news is that these problems can be avoided. Understanding how is something we will defer until we have a better understanding of the methods and processes we will follow in an ML project.