The 4 movies every data scientist should watch and how to use our skills responsibly.

Image from Pixabay.

Did you come here expecting to find Money Ball? This is going to be a bit different. In this article, I highlight the 4 movies that expose the tragic effects that data science and data related technologies are having today on our lives and our societies, when they are not used responsibly.

Seriously? data science has damaging effects on society? Yes, it does. Just a few weeks ago, the Wall Street Journal published a series of articles based on internal Facebook documentation, which reveal a number of data misuse related problems occurring at Facebook. As it turns out, Facebook supports human trafficking networks on its platform; its ranking algorithm promotes misinformation; and Instagram, owned by Facebook, can be harmful to teenage girls (1, 2).

These problems are created by Facebook’s data collection practices, the way Facebook designs its ranking algorithms and the engagement metrics they use, together with a collection of (poor) decisions that put profit above people. If this is not data science, then what is it?

Sadly, “The Facebook Files” are not the first report on data misuse and its damaging effects on our societies. Over the past few years, a number of books (3, 4), movies (5, 6, 7, 8), and organizations (9, 10) began to raise awareness of these issues, highlighting how the misuse of data combined with persuasive technologies, poor decision making and the lack of regulation are threatening our health, our livelihoods and our societies.

Why does this matter for data scientists? We, data scientists, could be using our skills to help people and societies. There is certainly a lot to be gained from the use of data related technologies. We could be helping teams, like in Money Ball. Or, we could inflict real damage through the data products that we create, if we don’t use our skills responsibly. After all, companies are run by people, decisions are made by people, and algorithms and data products are designed by (some of) us, data scientists.

The movies

The following 4 movies are documentaries, dramas or a blend of both, exposing the damaging effects of the misuse of data and artificial intelligence on different aspects of politics, health and on different sectors of society. They reveal why the products inflict the damage, and how companies ended up creating those products. And in some cases, they also offer suggestions on how to move forward.

The Social Dilemma

The Social Dilemma is perhaps the most famous of all the documentaries with 7 Nominations and 2 Emmy awards this year. The Social Dilemma reveals how social media platforms create addiction, manipulate people’s opinions and behaviors while spreading conspiracy theories, fake news and disinformation. These issues are created through the misuse of persuasive technologies aimed to capture user’s attention and to retain them on their platforms for as long as possible. Algorithms are designed to optimize the time users spend on the platform and / or the amount of content consumed, regardless of the quality or accuracy of that content.

Social media platforms compete for user’s attention, according to the movie, because this way they can generate profit through the paid advertisement shown to the users.

The film features interviews with many former employees and executives of big tech companies, like Google, Facebook and Twitter, who offer a first-hand look at what goes on into algorithm design. You can watch the trailer here.

Coded Bias

The film Coded Bias is a multi-award winning documentary that follows the journey of MIT Media Lab researcher Joy Buolamwini, who discovered that facial recognition algorithms actually do not recognize properly the faces of people of color and women. This finding shows that the artificial intelligence tools that we use and which are thought to predict real life as closely as possible, do not actually do such a good job. Or at least not for everybody. In other words, many of the artificial intelligence tools that we use today, discriminate, are racist or sexist. And this has dramatic consequences for various sectors of our societies.

The film explores the reasons behind the bias and the consequences of this bias across different sectors of the population, highlighting that vulnerable sectors of society are hit the worse. You can watch the trailer here.

The Great Hack

The Great Hack is a documentary film about the Facebook-Cambridge Analytica data scandal, perhaps the biggest scandal before “The Facebook files” that we mentioned earlier. The film shows how Cambridge Analytica’s misuse of data in targeted advertisement campaigns disrupted politics in various countries including the Brexit referendum in the UK and the 2016 elections in the US. Perhaps unsurprisingly the film exposes the relationship between Cambridge Analytica and the tech social media giant Facebook and how they shared users’ data, without users awareness.

The film features a journalist from The Guardian who broke the story and a former employee from Cambridge Analytica who turned whistleblower. You can watch the trailer here.

Brexit: the uncivil war

Brexit: the uncivil war is a drama film based on the true story of the strategy behind the “Vote Leave” campaign ahead of the Brexit referendum that prompted the UK to leave the European Union. The film shows how the (mis)use of social media and the internet as a marketing tool, through micro-targeting advertisement led to the separation of the UK from the European block. The film features an outstanding performance of Benedict Cumberbatch as the mastermind behind the campaign strategy. You can watch the trailer here.

Can we prevent the misuse of data technologies?

The movies highlight the damaging impact that the misuse of data and data related technologies are having on our livelihoods and our societies today. These problems are generated thanks to the lack of regulation around the use of data, which allows tech companies to do pretty much what they want. So the solution seems simple: we need a whole lot more regulation to prevent the misuse of these technologies, and incentivize their use towards the well-being of people and communities, instead of, well… for profit.

Unfortunately, due to the enormous amount of money spent by big tech on lobbying (11, 12, 13, 14, 15), the implementation of new regulation is going to be a slow and painful process. In fact, Google, Facebook and Microsoft are the three biggest lobbying spenders at the European Union (11, 13) and Facebook and Amazon are now the two biggest corporate lobbying spenders in the US (15).

So as it seems, for the time being, it is down to us, to try and do something to prevent the misuse of technology. After all, we, data scientists are the ones who create these products. Aren’t we?

So, what can we do?

Most of the movies discussed here feature former employees of big tech companies who either raised their voices and were not heard, or decided to leave their company because they realized that the technologies they were creating were not serving people’s best interest. If we happen to be working towards creating damaging technologies, we could do the same. Let’s raise our voices, raise our concerns and try to steer product design towards a useful purpose.

Some former tech employees went on to fund NGOs that fight for a humane use of technology, like the Center for Humane Technology, or for the accountable and equitable use of artificial intelligence like the Algorithmic Justice League. Jump on to their websites and see how you can help.

In fact, the Center of Humane Technology has a lot of resources for people working on the design of data products to support us to build humane technology and help us steer the discussion within our companies. And speaking of steering discussions…

Let’s steer the discussion about data science and data related technologies.

You, like me, probably read tons of articles on the internet saying that “Data Science is the sexiest job in the 20s”, and that data science, machine learning and software engineering pay some of the highest salaries in the market. And I am not sure about you, but when I talk to (most of) my colleagues, it all seems to revolve around money, around using the latest technologies and around working for some of these “prestigious” (big tech) companies. Those seem to be the signature of success. And very rarely do I hear discussions or reflections on whether what we do is actually meaningful or even useful at all.

So let’s steer the conversation. Let’s change the way we talk about data science. Instead of attaching success to the size of the pay check or the company we work for, let’s think for a moment if what we are going to build with our skills in that company and for that money, is actually useful. Are the data products improving society’s well-being in any way? Are we helping prevent crime? Improving health? Supporting healthy interactions? Are our algorithms fair?

And if we are not happy with the answers, why not just kindly declining that job offer and moving on?

I think it is important for us, data scientists, those who actually have the skills to create products that consume data and make decisions based on data, to understand that not every product is a good product, and that some can indeed be very harmful. We do have a say, and are therefore responsible for what we create. And we have this unique opportunity to create products that serve people and communities, and to abstain from creating products that are damaging or ethically irresponsible.

Until regulation on how data and algorithms can and can’t be used starts to emerge, and given the amount of money the big tech companies spend on lobbying against regulation, that is going to take a while, it is down to us, data scientists, machine learning engineers, those who actually have the skills to create those products, to use our skills responsibly, and say, no, I don’t want to be part of this, and put our skills to better use.

Are you on board?

References

1. The Facebook Files, articles.

2. The Facebook Files, a podcast series, podcast.

3. Don’t be evil, book.

4. Weapons of Math Destruction, book.

5. The Social Dilemma, movie.

6. Coded Bias, movie.

7. The Great Hack, movie.

8. Brexit: the uncivil war, movie.

9. Center for Humane Technology, organization.

10. The Algorithmic Justice League, organization.

11. Big Tech Lobbying, article.

12. The 17 tech companies that lobby the government the most, article.

13. Big tech tops EU lobby spending, article.

14. Tech companies lobby for changes in IT rules, article.

15. Big tech, big cash, Washington’s new power players, article.

--

--

--

Lead Data Scientist, author of “Python Feature Engineering Cookbook”, instructor of machine learning at www.trainindata.com and developer of Python open-source.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Some Real Data for the Min Wage Post

Beating COVID-19: Policy Stringency Trajectory Peers

5 queries you must know in SQL

Hypothesis Testing European Soccer Data Using Python

Are you Segmenting Your A/B Test Results?

Patient Sentiment for Pharmaceutical Drugs from Twitter

Analysing Real Big Data To Understand Sales and Customers Behaviours For An E-commerce Company

Hitch-hiking the Data Science Landscape

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sole from Train in Data

Sole from Train in Data

Lead Data Scientist, author of “Python Feature Engineering Cookbook”, instructor of machine learning at www.trainindata.com and developer of Python open-source.

More from Medium

Top 16 Data Science Certifications to pursue Online in 2021

Kaggle or Github? Which one is more important for a Data Science professional

Data Scientist Roadmap 2022

How My First Kaggle Competition Changed My Data Science Learning Experience