First Do No Harm
New EU data regulations limit automated algorithmic decision making.
In 2018, new data regulations in the European Union will limit algorithmic decision making on an individual/user level. The laws will also give an individual the right to demand an explanation about algorithmic decisions made about them.
The General Data Protection Regulation (GDPR) provides comprehensive guidelines for the collection, storage and use of personal information of EU citizens. These laws come after a number of landmark court cases involving data privacy that highlighted data-related grey areas, including the right to be forgotten, rights to restrict processing and personal data collection and transfer by foreign companies. While these regulations don’t yet apply in Africa, they raise significant topics that businesses and data scientists around the world should be considering.
We’ll explore three major subjects arising from these laws in this blog.
Automated Decision Making
Within data use, the GDPR specifies guidelines for “automated individual decision-making, including profiling” under which individuals have a right not to be subject to a decision when the decision is based on automated processing and when the decision has a significant effect on the individual. Exclusions to this right would be where a decision is necessary for entering into a contract authorized by law, or explicit consent has been provided. In all cases, exclusions included, individuals have the right to obtain human intervention in the decision and to contest the decision.
Many widely used data science methods fall under the definition of automated individual decision-making. These include recommendation engines, customer segmentation, credit assessments and online targeted advertising. The GDPR is likely to have practical implications for the use and deployment of automated decision methods because companies will have to implement processes for human intervention and hire data controllers for these new roles.
Profiling and Discrimination
The GDPR describes profiling as any type of automated processing with the intention of evaluating individuals based on personal aspects of the individual. More specifically, it includes analyzing or predicting individual’s economic circumstances, political positions, ethnic origins, movements, location, performance at work, religious beliefs, health or personal preferences. The regulations seek to promote fair, transparent processing using appropriate statistical methods and companies will need to ensure that any inaccuracies are corrected and the profiling processes prevent discrimination.
Discrimination in some form is inherent in most profiling exercises. Where inequality exists, these inequalities will be intrinsic in the data and will be reflected in profiling outcomes. Simply removing sensitive variables such as race from a profiling model does not ensure the model outcomes are now non-discriminatory. Other variables, such as product history could correlate with sensitive variables and the performance of the profiling process is not significantly altered when they are excluded. With the increasing complexity of profiling datasets, exclusion of sensitive variables becomes possible, simple correlations become less clear and the complex interactions between non-sensitive variables could play a greater role in determining evaluation outcomes. However, more complex models lead to other issues under the GDPR as we discuss below.
The Right to Know
Under the GDPR, individuals will be notified about data collected on them and have the right to access this information. The GDPR also states that when profiling takes place individuals have the right to “meaningful information about the logic involved”.
Machine learning methods assess correlations, associations and patterns in data that result in a certain outcome. They do not, however, explain why these patterns exist. Explaining complex models with hundreds of predictor variables can be extremely challenging. A potential positive consequence of transparent algorithmic decision making is empowering individuals to better understand why certain outcomes were made about them so they can change their behaviour for a better future outcome.
An example of this is credit scoring in South Africa, where a single number, the credit rating, affects almost every financial decision made about South Africans. A poor credit rating makes it very difficult, if not impossible to obtain a bank loan or any form of credit, including a home loan. According to a Lamna blog, posted in January 2016, close to 50% of South Africa’s 19 million active consumers, are three months plus in areas on their loan repayments. This begs the question: “How many consumers are made aware of and understand what makes up the credit score? Perhaps, if they were better informed, they could look at improving their credit score.
Obstacles to transparency include the the fear that people may find ways to cheat the system, companies may be reluctant to share their methods (which now become accessible to everyone, including their competitors) and difficulty in explaining the methods to nontechnical people.
Winners in the data economy
The data economy is likely to shift under these new regulations, and data practitioners, who can adapt and find opportunities in this new landscape will thrive. While the new regulations do not yet apply in most African countries, this could change. Astute data scientists and companies, will factor in global concerns around discriminatory profiling processes, in automated algorithmic decision making.