How do you start your day? You might read the daily news on your smartphones as suggested by Google Now. You listen to music or podcasts through applications like Spotify, Jio Saavn, or Google Podcast – you see some recommendations on weekly top playlists. You check your social media apps like Facebook or Twitter, and surprisingly, the ads of products that you were just chatting about with a friend last night appear on your screen. You might be planning a trip, but in a few hours or a day, you see a price change.
This trip planning doesn’t end here – you start getting notifications about weather, price deals, top places to visit in the area, and so on. Doesn’t this all sound so cool? Yes. You just start something, but you are hooked to it because of technology. What is happening behind the scenes is an important question to ask? All these applications are using Artificial
Intelligence (AI) and Machine Learning (ML) algorithms to train/teach their models imagine it as a robot for simplicity) based on data that we provide through our search, preferences, liking/disliking and so much more that we are not aware of. As a result, this robot works under the hood and shows us everything that he thinks we might be interested in.
Algorithms are the set of instructions designed to perform a specific task. You might be wondering, why are we talking about bias or fairness of algorithms when this is common among humans, but not computers? Right? How can a computer (algorithm specifically) learn to take a side or discriminate between majority and minority groups, when it is all based on math and logic? Shouldn’t we consider that algorithms are neutral? The answer is a bit complicated. To understand this further, we will discuss some examples that will highlight the problem and then we propose some solutions to fix the bias. We’ll start with a technical example and walk you through some business ones.
Do some of you remember the time when Facebook didn’t allow you to create an account because of an issue with your username and requested an ID proof? Facebook used an automatic algorithm to classify usernames as ‘real’ and ‘fake’. This ended up blocking so many real accounts too and it was termed nymwars. Identifying a white American name is easy in comparison to someone’s unique name that belongs to an ethnic group or culture, which is underrepresented. This means that the identification logic that is applied to the majority might be invalid for the minority group. Can we now conclude that there was some bias in the algorithm? Yes! The real problem is in the data that is used to train the algorithm (robot) which categorizes the usernames. These algorithms can’t be precise in their outcome if they are trained with a smaller dataset. Thus, the minorities that don’t make big data will always be a victim of unfairness.
Sadly, this is what happens most of the time. This problem of the skewed dataset and the corresponding algorithms can be fixed by the programmers (nerdy guys who write the algorithms) when they implement an algorithm, like a username classifier (a tool to sort usernames). Instead of choosing to only focus on much of the data to test and validate the algorithm, developers must focus on writing algorithms such that it tackles all possible cases, including the minority ones.
When a developer intends to generalize the code so that it is fair to everyone – it still misses out a lot of cases. We suggest that programmers should focus on special cases and try to reduce unfairness as much as possible until it is all gone. Fighting unfairness and not increasing fairness is the key here!
Now let’s take you straight to the business world - ads and recommendation systems.
Ad Algorithms: Can you believe that women were less likely to see ads for a career coaching service for “$200k+” executive positions in India? Further, Google ads have a race angle too. These startling statistics might have left you flabbergasted. Let’s continue a bit more to see how ads varied for American vs African American. Researchers have found that the background check service ads were more likely to appear in a paid search ad displayed after a search for names that are traditionally associated with African Americans (e.g. DeShawn) over American names (e.g. Geoffrey). This disparity can be attributed to ad-algorithms (which rely on ‘past’ search data) and these algorithms concluded that someone searching for “Deshawn” is more likely than someone searching for “Geoffrey” to click on an arrest-related ad (This is completely in sync with the algorithms’ ultimate target to generate revenue for Google). Here advertisement algorithms are reflecting our human biases in the datasets that brands collect, and algorithms continuously learn from the data.
Let’s imagine, for example, that a brand wants to predict whether it should target someone with an advertisement. Let’s also imagine that in the past, this brand hasn’t targeted ads to the LGBTQ community. These features will be present in that brand’s data-set and therefore, AI algorithms are likely to conclude that people from the LGBTQ community are less likely to purchase the brand’s products and should therefore not be targeted with the particular ads. Though not empirically tested, these ad algorithms are discriminating against Indian audiences based on caste (with individuals belonging to the upper echelons of the caste system being targeted with a disparate set of ads compared to their lower-caste counterparts). Sounds scary?
For fair advertising, algorithms can remove human biases shown in historical training data, but only by adding noise to the data. In layman terms, noise means encompassing diverse opinions and a broad representation of training data i.e. data on which the algorithms are trained. Noise is the key here!
Recommendation algorithm: Let’s move to the recommendation algorithms. Researchers have suggested that recommender systems are biased and deviate from users’ preferences over time. The recommendations tended to be either more diverse than what users were interested in or too narrow and focus on a few items. Further, the algorithms caused female user profiles to edge closer to the male-dominated population, resulting in recommendations that deviated from female users’ preferences (and suggests them with services that male consumers prefer).
As a parallel example, a relatively small group of superstar artists on music platforms (like Gaana, Spotify, and Saavn) are expected to receive a large portion of attention and digital space while the majority of artists in the long tail receive very little attention and space. This can give an undue and unfair advantage to the minority superstar artists over the less popular majority artists.
Similarly, Zomato recommending a paneer tikka masala over a tofu-whole grain sandwich is optimizing the firm’s profits at the cost of consumer’s health (assuming the consumer’s the objective is to not give in to the cravings and look after her health). In all these examples, recommendation algorithms are optimizing the recommendation decisions for a single, private objective i.e. to increase applications’ profit, engagement, and clicks. But one important point is that these decisions have serious side effects: recommending repeated popular products that maximize interactions and clicks may maximize the firm’s objectives, but can also harm individuals or society (by gradually forcing women to adapt to men’s preferences and not giving fair chance to less popular artists).
In all these examples, recommendation algorithms are optimizing the recommendation decisions for a single, private objective i.e. to increase applications’ profit, engagement, and clicks. But one important point is that these decisions have serious side effects: Recommending repeated popular products that maximize interactions and clicks may maximize the firm’s objectives, but can also harm individuals or society (by gradually forcing women to adapt to men’s preferences and not giving fair chance to less popular artists).
This article is an attempt by the authors to bring awareness to the impact that algorithms have in our lives. Building on scientific research, authors propose a few solutions that firms can incorporate to reduce unfairness associated with algorithms.
About the authors: Jubi Taneja is a PhD Student at the University of Utah, USA and Anuj Kapoor is Assistant Professor (Marketing) at IIM Ahmedabad, India. Anuj can be reached at firstname.lastname@example.org