bias in AI – Dataconomy

Why we need to open up AI black boxes right now

Kim Deen — Thu, 23 May 2019 13:49:45 +0000

AI has been facing a PR problem. Too often AI has introduced itself as a misogynist, racist and sinister robot. Remember the Microsoft Twitter chatbot named Tay, who was learning to mimic online conversations, but then started to blur out the most offensive tweets?

Think of tech companies creating elaborate AI hiring tools, only to realise the technology was learning in the male-dominated industry to favour resumes of men over women. As much as this seems to be a facepalm situation, this happens a lot and seems not so easy to solve in an imperfect world, where even the most intelligent people have biases.

“Data scientists are capable of creating any sort of powerful AI weapons”,
Romeo Kienzler, head of IoT at IBM and frequent speaker at Data Natives.

“And no, I’m not talking about autonomous drones shooting innocent people. I’m talking about things like credit risk score assessment system not giving a young family a loan for building a home because of their skin color.”

These ethical questions rang alarm bells at government institutions. The UK government set up a Centre for Data Ethics and Innovation and last month the Algorithmic Accountability act was proposed in Washington. The European Union created an expert group on artificial intelligence last year, to establish an Ethics guidelines for Trustworthy Artificial Intelligence.

IBM had a role in creating these guidelines, which are crucial according to Matthias Biniok, lead Watson Architect DACH at IBM, who designed CIMON, the smiling robot assisting astronauts in space. “Only by embedding ethical principles into AI applications and processes can we build systems that people can trust,” he tells.

“A study by IBM’s Institute of Business Value found that 82% of enterprises are now at least considering AI adoption, but 55% have security and privacy concerns about the use of data.”
Matthias Biniok, lead Watson Architect DACH at IBM

AI can tilt us to the next level – but only if we tilt it first.

“Artificial intelligence is a great trigger to discuss the bias that we have as humans, but also to analyse the bias that was already inducted into machines,” Biniok tells. “Loans are a good example: today it is often not clear for a customer why a bank loan is granted or not -even the bank employee might not know why an existing system recommended against granting a loan.”

It is essential for the future of AI to open up the black boxes and get insight into the models.

“The issue of transparency in AI occurs because of the fact that even if a model has great accuracy, it does not guarantee that it will continue to work well in production”
Thomas Schaeck, IBM’s Data and AI distinguished engineer, a trusted portal architect and leader in portal integration standards.

An explainable AI model should give insight into the features on which decision making is based, to be able to address the problem.

IBM research, therefore, proposed AI factsheets, to better document how an AI system was created, tested, trained, deployed and evaluated. This should be audited throughout their lifecycle. It would also include suggestions on how a system should be operated and used. “Standardizing and publishing this information is key to building trust in AI,” says Schaeck.

Schaeck advises business owners to take a holistic view of the data science and machine learning life cycle if they are looking to invest in AI. Choose your platform wisely, is his advice. One that allows teams to gain insights and take a significant amount of models into tightly controlled, scalable production environments. “A platform, in which model outputs and inputs are recorded and can be continuously monitored and analysed for aspects like performance, fairness, etc,” he tells.

IBM’s Fairness 360 toolkit, Watson Studio, Watson Machine Learning and Watson Open Scale can help you out with this. The open-source Fairness 360 toolkit can be applied to every AI model before it goes into production. The toolkit has all the state of the art bias detection and mitigation algorithms. Watson Studio allows teams to visualize and understand data and create, train and evaluate models. In Watson Machine Learning, these models can be managed, recorded and analyzed. And as it is essential to keep on monitoring AI during its lifecycle, IBM Open Scale connects to Watson Machine Learning and the resulting input and output log data, in order to continuously monitor and analyze in-production models.

Yes, it can all be frightening. As a business owner, you don’t want to end up wasting a lot of time and resources creating a Frankenstein AI.

But it is good to keep in mind that just as our human biases are responsible for creating unfair AI, we also have the power to create AI which mitigates, or even transcends human biases. After all, tech is what we make of it.

If you would like to know more about the latest breakthroughs in AI, Cloud & Quantum Computing and get your hands on experimenting with blockchain, Kubernetes, istio, serverless architecture or cognitive application development in an environment supported by IBM experts, then join the Data & Developers Experience event that is going to take place on June 11-12 at Bikini Berlin. Register here, it’s free.

Not Accounting for Bias in AI Is Reckless

Alyssa Simpson Rochwerger — Sat, 11 May 2019 16:48:29 +0000

I’ll never forget my “aha” moment with bias in AI. I was working at IBM as the product owner for Watson Visual Recognition. We knew that the API wasn’t the best in class at returning “accurate” tags for images, and we needed to improve it.

I was nervous about the possibility of bias creeping into our models. Bias in Machine Learning (ML) models is the exact sort of problem the ML community has seen time and again, from poor facial recognition of diverse individuals to an AI beauty pageant gone awry and countless other instances. We looked long and hard at the data labels we used for our project and, at first blush, everything seemed fine.

Just prior to launch, a researcher on our team brought something to my attention. One of the image classifications that had trained our model was called “loser.” And a lot of those images depicted people with disabilities.

I was horrified. We started wondering, “what else have we overlooked?” Who knows what seemingly innocuous label might train our model to exhibit inherent or latent bias? We gathered everyone we could — from engineers to data scientists to marketers — to comb through the tens of thousands of labels and millions of associated images and pull out everything we found objectionable according to IBM’s code of conduct. We pulled out more than a handful of other classes that didn’t reflect our values.

My “aha” moment helped avert a crisis. But I also realize that we had some advantages in doing so. We had a diverse team (different ages, races, ethnicities, geographies, experience, etc.) and a shared understanding of what was and wasn’t objectionable. We also had the time, support, and the resources to look for objectionable labels and fix them.

Not everyone who is building an ML-enabled product has the resources of the IBM team. For teams without the advantages we had, and even for organizations that do, the prospect of unwanted bias looms. Here are a few best practices for teams of any size as they embark upon their ML journey. Hopefully they help avoid unintended negative consequences like those we almost experienced.

Define and narrow the business problem you’re solving

Trying to solve for too many scenarios often means you’ll need a ton of labels across an unmanageable amount of classes. Narrowly defining a problem, to start, will help you make sure your model is performing well for the exact reason you’ve built it.

For example, if you’re creating a computer vision model that’s answering a fairly straight-forward question, like “Is this a human?” you need to define what you mean by “human.” Do cartoons count? What if the person is partially occluded? Should a torso count as “human” for your model? This all matters. You need clarity on what “human” means for this model. If you’re unsure, ask people the same question about your data. You might be surprised by the ambiguities present and the assumptions you made going in.

One way to help define your scope is by considering the information you use for your model. Even academic datasets like ImageNet can have classes and labels that introduce unintended bias into your algorithms. The more of your data you understand and own and can map back to the business problem you’re solving, the less likely you are to be surprised by objectionable labels.

2. Gather a diverse team that asks diverse questions

We all bring different experiences and ideas to the workplace. People from diverse backgrounds–not just race and gender, but age, experience, etc.–will inherently ask different questions and interact with your model in different ways. That can help you catch problems before your model is in production.

Building a diverse team also requires gathering data in a way that allows for different opinions, as well. There are often multiple valid opinions or labels for a single datapoint. Gathering those opinions and accounting for legitimate, often subjective, disagreements will make your model more flexible.

3. Think about all of your end users

Likewise, understand that your end users won’t simply be like you or your team. Be empathetic. Anticipate how people who aren’t like you will interact with your technology and what problems might arise in their doing so.

With this in mind, it’s important to remember that models rarely remain static. One of the worst mistakes you can make is deploying your model without a way for end users to give you feedback on how the model is applying in the real world.

You’ll want to keep humans as part of your process to react to changes, edge cases, instances of bias you might’ve missed, and more. You want to get feedback from your model and give it feedback of your own to improve its performance, iterating constantly towards higher accuracy.

4. Annotate with diversity

When you use humans to annotate your data, it’s best to draw from a diverse pool. Don’t use students from a single college or even labelers from one country. The larger the pool, the more diverse your viewpoints. That can really help reduce bias.

After all, this is where bias is often hidden. A few years back, researchers at the University of Washington & the University of Maryland found that doing an image search for certain jobs revealed serious underrepresentation and bias in results. Search “nurse,” for example, and you’d see only women. Search “CEO” and it was all men.

Having people of diverse backgrounds annotate data will help ensure your team asks different questions, thinks about different end users, and, hopefully, creates a technology with some empathy in mind.

Accounting for Bias Is Paramount for Good AI

Knowing what I know now, I’d argue it’s both negligent and reckless to launch an AI system into a production without accounting for bias with these basic best practices. Remember: it’s not impossible to reduce unwanted bias in your models. It takes some grit and hard work, sure, but it reduces down to being empathetic, iterating throughout the model building and tuning processes, and taking great care with your data.