When you consider all the implications of a world where AI rules supreme, are ethics even possible?
When you hear the term ‘artificial intelligence’ (AI) in popular culture, what do you think of? Some will think of ‘skin job’ rebellions as portrayed in Blade Runner and self-aware machines rising against their human creators, as depicted in the Terminator or Battlestar Galactica franchises. All excellent viewing fare, it must be said, but some of the early stages of AI development in the real world clearly violate Asimov’s Laws, reeking of surveillance capitalism, and violation of data privacy.
Things must be terrible when the likes of Google and Facebook take exception to Clearview’s scraping of the public internet for images to satisfy their law enforcement and government clients. What will they make of that photo of me defacing that statue of Kermit the Frog, a great humanitarian and role model? Is it really so difficult to use and develop AI in a way that does not ‘injure’ us by ignoring our privacy expectations?
The dream in AI or machine learning is to achieve the artificial equivalent of a human brain (like the positronic brain as incorporated by Star Trek Next Generation’s Data), and the technology needed is still a long way off despite the use of mainframes and swarm intelligence techniques. Of course, technology companies had to start somewhere, and we entered the age of ‘recognition,’ whether it’s voice or speech, image or facial, and of course, pattern.
These technologies are an associated proliferation of services and devices aimed at capitalizing on improvements in these areas. Whether it’s smart speakers, autonomous driving, or predictive analytics, all employ AI elements that rely on datasets for improvements. Besides, most violate user privacy when building and refining these data sets with algorithms.
Companies could have paid volunteers to build data sets but didn’t, preferring instead to utilize their customer data to improve future products (and of course to monetize the data itself to generate marketing revenue). However, they will pay contractors to analyze the data they collect, further compromising data privacy. It’s no coincidence that Big Tech and penalties for breaching data privacy laws go hand in hand.
As for the victims of breaches, regulations (with few exceptions) do not compensate them, and class actions rarely yield amount more than double figures, hardly worth claiming. From a business perspective, it’s cheaper to violate privacy than pay users to opt into datasets. I mean, billions in revenue forces them to reduce costs in this manner, doesn’t it?
Do all companies involved in AI development behave in this manner? Is it expected?
How AI Should Work
As a journalist, I have no requirement for big data, advanced analytics, image recognition, or automation. However, considering I frequently type (even now, in fact), I use a predictive typing tool to enhance productivity, i.e., save some time. Given that AI is involved, it’s relevance to this post is clear. I’ve chosen a traditional interview format (with my questions in bold and answers in italics), in the interest of completeness.
I spoke with Guy Katabi, founder & CEO at Lightkey Sources LTD., a global provider of predictive typing solutions with headquarters in Israel to see how they developed AI. Initially, he pointed out his passion for finding ways to use technology to make a positive impact on people’s lives, drawing on more than 12 years’ experience in software engineering, user experience (UX), and machine learning.
Bearing in mind that I’m not a software developer, can you briefly outline your software development process, with examples where security and privacy are incorporated and where AI is necessary?
The first version of Lightkey took about 24 hours to build. It was a simple window with very primitive text prediction capabilities. Turning this simple window to a viable product that supports 85 languages using multiple algorithmic layers was an exciting and challenging journey that can be broken into three main areas:
1. User Experience
2. Privacy and Security
Putting aside the usual privacy and security concerns, such as using data encryption, secure transport, etc. one of the major decisions we’ve made right from the start was NOT to collect our user’s content. Instead, we challenged ourselves to find creative ways to sort through the enormous amount of publicly available data on the web to create our datasets. Moreover, we wanted to build Lightkey to be an offline solution so that it could match the strictest cybersecurity requirements.
3. Prediction Technology
Imagine that a group of personal typing assistants are working for you. Each assistant has a unique perspective on your daily work. One of them is an expert on your typing history, the other is an expert on the topic you’re currently writing about, and the last one is an expert on the English language. Now, with every keystroke, each assistant states its point of view about what might be your next character, word, phrase, or sentence. Then, they negotiate as a group to find the best available answer, and if they succeed, a suggestion will appear on your screen. Otherwise, they will not offer any suggestion and will keep on negotiating and learning from your next input.
Behind the scenes, these assistants represent text prediction algorithms that are built using multiple disciplines, including Artificial Intelligence (AI), Natural Language Processing (NLP), and Information Retrieval Methods (IR).
Is it fair to say that any AI project requires data sets to facilitate future improvement?
Having high-quality datasets is critical to delivering relevant text predictions. When faced with the decision whether to build these datasets from our users’ data or to build them while sorting through petabytes of publicly available information, we decided on the latter. We strongly believe in a privacy-first approach that can be accessible to any kind of user, ranging from individual home users to the most regulated organizations that follow the strictest cybersecurity standards.
Do you have any comments on the failings of Big Tech (Google, Facebook, Amazon, Clearview, Apple, and Microsoft) when gathering data for image and speech recognition? All have invaded user privacy by outsourcing to contractors or snooping.
According to a recent study, more than 99% of the “terms and conditions” that users are accepting when using the services of large tech companies are unreadable for the average person. I think that creating data regulations such as GDPR and sounding the alarm when tech companies abuse their data collection practices are important, but it is still, to some extent, a reactive approach.
Providing proactive solutions is far from trivial in the current dynamics, and I personally think that there are two interesting problems to solve here:
First is to decode the “terms and conditions” during the user’s onboarding into a more accessible/ human-readable language giving the user a clear understanding of what information they are sharing with service providers. As a result, legal loopholes or obscure legal language in a data collection policy will be negatively communicated to the end users, scaring them away from the service and as a result forcing the service providers to be more explicit and measured when collecting or sharing their users’ data. In order for this to work, regulators must make this tool a mandatory part of the onboarding process.
The second is possibly creating an independent platform that can rank every service provider based on the clarity of terms and conditions, their cybersecurity standards, and their ability to follow these standards based on previously known breaches they’ve had. Adding this rank as a key part of the decision-making process will help users get a better sense of whether or not to trust the provider with their data and hopefully push the service providers to improve their data collection practices.
Note: DuckDuckGo’s Privacy Essentials is worth checking out to control personal data online.
To summarize, we have a big gap in terms of decoding terms and conditions and trusting that they will be actually respected by the service providers. If we could tackle these issues with proactive solutions that will add clarity and transparency, I think that we will experience a substantial improvement in the aspects of privacy and security.
How is user privacy and data protected when using Lightkey?
First and foremost, the users’ content is not submitted to the cloud as Lightkey is a local solution. Second, the locally learned typing patterns and user settings are all stored locally in an encrypted format. Third, by following industry best practices and standards.
Any observations on Lightkey’s future plans and on the future of ethical AI in the industry?
Our journey is about mitigating a deeply rooted pain. We strive to enable people to express their thoughts at the native speed in which they’re formed. Therefore, our future plans include machine-learning related developments that will push the boundaries of predictive typing without taking away creativity and privacy. Similarly, we’re constantly looking at new ways in which organizations can harness their collective strength to increase their value and maximize productivity.
Being able to generalize from data using AI’s deep learning capabilities is a great thing, as long as the data that is used to teach the machine is authentic, clean from biases, and overall balanced. The problem is that it is hardly the case, and unfortunately, we find out the problems in machine-generated models only after the damage has already been done. Creating clean and unbiased datasets is a very interesting field to research going forwards, and I think there is a lot of room for innovation here.
Lightkey’s approach to AI is a non-intrusive example to others who claim they are seeking to provide solutions that benefit humanity. Big Tech campaigns for AI self-regulation are laughable, as I’ve just demonstrated how easy it is to develop AI in an ethical manner. As for the Catholic Church and their input on AI ethics… what do they have to do with technical innovation or ethics? I’m all in favor of priests wearing ankle monitors and body cams when interacting with our children, and I guess AI/predictive analytics could be used to predict where priests will be moved when the next scandal occurs.
It is governments that need to regulate AI (without weaponizing it) to pad their coffers with resulting penalties and users need to control/be more conscious of what they put online and take every effort to protect their data, rejecting products and services with biased terms and conditions. Use end-to-end encryption where possible. If LightKey can use AI ethically, it’s possible for Big Tech companies with deep pockets to do the same. What do you think? Are you happy with how AI is progressing?