Many years ago, as a grad student I studied and wrote some AI (coded if you will). The data sets back then were pretty logical and binary. However they could be ambiguous. That is to say inferences were made based on hard logic and of course you as the programmer decided how you were going to weigh - favor data like values that were out of range etc. The goal of course was to drive a correct conclusion. This stuff eventually became fault detection algorithms used in a plethora of applications. Fault diagnostic tools used in cars to isolate a bad sensor for example. Just to add a little color here, these AI systems have been around for 30 years and can still get it wrong. Rarely, but it still happens. Reel the hands of time forward a few decades and we have some influencers pointing out a very old axiom. Garbage In = Garbage Out. GIGO if you're old enough to remember that acronym. Today we are applying the approaches to far more complex and data rich environments. Rich in volume but not necessarily accurate or even with known accuracy. Read on.
If you data mine bad or misleading inputs you will come up with misleading results. AI is not some prescient wonder worker, On the contrary it has become a sophisticated aggregator with rules written by programmers that steer the algorithms to favor certain results. Test it for yourself, take politically dubious statements and see what kind of a chat you come up with. There's your first data point. Filtered - some would say censored - results are a common practice and even if liberal boundary conditions are applied, bad inputs (even scholarly articles that were bent towards funding and not truth) yield crazy bad results.
Kind of an obvious conclusion for those in the trenches but a revelation for the mystified general public and I suppose for the equally mystified oligarchy that was hoping to replace people with software...
That play has already been written to disastrous results in various places. Only the folks in the trenches don't get a lot of air time to make that clear. It's good that people with a microphone are finally listening and talking. Taking any code (AI or otherwise) and putting complete trust in its veracity is for the naive or ignorant. Applying it to real solutions without rigorous testing - same. Actually reckless.
In order for AI to give consistently meaningful results it needs truthful, accurate and correctly represented data. LOL, today's world is filled with a lot of garbage. I had to laugh when I watched a TV "news program" give us a speech on the concerns for mis-information as an AI result. Now lets see, if the truth is counter to the narrative of the day... that is often labeled mis-information. If the current state of the art is wrong and AI simply echos that - it's really mis-information, and if you are promulgating propaganda, AI will be a very wild animal to tame! You probably need your own data set of whacked out versions of facts for that. I'm sure somebody is working on that right now.
All of the terra bytes of mouse clicks and e-mail responses... all of the url traffic and those downloaded papers from those sketchy corporately funded "scientific journals" none of this has a truth detector or label that holds anybody accountable for fabricating results. Unlike the German Beer purity law! LOL, good luck with the whole AI thing. I'm sure it will be useful at times, absolutely wrong on occasion and simply worthless at times as well. In a gross sense: untrustworthy. Let's not forget AI is written by people.
Let's see publication transparency - programmer names, funding sources and descriptions of over what data and what programmed weighting factors and filters were applied. Were there barred names and/or words used in the algorithms? Were filters that assign a weighting factor applied? All of that should appear on a warning label to be read and agreed to on the entry point to the website or printed out on the front page of any results that printed off. Maybe then you or your institution can be a judicious consumer of the conclusion a certain AI program comes up with. Then if it's harmful or wrong you have recourse for legal remedy. Hmmmm, there's an idea! How about a disclaimer on the evening news as to if any photo shop materials were used? How about digitally enhanced video or images? LOL, yup lots to be fixed to get this right. We want to feed the AI good stuff don't we? How much un-vetted, unvalidated stuff are we consuming without an AI generation disclosure today? LOL... ah the mind wanders.
AI is a software tool. Written by people, data mining what people write and publish, what people click on and makes assertions on that pile of inputs. It reads sensors that have ranges and various accuracies and all of it can go bad in various ways. Its developers attempt to incorporate additional functions like facial recognition, speech spectral nuances... and tons of other stuff from the IoT. How much of this is really well understood as pertains to a distilled result? Me thinks AI is still in its infancy with lots of vulnerabilities but I'm sure it will progress as a tool. It will get "cleaned up" perhaps it will give consistently correct results for narrow applications, but what release of the software will you be experiencing? How will you know? Maybe a legal terms and conditions can slip in a low confidence interval in a ULA for the cheap release and a giant jump in user fees for the "pro" good stuff? LOL, marketing gold.
What if the memory location the program and data set resides on gets corrupted? Hacked into or simply electronically damaged? There's a lot of DLLs that can wander into RAM and cause mayhem. Would that be like a digital stroke - LOL? or maybe a nefarious data set mod that is indeed a well targeted act of harm... how would that be safe guarded against? Keep those jump drives away! Could you ever cede human decision making in high stakes applications to software with complete trust considering all of the ways software and hardware can be attacked?
Perhaps a simple financial transaction and the role of "order taker" at a fast food place is an easy and overdue target for an expert system (weak AI with speech synthesis) but moving much further beyond that carries substantial liability risks that most informed ownership should be careful to accept. Alpha, Beta and Gamma test the heck out the tool before you adopt it! Then realize like any software it has a useful life dependent on upgrades, operating systems, hackers... not an Easy Button when you look at the whole picture. Looks like an AI insurance policy is just what the actuarials need to get going on. Could be a new revenue line...
AML JRO, NLO and SEO AML