Study Hall

Necessary Clarity: Artificial Intelligence In Professional Audio & The World Around Us

March 18, 2024

Michael Lawrence

A conversation with Dr. Kelvin King, an assistant professor of Digital Misinformation at Syracuse University’s School of Information Studies, that lends clarity to several aspects of AI.

During a recent editorial meeting, LSI/PSW editor Keith Clark asked me whether I’d given any thought to the role AI (Artificial Intelligence) plays in the editorial process, and floated the idea that it might be fun for an AI to match its (artificial) wits with a very human technical editor. Up to this point, my experience with AI was limited to infuriatingly circular arguments with tech support chat bots.

I began my relationship with ChatGPT by asking it about three common audio misconceptions (Do line arrays create cylindrical waves? Are underpowered amplifiers more dangerous to high frequency drivers? Does phantom power blow up ribbon microphones?), all of which have recently been explained in the pages of this publication. I wasn’t particularly surprised to learn that in all three cases, the chat bot gave the commonly repeated – yet factually inaccurate – answer.

Although perhaps entertaining, there’s little editorial value in this exercise. It does, however, raise bigger, more important issues: Where do AIs get their information from? What is the use of an AI that gives incorrect answers and repeats misinformation? How do we as professionals navigate an industry and society where AI is increasingly prevalent?

In considering these issues, I realized that I had no idea how any of this works, and needed to turn to an expert. I reached out to Dr. Kelvin King, who is an assistant professor of Digital Misinformation at Syracuse University’s School of Information Studies and has published extensive research on the diffusion of information on social media. He graciously agreed to talk with me about these issues.

The LSI/PSW editorial team would like to thank Dr. King for his time and expertise with this interview. Readers may contact him and learn more about his work at https://ischool.syr.edu/kelvin-king/. Note: this interview has been edited for clarity and condensed for length.

Michael Lawrence: So maybe we can start with how this AI actually works. To a layperson, it seems like it’s literally just reading the internet and repeating back a bunch of stuff that it’s seeing – is that what it’s doing?

Kelvin King: Close, but not entirely. So let’s talk a bit about what AI is, right? It’s not there to replace human intelligence. It’s there to augment human intelligence. Right now, a lot of people use it wrong. They assume that whatever the AI says is right. If we look at what’s happening with the real-world development of these tools – in most organizations, about eighty percent of everything related to AI development gets stalled. Because of one thing – trust. People don’t trust the AI.

And that’s because when we deal with machine learning, we have a trade-off. We find a lot of bias, errors, and variance in the models. The way people think about these Large Language Models (LLM) are that they’re databases, or search engines, but they’re not. We’re talking about machine learning models that look at large amounts of text, to predict the next set of inputs or outputs based on the data they have, and based on prompts. Prompts from you or me. And there can be a lot of errors based on the type of data they collected.

For example, some of these LLM are trained on Wikipedia. And a lot of the information on Wikipedia may not be entirely accurate. I have a video where I show that a huge number of edits on Wikipedia pages are made by bots. So think about it – an AI can edit Wikipedia, and then feed itself the same information. So you get even more error.

Thinking of it as a “database” isn’t exactly right – I’ll use another example: think about compressing an image file into a .JPEG on your phone. You lose some information as part of that process. The LLM has to fill in those blanks – and it’s really not a good term for this, but the correct term is “hallucinate.” And they might do this by producing sentences that are non-factual, or nonsensical, or contradictory…

ML: I’ve done a lot of technical writing work, like writing user manuals and documentation for products in the audio market. And something that has come up more and more lately is manufacturers who might feel that they don’t need to hire a technical writer to create the user manual for their product, because they’ll say “well we asked ChatGPT to write a user guide for it, and it gave us something good.” And I say, “well, yeah, because it’s reading the user guide that I already wrote.” So if there’s no information already out there on how this product works, it’s not going to figure out how to use your product, right? You still need someone to write the user guide first, so it understands what this device is, is that correct?

KK: That is correct. I’m an associate editor for one of our journals, and recently they sent me a survey on using AI for editing. And we’re seeing the industry information systems trying to move away from that, because there’s a lot of misinformation, and that’s not really good, right? One thing you should know about the LLM is that it’s very difficult to tell what information the model remembers, or where it gets that information from when it generates the text. It can’t tell if it’s accurate. So people who really understand what these models do don’t rely on them for very important things.

We have to remember they’re not databases, they’re not search engines, they’re just statistical patterns, storing mathematical patterns of those relationships of the data that it looks at. Because we’re talking about petabytes of data – where do you store all that? It’s going to try and compress that down to mathematical patterns, and make some sense of it, and then it tries to extract knowledge from that by sampling, and running queries and prompts.

Just like when you compress a file, it’s very difficult to get the original back, it’s the same way here, where it’s next to impossible to extract the entire data set. So all the parts that go missing is where the AI tries to fill in the gaps.

Another issue is that the data can be outdated. If you feed it that the sky is blue – well, here in Syracuse it might be snowing in the next 20 minutes. So that data is outdated now, but it’s still going to try to feed you the same or similar information based on what it already knows.

So AWS came up with something called Retrieval Augmented Generation (RAG). It helps the LLM deal with how to verify sources, look at other external sources, and then update its information to improve the veracity. So there are some ways we can mitigate this hallucination.

There’s also this idea called “temperature. It has to do with randomness. Several years ago when the LLM came into play, we were all impressed that it could write an essay on its own. And so creating new text or content, that’s randomness. You need the randomness to create new things, and that’s what you’re seeing: you prompted it for something and it creates something interesting.

So if we set the temperature high – the randomness high – it can create nonsensical things, and even give you citations on some of those things – citations that do not exist.

ML: I remember seeing a news article about a lawyer who used it to research a case, and it cited cases that didn’t exist. And I think there’s a temptation to blame the tool there, but maybe that’s not a good approach for a lawyer to be taking?

KK: So as an editor, you have two things to do here. You educate your viewers, your readers, like you’re doing, which is very good, on how to actually use LLM and how to cite them for accountability. You have to say you used it, or how you used it.

And next, you have to teach how to properly prompt the LLM. It’s about statistical patterns, so context is very important. For example, you might ask it “Can you tell me how many banks there are?” And you might be referring to financial institutions – credit unions, and so forth – but it might be referring to banks by the riverside, natural banks, so that’s another thing – being able to generate the correct prompt to get the accurate result.

ML: If we think about how the current generation is learning things – how they’re getting their information, they’re going on YouTube, they’re going on Reddit, and we keep seeing this idea of “voting on truth.” The things that come up on the top of the search results are the things that are popular – the most likes, the most upvotes, or what have you.

And just because it’s popular doesn’t mean it’s correct, right? So the top search result might be very popular, but the correct answer might be way down here at the bottom somewhere. So as a society, how do we navigate this when our information models seem to be based on engagement, not veracity?

KK: I always advocate, first and foremost, for education. I believe that if we try to restrict the information that people get, it will cause a lot more problems than if we try to educate the public on best practices on how to verify, how to identify false information, on best practices.

And that’s actually the same method used with RAG – you actually ask the model to break it down, give you a step by step thought, break it down into bits so you can verify the authenticity of each bit. It’s the same thing. Remember it’s a model that is not as smart as we are – because it is not.

And I think some of it comes down to laziness. It’s called the Elaboration Likelihood Model – basically lazy thinking. I’ll give a general example: let’s say you don’t know Mark Zuckerberg and you have no idea about him but he’s going to come here and he’s going to give you a million dollars. So he shows up and he’s wearing his usual hoodie and jeans, and another man comes in a Rolls Royce and a nice suit. The brain uses some lazy thinking to process this information and say “it must be that guy in the suit who’s the rich guy.” Which is so lazy.

ML: So basically a bias, an assumption.

KK: Right. The bias, instead of evaluating the whole thing. So it’s about educating people on how to avoid lazy thinking when they evaluate information, and educating on how to properly use those tools. Educate them on the issues and the risk – transparency, privacy, all the ethical ramifications of using AI, and there are a lot of them – so a lot of the conversation today is around transparency and accountability in AI. And we need to learn to control for the gaps in the data.

One example – a tech company using AI to hire people for technical roles. They found that a resume that the AI identified as belonging to a woman – because it might list a woman’s chess club or something along those lines on the resume – would get rejected more often, simply because it was trained by looking at who’s already working in a lot of these positions, which is statistically a white male. It’s rejecting it because it’s different, because it doesn’t fit the pattern, not because this person might or might not be successful in this role.

So yes, what you’re referring to is bias in the data. And that’s an example of something we try to manage, but it’s always going to be in the statistical model. There’s always going to be some error in every model.

It’s our job as the data scientist to make sure the data improves, the way we model and interpret the data improves. Predictive models have that disadvantage – if you give the machine learning model information about every hair on your head, it will find a pattern, even if there isn’t one.

That’s the center of this idea – it doesn’t know what’s right or wrong.

ML: I saw a news article recently and it didn’t seem right to me, and it wasn’t from a very reputable website, so you kind of have to think, “well, how reliable is this source?” and consider where the information is coming from. And we all know “I saw it online” isn’t a good reason.

KK: It’s actually worse than using a search engine – the search engine ranks the results, based on an algorithm, and you can make a choice of all the information. ChatGPT doesn’t typically do that, it just gives you an answer – or what it thinks is the answer. And you don’t know where it got it from.

ML: So this idea that – and I’ll use the low-hanging fruit example here – in the future, publications won’t need editors because the AI can do it all – it sounds like that’s not necessarily a programming issue, it’s an inherently flawed concept.

KK: Yes. In the future, no, we’re not going to solely depend on AI. Humans are always going to be responsible for this – think about self-driving cars, for example. We’re so very far from getting that right.

ML: And I think about, within the audio industry, something our readers would be familiar with – there are all sorts of algorithms that are designed to help assist you in the process of doing something like designing a sound system. Something basic like we need to figure out the best angles for the loudspeakers, or how much each speaker should be compensated for air absorption, things like that. Those exist, and in general they tend to work. They’re never matching what a really experienced human can do, but they can get you 80 percent of the way there more quickly. So it’s there to help you, but you’re the person responsible for the work, and if something goes wrong, I can’t really blame the software – I got hired to do the job, so it’s my fault.

KK: Exactly. From an ethical perspective, there are a lot of decisions an AI can’t make. In the health industry. Looking at benign cells versus cancerous cells, the efficacy of drug medications, which is something I’m working on now – a person always has to go through it. You can never rely on AI alone to make those decisions. We need transparency from start to finish on how the decisions are being made, and if your machine or model makes a mistake, it’s your fault. You’re holding the ball.

ML: There seems to be a lot of pushing in the way we interact with these tools, towards making it easier for people to need less and less input from a human, and I’m not sure that’s a sustainable trend. I think at some point you need to teach people what’s going on.

KK: Definitely. And unfortunately many people aren’t in that mindset. And maybe there’s a community environment where people believe certain things, or they weren’t taught critical thinking as hard, and if I were brought up a certain way or in a certain neighborhood, maybe I’d be doing that as well.

Then we can look at why falsehoods get shared, and confirmation biases, and things like that. Someone might be looking for information to justify a position they already hold, and now you’re asking a leading question of ChatGPT, and how do you think it’s going to respond to that? So it’s up to society to teach the truth to help people.

Michael Lawrence

Michael Lawrence is a systems engineer and audio educator, and serves as the technical editor for various pro audio publications. Connect with Michael at precisionaudioservices.com.

All Posts

Study Hall Top Stories

A Natural Experience: The Upsides Of Immersive In-Ear Monitoring For Performers & Engineers

Posted on April 25, 2024

This advanced technology, ironically, takes us back to a more “old school” feel on stage, where bands vibe off each other more rather than ...read more →

Ghost In The Machine: Phantom Power

Posted on April 23, 2024

Clearing up the mysteries about phantom power – what exactly is it, and how do we apply it effectively?

The Buck Stops Here: Who’s Responsible For The Overall Success Of Your Integration Project?

Posted on April 19, 2024

Project ownership is a philosophy, an attitude, an ability and willingness to oversee all aspect of a project, and accept the “buck” when it ...read more →

FOH First Aid Kit: Being Prepared For Anything As The Summer Concert Season Rolls In

Posted on April 19, 2024

That beautiful, pristine sound check that happened several hours ago can become a distant memory when the elements take over...

Church Sound: The Power (And Value) Of The Unseen In Worship Tech

Posted on April 19, 2024

Excelling at the invisible side of what we do is one of the biggest ways we can build quality on the visible side.

Lost In Translation? Pro Audio Has A Language All Its Own

Posted on April 17, 2024

Beyond terms and jargon, there is a form of communication based on our passion for the craft.

Helping Build The Future: The Wide-Ranging Educational & Development Efforts Of Tech 25

Posted on April 17, 2024

Inside the work of a Pittsburgh-based non-profit collective network of industry professionals providing production education, workforce programs, hands-on experience and more to the next ...read more →

Three Days In March: A Very Long 72 Hours Battling “Gremlins” At The 1984 Juno Awards

Posted on April 16, 2024

The “normal” amount of time to mount and broadcast the show was about a week to 10 days, but the “actual” amount of time ...read more →

Worthwhile Endeavor? The Case For Deploying Plugins At Corporate Events

Posted on April 15, 2024

Viewpoints from numerous mix engineers who are doing high-profile corporate shows and utilizing external plugins, including some who have been using them for many ...read more →

Tech Focus: Introducing The Lily P4D Microphone Ducking System

Posted on April 15, 2024

Inside a new tool for eliminating microphone bleed in live performances via a simple analog design that's controlled by the artists on stage.

Tandem Touring: Mixing Monitors As A Dual-Engineer Team

Posted on April 15, 2024

In a realm where we engineers are normally the maestro of our own domain, we will be sharing leadership of the department with another ...read more →

In The Studio: Organizing Your Session Files

Posted on April 12, 2024

Some tips to help keep things running smoothly when the track count starts to climb and the chaos starts to ensue...

Study Hall

Necessary Clarity: Artificial Intelligence In Professional Audio & The World Around Us

Michael Lawrence

A Natural Experience: The Upsides Of Immersive In-Ear Monitoring For Performers & Engineers

Ghost In The Machine: Phantom Power

The Buck Stops Here: Who’s Responsible For The Overall Success Of Your Integration Project?

FOH First Aid Kit: Being Prepared For Anything As The Summer Concert Season Rolls In

Church Sound: The Power (And Value) Of The Unseen In Worship Tech

Lost In Translation? Pro Audio Has A Language All Its Own

Helping Build The Future: The Wide-Ranging Educational & Development Efforts Of Tech 25

Three Days In March: A Very Long 72 Hours Battling “Gremlins” At The 1984 Juno Awards

Worthwhile Endeavor? The Case For Deploying Plugins At Corporate Events

Tech Focus: Introducing The Lily P4D Microphone Ducking System

Tandem Touring: Mixing Monitors As A Dual-Engineer Team

In The Studio: Organizing Your Session Files

Latest In News

About PSW

News

Gear

Study Hall

Podcasts

Subscribe

Forums

More Content