The realms of music and artificial intelligence (AI) are converging in fascinating ways, presenting unique opportunities for creative expression and collaboration. Both fields rely on the use of musical or statistical tools to shape and manifest ideas.
In this interview, Andrew Paley discusses the relationship between music and AI, exploring how they influence each other and the potential impact they can have on artistic endeavors.
Why are you interested in both music and Artificial Intelligence and how do these fields influence each other?
Andrew Paley: They’re both deeply creative fields in their own ways — they both involve use of an instrument, musical or statistical, that helps shape and in some sense collaborates on the resulting embodiment of an idea. And in both cases there can be some sense of capturing lightning in a bottle — some spontaneity that gets tied down in a medium.
How do they influence each other? Well, it could be direct — leveraging AI to create sounds or audio patterns — or indirect, which would be like any other art in setting a mood or inspiring an idea. On the other hand, there’s the opportunity to blend — I’ve been fascinated by exploring and piping audio signals into generative models to see what sorts of imagery (and otherwise) can be coaxed out. And I think, as time goes on, there’s the possibility of more dynamic, immediate, fluid collaboration between the two spaces in all sorts of forms.
Can you tell us more about your experience and contribution to using AI in content creation, especially at Narrative Science and Storyline? How do you see AI evolving in the media industry and what opportunities and challenges does it present?
Andrew Paley: At both Narrative Science and Storyline, I’ve been in pursuit of leveraging AI to reimagine the ways in which human beings access information and come to understand the world around them. At Narrative Science, we were fully focused on language and document generation, and I spent my time there as an engineer and designer, innovating on our platform to make it more capable and accessible. After a bit of time doing PhD research, I’ve cofounded Storyline to explore something larger in scope.
At root, the journalist in me cares most about whatever medium is most effective as a human-information interface, and there’s an enormous set of opportunities to improve our relationship to knowledge in meaningful ways given the developments in artificial intelligence over the past decade. We’re keeping that intersection at the core of what we’re doing at Storyline with an eye towards clarity and trust, and I’m excited about what we’re building.
That said, AI certainly presents challenges for media and information. One significant issue is the possibility, or inevitability, of rendering online spaces void of trusted information. If any sort of media can be generated convincingly — text, images, videos, sounds — there will be a constant battle in digital spaces between an ever-growing glut of nonsense and misinformation and whatever sliver the truth can carve out for itself. This of course gets even worse given feedback loops. And that’s to only mention in passing these new tools’ potential for amplifying intentional disinformation and for proactively robbing us of attention and agency on behalf of other humans, which are significant threats in and of themselves.
Can you tell us more about your creative approach to using AI to create music videos and how this might impact music production and creativity?
Andrew Paley: I view these new models as simply a new set of collaborators and tools with which to explore ideas. In some sense, understanding how to work with them now is like learning any other instrument — no one was good at playing the guitar before the guitar was invented. My music video experiments have been one way of testing the boundaries of what’s possible — playing around with these new sets of building blocks to see what the machine and I can dream up together.
Sometimes it’s been a bit more machine-led — the Pixie app I built in 2020 would generate imagery that I could select from and help sequence, and then it would generate animations from that.
Other times it’s been more human-led — the “Sequels” video took an enormous amount of manual editing with models just providing the re-lip syncing on the clips and means of upscaling the results. In more recent experiments, I’m pushing song lyrics through various models that I can download and orchestrate (like Stable Diffusion) to see what sorts of imagery they evoke and beginning to explore how I might create more cohesive imagery from the results in machine-generated animation.
And that’s all downstream of how music production and creativity will be and are being impacted — from song writing to sample generation to instrumentation to effects to mixing and mastering — the whole workflow by 2025 is going to be unrecognizable to someone working in 2015. It’s an incredibly exciting time to be riding the line between creativity and computation, and I think we’re just getting started in exploring what the future of art might look like.
How are you using AI to expand artistic expression and push traditional music and video boundaries, as described in your music?
Andrew Paley: I think mainly by just trying to lean into the tools to see if I can shake anything out of them that I find meaningful, which is basically the same way I got into music in the first place (creating soundscapes with my dad’s Casio keyboard during grade school was my gateway drug). In part, I’m just taking it as a chance to wander around and explore – it’s a wild ride even when you come back with nothing to show for it – and then sometimes I end up following a thread for long enough or deep enough that I come across something worth hanging on to.
And, as I said, the current incarnations are just a new type of instrument in some sense – and whenever you come across a new instrument, you push your own boundaries just by sitting down to play.
How have your research interests and work in AI influenced your artistic output and vice versa? How do you see the future of music in relation to AI and technological advancements? How do you see the role of AI in the music industry and how might it change the way music is created, distributed, and consumed?
Andrew Paley: One significant concern is the continued devaluing of the audio medium. Where this all ends up is hard to say, but if these new tools are about empowering artists to explore ideas, then I’m all for it. However, if we’re heading towards an era of infinitely generated, constantly personalized playlists of music that sounds like music you like, that not only threatens artists working in the space, it also sounds like some version of hell.
To me, the whole point of music is that songs are messages in a bottle – there’s something important about the ideas and the struggles and the joys and the connective tissue and the intent wrapped up in there – and if music were ever to be fully taken over and convincingly reduced to an algorithmic process of averaging across previous works with a bit of variability in there for good measure, right after I was done being impressed by the technology, I would consider it something of a tragedy.
And then there’s the social element – cultures are struggling with social cohesion already enough as it is, but the communities that rise up around styles and scenes and clubs and bands are too valuable to lose. They’re certainly something I’ve relied on most of my life.
How do you see AI’s potential for creating music, and how might musicians use AI to expand their creative work, as described in your presentations on “Generative Music with Artificial Intelligence”?How do you potentially use AI in music production and how does this influence your artistic vision, especially in relation to your releases on Spotify?
Andrew Paley: Well, it’s obvious that with more tools, more work is possible. And here I have to differentiate between the things I build or cobble together and the things I use off the shelf. For my part, I tend to keep the core songwriting process free of direct AI involvement, at least to date.
Downstream of that though, I leverage a variety of plugins that incorporate machine learning to expand and accelerate the process. NeuralDSP plugins are a great example – the idea that I can get an amazing amp sound with a signal chain of effects and cabinet/room tone from literally anywhere with my laptop has been game changing. The difference between the guitar sims of even five years ago and now is unreal.
Looking a little further down the road, there are all sorts of places to push boundaries, from AI collaborators during the songwriting and ideation process that can riff off an idea with you, to assistants that aid in patch selection and effects layering during arrangement, to exploring autoencoder models aimed at altering voices in pursuit of realistic backing choirs orchestrated by a single vocalist (I was actually trying to toy around with something in this realm the other day).
But again, what’s important to me is that I care about the human intent in the process – if there’s not a human being at the center of this array of oncoming tools and machine capabilities, then I’m unlikely to be all that interested.
And then there’s the visual component, of course. This is the place I’ve most enjoyed machine collaborations to date – from thematic ideation to the generation of the aforementioned music videos.
How do you address ethical and social issues related to the use of AI in the arts, and what considerations are important to you when it comes to using AI in your creative work?
Andrew Paley: If more capabilities are in reach and more ideas can be realized by more people, I take that as an undeniably good thing. That said, we have not yet begun to adequately reckon with how we’re going to handle these new capabilities when taken to their logical and technological extremes.
There’s of course the issue of what happens to creators when the models can do most or all of the creating. The writer’s strike going on in Hollywood as I type this is in part about this debate — what happens when studios can leverage language models to go from hiring teams of writers to a single editor who can clean up the machine’s output? And what if audiences don’t even notice the difference?
And that’s to say nothing of what these models actually are trained on — human ideas, human creations, human stuff. There’s not a really good means of tracing how individual human contributions get incorporated into hybrid outputs, but there seems to be a growing chorus that says we should maybe reduce the mythologizing of AI and see it more as a new kind of mashup of humanity in infinitely synthesizable forms — it’s our data in there, after all.
Maybe downstream we can reimagine how individual creators might get credit for helping inform the big sky-brain (some form of the “data dignity” that Jaron Lanier and others espouse), but I’m not holding my breath on that, nor am I sure how it could scale much beyond where we already find ourselves (if we were even able to ever implement it, which itself is a sizeable lift).
On the other end, I think there’s a real danger that — especially as these technologies progress — the infinity of personally-tailored possibilities via on demand experiences threatens to cheapen the meaning of art as human communication and make very real the notion that we just might be at risk of amusing ourselves to death.
How do you see the future of music in relation to AI and technological advancements? What developments and innovations do you anticipate and how might they change the musical landscape?
Andrew Paley: I think there are two main threads here (with some admitted overlap between them) – mimicking music and building collaborative tools for musicians and producers.
On the mimicry front, there’s already incredibly impressive work being done, and the sky is the limit there – I think we’ll continue to see increasingly convincing “deep fakes” of musicians across a variety of genres.
On the collaborative tool front, the potential applications are endless – from accompaniments to effects to mixing assistants that might be able to comp together starter takes. Even just the possibilities for fluency models providing frontends to speed up workflows is exciting – describing in a sentence a signal flow and having a channel strip populated with various plugins tuned to match the request would be game changing in terms of experimenting with sounds and ideas.
But again, what I care most about are the human beings at the center of all these new capabilities. We find ourselves at something of a crossroads as to how future generations will conceive of art and expression, and commerce could easily get the better of us if we’re not careful.
Thank you Andrew Paley for the Interview
Statements of the author and the interviewee do not necessarily represent the editors and the publisher opinion again.