In 2016, YACHT began an experiment working with artificial intelligence to help compose their next album. A documentary about this process, The Computer Accent, was recently released to streaming and helps showcase both the benefits and pitfalls of using computer learning in the exploration of what would become their 2019 album Chain Tripping. Our conversation with vocalist Claire L. Evans delves into the motivations, challenges, and revelations of integrating AI into their music-making process and offers a look at the possible future of collaboration.


ALLMUSIC: What made you want to work with AI in the first place?

CLAIRE L. EVANS: It’s funny thinking about it now—why did we even want to do this? I have to keep reminding myself that 2016 was a very different time, that we weren’t influenced by everything in the air now. It was on the horizon, and we wanted to learn about it. We had a feeling that it would become important, even if we didn’t anticipate what it would become. I read a lot of science fiction, and I had an idea that, at least in the cultural consciousness, people would be unpacking this for a long time. And we needed a challenge. We needed something to do that was new, that was interesting, that would be creatively engaging for us and would force us to make something different than we’d ever made before.

It would also require a lot of collaboration, which was exciting. It seemed fun to do something that would involve asking for a lot of help. We had no idea what we were getting into. We thought that you could “train an algorithm” on your own music and generate more of that music. We had both a too optimistic and too pessimistic view of AI. We thought it would be a system that would generate sounds on our behalf. Then we realized it wasn’t even possible to make a bad song with the technology available at the time. So then the project became about well, how can we make something good? Which is much harder.

ALLMUSIC: When did this all start coming together, since the album came out in 2019?

CLAIRE L. EVANS: The documentary started filming in 2017. We started on the record in 2016, trying to figure out how to do it, assessing the tools available at the time, which were limited. There were open source models in the computer science world that you could tinker with, if you knew how to code. There were nascent music startups that said they were using AI, but it was unclear what was really going on under the hood. Then there were artists and technologists building their own tools, building their own models. That was the landscape. So we had to figure out what we wanted to do, who we could collaborate with, and what tools we could use that would allow us some measure of control over what we were doing, despite the fact that we were not coders, which took a long time.

ALLMUSIC: You started thinking about this and 2016. Now it’s 2024. That’s a million years in the tech world. Have you used any of the new tools? If you were to go through this process now, would it be easier?

CLAIRE L. EVANS: It would be a lot easier, but it wouldn’t be as interesting. The fact that it was cumbersome was what made it compelling to us. The tools available were bad at doing what they were ostensibly meant to do. A text-generating model would generate absolute nonsense, a sound-generating model

The film is called The Computer Accent because everything AI-generated at that time had a kind of an accent. You could just tell it wasn’t human—it all had an identifiable weirdness.

would generate something either absolutely haunted or with such a low sample rate that it sounded too lo-fi even for us. The fun was taking all that chaotic material and seeing if anything interesting had accidentally happened, and trying to restructure all the chaos into something meaningful. Our job was to give it structure.

Now—the tools are so good. There’s not a lot of friction there. AI doesn’t make interesting mistakes. What we were drawn to originally was the oddness of the material these models generated. The film is called The Computer Accent, and we don’t even talk about this in the movie, but we called it that because everything AI-generated at that time had a kind of an accent. You could just tell it wasn’t human—it all had an identifiable weirdness, a wonkiness. That was charming. That’s kind of lost now. I’m not at all attracted to any of these new tools. They don’t provide any interesting texture. It’s just like, more human mediocrity.

ALLMUSIC: For a while I was using DALL-E to try and recreate famous album covers with prompts and just seeing what would happen. Iconic things like, “a yellow banana on a white background,” just to see what it would do. It’s way more fun when it’s goofy and you don’t really know what you’re going to get. But when it starts getting better, it loses that weird/creepy factor.

CLAIRE L. EVANS: Yeah, totally. I think we’re gonna miss that. One of our theories early on was that the “neural aesthetic” of early generative AI would become the new analog. That we’ll have nostalgia for the wonkiness and fucked-up weirdness of these early models—and I think we already do.

ALLMUSIC: One of my overarching questions is what did you give up and what did you gain in this process? Did you feel like you gave anything up by letting the computer spew out a bunch of sounds that are supposed to sound like you?

CLAIRE L. EVANS: It definitely wasn’t mirroring our process. It was interfacing with our process on an incredibly surface level. All the AI “knew” about us was the MIDI data we fed it, which was stripped of context. We couldn’t put an MP3 into these models. We had to draw out two-bar chunks from multi-tracked songs, which might have dozens of different patterns and melodies.

The Computer Accent (2022)
The Computer Accent (2022)

The AI was operating on such a minute piece of us, and anything it put out would necessarily only represent a fraction of who we are.

Normally we’d go into the studio with a notebook full of lyric ideas, a few melodies swimming in our heads, big concepts, or boneheaded riffs—that would be the source material. We’d still have to arrange and put it together, making all these choices that are dependent on context. That process remained exactly the same. It’s just that the source material had been run through the blender in this high-dimensional, conceptual way. The real difference was that there was much more of it. With these models, even at the rudimentary scale at which they were operating back then, you’d just hit “go” and get 40,000 words. So we had to make aggressive decisions about what stayed and what went. That was maybe the most alienating and overwhelming part of it, actually: letting go of the feeling that one thing might be better than the other.

We could give the same corpus of text and notes to 100 other bands, and they’d all make really different records. The source material is just the beginning of the process. I think that’s a good attitude to take towards generative art-making: you shouldn’t ever take the output as the end of the process. The output is just part of the process. You can’t just slap your name on it and call it done. I mean, that’s no fun. I’m not even talking about the ethics of it. It’s just not interesting, or fun, to do that. I don’t think whatever time I’m saving is worth it. What would I be doing with my time if I wasn’t making music, or writing? Why not just make the music? So yeah, I don’t think we gave up too much, other than flexibility, by virtue of the fact that the project was so structured. Everything had to be really pure conceptually.

ALLMUSIC: So what were the rules then?

CLAIRE L. EVANS: Everything had to be generated, in some way, from our own back catalog. Obviously, these models require more data than even our 20 years as a band can provide, but the prompts had to be from our own history. That was the first thing. The second was that we couldn’t improvise, jam, harmonize, or generate anything on our own. If we wanted something specific for a song, we had to figure out how to make the machine make it. Sometimes that would mean, “shit, we want a bassline that sounds like this—so let’s take this melody from this song from 10 years ago, and this melody from this other song of ours from five years ago, and run it through a latent space interpolation model, and hopefully, the model will split the difference and give us what we need.” That was ridiculously complicated, but we treated the whole album as a science experiment. None of the models we used could generate chords at the time. So there are almost no chords on the album.

What else? We could transpose things to a different key. And we could hack things up as much as we wanted. We did a lot of collaging, taking little MIDI riffs from different outputs and arranging them together. Especially with the lyrics, that was really interesting, because we were running these models at different temperatures. The high temperature output would be much wackier and more verbose and full of neologisms. The low-temperature stuff would be repetitive, punk rock style. Mixing the different temperatures within a single song was an interesting way of approaching verses and choruses. We gave ourselves leeway with arrangement, production, and performance. Everything’s played live. But in terms of the source material, we were really rigid: it had to come from the system, and it had to come from our own history.

ALLMUSIC: So the process of establishing what you were going to pick was mostly intuition? Listening to things, figuring it out, finding what you liked, arranging it. You used the word collage.

CLAIRE L. EVANS: There’s a rich tradition in avant-garde art and music of doing cut-ups, doing collages. Mixing sources. David Bowie used cut-ups for his entire career to write songs; he even had a computer system he made in the 90s that would cut up newspaper articles and recombine them. It was called the Verbasizer. Something really interesting happens when you use an electronic or generative process to surface material recombined from a different source. You’re forced to interpret it, to impose meaning on it.

The process of making meaning is what art is all about, but that meaning-making can exist at different times, and from different positions. You could be issuing it from yourself, or you could be interpreting xand projecting meaning onto something else. In music, it often goes both ways. You could write a song about something really specific, but then everyone that listens to it thinks it’s about something else. I could sing the same song 1000 times live, and it could mean something different to me every time, depending on where I am in my life. That’s always changing. I think it’s just really fun to play with that in a more explicit way. To draw from these avant-garde histories and find ways to project meaning onto chaos, essentially.

ALLMUSIC: I’ve been making field recordings, things like my friends talking at their record store, or a conversation happening next to me while I’m getting coffee. Incorporating a weird little quote into a song. Can that even be replicated by the tools that are out there? Pop music has a structure. Producers have formulas on how to make a hit. And that’s why they have multiple hits. But maybe it’s bland because it appeals to a very common denominator. Working with cut-up samples, that’s something interesting to me. Can machine learning replicate that? What would that sound like?

CLAIRE L. EVANS:

A computer can certainly replicate that on a formal level. If you train a machine learning model on a bunch of field recordings, and collage-based experimental music, it’ll generate more stuff that sounds like that. But it will be issued from nothing, totally divorced from context. It can never replicate the moment when you recorded, in the world, your friends talking in the record store.

ALLMUSIC:

A lot of innovation comes from the limitations of the process. Even the record you made, you were limited by the technology that was available at the time, and if you were to do it today, it probably wouldn’t be

Claire Evans - The Computer Accent (2022)
Claire highlighting pages of AI generated lyrics – The Computer Accent (2022)

the same. A band like Beat Happening wouldn’t sound the same in a fancy studio. Early hip-hop is also based on this limitation of process. I wonder if these things could be replicated with machine learning?

CLAIRE L. EVANS: 100% agree with that. Again, the machine learning systems could continue to iterate based on existing forms, which had emerged from limitations and constraints, but it could not create new forms from those same constraints. All good art comes from limitations, and is made by people who are questing beyond their means. That moment of striving beyond your capacities or your tools, trying to do something that transcends your situation, or what you have access to. That’s the most beautiful human gesture. When you have everything, though, you have nothing.

ALLMUSIC: It’s mentioned in the documentary about how Chain Tripping didn’t get a lot of press. Did people not understand what you were doing with the record as far as the concept? Do you feel like people weren’t fully ready to respond to it, and if it came out today, maybe the culture would be ready?

CLAIRE L. EVANS: It’s always hard to know. When the press ignores something it’s like, is it me? Or is it the idea? We did go through a real shitstorm in the press that probably got us blacklisted from coverage on some music websites. I don’t think people were then, or are now, super keen to see us as conceptual artists, even though we’ve been doing weird projects for 20 years.

That moment of striving beyond your capacities or your tools, trying to do something that transcends your situation, or what you have access to. That’s the most beautiful human gesture. When you have everything, though, you have nothing.

Our residual public image is as an indie-pop, indie sleaze band. I guess artists are always trying to escape their personas. It is what it is. I’m totally happy making the art I like to make. The people that like it like it. That’s all I really care about.

It was probably too soon, even though we felt like it was too late at the time. We thought we’d missed the window. It’s not like people weren’t talking about AI. Artists were engaging with the subject, and they had much clearer narratives around it than we did. We chose a process that was interesting to us, but it was not a super compelling process narratively. And the capacity that the music press has to communicate this subject is limited. Because the closer you get to AI, the more it’s just complicated, expensive, boring math. That’s not as fun as saying that an AI is your new bandmate, your clone, or coming for your job.

I don’t know when a good time is. The film is out now, and I’ve been struggling to figure out how to talk about it, because I still feel like it’s too soon, in a way. Now the film is a time capsule of a moment in time—not long ago, but a million years in AI years. The old DFA Records motto was “too old to be new, too new to be classic.” I think that’s kind of where we are now. I’m interested to see how people respond to it. I will say that far more people are interested in talking about it with me now. I get inquiries all the time, and I feel like I’m like a veteran of some old AI guard.

ALLMUSIC: How do you feel about the record now?

CLAIRE L. EVANS: I love the record. I think it’s the best music we’ve ever made. Partially because it liberated us from our own persona. I think we had this idea that we had to be a certain kind of band, before that record. We had to make certain kinds of songs for certain contexts, to maintain a reputation as a poppy fun electropop party band. Dropping a weird burner album with a lot of slow, formally experimental songs—and I think it’s quite restrained, too, more subtle than anything we’ve ever made before—was so freeing.

ALLMUSIC: The fact that you have been a band for so long is a testament to something for sure.

CLAIRE L. EVANS: People in the AI music world, back when we were making Chain Tripping, would say stuff to us like, “with AI, you’ll have a fourth bandmate that never drinks, never parties, never gets into trouble, and always agrees with what you say.” That’s not at all what I found interesting. What I found interesting was that we had this interface that could generate ideas, and we felt perfectly fine dismissing those ideas, because there was no ego involved.

Yacht Live - The Computer Accent (2022)
Yacht Live – The Computer Accent (2022)

And because we were all working towards making an album within this conceptual framework, we were only interested in finding the ideas that would most suit the song. Normally when you work with a group of people, as I’m sure you know, people get attached to whatever they brought to the table, and they can have a hard time letting go of things, even when they don’t serve the end. But we had this model issuing ideas and we could just be like, “nope, nope, nope, nope, nope—okay, that one’s good.” All without feeling like we had to coddle anyone. It actually made us better collaborators because we were all united in working towards a common goal, if that makes sense.

ALLMUSIC: Right: you can’t hurt the computer’s feelings. One thing that drives me crazy is the oversimplification of complex ideas. My feeling on the current state of AI is that “AI” is being used as a blanket term for everything now. Before it was sometimes “the cloud,” or “the algorithm.”

CLAIRE L. EVANS: Right. There’s not just one AI. There are a million different models, with different training data, different scopes, different powers wielding them to different ends. What are we actually talking about?

ALLMUSIC: I think about the lack of and loss of humanity with it. You still have to make your decisions. It has to do with the intent and process and decision making. What is lost when you let a machine do it all for you? Versus using it as a tool, like it’s a calculator, a drum machine, or whatever.

CLAIRE L. EVANS:

I think we’re in a proof of concept stage in the culture right now. People are doing AI projects just to show that they can be done. Someone could use AI, say, to generate a comic book. But ultimately, that won’t be as interesting as an artist using AI to generate a comic, then taking a panel from that comic and painting it, or using that as a prompt to do something else, say, write a novel, or a screenplay. What I mean is that there needs to be something else in the daisy chain. I think we’ll get to that point, where artists are integrating AI tools into a larger process, and they won’t think of it as anything different from using a synthesizer or using Photoshop. It won’t all be about the tool. That’s my hope. But that kind of reasoned, thoughtful integration of a new tool into a larger creative vision is slow work, and this technology is moving very quickly. I’m not sure we’ll make it in time. Every time I open the internet it’s full of generated dreck, generated language and imagery, thrown into the works by hustlers and hucksters. We’re all competing with so much volume, trying to pierce the signal.



Source link