Field Dispatch
Spotify has been hacked | Matt Connarton Unleashed
Speaker 1: This is something that popped up from Digital Musicnews dot com.
Speaker 1: Hackers scrape Spotify's entire library. I thought of this when
Speaker 1: I saw you know, we were talking earlier a little
Speaker 1: bit about Spotify and the importance of being on Spotify
Speaker 1: so that your music is discoverable, the term that we
Speaker 1: like to use in the industry. But hackers have scraped
Speaker 1: Spotify's entire library obtain three hundred terabytes worth of audio.
Speaker 1: Spotify says it has identified and disabled the nefarious user accounts.
Speaker 1: I hope they do a lot more than that. Just
Speaker 1: identifying and disabling those accounts is not going to fix
Speaker 1: the problem long term. I mean, if they did it once,
Speaker 1: they can do it again in theory. But it says
Speaker 1: here hackers say they've scraped Spotify's entire music library, compiling
Speaker 1: the metadata behind two hundred and fifty six million tracks
Speaker 1: two hundred and fifty six million tracks tied to over
Speaker 1: fifteen point four million artists profiles, and intend to make
Speaker 1: a massive amount of music available to torrent. Meanwhile, Spotify
Speaker 1: is acknowledged to the breach and confirmed that the culprit's
Speaker 1: accessed some of the platform's audio files. Some of the
Speaker 1: platforms audio files and quotes. So Spotify acknowledging I mean,
Speaker 1: you know, fifteen point actually two hundred and fifty six million,
Speaker 1: that's some of the platform's audio files. I wonder how
Speaker 1: many actual audio files do you ever wonder about that?
Speaker 1: Like how much actual music is on Spotify? How many
Speaker 1: audio files are on Spotify? Because we were talking about
Speaker 1: I remember a couple of years ago on the show
Speaker 1: now might have been two or three years ago, we
Speaker 1: were talking about how Spotify had announced that they were
Speaker 1: starting to remove some music from its platform because people
Speaker 1: were uploading you know, demos and really lo fi. I
Speaker 1: don't mean lo fi as in the style of recording.
Speaker 1: You know, you can you can record low fi music
Speaker 1: or music that sounds like it's supposed to be, you know,
Speaker 1: kind of like from another time even right, But I
Speaker 1: mean actual like low quality, like just poor recording quality music.
Speaker 1: People were just uploading, you know, stuff that they just
Speaker 1: just cheap demos that they'd made on their tape recorder
Speaker 1: or whatever, just uploading all of it to Spotify. And
Speaker 1: it got to a point where Spotify decided to go
Speaker 1: through and start and I don't know how they do it.
Speaker 1: I mean, I'm sure they have I'm sure they have
Speaker 1: the technology to detect these things, but actually went through
Speaker 1: and started removing some of that stuff. Because you know,
Speaker 1: if you think about how much music is being uploaded
Speaker 1: to these platforms like Spotify every single day, every single day,
Speaker 1: all over the world, people are uploading music, and it's
Speaker 1: not like you know, I think we tend to think
Speaker 1: of it as sort of this this infinite thing, like
Speaker 1: there's just this, you know, because it's because it's audio.
Speaker 1: There's just this infinite amount of space to put all
Speaker 1: this because it's not they're not physical objects. It's audio files.
Speaker 1: So you can just put as many audio files up
Speaker 1: as you want and it doesn't matter how much room
Speaker 1: it takes up. But it actually does matter how much
Speaker 1: rooma takes up because you have to keep building, you
Speaker 1: have to keep building more storage for all this music.
Speaker 1: It still takes up space digitally, so you can't just
Speaker 1: let it be a free for all. Part of what
Speaker 1: made me remember that is because I remember at the
Speaker 1: time when we talked about it on the show, some
Speaker 1: people were upset and it was another reason to be
Speaker 1: mad at Spotify. I guess people were upset saying, well,
Speaker 1: how can they just go through and remove stuff. Well, yeah,
Speaker 1: it's a little bit. First of all, a little bit
Speaker 1: of quality control is not a bad thing. You know,
Speaker 1: if people are just you know, getting the band together
Speaker 1: and sticking a tape recorder in the middle of the
Speaker 1: room and then taking that taking that recording and uploading Spotify,
Speaker 1: you don't necessarily want that on your platform, right, something
Speaker 1: something like that, you know. And also there's there's abuses
Speaker 1: that goes on that go on, and you know, people
Speaker 1: stealing other people's music and uploading it, and there's all
Speaker 1: kinds of things going on. So you know, I'm not
Speaker 1: saying that you should never be mad at Spotify about anything,
Speaker 1: but you know, but they have a right to police
Speaker 1: their own platform too a little bit and make those decisions. Anyway,
Speaker 1: that was a little bit of a side street, But
Speaker 1: the point is apparently two hundred and fifty six million tracks,
Speaker 1: that's only part of how much music is actually on Spotify.
Speaker 1: So let's see. So there's an update here again. This
Speaker 1: is from Digitalmusicnews dot Com. An update as of the
Speaker 1: twenty second It says after this piece was published. Well, no,
Speaker 1: let's go back, We'll go back to the update after
Speaker 1: let's look at the original article. First. Okay, so, the
Speaker 1: allegedly responsible hackers, part of a self described nonprofit project
Speaker 1: called Anna's Archive, themselves, disclosed the data heist in a
Speaker 1: blog post. In that lengthy post, drawing from the metadata,
Speaker 1: covers hard stats concerning duration, stream volume, popularity, genre at
Speaker 1: release date, and more regarding straight audio, Anna's Archive indicated
Speaker 1: that it had quote archived around eighty six million music files,
Speaker 1: representing around ninety nine point six percent of listens and
Speaker 1: clocking in at a little under three hundred terabytes in
Speaker 1: total size. A while ago, we discovered a way to
Speaker 1: scrape Spotify at scale. For now, this is a torrents
Speaker 1: only archive aimed at preservation, but if there is enough interest,
Speaker 1: we could add downloading of individual files to Anna's archived unquote,
Speaker 1: the hackers communicated. That's interesting because when they say that
Speaker 1: it's only again this is from the hackers, When they
Speaker 1: say that it's aimed at preservation, it sounds like they're
Speaker 1: they're trying to present this as an altruistic endeavor. We
Speaker 1: want to make sure that nothing ever happens to this music.
Speaker 1: We want to make sure again, I'm not please, I'm
Speaker 1: not justifying what they did. Don't misunderstand me. I'm just
Speaker 1: saying that this sounds like this is what they're presenting.
Speaker 1: You know, we want to make sure nothing happens to
Speaker 1: this music, you know, because in theory Spotify can remove
Speaker 1: anything anytime it wants, or an artists can remove anything
Speaker 1: anytime it wants, and then where does it go? What
Speaker 1: if they just delete it and then that's it. It's
Speaker 1: just gone. It's not available anywhere else, right, So they're
Speaker 1: trying to preserve all this music. That's that's what That's
Speaker 1: what the hackers have communicated, it says, you're Unsurprisingly, Spotify
Speaker 1: and especially rights holders, have plenty to say about those plans,
Speaker 1: as noted by Third Chair where had Yayov Zimmerman Yahov Zimmerman. However,
Speaker 1: whatever takedowns and illegal actions follow, the damage is already done. Technically.
Speaker 1: Anna's archive claims that it doesn't host any copyrighted materials,
Speaker 1: instead purportedly indexing metadata that is already publicly available, direct
Speaker 1: hosting or not. Some of the project's supporters are lamenting
Speaker 1: the Spotify circumvention and the possibility that will quote ruin
Speaker 1: the actual important literary archive unquote by encouraging aggressive litigation.
Speaker 1: Zimmerman wrote, quote the data is circulating on P two
Speaker 1: P networks, and there is no putting the back, putting
Speaker 1: this back in Pandora's box. Anyone can now, in theory,
Speaker 1: create their own personal free version of Spotify All Music
Speaker 1: up to twenty twenty five with enough storage and a
Speaker 1: personal media streaming service like Plex. The only real barriers
Speaker 1: are copyright law and fear of enforcement unquote you know,
Speaker 1: and I'm sure I'm not the only one who this
Speaker 1: has occurred, to to whom this has occurred, you know
Speaker 1: what this reminds me of. We're going back. We're going back. Jeez,
Speaker 1: how many years? Thirty years, maybe twenty five years? Napster.
Speaker 1: This reminds me of Napster when Napster first became a thing,
Speaker 1: file sharing, peer to peer, file sharing online, and the
Speaker 1: whole music industry freaked out because all of a sudden,
Speaker 1: everything was free online through Napster. Now, obviously a lot's
Speaker 1: changed since then, and the music industry had to figure
Speaker 1: out ways to adjust to the new reality and the
Speaker 1: Internet and all of it. But when I see this,
Speaker 1: anyone can now in theory create their own personal free
Speaker 1: version of Spotify All Music up to twenty twenty five
Speaker 1: with enough storage, and a personal media streaming server like PLEX.
Speaker 1: The only real barriers are copyright law and fear of enforcement.
Speaker 1: Let's see, there's a little do we have time. Oh yeah,
Speaker 1: we have time. There's a little bit more here. Oh sorry, okay,
Speaker 1: here we go. Perhaps more pressingly, in the AI age,
Speaker 1: the massive collection of audio could theoretically be used to
Speaker 1: train generative generative models and fuel additional unauthorized sound alike outputs,
Speaker 1: a particularly significant issue if the involved platforms are based
Speaker 1: in countries with inadequate IP protections. I'm telling you, in
Speaker 1: some ways, this is napster all over again. This is
Speaker 1: napster in the AI age. It says here. One section
Speaker 1: of the Anna Archives site says, quote, it is well
Speaker 1: understood that lms, which is large learning models like you know, CHATCH,
Speaker 1: GPT and others that suck up all this information. Okay,
Speaker 1: it is well understood that lms thrive on high quality data.
Speaker 1: We have the largest collection of books, papers, magazines, et
Speaker 1: cetera in the world, which are some of the highest
Speaker 1: quality text sources. Unquote. According to the same site, Anna's
Speaker 1: archive promptly put out the metadata and with three hundred
Speaker 1: terabytes worth of audio files, releasing an order of popularity.
Speaker 1: In other words, the full extent of the episode's fallout
Speaker 1: remains to be seen. And as initially mentioned as Spotify
Speaker 1: confirmed the unauthorized access, but not where things go from
Speaker 1: here in a detailed light. In a detailed light statement, Yeah,
Speaker 1: they didn't really say much. The Spotify spokesperson said, quote,
Speaker 1: an investigation into unauthorized access identified that a third party
Speaker 1: scraped public metadata and used illicit tactics to circumvent DRM
Speaker 1: to access some of the platform's audio files. We are
Speaker 1: actively investigating the incident quote. And again, as I pointed
Speaker 1: out earlier, what is missing from that statement nothing about
Speaker 1: how we will prevent this from happening again.
Podbean