Podcast: Inventory. Librarians to the Rescue

Starting today, podcasts distributed under the random:stream section have a new home. This short tale — with some technical details and a surprise ending — explains why and how.

For nearly 7 years, I’ve been using the popular service SoundCloud to distribute audio files for podcasts hosted on the Polish-language version of this site. But after yesterday’s conversation with an AI agent, I let myself be persuaded to switch to a new model.

Before diving into the specifics, I’d like to share some technical knowledge about how this whole podcasting business works.

What’s the deal with podcasts?

When we want to publish audio content online, we need a platform to which we can upload episodes. That will take care of storing, presenting and distributing them.

From a technical point of view, it is a cheerful bunch of computers connected to the Internet all the time. On the disks of these computers we will find a server application that creates the content visited, as well as multimedia material presented to visitors by that server app.

Storing is simply putting sound files on the disks of computers connected all the time to the global web. It is from there that future editions of our work will be downloaded. In addition, these services often transcode the files uploaded by users, e.g. changing them to a more standard format. When we post high-quality 500 MB WAV file from a digital voice recorder, the service can turn it into an MP3 version of 25 MB that listeners will be able to download much quicker.
Presenting is based on the fact that the episodes we post appear on the web service of the chosen platform, along with their descriptions, graphics and a player so anyone can listen to them. This is an ad hoc website being created, where the published podcasts are shown. The aforementioned player will indirectly (through the file serving service) refer to the multimedia content repository mentioned earlier.
Distribution involves spreading on the Internet an up-to-date list of episodes with links to the files (usually in MP3 format) and additional links to the aforementioned sites responsible for the presentation. This is done using the so-called RSS feeds, which are web resources that comply with the standard called Really Simple Syndication. Such a feed is identified by a URL that looks like a “web page address”, but instead of leading to content intended for human reading, it leads to a file in XML format intended to be read by applications installed on computers or smartphones.

If it were not for the last step, our podcasts would be associated with only one “broadcaster”, and if we wanted to use several distributors, we would have to manually post each episode to their platforms. The RSS standard lets the source platform list new material, and other services can create podcast pages with embedded players — ready to play.

Redistribution

The RSS feed file contains titles, descriptions and cover graphics of upcoming episodes, but most importantly URL references to audio files visible on the web, which are “lying” on our source platform.

It’s no magic, this is what the included snippet of the episode description looks like, along with a link to the MP3 file:

1<item>
2  <title>random:self – #002 – Agile, Lean, Organic startup</title>
3  <pubDate>Sun, 16 Jul 2023 17:00:00 +0200</pubDate>
4  <enclosure
5    type="audio/mpeg"
6    url="https://randomseed.pl/pod/randomself-002.mp3" length="0" />
7 […]
8</item>


&lt;item&gt;
  &lt;title&gt;random:self – #002 – Agile, Lean, Organic startup&lt;/title&gt;
  &lt;pubDate&gt;Sun, 16 Jul 2023 17:00:00 +0200&lt;/pubDate&gt;
  &lt;enclosure
    type=&#34;audio/mpeg&#34;
    url=&#34;https://randomseed.pl/pod/randomself-002.mp3&#34; length=&#34;0&#34; /&gt;
 […]
&lt;/item&gt;

Usually the platform of our choosing, responsible for presenting our podcasts, also creates a descriptive RSS feed of all episodes and makes it available under some URL. That way, we don’t have to worry about technical details. YouTube, Spotify, SoundCloud and many other similar services do this. Once we have the feed URL, we can register it with other platforms.

For example, by publishing in SoundCloud, we are able to quickly create distribution channels on Apple Podcasts, Spotify or YouTube. Paste the channel URL, and those platforms will generate their own versions of the podcast page and periodically check our source RSS for new episodes. If a new episode has appeared, the intermediary platform can inform the subscribers, trigger a notification, and so on.

Usually the big platforms are both source providers and distributors. It is up to us which one we choose as a base for presenting the podcast, and which services will use our RSS to download the content hosted there and present it at home.

Larger intermediary platforms that distribute podcasts will mostly download the files indicated in our RSS feed and make copies of them, so that listener satisfaction doesn’t depend on an external data source. Smaller ones will display a web-based media player that directly streams episodes from the URLs listed in the RSS.

Another interesting case is podcast listening apps, available for both computers and mobile devices. They bypass content presentation services and download podcast episodes from the addresses found in the RSS feeds.

Storage costs

Returning to the case of random:stream, or podcasts of this service. Years ago, I chose SoundCloud as a distribution platform because it gives you a lot of control over the quality of your uploads, has a good player, and generates RSS feed for you.

I would put each new episode — such as the random:press — there, as a high-quality audio file, describe it, and the service would make the MP3 version and metadata appear in the RSS feed, the URL of which was posted on the site. In addition, I inserted a SoundCloud player on the episode pages of my site.

After a while, I wished I had more control over the shape of the XML file that contains the RSS data, so I built a suitable template and started distributing RSS on my own, treating SoundCloud as a “storehouse” for audio files. Occasionally I would get so-called subs on SoundCloud itself, but my main audience was visitors to the site and listeners from other services (e.g. Spotify).

It looked like I was publishing to SoundCloud and then manually posting to the website references to MP3 versions of episodes stored on this site. The RSS generator (running within my site) was collecting this information and placing it in the channel file. From there, other services could use them and make nice playlists.

Everything worked smoothly, until one day I exceeded my total broadcast minutes on SoundCloud, and had to choose a paid plan. Somehow I ended up on the most fully loaded plan — and later on, an even more loaded one. At that point, the price started to feel a bit much.

The problem was not that I should get more for the price, but that I was getting features I didn’t need. SoundCloud is an artist community and music distribution platform. In my case, this distribution role was taken over by me years ago, while SoundCloud served me as an MP3 storage and media player inserted into episode pages. I also didn’t use the mastering option added later, as my productions are mainly voice-over.

It turned out I wasn’t the ideal customer, i.e. one willing to pay for mechanisms to distribute content and build a community around published music. I asked the AI agent about possible alternatives and then realized that I had actually put most of the effort into self-distribution a few years back.

As for presentation — meaning the media player — that part solved itself because a few months ago I configured the YouTube channel I own to take the content of the RSS published on randomseed.pl/pod/ and based on that create a playlist with copies of all published episodes. So I had a player — all I had to do was embed it on the episode pages.

All that remained was a place to store files. And this is where they came to the rescue…

…The American Librarians!

Dear Reader, are you familiar with a service called Internet Archive? It’s an online treasure trove of content, initiated in 1996 by Brewster Kahle. For many years now, the service has been maintained by a non-profit organization dedicated to archiving Internet resources. Yes, those guys are basically making a backup of the whole Web.

What’s more, in addition to meticulously taking snapshots of web pages, there is also a campaign to archive the content of books, sound recordings and videos.

It turns out that by registering an account with Internet Archive, we gain access to a panel that allows us to archive digital content we have the rights to. For example, to our own podcasts!

Long-time netizens might feel a hint of nostalgia when using this digital library — parts of the interface feel straight out of the early 2000s. Just don’t expect blazing speeds: the Internet Archive isn’t built for high-performance transfers, so delays and downtime may happen.

After converting the podcasts to MP3, I uploaded them to the Archive and replaced the links to them across all podcast pages. Previously, these were URLs leading to the SoundCloud service, but now to the Internet Archive. In addition, I removed (actually disabled) SoundCloud’s existing embedded player, replacing it with a YouTube frame. This was possible because YouTube is one of the platforms that helps distribute my podcasts via RSS.

Epilogue

This small change made me realize how my perception of Internet services has changed over the years. I have turned from a technical user and enthusiast of the Internet into a consumer of digital goodness, with all the associated pros and cons.

Switching to another model was easy in this case, but the convenience of having someone handle the space for my sound files won out of sheer necessity.

I forgot to mention why I replaced the player with a more popular one. It can be noted that since I have a complete influence on the content of the site, I should rather use a locally-hosted player, for example one written in JavaScript, and enjoy full control.

The problem with the above is that I would then have to take care of the speed of transfers and guarantee user comfort.

Hosting files on a service meant for archiving was never designed to guarantee continuous availability and high download speeds. YouTube, on the other hand, is a massive platform that, first of all, isn’t likely to shut down anytime soon (which could happen to some “free hosting” or “best free media player” service, for example), secondly, it maintains a player all the time, and thirdly, it makes and hosts copies of all podcast episodes from my channel.

Noteworthy here is the issue of identifying the digital content that belongs to us. RSS gives us a way to point to our work online, so we don’t have to delegate this right to the operator of the access service to the files or the player provider. Content no longer depends on location, but still (in the sense of its availability) depends on the owners of network services. Maybe only the rise of Web3 and truly decentralized services will change that someday, but we at least have the opportunity to react and change platforms.

This is a kind of surprising twist of fate, because RSS technology had time to spread before podcasting became fashionable and profitable. If early podcasters and their listeners hadn’t embraced the open spirit of those early “Internet radio stations” we would probably have a situation where creators are doomed to initially selected platforms, and the content is technically under the control of the chosen service provider.

This small change - as in the case described above: a few new URLs, a different player, a different file hosting service - does not require a revolution, only the awareness that it can technically be done. In this way, we regain control over the form and availability of content. At this point, we are no longer just a user of the platform — we become the alternative

What’s the deal with podcasts?

Redistribution

Storage costs

Epilogue

Taxonomies: