science

23782 readers

453 users here now

A community to post scientific articles, news, and civil discussion.

rule #1: be kind

founded 2 years ago

MODERATORS

m3t00@lemmy.world

Joleee@lemmy.world

laverabe@lemmy.world

DeadPand@midwest.social

laverabe@lemmy.zip

Mapping the Podcast Ecosystem with the Structured Podcast Research Corpus (arxiv.org)

submitted 1 day ago by FactChecker@lemmy.world to c/science@lemmy.world

0 comments fedilink hide all child comments

Podcasts provide highly diverse content to a massive listener base through a unique on- demand modality. However, limited data has prevented large-scale computational analysis of the podcast ecosystem. To fill this gap, we introduce a massive dataset of over 1.1M pod- cast transcripts that is largely comprehensive of all English language podcasts available through public RSS feeds from May and June of 2020. This data is not limited to text, but includes metadata, inferred speaker roles, and audio fea- tures and speaker turns for a subset of 370K episodes. Using this data, we conduct a founda- tional investigation into the content, structure, and responsiveness of this ecosystem. Together, our data and analyses open the door to contin- ued computational research of this popular and impactful medium.

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here