TIL: manually getting YT music recap playlist through code and google takeout

i listen to a decent amount of music. in 2025 i listened to a lot of music in the first half because I had my ca final exams in may 2025 and i absolutely cannot self study without sountrack music. i listen to hans zimmer, ludwig, daft punk to name a few.

each year, i wait for the yt music recap feature because it gives me a playlist of the top 100 most listened songs of the year. a musical memory of sorts if i ever want to go down the nostalgia road in the future.

however this year - 2025 - i got the yt music recap slides but not the playlists. i waited, hoping the playlist would appear, but to no avail.

i remembered from earlier that there was this google website that lets you export all of your data - its called google takeout. i decided to use llms to manually make a playlist basis that data.

so i went to takeout and downloaded my data. i got 2 formats - json and html. json is the better language for machines and code but for some reason that i don’t know why the json file only went back to april 2025 history. i asked gemini 3 pro to give me the code, copy pasted it to a text doc and saved it as .py, i opened powershell, ran the code, expected it to take some time but there it was - the top 100 songs - the code ran within less than a second.

it felt truly magical. awe like.

but this wasn’t the whole truth (didn’t intend to but i realised i just wrote down the name of the currently in good demand health brand).

so i asked gemini to write me a code for the html file because it had the full 2025 year data. it used beautifulsoup library in the first try but the code took >5 minutes. so then i stopped it and told it and it changed the code to regex.

the regex code failed several times but all i had to do was paste the error/output back to gemini. it iterated. and there i had it finally. my yt music recap playlist of 2025.

the top 10 songs extract from the output - — Top 100 Songs (2025-01-01 to 2025-12-31) — Found 20434 valid plays in this date range.

Open Hearts - The Weeknd - Topic (60 plays)
FOILS - Ludwig Göransson - Topic (46 plays)
Fission - Ludwig Göransson - Topic (45 plays)
Groves - Ludwig Göransson - Topic (43 plays)
Kiss the Ring - Hans Zimmer - Topic (43 plays)
747 - Ludwig Göransson - Topic (43 plays)
Meeting Kitty - Ludwig Göransson - Topic (40 plays)
Lose My Mind (feat. Doja Cat) (From F1® The Movie) - Don Toliver - Topic (39 plays)
TRUCKS IN PLACE - Ludwig Göransson - Topic (39 plays)
THE ALGORITHM - Ludwig Göransson - Topic (39 plays)

whats amazing is how this got me into flow state. i feel contempt. not pleasure or happiness of sort but so relieved and contempt and motivated and awelike.

so much so that i decided it was time to add a TIL section to my personal website. the brain is weird.

edit - since the code worked, it was natural to play along. althought the data attributes i have aren’t a lot - i only have the song name and when it was played - there’s still a lot i can do.

first i also created a playlist without the soundtracks - fairly easy by asking the llm to remove songs by hans zimmer, ludwig and other instrumental artists. then the llm suggested a script to get -

top songs by each month - very interesting to see it. the brain is a victim of availability and recency bias and so actually looking at the data that defined you during that period of time is interesting and kind of cognitive dissonance like.
hourwise distribution of songs listened during. got a u-shaped distribution (peaks at the sides and zero at the middle. (basically im a night owl or atleast was during the study phase)
top 10 artists - top 4 matched with yt music recap slides, 5th didn’t. most probable explanation is that the data i have doesn’t take into account the amount of time i listened to a song while yt’s code must have that as well. which is ironic about how takeout is still contrained by google’s choice. maybe its for the better. i have no clue.

time to sleep.