Wednesday, June 01, 2011

Backup Grooveshark Playlist #2

I just realised that if you copy a URL from the browser address bar on Grooveshark there is a # in there and if you rip this full URL the HTML does not contain the playlist.

To get around this I have written a second script that will clean the URL so that it will be identical to a Grooveshark URL link (ie one that someone sends to you or posts online).

New script;

# Pull the name of the playlist from the URL

playlistname=`echo $1 | sed "s/\// /g" | awk '{print "Grooveshark_"$5}'`

# Clean the URL and grab the HTML from the clean URL

cleanURL=`echo $1 | sed "s/#\///"`

wget -O $playlistname.html ${cleanURL}

# Now parse the HTML and extract the songs, then strip the HTML tags

awk '/Songs on Playlist/, $NF ~ /noscript/' $playlistname.html | sed -e :a -e 's/<[^>]*>//g;/ $playlistname

The script is executed the same as before

Back up grooveshark playlist

I've made a script that will back up any Grooveshark playlist and make both a plain text and a html copy (the latter can be opened in a browser and it will be instantly loaded by grooveshark).

The commands are;

# Pull the name of the playlist from the URL

playlistname=`echo $1 | sed "s/\// /g" | awk '{print $4}'`


# Grab the HTML from the URL


wget -O $playlistname.html $1


# Now parse the HTML and extract the songs, then strip the HTML tags


awk '/Songs on Playlist/, $NF ~ /noscript/' $playlistname.html | sed -e :a -e 's/<[^>]*>//g;/ $playlistname

cat $playlistname


Put these commands inside a file (called for example "gsbackup.sh") and make it executable

chmod u+x gsbackup.sh

then run it with the URL of the grooveshark playlist

./gsbackup.sh http://grooveshark.com/playlist/Adam+Curtis/54597002?src=5

This should produce files Adam+Curtis and Adam+Curtis.html

To convert playlists from iTunes, Spotify and LastFm into Grooveshark I recommend the Groovylists service