Quick Perl script: Scrape your photos from Twitpic

I've been following Noah Everett the creator of twitpic on Twitter. Although he provides a great service it does appear to be a one man show - there is every chance he could get fed up and walk away - I doubt the terms of service give the user any way of getting their photos back if that happened.

Most people use twitpic for throw-away snapshots posted from their handheld device but I use a twitter account with protected updates to create a timeline of my daughter growing up. I photograph her day to day on my iPhone and as my twitter client doesn't offer an option, the photos end up on twitpic. These photos are memories I am not prepared to lose.

Below is the script i run once a week to ensure my photos are backed up.


use strict;
use utf8;

use LWP::Simple;
use XML::Simple;
use Data::Dumper;
use Date::Parse;
use Date::Format;
use XML::DOM;

my $content = get("http://twitpic.com/photos/darrenferguson/feed.rss");
die "Couldn't get it!" unless defined $content;

my $xml = XMLin($content);

foreach my $item (@{$xml->{channel}->{item}}) {
    
    my $link = $item->{link};
    my $date = $item->{pubDate};
    
    $date = str2time($date);
    $date = time2str('%Y-%m-%d_%H-%M-%S_', $date);
    
    $link =~ m|.*/(.*$)|;
    my $id = $1;
    
    
    my $html = get($link);
    
    
    if(defined $html) {
        while($html =~ s|(<img.*?>)||ism) {
            my $tag = $1;
            if($tag =~ m|class\=\"photo-large\"|) {
                $tag =~ m|src\=\"(.*?)\"|;
                my $src = $1;
                my $img_data = get($src);
                
                if(defined $img_data) {           
                    my $fn = "$date-$id.jpg";
                    
                    if(!(-f $fn)) {
                        open FILE, ">$date-$id.jpg";
                        binmode FILE;
                        print FILE $img_data;
                        close FILE;
                    }
                }
            }
        }
    }
}
Leave a comment