Perl Jobs Auto-Emailer
August 3, 2007
Here's an interesting Perl mini application. A friend of mine who's somewhat jealous of the fact that I'm working for Discovery remotely from home wants to find a new Perl job. There's a great web site for that: http://jobs.perl.org, but the secret is out, and a lot of people check that site all day. He doesn't want to have to check it all day to get in front of the blizzards of resumes that get sent to each job. I figure this is a job for Perl.
What we want is a small application that runs every hour by cron, checks an RSS feed from the Perl jobs web site, looks for any new jobs posted, and emails a summary.
First, the usual (and mandatory) intro block followed by four useful CPAN modules:
#!/usr/bin/perl use strict; use warnings; use LWP::Simple 'get'; use XML::RSS; use List::MoreUtils 'any'; use Mail::Mailer;
I need some way to keep track of previously seen job postings so I don't send duplicate emails. So I'm going to have a text file with the links to the job posts seen since each link is unique. So when the application starts up, I need to load all the previously seen links into an array.
use constant LINKS => '/some/directory/links.log';
open( my $seen_links, '<', LINKS )
or die "Can't read from log $!";
my @links = map { chomp; $_; } ( <$seen_links> );
close($seen_links);
Now it's time to go fetch the RSS XML, parse it, and find all the jobs that are new. What I'm doing here is taking all the "items" in the RSS feed and checking their links against the links I've already seen. If I haven't seen a link yet, I'm going to store off the item in an array.
my $rss = XML::RSS->new();
$rss->parse( get('http://jobs.perl.org/rss/standard.rss') );
my @new_jobs = grep {
my $link = $_->{'link'};
not any { $link eq $_ } @links;
} @{ $rss->{'items'} };
Finally, the fun part. Once I have a list of new job postings (assuming there are new postings), I'm going to create an email and send it off with all the new job postings summarized.
if (@new_jobs) {
my $mailer = Mail::Mailer->new();
$mailer->open({
'From' => 'somebody@example.com',
'To' => 'user@example.com',
'Subject' => 'Job' . ( ( @new_jobs > 1 ) ? 's' : '' ),
}) or die "Can't open mail $!\n";
open( my $new_links, '>>', LINKS )
or die "Can't write to log $!";
foreach my $job (@new_jobs) {
my $attrib = sub {
return
$job->{'http://jobs.perl.org/rss/'}{ $_[0] }
|| 'Undefined';
};
print {$new_links} $job->{'link'}, "\n";
print {$mailer}
$job->{'title'}, "\n",
$job->{'link'}, "\n\n",
' Company: ', $attrib->('company_name'), "\n",
'Location: ', $attrib->('location'), "\n",
' Hours: ', $attrib->('hours'), "\n",
' Terms: ', $attrib->('employment_terms'), "\n",
' Posted: ', $attrib->('posted_date'), "\n\n\n";
}
close($new_links);
$mailer->close();
}
You'll notice that I use a closure just inside the foreach loop. I did that because I wanted to avoid writing the nested data structure call and "or undefined" stuff five times. I'm kind of OCD like that. I guess it would have been fine to leave it in the last print statement.
So that's it. Less than 50 lines of code. Of course, it needs comments, POD, and better error checking and reporting, but it'll do for something quick and dirty.
Comments (2)
Colin Meyer wrote:Alternatively, you can subscribe to the email list:
jobs-subscribe@perl.org
For a list of many, many Perl related email lists, check out lists.perl.org.
Good point; the list subscription will do roughly the same thing. However, switch the URL in my program to this...
http://jobs.perl.org/rss/telecommute.rss
...and you only get telecommute positions, which is what my program should have been doing from the beginning.
Of course, you can achieve the same thing by subscribing to the list and filtering with procmail or similar.
TMTOWDOI.
