<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>The Party Line</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/" />
    <link rel="self" type="application/atom+xml" href="http://blog.tobez.org/atom.xml" />
    <id>tag:blog.tobez.org,2010-01-06://3</id>
    <updated>2009-08-26T20:55:41Z</updated>
    <subtitle>tobez&apos;s personal weblog</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type 5.01</generator>

<entry>
    <title>Scraping ASP.NET sites with Perl</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2009/08/scraping-aspnet-sites-with-perl.html" />
    <id>tag:blog.tobez.org,2009://3.109</id>

    <published>2009-08-26T19:37:37Z</published>
    <updated>2009-08-26T20:55:41Z</updated>

    <summary>Today at work I needed to locate and extract, automatically, some information from a website. There was no direct URL to the information I needed, some fields had to be filled and some POST forms had to be submitted. Normally...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="aspnet" label="ASP.NET" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="perlmodules" label="Perl modules" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="perl" label="perl" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="proxy" label="proxy" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p>Today at work I needed to locate and extract, automatically, some
information from a website.</p>

<p>There was no direct URL to the information I needed,
some fields had to be filled and some POST forms had to be
submitted.</p>

<p>Normally I would use
<a href="http://search.cpan.org/dist/WWW-Mechanize/">WWW::Mechanize</a>
for such a task,
but in this particular instance the situation was made
somewhat less managable because the site in question
was implemented with ASP.NET.</p>

<p>The problem with this is that every link
has an associated JavaScript event handler
which does some housekeeping,
assigns things to funnily named
hidden input fields like <code>__EVENTTARGET</code> and <code>__EVENTARGUMENT</code>
and then POSTs a form.</p>

<p>My first thought was to try and find a CPAN module which
handles those complications.
Not surprizingly, there is one, aptly named
<a href="http://search.cpan.org/dist/HTML-TreeBuilderX-ASP_NET/">HTML::TreeBuilderX::ASP_NET</a>.</p>

<p>According to its documentation, the module works in combination with
the standard
<a href="http://search.cpan.org/dist/libwww-perl/lib/LWP/UserAgent.pm">LWP::UserAgent</a>
and <a href="http://search.cpan.org/dist/HTML-Tree/">HTML::TreeBuilder</a>, and
converts ASP.NET JavaScript posting redirects into an
<a href="http://search.cpan.org/dist/libwww-perl/lib/HTTP/Request.pm">HTTP::Request</a>
object which can be fed to LWP::UserAgent&#8217;s request() method.
Just what the doctor ordered.</p>

<p>However, it turned out that my joy was a bit premature:</p>

<ul>
<li>it requires Perl 5.10, which we do not yet have on our production
systems;</li>
<li>documentation is incomplete and inaccurate at times - it insists
naming its httpRequest() method as httpResponse();</li>
<li>it fails its own tests, not only on two machines I have tried to run
them, but also on a lot of other systems according to CPAN Testers.</li>
</ul>

<p>After a bit of pondering I decided that spending time
on trying to fix the HTML::TreeBuilderX::ASP_NET module
is a bit counter-productive - I needed the working code soon.</p>

<p>So what to do?</p>

<p>One thing we should keep in mind is that those JavaScript postbacks
do not do anything fancy.
The hidden fields that are filled in depend on what was clicked
on the page, nothing else.
After they are filled, a normal POST occurs.</p>

<p>So if we know what to POST, we could just use WWW::Mechanize
and get the job done easily and quickly.</p>

<p>So the solution naturally splits into two parts - finding out
what fields to set, and automating the process.</p>

<p>The first part is to launch a browser, do clicking and entering
by hand, and capture what gets POSTed at each step.
This capturing could be done by a variety of methods:</p>

<ul>
<li>tcpdump/<a href="http://www.wireshark.org/">wireshark</a> - listen to &#8216;em on the wire!</li>
<li>having a proxy which outputs the POSTed parameters;</li>
<li>using a browser extension that shows POSTed parameters.</li>
</ul>

<p>I have chosen the second option,
since I had a script similar to what I need already,
and since it is easy to filter out any parameters which
I did not want to see, like __VIEWSTATE,
which can easily be several kilobytes long.</p>

<p>Enter spyproxy.pl:</p>

<pre><code><span class="hi_comment">#! /usr/bin/perl</span>
<span class="hi_key1">use</span> strict;
<span class="hi_key1">use</span> warnings;
<span class="hi_key1">use</span> HTTP::Proxy;
<span class="hi_key1">use</span> CGI;

<span class="hi_key1">my</span> $proxy = HTTP::Proxy-&gt;<span class="hi_key2">new</span>(host =&gt; <span class="hi_string">"localhost"</span>);
$proxy-&gt;logmask(<span class="hi_number">32</span>); <span class="hi_comment"># 32 - FILTERS</span>
$proxy-&gt;push_filter(
        request =&gt; Spy::BodyFilter-&gt;<span class="hi_key2">new</span>(),
);
$proxy-&gt;start;

<span class="hi_key1">package</span> Spy::BodyFilter;
<span class="hi_key1">use</span> base <span class="hi_key3">qw</span>(HTTP::Proxy::BodyFilter);

<span class="hi_key1">sub</span> will_modify { <span class="hi_number">0</span> }

<span class="hi_key1">sub</span> filter
{
    <span class="hi_key1">my</span> ($me, <span class="hi_key2">undef</span>, $req) = @_;
    <span class="hi_key2">print</span> $req-&gt;method, <span class="hi_string">" "</span>, $req-&gt;uri, <span class="hi_string">"\n"</span>;
    <span class="hi_key1">return</span> <span class="hi_key1">unless</span> $req-&gt;method <span class="hi_key3">eq</span> <span class="hi_string">"POST"</span>;
    <span class="hi_key1">my</span> $body = $req-&gt;content;
    <span class="hi_key1">my</span> $<span class="hi_key3">q</span> = <span class="hi_key2">new</span> CGI($body);
    <span class="hi_key1">for</span> <span class="hi_key1">my</span> $p ($<span class="hi_key3">q</span>-&gt;param) {
        <span class="hi_key1">next</span> <span class="hi_key1">if</span> $p <span class="hi_key3">eq</span> <span class="hi_string">"__VIEWSTATE"</span>;
        <span class="hi_key2">print</span> <span class="hi_string">"$p\n\t"</span>, $<span class="hi_key3">q</span>-&gt;param($p), <span class="hi_string">"\n"</span>;
    }
}
</code></pre>

<p>Launch it locally in a terminal, set your browser&#8217;s
proxy settings to <code>localhost:8080</code>, and watch the output
in the terminal.</p>

<p>The second part of the puzzle is to use the wonderful
<a href="http://search.cpan.org/dist/WWW-Mechanize-Shell/">WWW::Mechanize::Shell</a>.
It provides an interactive shell, in which we
can issue GET requests, see the content of the responses,
view links, forms, and form fields with their values,
follow the links, set the value of the fields, click
on buttons and submit the forms.
Best of all, after getting what we are after
we can issue a <code>script</code> command and get a
piece of Perl code that will perform all
the tasks we&#8217;ve just done.</p>

<p>So the final solution looks like this:</p>

<ol>
<li>Load the start page in your browser (through the <code>spyproxy</code>).</li>
<li>Load the same page in WWW::Mechanize::Shell.</li>
<li>In the browser, fill in any fields that need filling, and click where you want.</li>
<li>Observe the spyproxy output, note any fields that need setting.
In a typical <code>ASP.NET</code> application, you will want to ignore
the vast majority of the fields at any given moment.  Don&#8217;t
worry, humans are good at this sort of pattern recognition.  :-)
Pay special attention to <code>__EVENTTARGET</code> and <code>__EVENTARGUMENT</code>
fields.</li>
<li>Set the same fields to the same values in the shell
(use <code>value fieldname fieldvalue</code>).</li>
<li>If <code>__EVENTTARGET</code> was set, type <code>submit</code> in the shell;
otherwise, find the name of the button that was pressed
(see step 4), and type <code>click buttonname</code> in the shell;</li>
<li>Examine the content of the response (<code>content</code> in the shell)
to make sure that what you&#8217;ve got in the shell makes sense.</li>
<li>If more clicking and entering is to be done, go to step 3.</li>
<li>Type <code>script script-name.pl</code> in the shell.</li>
<li>Go edit script-name.pl - remove any prints you do not
need, change constants you entered in the fields with
variables where needed.</li>
<li>Your custom scraping script is ready to use.</li>
<li>&#8230;</li>
<li>Profit!</li>
</ol>

<p>I hope this trick will be of use to somebody.
Enjoy!</p>
]]>
        

    </content>
</entry>

<entry>
    <title>port-tags on github</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2009/03/port-tags-on-github.html" />
    <id>tag:blog.tobez.org,2009://3.82</id>

    <published>2009-03-15T19:26:49Z</published>
    <updated>2009-03-15T19:29:51Z</updated>

    <summary><![CDATA[Some years ago I&#8217;ve made a little web application which allowed one to browse FreeBSD ports collection by tags, &agrave; la delicious. The tags were not created by users but were instead generated from a couple of fields taken from...]]></summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="FreeBSD" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="freebsd" label="freebsd" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="freebsdports" label="freebsd ports" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="git" label="git" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="github" label="github" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="perl" label="perl" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="porttags" label="port-tags" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="ports" label="ports" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="woblomo" label="woblomo" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p>Some years ago I&#8217;ve made <a href="http://www.tobez.org/port-tags/">a little
web application</a> which <a href="http://www.tobez.org/about-port-tags.html">allowed one
to browse</a> <a href="http://www.freebsd.org/">FreeBSD</a>
<a href="http://www.freebsd.org/ports/">ports collection</a>
by tags, &agrave; la <a href="http://delicious.com/">delicious</a>.</p>

<p>The tags were not created by users but were
instead generated from a couple of fields
taken from every
port&#8217;s <code>Makefile</code>,
so it was not exactly a &#8220;social&#8221; software.</p>

<p>There was some <a href="http://www.freebsd.org/cgi/getmsg.cgi?fetch=487425+0+/usr/local/www/db/text/2005/freebsd-ports/20051113.freebsd-ports">limited</a> <a href="http://www.freebsd.org/cgi/getmsg.cgi?fetch=834919+0+/usr/local/www/db/text/2005/freebsd-ports/20051120.freebsd-ports">amount</a>
of discussion on FreeBSD mailing lists, and a publicly
accessible readonly SVN repository
was created by my friend <a href="http://blog.droso.org/">Erwin</a>,
but the overall interest was rather low.</p>

<p>Over time I moved on and basically stopped
working on the project,
but recently I had an idea - not exactly to
re-surrect it, but to make it more
easy for people who are interested to
contribute.</p>

<p>Enter <a href="http://github.com/tobez/port-tags/tree/master"><code>port-tags</code> at github</a>.
<a href="https://github.com/">Github</a> is a tool
to host <a href="http://git-scm.com/">git</a> repositories
of your open-source projects.
Anybody can easily clone your repository,
fork it completely, or submit their
changes back to you.
I only started using it today,
so I cannot say much about its features
and how convenient they are,
but from what I&#8217;ve heard,
it is very very nice.</p>

<p>So, if you are interested,
and have got <a href="http://en.wiktionary.org/wiki/round_tuit">round tuits</a> to spare,
please hack on <code>port-tags</code> - maybe
some good will eventually come out of it.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Unknown CPAN III: File::SortedSeek</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2009/03/unknown-cpan-iii-filesortedseek.html" />
    <id>tag:blog.tobez.org,2009://3.81</id>

    <published>2009-03-13T22:34:05Z</published>
    <updated>2009-03-13T22:40:26Z</updated>

    <summary>Let&#8217;s suppose that you have a huge logfile and would like to quickly extract lines from it that relate to a given small time interval. How would you do it? Since the lines are ordered by time specification, the fastest...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Hints" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="cpan" label="CPAN" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="perlmodules" label="Perl modules" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="unknowncpan" label="Unknown CPAN" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="logfiles" label="logfiles" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="perl" label="perl" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="woblomo" label="woblomo" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p>Let&#8217;s suppose that you have a huge
logfile and would like to quickly
extract lines from it that relate
to a given small time interval.
How would you do it?</p>

<p>Since the lines are ordered by time specification,
the fastest way (provided you do not
keep indexes of any sort) is to do
the old good binary search,
doing all necessary housekeeping to
account for line boundaries
and converting the <a href="http://search.cpan.org/~muir/Time-modules/lib/Time/ParseDate.pm">timestamp from whatever
format it is in the logfile</a>
to epoch seconds for comparison with the
target interval boundaries.</p>

<p>Since you are dealing with <a href="http://www.perl.org/">Perl</a> here,
it would be natural to first look
on <a href="http://www.cpan.org/">CPAN</a> for a module which somebody
else has already written to do just this.</p>

<p>And of course somebody has.
Enter <a href="http://search.cpan.org/~jfreeman/File-SortedSeek/lib/File/SortedSeek.pm">File::SortedSeek</a> by Dr. James Freeman.
The module interface is a bit weird,
so it pays off to read the documentation
carefully.</p>

<p>At any rate, here is a complete program that
handles the task, assuming that the timestamp
(in pretty much any format)
is at the beginning of each line of the logfile:</p>

<pre><code>
<span class="hi_comment">#! /usr/bin/perl</span>
<span class="hi_key1">use</span> strict;
<span class="hi_key1">use</span> warnings;
<span class="hi_key1">use</span> Getopt::Long;
<span class="hi_key1">use</span> File::SortedSeek;
<span class="hi_key1">use</span> Time::ParseDate;

<span class="hi_key1">my</span> ($from, $to);
usage() <span class="hi_key1">unless</span> GetOptions(<span class="hi_string">"from=s"</span> =&gt; \$from, <span class="hi_string">"to=s"</span> =&gt; \$to);
usage() <span class="hi_key1">unless</span> @<span class="hi_key1">ARGV</span> == <span class="hi_number">1</span>;
$from = parsedate($from) <span class="hi_key1">if</span> $from;
$to   = parsedate($to)   <span class="hi_key1">if</span> $to;

<span class="hi_key1">my</span> $filename = <span class="hi_key2">shift</span>;

<span class="hi_key2">open</span> L, <span class="hi_string">"&lt; $filename"</span> <span class="hi_key3">or</span> <span class="hi_key2">die</span> <span class="hi_string">"unable to open $filename: $!\n"</span>;
File::SortedSeek::set_silent(<span class="hi_number">1</span>);

<span class="hi_key1">my</span> $end = File::SortedSeek::numeric(*L, $to, \&amp;time2sec)   <span class="hi_key1">if</span> $to;
<span class="hi_key1">my</span> $beg = File::SortedSeek::numeric(*L, $from, \&amp;time2sec) <span class="hi_key1">if</span> $from;
$end ||= <span class="hi_number">0</span>;  $beg ||= <span class="hi_number">0</span>;
<span class="hi_key1">while</span> (&lt;L&gt;) {
    <span class="hi_key2">print</span>;
    $beg += <span class="hi_key2">length</span>($_);
    <span class="hi_key1">last</span> <span class="hi_key1">if</span> $end &amp;&amp; $beg &gt; $end;
}

<span class="hi_key1">sub</span> usage
{
    <span class="hi_key2">print</span> <span class="hi_key1">STDERR</span> &lt;&lt;EOF;
usage:
\t$<span class="hi_number">0</span> --from date-<span class="hi_key2">time</span> [--to date-<span class="hi_key2">time</span>] filename
\t$<span class="hi_number">0</span> -f date-<span class="hi_key2">time</span> [-t date-<span class="hi_key2">time</span>] filename
EOF
    <span class="hi_key2">exit</span> <span class="hi_number">1</span>;
}

<span class="hi_key1">sub</span> time2sec
{
    <span class="hi_key1">my</span> $line  = <span class="hi_key2">shift</span>;
    <span class="hi_key1">return</span> <span class="hi_key2">undef</span> <span class="hi_key1">unless</span> <span class="hi_key2">defined</span> $line;
    <span class="hi_key1">my</span> $r = parsedate($line, FUZZY =&gt; <span class="hi_number">1</span>);
    $r;
}
</code></pre>

<p>Nifty, eh?</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Unknown CPAN II: Time::ParseDate</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2009/03/unknown-cpan-ii-timeparsedate.html" />
    <id>tag:blog.tobez.org,2009://3.77</id>

    <published>2009-03-09T16:39:09Z</published>
    <updated>2009-03-10T11:14:22Z</updated>

    <summary>There is a number of very good, but not very well known Perl modules on CPAN. Sometimes I&#8217;ll be writing short posts about such modules which I use and appreciate. When you are dealing with date and time in Perl,...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Hints" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="cpan" label="CPAN" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="perlmodules" label="Perl modules" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="unknowncpan" label="Unknown CPAN" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="date" label="date" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="perl" label="perl" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="time" label="time" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="woblomo" label="woblomo" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p><em>There is a number of very good, but not very well
known <a href="http://www.perl.org/">Perl</a> modules on <a href="http://www.cpan.org/">CPAN</a>.</em></p>

<p><em>Sometimes I&#8217;ll be writing short posts about
such modules which I use and appreciate.</em></p>

<p>When you are dealing with date and time in Perl,
inevitably you will reach a point
when you need to do more than is immediately
available through <a href="http://perldoc.perl.org/perlfunc.html">Perl builtins</a> and
the <a href="http://perldoc.perl.org/POSIX.html"><code>POSIX</code></a> module.</p>

<p>Then you try to find a module for what you
want on CPAN, and you drown in literally
hundreds of modules dealing with dates and times.</p>

<p>Luckily, there is a clear winner in this &#8220;modules war&#8221; -
everybody (or at least everybody sane) recommends
to use the <a href="http://search.cpan.org/~drolsky/DateTime/lib/DateTime.pm"><code>DateTime</code></a> module,
and for the things that it cannot do,
various other modules from the <a href="http://search.cpan.org/search?query=datetime">same namespace</a>.</p>

<p>So life is bright for a perl programmer on the date/time front,
until you have a need to parse a date represented
in one of a multitude of &#8220;human-readable&#8221; formats,
and you don&#8217;t know in advance which one it is going to be.</p>

<p>The <code>DateTime</code> itself cleverly refuses to deal with
this task <em>at all</em>, and instead recommends to
use one of the <a href="http://search.cpan.org/search?query=datetime%3A%3Aformat&amp;mode=all"><code>DateTime::Format::</code></a>
modules.</p>

<p>You will be relieved to know that you can easily
and quickly create parsers for your own date formats -
that is, if you are able to remember that you should
use the module aptly named
<a href="http://search.cpan.org/~drolsky/DateTime-Format-Builder/lib/DateTime/Format/Builder/Parser/Regex.pm"><code>DateTime::Format::Builder::Parser::Regex</code></a>.</p>

<p>The documentation for <a href="http://search.cpan.org/~jhoblitt/DateTime-Format-Bork/lib/DateTime/Format/Bork.pod"><code>DateTime::Format::Bork</code></a>
is also very enlightening.</p>

<p>Aaaaanyway. </p>

<p>I prefer to go against the flow here,
and use a module somewhat unfortunately
named <a href="http://search.cpan.org/~muir/Time-modules/lib/Time/ParseDate.pm"><code>Time::ParseDate</code></a>.
I mean, it could just as easily be <code>Date::ParseTime</code> or something,
right?
Worse, for years I had trouble remembering what <em>distribution</em>
this modules comes from
(it, very obviously for everyone but me,
can be found in the <a href="http://search.cpan.org/~muir/Time-modules/"><code>Time-modules</code></a> distribution).</p>

<p>At any rate,
if we forget for a second about the funny names,
this module is truly a wonder:</p>

<pre><code>$ perl -MTime::ParseDate -le 'print parsedate("Sat Feb 14 00:31:30 2009")'
1234567890
$ perl -MTime::ParseDate -le 'print parsedate("2 days ago")'
1236443283
$ perl -MTime::ParseDate -le 'print parsedate("18:30")'
1236619800
</code></pre>

<p>It exports a single function, which takes a single parameter
(unless you want to specify some options which are rarely needed in practice),
and you get your epoch seconds back in return.
Very simple, very elegant, gets the job done.
I wish there were more &#8220;straight to the point&#8221; modules like this one.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Unknown CPAN I: Sys::RunAlone</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2009/03/unknown-cpan-i-sysrunalone.html" />
    <id>tag:blog.tobez.org,2009://3.76</id>

    <published>2009-03-07T13:50:59Z</published>
    <updated>2009-03-07T13:55:52Z</updated>

    <summary>There is a number of very good, but not very well known Perl modules on CPAN. Sometimes I&#8217;ll be writing short posts about such modules which I use and appreciate. There is a common task of executing a script from...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Hints" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="cpan" label="CPAN" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="perlmodules" label="Perl modules" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="unknowncpan" label="Unknown CPAN" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="cron" label="cron" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="locking" label="locking" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="perl" label="perl" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="woblomo" label="woblomo" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p><em>There is a number of very good, but not very well
known <a href="http://www.perl.org/">Perl</a> modules on <a href="http://www.cpan.org/">CPAN</a>.</em></p>

<p><em>Sometimes I&#8217;ll be writing short posts about
such modules which I use and appreciate.</em></p>

<p>There is a common task of executing a script
from <a href="http://www.freebsd.org/cgi/man.cgi?cron">cron</a> periodically,
subject to the following conditions:</p>

<ul>
<li>a script can occasionally run a relatively long
time (longer than the interval at which it
is launched by cron);</li>
<li>such long runs do not happen often;</li>
<li>running two or more instances of the
script at the same time will lead to
all sorts of strange things happening
and must be avoided;</li>
<li>skipping a single run is no big deal.</li>
</ul>

<p>A usual method to prevent strange things
from happening
is to use a lockfile, like in this example:</p>

<pre><code>use strict;
use warnings;

use Fcntl qw(:DEFAULT :flock);

sysopen(L, "/var/run/myprocess.lock", O_WRONLY | O_CREAT)
    or die "cannot open lockfile: $!";
flock(L, LOCK_EX | LOCK_NB)
    or die "cannot obtain lock: $!";

# ... do work here ...

unlink "/var/run/myprocess.lock"; # optional
close(L);
</code></pre>

<p>(You might want to silently exit when flock fails
in order to not spam yourself with useless cron
mails).</p>

<p>Nothing fancy, really.
But over time it becomes boring to write all this housekeeping
code in every little cronjob,
especially since in some cases the &#8220;do work here&#8221; part
can be comparable in size to the locking part.</p>

<p>The solution, of course, is to use the CPAN magic
and to find a module which sweeps all this complexity under
the rug, leaving us with a clean and simple interface,
so that we can concentrate on getting the job done.</p>

<p>As usual with CPAN, there is not one, but <a href="http://search.cpan.org/search?query=lockfile">several modules</a>
which were written to perform this task.
Most of them are rather powerful,
which is unfortunate, since we want simplicity of use above
all else.
The bells and whistles provided by those modules
might be needed in certain situations,
but for the purpose described above they just get in the way.</p>

<p>There is, however, a wonderfully simple (and a rather clever)
module by <a href="http://search.cpan.org/~elizabeth/">Elizabeth Mattijsen</a>,
<a href="http://search.cpan.org/~elizabeth/Sys-RunAlone/lib/Sys/RunAlone.pm"><code>Sys::RunAlone</code></a>.
To get the same functionality as the code above,
all you need to do is this:</p>

<pre><code>use strict;
use warnings;
use Sys::RunAlone;

# ... do work here ...

__END__
</code></pre>

<p>That&#8217;s it.  Nothing else to write.</p>

<p>There are only two minor things to remember about this module,
if you want to avoid problems.</p>

<p>First, it uses the script&#8217;s <code>DATA</code> handle to do the locking
(that is, it actually uses the script&#8217;s file itself).
So if you have several symlinks pointing to the same
script, you cannot run them at the same time for
it is still one physical file and one <code>DATA</code> handle.</p>

<p>Second, and for the same reason,
if you modify the script while it
is running and then launch it again,
it will fail to
detect that another instance is already running,
since the <code>DATA</code> handle will be different.</p>

<p>Just keep this in mind when you use it.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>&quot;Idiots can vote too&quot;</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2009/02/idiots-can-vote-too.html" />
    <id>tag:blog.tobez.org,2009://3.70</id>

    <published>2009-02-16T09:49:51Z</published>
    <updated>2009-03-01T12:07:39Z</updated>

    <summary>My blood pressure was quickly raised by this: http://cpanratings.perl.org/user/dandv. The gist of his so called reviews: &#8220;I did not use this module, but Catalyst has switched from it to something else, hence I rate it with 1 star out of...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Rants" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p>My blood pressure was quickly raised by this: <a href="http://cpanratings.perl.org/user/dandv">http://cpanratings.perl.org/user/dandv</a>.</p>

<p>The gist of his so called reviews: &#8220;I did not use this module, but Catalyst has switched from it to something else, hence I rate it with 1 star out of 5.  Avoid.  Use Moose&#8221;.</p>

<p>WTF??  Whatever has happened to TIMTOWTDI?  Who <strong>is</strong> this guy?</p>

<p><font size="-2"><sup>1</sup> The title of this post shamelessly taken from kaare&#8217;s remark on #cph.pm</font></p>
]]>
        

    </content>
</entry>

<entry>
    <title>Stupid code examples in documentation</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2008/06/stupid-code-examples-in-documentation.html" />
    <id>tag:blog.tobez.org,2008://3.69</id>

    <published>2008-06-25T10:21:17Z</published>
    <updated>2009-03-01T12:07:39Z</updated>

    <summary><![CDATA[Dear maintainer of Spreadsheet::ParseExcel! Please remove the $sheet-&gt;{MaxCol} ||= $sheet-&gt;{MinCol}; statement from the loop over spreadsheet rows in the example at the top of the module documentation. People just cut and paste this into their code, which is pointless. This...]]></summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Rants" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p>Dear <a href="http://www.szabgab.com/">maintainer</a> of <a href="http://search.cpan.org/dist/Spreadsheet-ParseExcel/">Spreadsheet::ParseExcel</a>!</p>

<p>Please remove the</p>

<pre><code>$sheet-&gt;{MaxCol} ||= $sheet-&gt;{MinCol};
</code></pre>

<p>statement from the loop over spreadsheet rows in the example at the top
of the module documentation.  People just cut and paste this into their
code, which is pointless.</p>

<p>This madness goes far - I even saw this code in a presentation at
<a href="http://cph.pm.org/">the local Perl Mongers group</a> technical meeting.</p>

<p>Take pity on poor wasted electrons, <a href="http://www.urbandictionary.com/define.php?term=k+thx">K THX</a>.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Perl, maps, and geocaching</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2008/05/perl-maps-and-geocaching.html" />
    <id>tag:blog.tobez.org,2008://3.68</id>

    <published>2008-05-29T10:05:18Z</published>
    <updated>2009-03-01T12:07:39Z</updated>

    <summary>Being inspired by Edmund von der Burg&#8217;s talk at the recent Nordic Perl Workshop in Stockholm, Henrik, Lars, and myself started to play with Open Street Maps. Some work-related goodness will probably come out of it. Meanwhile, we played with...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Fun" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p>Being inspired by <a href="http://conferences.yapceurope.org/npw2008/talk/1219">Edmund von der Burg&#8217;s talk</a> at
the recent <a href="http://conferences.yapceurope.org/npw2008/index.html">Nordic Perl Workshop in Stockholm</a>,
<a href="http://www.hamster.dk/">Henrik</a>, <a href="http://www.thegler.dk/">Lars</a>, and myself started to play
with <a href="http://wiki.openstreetmap.org/index.php/Main_Page">Open Street Maps</a>.</p>

<p>Some <a href="http://www.telia.dk/">work-related</a> goodness will probably
come out of it.  Meanwhile, we played with mapping
the <a href="http://www.geocaching.com/">caches</a> we&#8217;ve <a href="http://www.tobez.org/images/stockholm-caches.png">found in Stockholm</a> during the
workshop.</p>

<p>From this map I can deduce three things:</p>

<ul>
<li>we need a life;</li>
<li>the Open Street Maps community in Stockholm has not reached a critical mass yet;</li>
<li>central Stockholm needs more regular geocaches.</li>
</ul>

<p>And now - back to the scheduled silence.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Version-independent location of a CPAN distribution&apos;s Changes file</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2007/12/version-independent-location-of-a-cpan-distributions-changes-file.html" />
    <id>tag:blog.tobez.org,2007://3.65</id>

    <published>2007-12-20T11:43:39Z</published>
    <updated>2009-03-01T12:07:36Z</updated>

    <summary>Some time ago several people (most notably skv@) ranted about including a list of changes or a link to such list in the commit message for a port update. I thought it was a great idea and started including a...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="FreeBSD" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Hints" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p>Some time ago several people (most notably skv@) ranted about including a 
list of changes or a link to such list in the commit message for a port
update.</p>

<p>I thought it was a great idea and started including a link to a <a href="http://www.cpan.org/">CPAN</a>&#8217;s 
distribution Changes file in my commits some time ago.</p>

<p>What I did not like was that those links looked like this:</p>

<p><a href="http://search.cpan.org/src/JESSE/Template-Declare-0.27/Changes">http://search.cpan.org/src/JESSE/Template-Declare-0.27/Changes</a></p>

<p>FreeBSD&#8217;s commit messages are preserved in our repository and mail archives
forever, for a suitable definition of &#8220;forever&#8221;.  On the other hand, CPAN
authors are encouraged to clean up old and obsolete versions promptly.</p>

<p>Thus there is a discrepancy between expected time of life of the link in the
commit message and the link contents.</p>

<p>While older CPAN distributions can still be found on 
<a href="http://backpan.cpan.org/">BackPAN</a>,
it only provides links to tarballs and not
individual files like Changes.</p>

<p>Luckily, it turns out that version-less links like</p>

<p><a href="http://search.cpan.org/dist/Template-Declare/Changes">http://search.cpan.org/dist/Template-Declare/Changes</a></p>

<p>work just fine, redirecting to the most recent version of the file.  This is 
acceptable, since Changes is expected to be a prepend-only file, so the
information the commit message was trying to link to can (almost) always be
found there.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Backing up Google Reader subscriptions as OPML, periodically and automatically</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2007/10/backing-up-google-reader-subscriptions-as-opml-periodically-and-automatically.html" />
    <id>tag:blog.tobez.org,2007://3.64</id>

    <published>2007-10-31T21:07:42Z</published>
    <updated>2009-03-01T12:07:36Z</updated>

    <summary>A fellow former Bloglines user has asked me whether I found a way to backup Google Reader subscriptions into an OPML file from cron, as we used to do with our Bloglines accounts. A quick search turned up this, which,...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="FreeBSD" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Hints" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Web" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p>A fellow former <a href="http://www.bloglines.com/">Bloglines</a> user has asked me whether
I found a way to backup <a href="http://reader.google.com/">Google Reader</a> subscriptions
into an <a href="http://en.wikipedia.org/wiki/OPML">OPML</a> file from <a href="http://www.freebsd.org/cgi/man.cgi?query=cron&amp;apropos=0&amp;sektion=8&amp;format=html">cron</a>, as we used to
do with our Bloglines accounts.</p>

<p>A quick search turned up <a href="http://googlesystem.blogspot.com/2007/08/blogroll-powered-by-google-reader.html">this</a>,
which,
from the look of it,
in order for it to work
requires every feed to be
explicitly marked with a tag which
is set up as public.</p>

<p>This by itself is rather cumbersome,
and you have to remember to do that for every new
feed you subscribe to,
otherwise you&#8217;ll defeat the purpose of making
periodic backups in the first place.</p>

<p>Luckily, there is a better solution.
There is a nice little module on <a href="http://www.cpan.org/">CPAN</a>,
<a href="http://search.cpan.org/~gray/WebService-Google-Reader/">WebService::Google::Reader</a> by <a href="http://search.cpan.org/~gray/">gray</a>,
which uses an unofficial <a href="http://code.google.com/p/pyrfeed/wiki/GoogleReaderAPI">Google Reader API</a>
to do various nifty things with your Google Reader
subscription,
<em>including</em> OPML export.</p>

<p>This means that after installing the module
you can simply put the following command
into your crontab (only command itself is shown,
see <a href="http://www.freebsd.org/cgi/man.cgi?query=crontab&amp;apropos=0&amp;sektion=5&amp;format=html">crontab(5) manual page</a> to find
out what else you will want to put in there):</p>

<pre><code>env GOOGLE_USERNAME=your-username-typically@gmail.com \
  GOOGLE_PASSWORD=your-user-password \
  perl -MWebService::Google::Reader -e \
  'print WebService::Google::Reader-&gt;new(
     username =&gt; $ENV{GOOGLE_USERNAME},
     password =&gt; $ENV{GOOGLE_PASSWORD})-&gt;opml' \
  &gt; /where/to/put/greader.opml
</code></pre>

<p>You will have to make the above to be one long line to satisfy
crontab syntax, and of course remember to use a real username,
password, and the path to the resulting OPML file.</p>

<p>Unfortunately, the most recent version
of the module (which is 0.03 at the time of this writing)
has a minor bug which prevents
the <code>opml()</code> method from working correctly.
So you will need to do a little patching.</p>

<p>Before installing the module,
edit the source file <code>lib/WebService/Google/Reader/Constants.pm</code>,
look for a string <code>subscribtions</code>,
and fix the spelling
(finding correct spelling is left as an exercise for the reader).
Then proceed installing the module as usual.</p>

<p>Hopefully, this step won&#8217;t be necessary in a couple of days&#8217; time
when a new version of the module is released.</p>

<p>If you are a <a href="http://www.freebsd.org/">FreeBSD</a> user like myself,
you may choose instead to
<a href="http://www.tobez.org/download/p5-WebService-Google-Reader-0.03.tgz">fetch a skeleton of the port of the module</a>.
Unpack it in <code>/usr/ports/www/</code>
and install it as you would any other port.</p>

<p>I intend to add the port to the ports collection as soon
as our <a href="http://www.freebsd.org/cgi/mid.cgi?id=20071030120901.GM45185%40droso.net&amp;db=mid">current ports freeze</a> is over.</p>

<p>Enjoy!</p>
]]>
        

    </content>
</entry>

<entry>
    <title>YAPC::EU 2007 DBIx::Perlish slides</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2007/09/yapceu-2007-dbixperlish-slides.html" />
    <id>tag:blog.tobez.org,2007://3.63</id>

    <published>2007-09-16T21:07:11Z</published>
    <updated>2009-03-01T12:07:34Z</updated>

    <summary>I finally uploaded slides of my talk about DBIx::Perlish I gave at YAPC::EU 2007. Well, better late than never. Enjoy....</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p>I finally uploaded <a href="http://www.tobez.org/presentations/dbix-perlish-yapc-eu-2007/index.html">slides of my talk</a>
about <a href="http://dbix-perlish.tobez.org/">DBIx::Perlish</a> I gave at <a href="http://vienna.yapceurope.org/ye2007/">YAPC::EU 2007</a>.</p>

<p>Well, better late than never.  Enjoy.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>DBIx::Perlish - Updates, deletes, inserts</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2007/08/dbixperlish-updates-deletes-inserts.html" />
    <id>tag:blog.tobez.org,2007://3.62</id>

    <published>2007-08-30T07:01:03Z</published>
    <updated>2009-03-01T12:07:33Z</updated>

    <summary>This is a post in a series about DBIx::Perlish Perl module. Previous posts in the series are linked at the bottom. Not all queries are made to request the data; you will also want to be able to update, delete,...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p><em>This is a post in a <a href="http://blog.tobez.org/?p=44">series</a>
about <a href="http://dbix-perlish.tobez.org/">DBIx::Perlish</a> Perl module.
Previous posts in the series are linked at the bottom.</em></p>

<p>Not all queries are made to request the data;
you will also want to be able to <em>update</em>, <em>delete</em>,
and <em>insert</em> data.</p>

<p>So the query language that DBIx::Perlish implements
also provides <code>db_update</code>, <code>db_delete</code>, and <code>db_insert</code>
functions.</p>

<p>The <code>db_update</code> and <code>db_delete</code> functions are in fact very similar
to <code>db_fetch</code>, with which you are already familiar.</p>

<p>Let&#8217;s start with <code>db_update</code>:</p>

<blockquote>
<pre><code>db_update {
   my $u : users;
   $u-&gt;id == 42;  # filter

   $u-&gt;first_name = "Ford";
   $u-&gt;last_name  = "Prefect";
};
</code></pre>
</blockquote>

<p>You can see that a query sub that <code>db_update</code>
takes has all the same elements as a query sub
taken by <code>db_fetch</code>, with an addition of
an assignment to update column values.
Quite natural, eh?</p>

<p>One thing to remember is that you cannot use a <code>return</code> statement
with <code>db_update</code> - it has no meaning!</p>

<p>It can become quite tedious to write all the individual
assignments if the number of columns to be updated is not
small.  Luckily, there is a shortcut:</p>

<blockquote>
<pre><code>db_update {
   my $u : users;
   $u-&gt;id == 42;  # filter

   $u = {
      first_name =&gt; "Ford",
      last_name  =&gt; "Prefect",
      age        =&gt; $u-&gt;age + 1,
   };
};
</code></pre>
</blockquote>

<p>That is, you can just use hash reference syntax.</p>

<p>You might have noticed that in the last example an
expression involving <code>age</code> column was used to update
it.  That works as you would expect it will.</p>

<p>Using <code>db_delete</code> is equally simple.</p>

<blockquote>
<pre><code>db_delete {
   users-&gt;id &lt; 20;
};
</code></pre>
</blockquote>

<p>Using <code>return</code> did not make sense with <code>db_update</code>.
This is also the case with <code>db_delete</code>.</p>

<p>Moreover, assignments which you were using with <code>db_update</code>
make no sense neither with <code>db_delete</code> nor with <code>db_fetch</code>.</p>

<p>Another thing you would want to do, inserting the data into
a table, is done differently.  Unlike updates and deletes,
inserts do not require a query language since they are so
simple.</p>

<p>So if you want to insert a row into a table, you do it with a simple hashref:</p>

<blockquote>
<pre><code>db_insert 'users', {
   id         =&gt; 42,
   first_name =&gt; "Ford",
   last_name  =&gt; "Prefect",
};
</code></pre>
</blockquote>

<p>Nothing fancy here.</p>

<p><em>To be continued.</em></p>

<p><a href="http://blog.tobez.org/?p=44">First post, <em>Getting rid of SQL from Perl code</em></a> <br />
<a href="http://blog.tobez.org/?p=45">Second post, <em>A walk through an example</em></a> <br />
<a href="http://blog.tobez.org/?p=46">Third post, <em>Many happy returns</em></a>  </p>
]]>
        

    </content>
</entry>

<entry>
    <title>DBIx::Perlish - many happy returns</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2007/08/dbixperlish-many-happy-returns.html" />
    <id>tag:blog.tobez.org,2007://3.61</id>

    <published>2007-08-28T14:27:36Z</published>
    <updated>2009-03-01T12:07:33Z</updated>

    <summary>This is a post in a series about DBIx::Perlish Perl module. Previous posts in the series are linked at the bottom. Coming back to the original example, my $uid = 42; my @r = db_fetch { my $u : users;...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p><em>This is a post in a <a href="http://blog.tobez.org/?p=44">series</a>
about <a href="http://dbix-perlish.tobez.org/">DBIx::Perlish</a> Perl module.
Previous posts in the series are linked at the bottom.</em></p>

<p>Coming back to the original example,</p>

<blockquote>
<pre><code>my $uid = 42;
my @r = db_fetch {
    my $u : users;

    $u-&gt;groups_id == groups-&gt;id;
    $u-&gt;id == $uid;
};
</code></pre>
</blockquote>

<p>One more thing to note about this, before moving on:
when you specify more than one filtering expression,
there is an implicit <code>and</code> between them - so that
only those rows which satisfy <em>all</em> expressions
will match.</p>

<p>What more the language does to be useful?</p>

<p>Needless to say, all the normal binary operators are supported,
so your expectation that arithmetics work is not destroyed:</p>

<blockquote>
<pre><code>my $uid = 40;
my @r = db_fetch {
    users-&gt;id == $uid + 2;
}
</code></pre>
</blockquote>

<p>String concatenation also works, including interpolation:</p>

<blockquote>
<pre><code>my @r = db_fetch {
    my $u : users;
    return "$u-&gt;first_name $u-&gt;last_name";
}
</code></pre>
</blockquote>

<p>This last example shows that DBIx::Perlish
interpolates columns represented as method calls
in a string.  This would normally not work in a &#8220;normal&#8221;
Perl.</p>

<p>If you are paying attention,
you can see that there is another new thing in
the last example, namely, the presence of the <code>return</code>
statement.</p>

<p>If there is no <code>return</code> statement in the query sub,
all columns will be returned;
it is completely analogous to SQL&#8217;s <code>SELECT *</code> statement.</p>

<p>So you use the <code>return</code> statement when you want to control
what is returned by the query.
In this case it will be the concatenation of the column <code>first_name</code>,
a space character, and the column <code>last_name</code> from the table <code>users</code>.</p>

<p>How the query result is returned?</p>

<p>There are four possibilities;
which one is chosen depends on what the <code>return</code> statement,
if any, specifies, and what context (list or scalar) the <code>db_fetch</code>
is being used in.</p>

<p>If you call <code>db_fetch</code> in a scalar context,
you indicate that you are only interested in <em>one</em> row
of the resulting data set.  Using list context you
indicate that you want all rows back.</p>

<p>So the context controls how many <em>rows</em> you want back.</p>

<p>On the other hand, if you specify a <code>return</code> statement
with precisely one value, you indicate that you are only
interested in one particular column of the resulting data
set.  The absence of the <code>return</code> statement or a <code>return</code>
statement with a list of values indicates that you want
several columns back.</p>

<p>So the return statement controls how many <em>columns</em> within
a row you want back.</p>

<p>Combining two possibilities for rows with two possibilities
for columns, you get four possible combinations.</p>

<p>Let&#8217;s illustrate these possibilities with examples.</p>

<ol>
<li>Scalar context, single-value return:</li>
</ol>

<blockquote>
  <p>my $name = db_fetch { return user->name };
     print &#8220;The name of some user is $name\n&#8221;;</p>
</blockquote>

<p>You get one row with one column back - a single
scalar value.</p>

<ol>
<li>List context, single-value return:</li>
</ol>

<blockquote>
  <p>my @names = db_fetch { return user->name };
     print &#8220;The names of all users are @names\n&#8221;;</p>
</blockquote>

<p>You get an array with column values back.</p>

<ol>
<li>Scalar context, more than one column returned:</li>
</ol>

<blockquote>
  <p>my $u = db_fetch { return user->id, user->name };
     print &#8220;The name of a user with id $u->{id} is $u->{name}\n&#8221;;</p>
</blockquote>

<p>Please note that you get a <em>hash reference</em> back with the keys
being the names of the columns returned.</p>

<ol>
<li>Finally, the most common case of the list context with
more than one column returned:</li>
</ol>

<blockquote>
  <p>my @u = db_fetch { return user->id, user->name };
     for my $u (@u) {
        print &#8220;$u->{id}:\t$u->{name}\n&#8221;;
     }</p>
</blockquote>

<p>Observe that you get an <em>array of hash references</em> back.</p>

<p>Now, you might ask, what will be the names of the columns in a case
like this:</p>

<blockquote>
<pre><code>my @r = db_fetch {
    my $u : users;
    return $u-&gt;id, "$u-&gt;first_name $u-&gt;last_name";
}
</code></pre>
</blockquote>

<p>One of the columns is obviously &#8220;id&#8221;, but what about the other one?
How do you refer to it in the result you&#8217;ve got?</p>

<p>The answer is - it is database-dependent and can be pretty much
anything.  So you ask the next question: &#8220;can I control that?&#8221;.</p>

<p>Yes, you can, by prepending the offending nameless column
with the name you want for it in the return statement:</p>

<blockquote>
<pre><code>my @r = db_fetch {
    my $u : users;
    return $u-&gt;id, full_name =&gt; "$u-&gt;first_name $u-&gt;last_name";
}
</code></pre>
</blockquote>

<p><em>More in the next post.</em></p>

<p><a href="http://blog.tobez.org/?p=44">First post, <em>Getting rid of SQL from Perl code</em></a> <br />
<a href="http://blog.tobez.org/?p=45">Second post, <em>A walk through an example</em></a></p>
]]>
        

    </content>
</entry>

<entry>
    <title>DBIx::Perlish - a walk through an example</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2007/08/dbixperlish-a-walk-through-an-example.html" />
    <id>tag:blog.tobez.org,2007://3.60</id>

    <published>2007-08-26T12:36:14Z</published>
    <updated>2009-03-01T12:07:33Z</updated>

    <summary>This is a second post of a series about DBIx::Perlish Perl module. Let us go through the example of DBIx::Perlish&#8217;s non-SQL language which I concluded the last post with: my $uid = 42; my @r = db_fetch { my $u...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p><em>This is a second post of a <a href="http://blog.tobez.org/?p=44">series</a>
about <a href="http://dbix-perlish.tobez.org/">DBIx::Perlish</a> Perl module.</em></p>

<p>Let us go through the example of DBIx::Perlish&#8217;s non-SQL language
which I concluded the last post with:</p>

<blockquote>
<pre><code>my $uid = 42;
my @r = db_fetch {
    my $u : users;

    $u-&gt;groups_id == groups-&gt;id;
    $u-&gt;id == $uid;
};
</code></pre>
</blockquote>

<p>I hope it is obvious to you what that snippet does,
but let&#8217;s go through it anyway.</p>

<p>First thing to note that it deals with two database tables,
<code>users</code> and <code>groups</code>.
The snippet also illustrates that there are two ways to refer
to a table - explicitly by name, as is done with
<code>groups</code>, and via aliasing a table to a local variable
using Perl attribute syntax.
This latter method is preferable when you have to refer
to a table more than once or twice.
This method is also
the only one with which you can make self-joins,
but I&#8217;ll talk about it later.</p>

<p>So the above snippet could have been written as</p>

<blockquote>
<pre><code>my $uid = 42;
my @r = db_fetch {
    users-&gt;groups_id == groups-&gt;id;
    users-&gt;id == $uid;
};
</code></pre>
</blockquote>

<p>as well as</p>

<blockquote>
<pre><code>my $uid = 42;
my @r = db_fetch {
    my $u : users;
    my $g : groups;

    $u-&gt;groups_id == $g-&gt;id;
    $u-&gt;id == $uid;
};
</code></pre>
</blockquote>

<p>There are more things to pay attention to in the example.</p>

<p>You refer to the individual columns using Perl method call
syntax.  Since it is common to use accessor methods with
Perl objects in order to get or set their properties,
it does not require a stretch of imagination to see
what <code>$u-&gt;groups_id</code> means.</p>

<p>The <em>declarative</em> semantics of the language is apparent
in how you specify what results you want back from the query:
individual comparison expressions are used as <em>filters</em> to
limit an unconstrained result.</p>

<p>Let&#8217;s look at a simpler example:</p>

<blockquote>
<pre><code>my @r = db_fetch {
   my $u : users;
}
</code></pre>
</blockquote>

<p>This will get you <em>all</em> rows from the <code>users</code> table.
If you only want users with an <code>id</code> over a hundred,
you would write:</p>

<blockquote>
<pre><code>my @r = db_fetch {
   users-&gt;id &gt; 100;
}
</code></pre>
</blockquote>

<p>If you were operating with a normal Perl array instead of a DB table,
you would probably do the same filtering in one of two
ways, a more familiar imperative one:</p>

<blockquote>
<pre><code>my @r;
for my $u (@users) {
   push @r, $u if $u-&gt;{id} &gt; 100;
}
</code></pre>
</blockquote>

<p>or a shorter declarative one, using <code>grep</code>:</p>

<blockquote>
<pre><code>my @r = grep { $_-&gt;{id} &gt; 100 } @users;
</code></pre>
</blockquote>

<p>So how the query language is used resembles declarative
<code>grep</code> more than imperative <code>for</code>.
But it is not news to you - after all,
SQL does precisely the same!</p>

<p><em>More to follow.</em></p>
]]>
        

    </content>
</entry>

<entry>
    <title>DBIx::Perlish - getting rid of SQL from Perl code</title>
    <link rel="alternate" type="text/html" href="http://blog.tobez.org/2007/08/dbixperlish-getting-rid-of-sql-from-perl-code.html" />
    <id>tag:blog.tobez.org,2007://3.59</id>

    <published>2007-08-23T19:03:14Z</published>
    <updated>2009-03-01T12:07:31Z</updated>

    <summary>This is a first installment of several posts about DBIx::Perlish Perl module. Database handling in your program consists of queries and &#8220;everything else&#8221;. DBI makes handling of the &#8220;everything else&#8221; part easy in Perl. It presents a unified interface to...</summary>
    <author>
        <name>tobez</name>
        
    </author>
    
        <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://blog.tobez.org/">
        <![CDATA[<p><em>This is a first installment of several posts
about <a href="http://dbix-perlish.tobez.org/">DBIx::Perlish</a> Perl module.</em></p>

<p>Database handling in your program consists of queries and
&#8220;everything else&#8221;.</p>

<p><a href="http://dbi.perl.org/">DBI</a> makes handling of the &#8220;everything else&#8221; part easy in <a href="http://www.perl.org/">Perl</a>.
It presents a unified interface to deal with
a lot (although not all) idiosyncrasies <a href="http://www.oracle.com/database/index.html">of</a>
<a href="http://www.postgresql.org/">various</a> <a href="http://www.sqlite.org/">numerous</a> <a href="http://www.mysql.com/">RDBMSes</a>.</p>

<p>But mostly, when doing DB programming,
you deal with queries.
Which DBI does not help you much with.</p>

<p>Which are done using <a href="http://en.wikipedia.org/wiki/SQL">SQL</a>.</p>

<blockquote>
<pre><code>SELECT * FROM users,groups
  WHERE
      users.groups_id = groups.id AND
      users.id = 42;
</code></pre>
</blockquote>

<p>Which is a <a href="http://en.wikipedia.org/wiki/Domain-specific_programming_language">domain-specific language</a>, from your point of
view as a Perl programmer.</p>

<p>DSLs are all the rage nowadays, and I won&#8217;t dispute their
advantages and usefulness.</p>

<p>But SQL is a very <em>large</em> DSL, as they go.</p>

<p>You are a smart programmer, so you learned SQL.
You might even appreciate the fact that it is a declarative
language, not unlike <a href="http://en.wikipedia.org/wiki/Prolog">Prolog</a>,
which definitely adds a bit of excitement
to the mundane task of writing your queries.</p>

<p>But every time you are writing Perl code that works
with databases you need to constantly switch mental gears
between Perl, the language you like (otherwise you
would not be reading this), and SQL,
the language very different from Perl.
This is taxing.  It costs you in lost productivity.</p>

<p>Even if you clearly separate in your code (and you should)
those parts dealing with DB handling and those that are not,
you still need to write those queries,
and to switch languages back and forth.</p>

<p>At best, SQL in your Perl code severely disrupts the code
flow by virtue of being a different language and thus
looking very different from Perl.</p>

<blockquote>
<pre><code>my $r = $dbh-&gt;selectall_arrayref(
   "SELECT * FROM users, groups
    WHERE
      users.groups_id = groups.id AND
      users.id = ?",
   {Slice=&gt;{}}, 42);
</code></pre>
</blockquote>

<p>So you clearly have a problem that needs a solution.</p>

<p>One of the possible solutions is to use an object-relational
mapper module such as
<a href="http://wiki.class-dbi.com/wiki/Main_Page">Class::DBI</a>,
or
<a href="http://search.cpan.org/dist/DBIx-Class/lib/DBIx/Class.pm">DBIx::Class</a>,
or
<a href="http://search.cpan.org/dist/Jifty-DBI/lib/Jifty/DBI.pm">Jifty::DBI</a>.</p>

<p>Such solutions have their merits.
There are many arguments in favour of them.
There are also <a href="http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx">arguments against them</a>.
I am not going to discuss them any further here,
I would only like to point out that
even if you use an object-relational mapper,
you still cannot completely avoid writing SQL:
sometimes because you need to construct a query
that your mapper of choice does not support
through its abstraction,
and sometimes for efficiency reasons;
the majority of the existing mappers do not
deal well with collections, and provide you
with shims to add your custom SQL code in
strategic places.</p>

<p>Another solution, the solution which I am trying to sell you
is to try to get away
from using SQL as a domain-specific language in your
Perl code altogether.</p>

<p>This involves creating <em>yet another</em> domain-specific language
specifically for doing database queries with a (nice for Perl
programmers) distinction that it actually looks very much like
Perl.  In fact, syntactically it is a <em>valid</em> Perl,
altough semantically it is still a declarative language
suitable for making relational database queries.</p>

<blockquote>
<pre><code>my $uid = 42;
my @r = db_fetch {
    my $u : users;

    $u-&gt;groups_id == groups-&gt;id;
    $u-&gt;id == $uid;
};
</code></pre>
</blockquote>

<p>More on this in a future post.</p>
]]>
        

    </content>
</entry>

</feed>
