Squirt: Removed Nofollow From Comments

For various reasons, I have removed nofollow from comments. Some of the reasons:

  1. nofollow goes against the principle of the web, I don’t want this site to be a blackhole
  2. I want to encourage discussions by “rewarding” those that comment
  3. Askimet rocks

Pulling Mail Out of Gmail And Retaining Your Labels

GmailIf you are fed up with Gmail and want to pull all your mail, here is how you do it. This technique was used on over 30 mail accounts so I’m sure it will work for you.

The problem of exporting your mail from Gmail is not a trivial one. From discussions by Opera Software’s lead QA for Opera Mail’s posting on Gmail’s Buggy IMAP Implementation to Matt Cutts’ posting on How to back up your Gmail on Linux in four easy steps to LifeHacker’s posting on Back up Gmail on Linux with Getmail to Wired’s wiki entry on Make a Local Backup Of Your Gmail Account, it seems that there is no single definitive source on how to pull your mail and retain your labels.

So here is what I’ve done to solve this problem:

  1. Use getmail – this has been the best archiver I’ve run across. There are other applications – isync, OfflineIMAP, Fetchmail, etc. – that probably do a decent job, but getmail is still the best in my view. There are other hacks – use Mail.app to synch the Gmail IMAP directory, then convert emlx to maildir; same for Thunderbird and mbox; etc – but we wanted something a little more straightforward – Occam’s razor, right?
  2. Install getmail – On my dev machine, I used macports (port install python25; port install getmail) to install the latest getmail which had dependencies on Python 2.5. After this was done, I set up the getmailrc config file and fired off an attempt using SimpleIMAPSSLRetriever… which failed due to a lack of SSL in the newly installed Python. I had to go back and install Readline (port install py25-readline), then install SSL for Python (port install py25-socket-ssl).
  3. Patch Python – There is a malloc bug in imaplib when fetching large documents using SSL. So open up imaplib.py from your Python lib dir (in my case /opt/local/lib/python2.5/) and replace:
    data = self.sslobj.read(size-read)

    with

    data = self.sslobj.read(min(size-read, 16384))

    to maintain a 15MB memory block if necessary.

  4. Configure getmail – Now that most of the fun is taken care of, we need to set up a configuration file for getmail (~/.getmail/getmailrc) and create the proper local destination. First the getmailrc file:
    [retriever]
    type = SimpleIMAPSSLRetriever
    server = imap.gmail.com
    mailboxes = ("[Gmail]/Starred",)
    username = username@yourdomain.com
    password = xxx
    
    [destination]
    type = Maildir
    path = ~/Maildir/
    
    [options]
    verbose = 2
    message_log = ~/.getmail/gmail.log

    First of all, we are using IMAP to retrieve mail as POP has a limit of 99 documents per access and that would take forever.

    Second, we are using the Maildir format for the destination so we need to make sure the target directories have been created (~/Maildir/cur, ~/Maildir/new, ~/Maildir/tmp).

    Third, we need to specify a mailbox or mailboxes to download or the INBOX will be the default.

    Fourth, we need a trailing comma on the list of mailboxes to download due to a parsing error in getmail (actually the mailboxes option needs to be a tuple, but the trailing comma negates that).

    Fifth, we need to know the syntax of Gmail’s internal IMAP structure to pull down discrete folders. Non-label folders (Starred, Sent Mail, Drafts, etc.) are accessed with “[Gmail]/Starred” (as in the above config) and labels are accessed directly. For example, the label “Important Project” would have this in the config:

    mailboxes = ("Important Project",)
  5. Download your Gmail – For every folder/label I had within Gmail, I downloaded to a separate folder so I could import into dovecot IMAP without hassle. This entailed changing the mailboxes option in getmailrc, running getmail, renaming Maildir to label/directory name, rinsing, repeating.
  6. Retain Times – Because maildir uses the modification time of every file to determine the sent date, all emails pulled by the above method will basically lose their sense of time. The below PHP script will restore the modification times:
/* VARS ***********************************************************/
$box = '';
$stem = SITE_DIR.'Maildir/'.$box.'/new/';
/******************************************************************/
 
$dir_contents = scandir($stem);
foreach($dir_contents as $item) {
  if(!ListFind('.,..,.DS_Store',$item)) {
    $file = $stem.$item;
    $content = file_get_contents($file);
    $date = extractText($content,"nDate: ","n");
    $utime = strtotime($date);
    $converted = date('YmdHi.s',$utime);
    shell_exec('touch -mt '.$converted.' "'.$file.'"');
  }
}
 
function extractText($content,$start,$end) {
  if(strripos($content,$start)===false) { return false; }
  $startpoint = strripos($content,$start)+strlen($start);
  $endpoint = strripos($content,$end,$startpoint);
  $length = $endpoint - $startpoint;
  return trim(substr($content,$startpoint,$length));
}
 
function ListDeleteAt($inList, $inPosition, $inDelim = ',') {
  $aryList = _listFuncs_PrepListAsArray($inList, $inDelim);
  array_splice($aryList, $inPosition-1, 1);
  $outList = join($inDelim, $aryList);
  return $outList;
}
 
function _listFuncs_PrepListAsArray($inList, $inDelim) {
  $inList = trim($inList);
  $inList = preg_replace('/^' . preg_quote($inDelim, '/') . '+/', '', $inList);
  $inList = preg_replace('/' . preg_quote($inDelim, '/') . '+$/', '', $inList);
  $outArray = preg_split('/' . preg_quote($inDelim, '/') . '+/', $inList);
  if(sizeof($outArray) == 1 && $outArray[0] == '') {
    $outArray = array();
  }
  return $outArray;
}

photo: chris ivarson

This is a reprint of a post I originally made at http://www.propertymaps.com/blog. I felt it was relevant to the current Gmail posts so am reprinting with slight modifications.

Safari 4 Public Beta Annoyances

webkitFirst off, I’m going to start posting in a new format – a “squirt” (screw you zune). I have thoughts to convey that are too long for a tweet and too short for a full entry. I’ll put all these squirts into a single category.

Here are some things about Safari that are chapping my hide:

  • With the newest version of WebKit, it is actually slower in the SunSpider JavaScript benchmark. Before, I was running a custom build of Safari 4 with WebKit and was doing over 100 ms better.
  • It breaks Mail.app if you are using the GrowlMail Bundle.
  • Nitro is the name of the new nebulous engine. How much of this is Squirrelfish and how much is optimization to WebKit?
  • 1Password doesn’t work. This is an InputManager so doesn’t really count as it’s an unsupported hack.
  • The blue loading bar is gone and the stop/reload within the URL bar is not intuitive. It’s also harder to see which pages are loading at a glance when clicking through tabs.
  • The bookmark button is cemented onto the URL bar. I don’t use bookmarks. Go away!
  • The trying-to-be-awesome-bar isn’t. It only matches the beginning of what you are typing. For example, in Firefox, I can type “alpha” and it will find the page I want. In Safari, I have to type “secure.xxxxxxx.com/alp” before it finds it.
  • The search bar is jaring how it plops down like a ton of bricks. This animation is so not Apple.
  • I have a hunch Safari is renicing itself somehow. Opera is not nearly as responsive when Safari 4 is running. Is this just me?
  • Reboot to install a browser? Grrr.

Don’t get me wrong, there is plenty of nice eye candy, especially when you load up the browser for the first time. I’ve seen a lot of the “150 New Features” before as I’ve been compiling this version for a while now. Of those 150 features, there are probably 20 that are new and most I don’t really care about.

Anyway, WebKit has been and will continue to be my default browser (as I write this in Shiretoko).

image: apple

Gmail Account Lockdown

Gmail Account Lockdown

Gmail Account Lockdown

While everyone is up in arms about the recent Gmail outage, I was up in arms about my Account Lockdown (pictured above). Mail.app has been acting up on me again lately (I have a post about Mail.app), so I decided I would clear out my junk in Gmail. I had about 150,000 messages in there and wanted to get rid of about 90% of it so I started mass deleting via IMAP.

Before long, I started noticing that nothing was actually happening. Mail.app was moving the messages to Trash, but when I would pop back over to my “All Mail” folder, the same number of messages were there. WTF? So I decide to log in to the web interface to check things out (and to check out the new button styles) and lo and behold, my account was locked down.

The lockdown was over “suspicious” behavior and I guess in my case that behavior was “deleting large amounts of email”. But I was doing it via IMAP, not POP. So off to the troubleshooting page which was of no help whatsoever, just telling me the lockdown would be in effect for 24 hours. Great.

This, the recent outage, and disappearing email has me slightly worried.

Using the Bit.ly and Twitter API

Bit.lyI did a little experiment and created a tweet bot that would announce every article I just finished reading. There were two problems with this – 1) I was putting out two much noise in my twit stream, 2) I may not necessarily want everyone to know when I’m reading and not working, heh. So I canned the idea.

Poking around these APIs (Twitter and Bit.ly) really opened my eyes quite a bit as to how powerful these simple services can be once the internals are exposed. Bit.ly ran an API contest and some of the entrant applications will blow your mind. A simple URL shortener can produce so much power. I wish I would have know about the contest a little earlier.

What I’m going to show here is almost the same as the 140it.com application, but my version is much more simplified. I also want to test a syntax highlighting plugin for PHP and many other languages which uses GeSHi.

So, I’ll just jump to the code and not go into the application logic. Here is the first part to shorten a URL:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// get shortUrl
$ch = curl_init();
$url = 'http://api.bit.ly/shorten';
$post_string = 'longUrl='.urlencode($data['url']).'&version=2.0.1';
$options = array(
    CURLOPT_URL => $url,
    CURLOPT_USERPWD => '[username]:[password]',
    CURLOPT_POSTFIELDS => $post_string,
    CURLOPT_RETURNTRANSFER => true
);
curl_setopt_array($ch, $options);
$response = curl_exec($ch);
curl_close($ch);
 
$response = json_decode($response);
if($response->statusCode!='OK') {
    return false;
}
$shortUrl = $response->results->$data['url']->shortUrl;
if(trim($shortUrl)=='') {
    return false;
}

And the next part to tweet about it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// tweet that
$ch = curl_init();
$url = 'http://twitter.com/statuses/update.json';
$post_string = 'status='.urlencode('Just Read: '.$data['name'].' - '.$shortUrl);
$options = array(
    CURLOPT_URL => $url,
    CURLOPT_USERPWD => '[username]:[password]',
    CURLOPT_POSTFIELDS => $post_string,
    CURLOPT_RETURNTRANSFER => true
);
curl_setopt_array($ch, $options);
$response = curl_exec($ch);
curl_close($ch);
 
$response = json_decode($response);
if(isset($response->error)) {
    return false;
}

If you decide to use this code, make sure your tweet is below the character limit. The above code does not reflect that.

One last thing, the real-time link stats at Bit.ly are the bomb. These guys are really pushing the boundaries here and I’m a huge fan. To access the real time stats, take an example URL like http://bit.ly/mQcHJ and add “info/” in the middle – http://bit.ly/info/mQcHJ.

Another last thing, the GeSHi plugin is also pretty cool. When copying the code, you don’t end up copying the line numbers.

image: bit.ly