Sunday, March 27, 2011

Format of pipermail/Mailman database/ files?

I'm wanting to write a tool that looks at some pipermail archives and creates a MessageID to URL table.

If I run `strings /var/lib/mailman/archives/public/discuss/database/2011-March-article` I can see that the information that I want is in these database files. If I look at /usr/lib/mailman/Mailman/Archiver/pipermail.py it looks like this is a baddb formatted file.

If I try to write a python script to read these files, all I get is:

bsddb._db.DBInvalidArgError: (22, 'Invalid argument -- /var/lib/mailman/archives/public/discuss/database/2011-March-article: unexpected file type or format')


Anyone have any ideas? I'm not a Python programmer, so it is quite possible that I'm just not understanding how to do things in Python. Any suggestions greatly appreciated.

Note: This is python-2.4.3-27.el5_5.3 , on a CentOS 5.5 machine. I'm running the following script on the same machine that is writing the files, so there won't be a version mismatch.

#!/usr/bin/python

import os, sys
import bsddb

path = "/var/lib/mailman/archives/public/discuss/database/2011-March-article"

db = bsddb.btopen(path)