One thing bugging me was that the script exec'ed rpm for each package. Even though UNIX process creation is relatively inexpensive, the programs being exec'ed take time to initialise themselves, they have to open files, read configs, create internal structures, etc. The cumulative initialisation costs can be substantial. For example, the old makewhatis script that used to ship which many Linux distro's exec'ed gawk for every manual page, this took 30 minutes on a 486DX66. It was so annoying I rewrote it to exec gawk less often, and the the run time dropped to 1.5 minutes. The improved version is still included man-1.6g. Given how many machines were once running this script, the reduction in Carbon emissions may have been significant ;-)
By taking advantage of rpm's --queryformat option I've changed the rpmChangelogs script to exec rpm for 100 rpm arguments at a time. This is about 10 times faster for large runs. For example, when I generated a summary dating back to my upgrade from OpenSUSE 11.4 to 12.1, the run time reduced from about 50 seconds down to 5 seconds.
I've added an option to include the description of the package. And I've added and option to accept the rpm names from the command line instead of just doing the most recently installed ones.
Here is the syntax summary for the new version:
python rpmChangelogs.py -h Usage: rpmChangelogs.py [options] [rpm...] Report change log entries for recently installed (-i) rpm's or for the rpm's specified on the command line. Options: -h, --help show this help message and exit -i INSTALLDAYS, --installed-since=INSTALLDAYS Include anything installed up to INSTALLDAYS days ago. -c CHANGEDAYS, --changed-since=CHANGEDAYS Report change log entries from up to CHANGEDAYS days ago. -d, --description Include each rpm's description in the output.
Except for the optional addition of the description, the output is the same as the previous OpenSUSE only script.
My python is a little rusty - I just spent months doing Java - so I've also gone back over it and tried to tidy up the code.
The code
(Once you've expanded the code, hover over the code area to bring up options that make it easier to copy or print - requires javascript to be enabled.)
#!/usr/bin/env python # # rpmChangelogs.py # # Copyright (C) 2011: Michael Hamilton # The code is GPL 3.0(GNU General Public License) ( http://www.gnu.org/copyleft/gpl.html ) # # Updated 2013/03/18: now uses seconds from 1970 to avoid localisation issues with the dates output by rpm. # import subprocess from datetime import datetime, timedelta from optparse import OptionParser maxArgsPerCommand=100 optParser = OptionParser( usage='Usage: %prog [options] [rpm...] ', description="Report change log entries for recently installed (-i) rpm's or for the rpm's specified on the command line.") optParser.add_option('-i', '--installed-since', dest='INSTALLDAYS', type='int', default=1, help='Include anything installed up to INSTALLDAYS days ago.') optParser.add_option('-c', '--changed-since', dest='CHANGEDAYS', type='int', default=60, help='Report change log entries from up to CHANGEDAYS days ago.') optParser.add_option('-d', '--description', dest='DESC', action='store_true', default=False, help="Include each rpm's description in the output.") (options, args) = optParser.parse_args() installedSince = datetime.now() - timedelta(days=options.INSTALLDAYS) changedSince = datetime.now() - timedelta(days=options.CHANGEDAYS) showDesc = options.DESC if len(args) > 0: recentPackages = args else: queryProcess = subprocess.Popen(['rpm', '-q', '-a', '--last'], shell=False, stdin=None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True) recentPackages = [] for queryLine in queryProcess.stdout: (name, dateStr) = queryLine.split(' ', 1) installDatetime = datetime.strptime(dateStr.strip(), '%a %d %b %Y %H:%M:%S %Z') if installDatetime < installedSince: break recentPackages.append(name) queryProcess.stdout.close() queryProcess.wait() if queryProcess.returncode != 0: print '*** ERROR (return code was ', queryProcess.returncode, ')' for line in queryProcess.stderr: print line, # Use one rpm exec to query multiple packages - 10x faster than an exec for each one marker = '+Package: ' markerLen = len(marker) for subset in [recentPackages[i:i+maxArgsPerCommand] for i in range(0, len(recentPackages), maxArgsPerCommand)]: format = marker + '%{INSTALLTIME} %{NAME}-%{VERSION}-%{RELEASE}\n' + ('%{DESCRIPTION}\n\n+Changelog:\n' if showDesc else '') rpmProcess = subprocess.Popen(['rpm', '-q', '--queryformat=' + format, '--changelog'] + subset, shell=False, stdin=None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True) tooOld = False for line in rpmProcess.stdout: if line.startswith(marker): installedDate = datetime.fromtimestamp(float(line[markerLen:line.rfind(' ')])) name = line.rsplit(' ', 1)[1] print '==================================================' print marker, installedDate, name, print '------------------------------' tooOld = False else: if line.startswith('* ') and len(line) > 17: try: changeDate = datetime.strptime(line[:line.rfind(' ')], '* %a %b %d %Y') tooOld = changeDate < changedSince except ValueError: pass # not a date - move on if not tooOld: print line, rpmProcess.stdout.close() rpmProcess.wait() if rpmProcess.returncode != 0: print '*** ERROR (return code was ', rpmProcess.returncode, ')' for line in rpmProcess.stderr: print line, rpmProcess.stderr.close()