Skip to content

monads, menus, and mercurial

Recently I switched from the tiling window manager ion3 to a new window manager: xmonad. xmonad is written in Haskell, has a tiny and malleable code base (500 lines), and is easy to extend (if you know Haskell!).

One of the reasons that xmonad is so small is that it leaves some window manager features to external programs. For example, it doesn’t have any sort of menu. Most xmonad users use dmenu to provide a program launcher. (I also use it to run internal xmonad commands with my XMonadContrib.Commands module.)

dmenu is originally by Anselm R. Garbe, author of the window managers dwm and wmii. He follows strict anti-bloat principles and endeavors to keep his programs free of extraneous features. This is a respectable goal. On the other hand, there are several patches that multiple users have found useful that are not included in dmenu: Xinerama support, a history feature, and so on.

dmenu is maintained in the Mercurial distributed version control system. This makes it trivial for me to maintain my own version of dmenu, synced with upstream changes, that includes these extra features. It’s even easy for me to share this with the world. (I’d like to think of this as an “extra features edition” rather than as a “fork”.)

You can check out my version with Mercurial:

hg clone http://www.davidglasser.net/hg/dmenu/

or just grab a tarball of the latest version.

On a related note, I’ve been very happy with Mercurial. Its user interface is very nice, the underlying principles are sound, and the work-in-progress book by Bryan O’Sullivan is well-written and helpful. xmonad uses the distributed version control system darcs, and I tried to do similar tasks (keeping my local config in my personal repository while syncing with upstream) in darcs. While darcs has many interesting ideas, I ran into the serious performance problems almost immediately; and because darcs doesn’t really have any idea of a “changeset identifier” other than the log message, it’s pretty hard to use the command line to find out about various patches. Mercurial, on the other hand, was a breeze. There are certainly things I prefer about Subversion (documented internals with a stable API being the most obvious), but I should definitely experiment more with Mercurial (if for no other reason than to learn lessons that can be applied to Subversion and svk).

What are you really trying to do?

I’ve found that 99% of the time, when somebody asks a detailed technical question, the right first answer is “What are you really trying to do?” It’s often frustrating for the person who hears it; they think they understand their problem very well and that they’ve reduced it to a single technical point, but in reality approaching the issue from a completely different direction is the way to go. I ran into this with myself this morning when working on my thesis project, amock.

amock starts with a dynamic analysis of a Java program; it is implemented (for simplicity) as an offline analysis. The program is run in a special instrumented mode which dumps out a trace of every method called to a file, which is later processed by the actual analysis algorithm. In the trace generated by my instrumentation, every non-primitive object is assigned a unique integer ID, so that the processor can tell which object is which. It does this by using a weak identity hash map: a chimera of the standard Java classes WeakHashMap and IdentityHashMap.

The problem is, this doesn’t work for creating trace entries about constructors. When you’re calling a constructor, the semi-initialized object is this weird weird thing that you basically can’t examine at all… trying to do basically anything to it other than call ‘dup’ or call a constructor on it makes the verifier cry. Thus my instrumentation has to go to extra efforts to substitute a dummy ConstructorReceiver object into the calls to my tracer implementation, because you can’t actually pass a semi-initialized object to methods. This means that my trace entries for entering-constructor all have no idea which object they’re constructing, and there’s no good way to tell from the trace if two different entering-constructor entries refer to the same object (calling superclass constructors on it, say) or not. (The exiting-constructor entries do have an identity for the object, though, because the object is initialized by then).

So this morning I spent about an hour poring over the JVM spec and trying to come up with complicated instrumentations that would extract some information from this annoying uninitialized object. I had several different plans; for example, I started trying to work out how to instrument every single constructor in the system to add an extra object-id argument. I went to one of my labmates and tried to get advice on how to instrument the Java bytecode in order to somehow get this to work.

“What are you really trying to do?” he asked.

So I started explaining from the beginning. And suddenly it hit me: I didn’t need a fancier instrumentation. All I needed to do was write a braindead two-pass algorithm that reads in a trace file, matches entering-constructor and exiting-constructor entries, and spits out a trace with the additional information in it. It took me ten minutes and worked the first time.

You’ve got your microformat in my JavaScript!

A couple of years ago, when microformats were first created, I wrote a JavaScript-based renderer for hCalendar which I called JSCalendar. The idea is that wherever an event was mentioned on a web page, you would include hCalendar attributes in the description, so that the HTML was both human- and machine-readable; you’d then just include my JavaScript file an an empty <div> and voila, an attractive monthly calendar would appear on your site, fully stylable with CSS. I never ended up actually using JSCalendar for anything myself, but other folks apparently found it useful, and I get bug reports and suggestions every so often. It’s not exactly the best piece of software I ever created (and in fact was the first JavaScript I wrote), but hey, if it’s useful, might as well make it more easily available and make it easier for others to contribute patches to it.

So I’ve moved it to Google Code Hosting. I’ve also changed its name to js-hcalendar, since jscalendar was already the SourceForge name of a very nice embedded JavaScript calendar.

It’s alive!

It’s horribly hacky. The newest file is full of TODO comments and special cases. (The rest of the infrastructure is slightly less so.)

But!

amock can run a system test and automatically generate a passing unit test from it!

(The key word here is a system test: the one it is tested on. But still!)

You, too, can run amock

In the interests of openness, and for that matter making me embarrassed if I’m not making steady progress, I’ve moved the Subversion repository for my thesis project amock to Google Code Hosting.

Output

I wrote three papers at the end of last semester. The first was the proposal for my master’s thesis. My project is inspired by David Saff’s test factoring project. It can be relatively easy to write a few top-level system tests for an application, whereas it can be slow and tedious to write a complete suite of unit tests which exercise the entire program. The concept behind test factoring is that by dynamically observing the execution of a system test, we can automatically generate a large number of unit tests which drive individual objects in the same way that the entire test suite did.

The main focus of David Saff’s project was efficiency: a very slow system test can be run much quicker if the code in only a small number of classes is executed, and the rest of the world is replaced by “mocks”. His project executed the program in a specialized runtime environment which read from a transcript of the system test and simulated method calls which reached outside of the tested classes by just returning whatever they did during the system test.

That project is quite interesting. However, my focus in test generation is less on efficiency of running tests and more in ease of creating and modifying tests. Often, you may find yourself needing to change a test you wrote to be more lenient: to allow a return value to take on more values, or to allow two events to occur in a different order, for example. The ability to do this is even more important for automatically generated tests, which never had the benefit of a human writer and may be very brittle. So, my project (which I am calling @amock@) has the same general idea as test factoring, but its output is standard Java code. The generated tests must be legible and comprehensible by developers; developers much be able to relax and otherwise tweak the generated tests. In order to implement “mocking out the world”, I am using jMock, a Java mock-object library. (Yes, I do realize that the mock objects folks prefer to think of mock objects as a design aide and not a post facto testing tool; perhaps this will lead to the generated tests not being very good for most programs. We will see.)

My goal is to implement @amock@, write my thesis, take two classes, and TA another one this semester; I’ll move to San Francisco and start my new job after that. Since this is a pretty heavy load, there’s a reasonable chance it’ll take me the summer to finish up the thesis and I’ll have to wait on the job; let’s hope that doesn’t happen.

My other two papers were group projects for my classes last term. I’d describe them both as interesting projects which didn’t pan out as conclusively as we’d hoped; however, I’m still pretty happy with both of them. For my databases class, I looked into the best database structures for implementing oh-so-trendy-Web-2.0 tagging queries, working with Richard Tibbetts and Chris Laas. For advanced algorithms, Natan Cliffer and I explored what we called packing dual problems of machine scheduling problems: problems that generalize the NP-complete bin packing problem in the same way that machine scheduling problems are generalized. I think our classification is an interesting idea, but unfortunately we didn’t prove any non-trivial results about specific subproblems.

Decisions, decisions

Over the past couple of months, I’ve been lucky to meet dozens and dozens of brilliant engineers from some of the best companies in the modern software industry. This is an incredible time to be looking for a first job out of school in software. I could easily have been very happy at six or seven of the companies that I interviewed with (StreamBase, Amazon, Vanu, ITA Software, Akamai, OKCupid, and more).

After lots of thought, though, I have decided that after graduating with my MEng next year, I will be moving to San Francisco and working for Google.

Now, back to actually being a grad student!

A Topological Lesson

My friend Lissa made an interesting topological discovery: these handcuffs can be slipped off of their bar. She made a video demonstrating it, and I am finally joining the 21st century by YouTubeing it for her.

It turns out that technically, the way she described what she had shown was not entirely accurate. I got a little carried away and posted a page-long comment, starting with her handcuffs and ending with the Poincaré Conjecture. Enjoy!

Subversion Developer’s Summit: Day 1!

Here I am at the Subversion Developer’s Summit at Google in Mountain View. Looks like clkao and I are giving a talk on SVK today or tomorrow.

Subversion prides it on being a respectful community — Ben Collins-Sussman mentioned in his opening spiel that it was one of the big factors in Subversion’s success. I’m happy to report that these are not lies, and that these are pretty great folks.

Yesterday, we dropped by Google to hack for a while. My first plan was to work on my Advanced Algorithms homework. That plan quickly changed to hacking on ZVM, a Python Z-Machine implementation. Turned out there were at least two of us not already on board with ZVM who had implemented Z-Machines before!

We got distracted away from that pretty quickly by noticing that there were a whole lot of changes nominated in the Subversion 1.4.1 STATUS file that really needed to be reviewed and backported in order to get the release out. So we took a couple hours and got the 1.4.x branch to an almost release-worthy state. Maybe tonight!

In a long discussion about distributed version control last night at Karl Fogel’s, most folks realized that none of us had really spent serious time working with a distributed version control system. (Well, other than SVK, which honestly already has most of the features we were talking about…) To get more experience, we’re going to play around with Mercurial during the Summit.

There’s going to be a shared developer’s blog for reporting on the summit, if interested.

Announcing Hiveminder

While my main work recently has been doing research towards my MEng in the Program Analysis Group at MIT’s CSAIL, I’ve been paying attention to the “secret project” from my internship at Best Practical last summer. Most of what I did there was way out in the open, in the public repositories of the request tracker RT, the object-relational database mapper Jifty::DBI, the web framework Jifty, and the distributed version control system SVK (which wasn’t actually part of Best Practical at the time). But we’ve been keeping this project under wraps and in “private beta” until this weekend, when we released…

Hiveminder! Hiveminder is a collaborative task organizer. Sure, there are many other to-do trackers out there. Hiveminder’s biggest strengths are in its intuitive and simple way of letting you set up “… but first” and “… and then” style relationship between tasks and its support for sharing tasks between people, one-on-one and in groups.

Today, when I look at my main “to-do” page, I see the next steps I need to to for my research, the SVK bugs and features I’d like to work on, a reminder to renew the hosting for this website (with a due date of… today… ooh, should deal with that right now… done), all in one spot.

Technology-wise, the Jifty framework we wrote has some very nice advantages. (Disclaimer, or bragging, or something: Jifty was designed mostly by Jesse Vincent (”the boss”), but I did most of the implementation of its first draft. Of course, since then I think every line I wrote has been replaced.) For example, all code that actually changes anything in the database goes through specific “action” objects; while we haven’t finalized a public API for Hiveminder, this means that eventually everything you can do through the site will be achievable via an API, without any special custom coding.

Another nice technical feature in Jifty is the concept of individually-addressable “page regions” inside web pages, which can be updated via AJAX in a single click. This feature is nothing new for AJAXy webpages… until you try Hiveminder again without Javascript and find out that everything still works. Sure, there are a few extra page reloads, and some fancy UI features like drop-down menus can’t be used, but other than that the site works exactly like it does with Javascript… and without Hiveminder’s code having to have any special cases at all. Hell, I just added a comment to a task about an SVK bug using lynx… just because I can.

Of course, I’m being distracted from work now, so I’ll click on the “pag” tag to show only my research-related tasks and get back to editing this paper…