Saturday 18 June 2011

Hacking on usb-modeswitch, part 1

Lately I've been spending time porting the usb_modeswitch_dispatcher tcl script from the usb-modeswitch package to C.

While being a great exercise at both my knowledge of Tcl (almost non-existent) and my knowledge of C; it's also been very interesting so far to look at how things were being done to "switch" USB devices from a storage mode into modem mode.

One of the problems I'm hitting now is balancing between performance and disk space usage. In an attempt to cut down on installed space, usb_modeswitch data has been all compressed into one tarball, comprising 162 small text files with the necessary vendor and product IDs expected before, after switching and the magic message to do the actual switch (see Debian bug 578024 for the rationale for compressing files). Having a compressed tarball is great to save space, but would tend to cause delays at boot time when the file needs to be uncompressed (perhaps multiple times) during boot-up. On the other hand, separate files take more space, which is especially a problem for those who don't need usb-modeswitch on their systems.

I'm now working on quantifying the performance impact between compressed and uncompressed, as well as trying to figure out the actual size impact between both options for the Live CD. Theoretically, there should be little difference or even higher space usage with the compressed tarball on the LiveCD (because you can't really compressed something already compressed). I'll find out and post results here. As for the performance impact, there may not be much to look through the compressed tarball and extract one file from it, but every little bit of gain can help.

3 comments:

doctormo said...

You should keep the tarball of the files, but use a caching directory in /var/cache/ to hold the file you need. It can stay there for years if need be.

That way the performance delay will only happen on the first boot.

Anonymous said...

(I'm the Debian maintainer for usb-modeswitch{,-data} ).

Honestly, I don't think the unpacking performance has any measurable impact: it happens _once_ per device, per switching. And it's not an operation that blocks the boot, as the udev script forks (hence not blocking the boot process).

As for the /var/cache idea, it could be done, but again I'm not sure that it would help, performance-wise: you still have to check for overrides in /etc/, purge the cache on databases upgrades, etc.

But I agree the tcl-script C rewrite is a good thing as it will reduce the dependencies footprint of usb-modeswitch.

Btw, feel free to contact me or upstream to discuss things further, I would be happy to avoid a diff between Debian and Ubuntu' usb-modeswitch'es.

Oh wat mooi said...

Compressed tar is not a good format for an archive that needs random access as there is no directory or index. To find a file, the whole archive needs to be scanned.

Zip is a lot better n that regard; that is why it is used by things like Java and Open Document Format.

This will allow the best of both worlds...