This winter, while home visiting family, I took the opportunity to gather up all of my old hard disks and archive them. This amounted to the majority of my digital footprint for the first 18 years of my life. I'd been putting the task off for a few years, but the chance to explore the data sitting on these old drives (and the cherished computers they came from!) helped motivate this project.
When I was a teenager, whenever a hard disk needed replacement, I'd pull the old drive and shove it in my closet. There they sat, some for over a decade, until I turned them back on last month. Amazingly, across ~350GB of data, only about 500KB proved unrecoverable. Quite fortunate, considering that some of these drives were 15 years old, and one was removed with failing health checks.
In the process of recovering this data, I resolved to preserve it for the rest of my lifetime. Why go to all this trouble? Well, in my explorations of these old drives, I discovered far more meaningful memories than I expected. Old music and 3D renders, chat logs, emails, screenshots, and tons of old photos. The disks from the early 2000s were a reminder of the days when computer use gravitated around "My Documents" folders. Then I learned about Linux and always-on internet access arrived. I took a peek at my first homedir and found all of the little Python scripts I wrote as I learned to work on the command line.
By today's standards, the breadth and fidelity of these scraps is rather... quaint. A kid growing up today should have a fine pixel-perfect view of most of their digital trail as they grow up. That was another reason this project proved interesting: it was not just a record of how computers changed; it revealed how the ways I used computers and what they meant to me changed over time.
Here's a brief rundown of the tools and backup process I used, both because they will be useful to refer back to decades from now, and because they may perhaps be useful to others in their own backup tasks:
Archival process
IDE HDD -> USB -> Laptop -> External USB HDD
I used a Sabrent USB 2.0 to IDE/SATA Hard Drive Converter (USB-DSC5) to connect the drives to my laptop. I've found this to be a really handy (and cheap!) swiss-army knife for recovering old hard drives, especially since it works on both 3.5" and 2.5" drives. To store the recovered data, I used a 2TB WD My Passport USB Hard Drive (WDBBKD0020BBK-NESN). I've had good experiences with these drives in the past, and they have a great form factor. I ordered both items from Amazon and received them a couple days into my trip.
Reading data from the drives
To recover data from the drives, I used ddrescue. This is an imaging tool like dd that will record read errors and exhaustively retry the surrounding areas. Recovering a drive looked like this:
Copy data from /dev/sdc to disk.img (outputting a log of errors to ./disk-log):
$ ddrescue -d -n /dev/sdc ./disk.img ./disk-log
One of my favorite features of ddrescue is that you can re-run it at any point to resume where it left off or try to recover more data. In the initial run, I passed -n to skip the slow exhaustive scraping process, in hopes of getting as much data off the drives as possible in case they stopped working after running for while. Thankfully, no issues cropped up. If there were read errors during the initial sweep, I re-ran the process with a retry count:
$ ddrescue -d -r 3 /dev/sdc ./disk.img ./disk-log
In addition, I saved the partition table and S.M.A.R.T. data separately:
With the holidays over and all disks archived, I flew back home with the external HDD in my carry-on bag.
Cold storage in the cloud
Thanks to the advent of cheap cloud cold storage options like Amazon Glacier, Google Nearline, and Backblaze B2, it's now very affordable to dump a bunch of full disk images in the cloud. I chose Google Nearline for this task. Amazon Glacier is a bit cheaper (Glacier: $.007 / GB, Nearline: $.010 / GB), but retrievals are complicated to execute and price. Backblaze B2 is even cheaper, but only uses a single datacenter.
Before uploading my backups, I was able to shave off ~100GB (almost 30%!) by compressing with lrzip, which is specialized for large files with repeated segments. I also experimented with compressing one of the disk images with xz, but (as predicted by lrzip's benchmarks) xz took 22% longer to produce a file 10% larger.
After compressing the images, I encrypted them with AES256 using gpg. While I've typically used the default CAST5 cipher in the past, for this project I chose AES256 based on this guide. I considered generating a keypair for these backups: my plan was to create copies of the private key encrypted with a couple different passwords given to family members, etc. I decided to defer this because I didn't fully understand the crypto details and wanted to get uploading, so I ended up symmetrically encrypting the files. I may revisit this later and re-upload with a more granular key system.
Putting it all together, I assembled everything into a pipeline and ran it overnight:
for n in $(ls); do
pushd $n
lrzip -vv $n.img
tar cv $n.img.lrz $n-log fdisk smart | pv | gpg --passphrase="$PASSPHRASE" --no-use-agent --symmetric --cipher-algo AES256 | gsutil cp - gs://$BUCKET/$n.tar.gpg
popd
done
Waking up to ~250GB of memories neatly packed up and filed away was a lovely sight. I've been sleeping better since!
At my friend davean's suggestion, since lrzip is a less common program, I also uploaded a copy of the git tree to my Nearline bucket.
I also encrypted the files on my local HDD: while I used the out-of-box NTFS filesystem on the My Passport drive for the disk images in transit, once I had a copy of the files in Nearline, I reformatted the drive to ext4 with dm-crypt.
Update: an important final step (thanks to khc on HN for mentioning this): it's critical to test a full restore flow of your backups before leaving them to rest. In my case, I tested downloading from Nearline, decrypting, un-lrzipping, and reading. Similarly, for my local HDD copy, I tested mounting the encrypted filesystem and reading the images.
Finally, a word of advice when handling disk drives (and other objects you would not like to fall): objects that are already on the floor cannot fall further. Treat any object that is elevated from the floor like it will fall. You can increase the odds of this happening massively by haphazardly arranging your backup drives on swivel chairs and assorted hardware. ;)
Tonight I noticed that in a React 0.12 codebase of mine, entities were rendering as "Â " in Mobile Safari. After a quick search I came across this StackOverflow answer which identifies the "Â " output as a UTF-8 formatted non-breaking space character being interpreted as ISO-8859-1.
To resolve this problem, putting...
<meta charset="utf-8">
...after my <head> element did the trick. While explicitly marking your webpages as UTF-8 encoded has been a best practice for a while now, I learned the hard way today that it's a requirement when working with React.
Interestingly, this problem was apparent in Mobile Safari on OSX but not Chrome on Linux. This made it present much later in QA. Another good reason not to leave the choice up to the browser!
Myo is a wireless armband that uses electromyography to detect and recognize hand gestures. When paired with a computer or smartphone, gestures can be used to trigger various application-specific functions.
When their marketing video made the rounds in 2013, I remember one specific demo made my jaw drop: touch-free video control. The video shows a man watching a cooking instructional video while cutting some raw meat. Being able to pause and rewind the video simply by raising his hand was a solution to an interaction problem I've had countless times, such as listening to podcasts while doing chores, or watching videos while eating a sandwich.
I ordered a Myo back in March 2013 and deferred shipment until their consumer design was ready. It was a nice surprise to return home from holiday travels to find a Myo waiting for me. :)
Unfortunately there is no official Linux support yet (though there's a proof of concept from a hackathon). On Windows and OSX, there's a pretty elegant Lua scripting environment in the SDK which is used to write "connector" integrations. Lua scripts are selected based on the currently active app to trigger mouse/keyboard actions from gestures. This is a neat approach. It enables developers and tinkerers to do a bunch of the legwork writing and designing integrations, while wrapping the complex parts (gesture recognition / mouse control / keyboard automation) in a cross-platform manner.
I was happy to see some web browser integration already built, but upon further inspection there were a few different behaviors which would be more to my liking. I was delighted to discover that I could simply open up the web browser connector and hack the high-level Lua code into the controls I wanted. I added a gesture to take control of the mouse, as well as some special cases for controlling video playback.
While the gesture recognition doesn't always work perfectly (probably a matter of training both myself and the armband better), when everything works properly, the results are pretty sublime:
I'll be posting my scripts and future tinkerings in a myo-scripts repo on GitHub.
In September, Amazon finally released an official Instant Video app for Android. Unfortunately, this app is not downloadable on tablets via their app store – only phones. Yet, Amazon Instant video has been supported for years on Amazon's own "Fire" Android-based tablets. It really bums me out that Amazon Instant Video continues to snub their competitors' products, but if you're like me and have a lot of purchases locked up in their store, there is a way:
I'd read in a couple places that if you install the APK manually, it works fine on tablets – as long as the phone version of Amazon's store app is installed as well.
I used adb (apt-get install android-tools-adb) to transfer both apps from my Nexus 4 to my Nexus 7:
Now you should be able to launch Amazon Instant Video on your tablet, watch some Star Trek TNG, and dream of a world where tech companies don't play silly games about which kind of PADD you use.
Here's a few anecdotes from the development of Pixels and a quick explanation of how it works. I hadn't worked with some of the graphics programming patterns (coordinate systems!) for a while, so I ended up making some classic mistakes – hopefully you can avoid repeating them. :)
background
Pixels is an infinitely zoomable black-and-white comic. As you zoom in, the pixels that make up the image resolve into smaller square comic panels – dark ones for black pixels, light ones for white. Depending on which panel you are looking at, the set of pixel panels will be different. These comics can be further zoomed into, revealing new pixels, ad infinitum.
Our schedule for the project was pretty absurd: we had 3 days from the first discussions on Saturday to going live Tuesday night. I was traveling between 3 different states during those days. With this crazy schedule, there was very little margin for error or engineering setbacks. I ended up sleeping for a total of 1 hour between Monday and Tuesday, finishing the zooming code on a plane flight home to SF via Gogo Inflight Internet. My plane landed in SFO a half hour before our rough goal of launching midnight EST. While my cab was hurtling me home at 80mph, the folks at xkcd were putting the final touches on the art and server code. We cobbled it all together over IRC and went live at around 1:30am EST.
On the art side, we decided on a panel size of 600x600, with pixels only pure black or pure white. Without grays for antialiasing, the images look a little crunchy to the eye, but this makes the zoomed pixels faithfully match the original image (we also thought the crunchiness would be a nice cue that this comic was different from the usual). On the tech side, HTML5 Canvas was the obvious choice to do the drawing with decent performance (I also tested plain <img> tags and found they were significantly slower). Unfortunately, browsers were still too slow to draw all 360,000 individual pixel panels via Canvas, so we had to compromise for fading the individual pixels in once they were 500%-1000% zoomed.
For this project, I chose to use vanilla JavaScript with no external dependencies or libraries. In general, we strive to keep the dependency count and build process minimal for these projects, since keeping the complexity level low gives them a better chance of aging well in the archives. Browsers recent enough to support Canvas (IE9+, Firefox, Chrome, Safari) are superficially consistent enough to spec to not require too much time fixing compatibility issues.
to infinity and beyond
One of the core challenges to Pixels was representing the infinitely deep structure and your position within it. As you zoom in, the fractal pixel space is generated lazily on the fly and stored persistently, so that when you zoom back out, everything is where you first saw it.
Each panel has a 2d array mapping: a pixel stores the type of panel it expands to, and possibly a reference to a deeper 2d array of its own constituent pixels. Appropriately, this data structure is named a Turtle, a nod to the comic and "Turtles all the way down".
To represent your position within the comic, it would be convenient to store the zoom level as a simple scale multiplier, but you'll eventually hit the ceiling of JavaScript's Number type (as an IEEE 754 Double, that gives you log(Number.MAX_VALUE, 600) ≈ 111 zoom nestings). I wanted to avoid dealing with the intricacies of floating point precision, so I decided to represent the position in two parts:
pos: a stack of the exact panel pixels you've descended into
offset: a floating point position (x, y, and scale) relative to pos
Here's a couple examples:
To render the comic, we locate the Turtle that accords to pos and draw the pixels described in its 2d array, offset spatially by the values of offset. As you scroll deeper, pos becomes a long array of the nested pixels you've entered, like [[305, 222], [234, 674], [133, 733], ...]. When you zoom far enough to view the pixels of a panel for the first time, the image data is read using a hidden Canvas element, and a new Turtle object is generated with the panel ids for each pixel.
One complexity of relying on pos for positioning is that it needs to update when you zoom into a panel / out of panel / pan to a different panel. When a pixel panel is zoomed in past 100% size, its location is added to the pos stack, and offset is recalculated with the panel as the new point of reference. Some of the hairiest code in this project came down to calculating the new values of offset.
There's a small trick I used to simplify the handling of the various cases in which pos needs to be updated. If the current reference panel is detected to be offscreen or below the 100% size threshold, pos and offset are recalculated so that the point of reference is the containing panel, as if you were zoomed in really far. This then triggers the check for "zoomed in past 100% size", which causes a new reference point to be chosen using the same logic as if you'd zoomed to it.
corner cases
Working out the browser and math kinks to simply position and draw a single Turtle took way longer than expected. It took me deep into the second night of coding to finish the scaffolding to render a single panel panned and zoomed correctly. Then, I needed to tackle nesting. Because the sub-panels are pixel-sized, you can't see the the individual pixel panels until you zoom above 100% scale. Since a panel at 100% scale takes up the whole viewing area, I initially thought this implied I'd only need to worry about drawing the pixels for a single panel at a time.
I then realized a problem with that thinking: if you zoom into a corner, you can go past 100% scale with up to 4 different panels onscreen:
This thought led me to make the worst design decision of the project.
TurtlesDown.prototype.render = function() {
// there is no elegance here. only sleep deprivation and regret.
Brain-drained at 3am, and armed with the knowledge that there could be up to 4 panels onscreen at any time, I began to write code from the perspective of where those 4 panels would be. I decided to let pos reference the panel at the top-left-most panel onscreen, and then draw the other 3 panels shifted over by one where appropriate.
For me, programming late at night is dangerously similar to being a rat in a maze. I can follow along the path step-by-step well enough, but can't see far enough down the line to tell whetcher I've made the right turn until it's too late. With proper rest and a step back to think, it's clear to see what's going on, but in the the moment when things are broken it's tempting to plow through. Once I got 4 panels drawing properly, I realized the real corner case:
Consider that the top left panel has position [[0, 0], [599, 599]]. How do we determine what the other 3 panels will be? We can't just add one to the coordinates. We have to step back up to the parent panel, shift over one, and wrap around. In essence, we have to carry the addition to the parent. And, if necessary, its parent, and so on...
At this point, I needed to get something working and was too far down the path to reform the positioning logic into what it should have been: a descending recursive algorithm. Walking the pos stack and doing this carry operation iteratively ballooned into 40 lines of tough to reason about code. It's necessary to carry the x and y coordinates separately – this is something I forgot to account for in an early release, causing some fun flicker bugs at certain corner intersections.
I'm not proud of the render() function or how it turned out – but I'm really happy and somewhat bemused that when all is said and done, that nightmare beast seems to work properly.
"I'm not sure how this works, but the algebra is right"
Like many computer graphics coordinate systems, Canvas places (0,0) in the top-left. Early on I decided to translate this so that the origin was in the center, in order to simplify zooming from the center of the viewport (centerOffset = size / 2). A while later, I discovered that the simple ctx.translate(centerOffset, centerOffset) call I was using didn't apply to ctx.putImageData(), the main function used to draw pixel panel images. I considered two options: either I could figure out the geometry to change my zooming code to handle an origin of (0,0), or manually add centerOffset to all of my putImageData() calls and calculations. I did the latter because it was quick. That was a mistake.
What I didn't foresee was how much splattering centerOffset everywhere would increase the complexity of the equations. The complexity arises when centerOffset is multiplied by offset.scale or needs to be removed from a value for comparison. For example (from _zoom(), which ended up needing to know how to origin shift anyway!):
In general, it's a good idea to do your translate operation as late in the chain as possible. Eliminating or externalizing as many operations as possible helps keep things understandable. I knew better, but I didn't see the true cost until it was too late...
Eventually, the complexity of some of my positioning code reached the point where I could no longer think about it intuitively. This led to some very frustrating middle-of-the-night flailing in which I sorta understood what I needed to express mathematically, but the resulting code wouldn't work properly. The approach that finally cracked those problems was going back to base assumptions and doing the algebra by hand. It's really hard to mess up algebra, even at 5am. However, this had the amusing consequence of me no longer grokking how some of my own code worked. I still don't understand some of the math intuitively, sadly.
TV audiences are mobile
All told, our launch went quite well. While there were some timing and performance issues we noticed the following day, it seemed to work for most people – a huge relief after the last 2 days. I spent Wendesday working through my backlog of minor fixes and improvements in preparation for the Colbert Report bump. Two tasks I triaged for release were proper IE support and mobile navigation. For IE, due to the lack of support for cross-origin canvas image reading, we needed to do some iframe silliness to get IE to work properly. Regarding mobile, I didn't think that phone browsers would perform very well on the image scaling, so I deferred it to focus on the best experience.
We anticipated a higher proportion of the Colbert Report referrals would be using IE, so we sprinted on that, getting it working just in time before the xkcd folks needed to leave for the interview. However, rushed nerds as we were (and ones who usually do not watch much TV), we didn't consider a very important aspect:
People watching TV don't browse on computers. They use their phones.
As I watched the traffic wave arrive via realtime analytics, my heart sank. The visitors were largely mobile! A much, much larger proportion than those using IE. Our mobile experience wasn't completely broken, but if we'd been considering the mobile traffic from the start, we'd have focused on it a good deal more.
berzerking works (sometimes)
When working fast, you have to resign yourself to make mistakes. Trivial mistakes. Obvious ones. Some of those mistakes can be slogged through, while some will bring a project down in flames. There's a delicate calculus to deciding whether a design mistake is worth rewriting or being hacked around. Having a hard 2 day deadline amplifies those decisions significantly.
I, like most developers I know, hate the feeling of a project running away from me; hacking blindly and not fully understanding the consequences. That's how insidious bugs are absently created, or codebases that need to be scrapped and rewritten from scratch. I prefer to take the time to recognize the nature and patterns to my problems and realize them with elegance.
Yet – sometimes, for projects small enough to fit fully in the head, and rushed enough to not weather a major time setback, brute force crushing works. For fleeting art projects that are primarily to be experienced, not maintained, that is perhaps enough.
When we work on interactive comics at xkcd, we take pride in experimenting with the medium. It's a privilege to combine forces with their masterful backend engineer + Randall Munroe's witty and charming creativity. Each collab is an experiment in what kinds of new experiences we can create with our combined resources at our disposal.
One of the fun aspects of working with comics is you don't expect them to think, to react to your behavior, to explore you as you explore them. That lack of expectation allows us to create magic. Every now and then, it's good to shake things up and push the near-infinite creative possibilities we have on the modern web. The health and sanity costs on these projects are high, but for me personally, novelty is the impetus to take part in these crazy code and art dives.
Overall, my favorite part of working on these projects is the moment after Randall's art & creative comes in: when I get to experience the project for the first time. Even though I know the general mechanic of the comic, when the backend, frontend, art, and humor all click into place simultaneously, it becomes something new. That first moment of discovery is as much a joy and surprise to me as it is to you.
The aptitude why command is awesome. At a glance, it tells you the whole dependency tree that led a package being installed, starting with the user's action:
chromakode@ardent:~$ aptitude why liboxideqtcore0
i unity-webapps-service Depends webapp-container
i A webapp-container Depends webbrowser-app (= 0.23+14.04.20140428-0ubuntu1)
i A webbrowser-app Depends liboxideqt-qmlplugin (>= 1.0.0~bzr490)
i A liboxideqt-qmlplugin Depends liboxideqtcore0 (= 1.0.4-0ubuntu0.14.04.1)
Every package manager should be able to answer that question so simply.
There's a lesson I seem to repeatedly forget, applicable to both life and design, in the nature of leaps forward. By "leap" here, I refer to an evolutionary change in the quality of an experience or approach which simultaneously solves many problems at once. The big "ah-has" that transform entire problem spaces and change the ways we think about possibilities.
A leap typically has the ability to take things we knew would be good, but opens them up in ways we couldn't imagine before. For example, the potential of smartphones was long known before they became widely available. Authors and researchers imagined the possibilities of portable computers and ubiquitous connectivity decades before they became everyday utilities. Sci-Fi predicted things like the internet, Google, widespread social networking, and cryptocurrency. However, what it didn't predict is the things that come after. The results of the leap.
As designers and engineers, we are constantly asked to quickly sort through possibility spaces, finding the good or elegant options. When searching a possibility space, we often use heuristics, basing our perception of options on experience or existing data that models expectation. Leaps obscure the options that these models fail.
Sometimes you can't measure an experience after a leap based on what has come before it.
Why did social networking take hold when it did? Why did YouTube and Netflix start to work after so many video streaming services failed? Why are peer-to-peer sharing economies like Uber or AirBnB working now, rather than 10 years ago? Many people felt certain all of these things would eventually work, but why now? It seems easier to look back and remember the reasons these things wouldn't work, rather than notice the changes which allowed them to.
Perhaps the reason leaps are so difficult to pin down is that their effectiveness comes from a lot of little changes, rather than a few big ones that humans find easier to reason about. Many small reductions in frictions that touch our lives and the lives of others in little ways throughout our day.
I often make the mistake of forgetting leaps when I judge a new idea. While it is necessary to be able to rapidly weed out bad ideas when searching, I miss good ideas too because I didn't see the leap. And there's the crux of the problem: many of those little reductions in friction can't be noticed until you try.
After configuring deja-dup to back up to S3, I hit a snag: the process seemed to hang during the upload phase.
To obtain more information, I found that you can enable verbose output via an environment variable (why it isn't a verbose command-line parameter is a mystery to me):
DEJA_DUP_DEBUG=1 deja-dup --backup
The first S3 upload would start and hang, eventually printing the error:
It turns out that this is a transient error for new S3 buckets while the DNS changes propagate through AWS (reference). Indeed, the full error contents of curling the bucket described a temporary redirect, which was probably not being handled properly by deja-dup/duplicity/python-boto. After waiting about an hour, the problem was resolved and my backup process went smoothly.
As a side note, after tinkering with the IAM profile a bit, this is the minimal set of permissions I could find for the duplicity account:
A hacky workaround for empty contents_pillar in salt-ssh
I've been really enjoying learning to use SaltStack to configure my servers and VMs. The relatively new salt-ssh transport is incredibly convenient for managing a small number of project cloud servers. However, there is one limitation I've discovered when handling certificates and private keys: file.managed's contents_pillar parameter outputs blank files.
It seems that pillar data is not sent to the minion environment when using salt-ssh. The contents_pillar pillar lookup then falls back to an empty default value (site note: a good example of why strict KeyErrors are helpful!). However, since the state datastructure is rendered on the master server, there is a hacky workaround relying on templating directives. For example:
(It is necessary to use the indent() jinja2 filter so that the inlined contents form valid YAML.)
Hopefully a future version of salt-ssh will support contents_pillar, making this unnecessary. In the mean time, this was the least gross hack I could find.
Recently, before rebuilding samurai, I wanted to download the old drive image as a final resort in case I'd forgotten to snag any files. I was a bit disappointed that there was no way to simply download the raw image using the web interface, but there are dark arts of dd that can fill this gap. Linode provides their own instructions for the procedure, but I discovered a few tricks worth saving:
Running a separate rescue distro per Linode's suggestion and then opening it up to root ssh seemed a bit ugly to me. However, imaging the system live and in-place could lead to corrupt images if the disk is written to. Remounting the live root filesystem read-only with mount was not possible, but there is an old Linux SysRq trick you can use to remount all mounted filesystems read-only:
# echo u > /proc/sysrq-trigger
While it's still a good idea to stop any nonessential services, this allowed me to proceed with the system online and using my existing ssh keys. I also lived dangerously and kept my soon-to-be-doomed nginx process running during the imaging. >:)
Since most of my drive was empty zeroes, passing the data through gzip was a massive savings in both transfer and disk usage at the destination:
Un-gzipping the image file the straightforward way leads to a problem: gunzip does not appear to support sparse files. In my case, I didn't have 50GB to spare for 1.5GB of uncompressed image data, so I needed a way to decompress to a sparse file. There's a nice trick using cp and /proc's file descriptor mapping to pull this off:
So there you have it! Gzipped online disk imaging with a sparse output file. Note that the image itself won't be bootable, as Linodes boot a kernel specified outside of their drive storage. You can read the data by mounting the image loopback via the instructions Linode provides.
I'm sure this is all common knowledge around sysadmin circles, but I wasn't able to find all of the pieces in one place during my googling. Hopefully this recipe may prove useful. :)