Keeping Documents Synced and Organized

I've spent a lot of time thinking and rethinking my current solution to document synchronization. I've included the word "organized' in the article title because when I say sync, I really mean organize. When using multiple computers I want to maintain an organized view of all my work, at anytime.

Here's a bit of history on how I arrived at using Subversion to keep myself synchronized and organized.

One day, long long ago, I sat down and took a look at every important personal file on my computers. I don't consider downloaded application installers or most executable files important. I also don't consider music as my "phone" keeps my music synchronized and I couldn't really care less if I lost it all. Pictures are tough for me as I a) don't take a lot of pictures and b) mostly upload them to other websites.

I determined that I have an overwhelming amount of school-related files, a ton of misc programming files and a whole lot of other junk. So I created three folders, one for programming, one for school work, and one for files. (I do realize that 'files' implies school work and programs but w/e.) Within the nexus that is the files folder I have sub-folders for financial documents, backups binaries, professional documentation, etc. Then I took some time and evaluated popular options for keeping those three folder organized:

Dropbox: (http://www.getdropbox.com/)


Dropbox Logo

Dropbox has a ton of great reviews, from both my friends and the internet peoples. They provide you with a user account with a limited amount of storage space (up to 2GB as of now). Dropbox installs as an agent on every computer you'd like synchronized whether it's running OS X, Linux, or Windows. You select a file folder or folders to act as your "dropbox". Every time you update a synchronized file on your computer the agent will upload the change to the Dropbox server. They preform some bit of magic which aids in backup and storage, then push the update out to all other listening clients. If you need to sync more than 2GB you have the option of upgrading to a premium account. (http://www.getdropbox.com/terms#pricingterms)

Dropbox gets a little tricky when considering the types of access they provide. You have three types of "boxes", one for public access (all Dropbox users), shared (you and your friends), and personal (you and yourself). This irks me a bit when considering the types of files I need synchronized. I'm only concerned with personal files. Note: Dropbox has a strict Information Property stance (which is awesome), so syncing music is not a good idea.

I didn't choose Dropbox for two main reasons, first it's centralized, second they limit to 2GB. My files are personal and I'd rather manage the service syncing them. I also have a bit more than 2GB worth of files, not too much more but enough to need premium access.

Take a look at (https://www.getdropbox.com/features) for a comprehensive list of features.

SugarSync: (https://www.sugarsync.com/)




Surgar provides a similar service to Dropbox. Unfortunatly they provide less features for their free option. You get 2GB (the same as Dropbox) but you're limited to only two computers. They put a ton of advertising effort into their Mobile sync ability. Unfortunately for them Dropbox also has an iPhone/iPod Touch app. Fortunately for them they support a bit more than just Apple mobile devices.

I couldn't find anything about file restoration, only backup features. Dropbox does a good job including an "undo" feature. While this may dip into your 2GB worth of storage, it's invaluable when considering syncing only documents.

DYI


I wasn't very happy with either option and after reading their privacy policies I decided I'd find some software to manage my own synchronization. The only apparent downsides to this would be limited set of features and no awesome amount of server raiding with someone to blame if my files went missing. The upside is that I do own a raided server and and I can blame them if any data goes missing. One cool tool I found was Unison. Unison is simple, open source and free. The sad part is it only allows you to replicate on two computers (or collections). This is because Unison is meant as a backup utility, not as file synchronization software.

Eventually I decided to go with Subversion (SVN). I've used SVN immensely throughout school for group projects and gaining nightly builds of some software I test. Thanks to Tortoise SVN I can easily admin my updates and downloads. I'd say the decision to organize and synchronize my files using SVN is based on the features Tortoise SVN provides.

Tortoise SVN: (http://tortoisesvn.tigris.org/)


I'm not going into much detail about subversion as my fingers are getting tired. It's a simple way of synchronizing files using a centralized repository. Tortoise is a nice Windows GUI for using subversion (which is primarily used via the command line). Using SVN allows me to host the centralized repository. So at no time do my personal files leave my hands. I choose the type of encryption and I'm not limited by an artificial storage limit.

Another feature I enjoy about SVN is the ability to transfer files via SSH. That means I don't have to create any more holes in my repository's filewall or worry about any complex system of authentication. AND: if I'd like to have a complex system of authentication I can do so with SSH keys. Tortoise SVN will load PuTTY configuration files so storing my SSH configuration and keys is transparent (as I already have them configured in PuTTY).

Here's a bit of imagery to sweeten the post (Fig A.):

SVN Commit Dialog  Tortoise SVN (Fig A.)

Uploading (called commiting) has different levels of granularity. By this I mean I can choose different parts of my file-folder-structure to commit. Perhaps I am working on a homework assignment. I want to commit so I can access it from a few other computers, but I've also worked on other assignments I'm not ready to commit. With SVN I can choose to commit only that current assignment.

This is helpful because I can choose when I'm ready to be organized. I can sync those files I feel are properly sorted. This way, no needless binary or debug files get shipped off.

Now there are some gotchas (or so it appears). Let's say I work on the same file from two different computers, or drop in a file with the same name at the same place from two computers. If I try to commit both SVN will throw a conflicting error. Conflicts require manual review. Thankfully Tortoise SVN provides a nice diffing window to easily review and choose which changes should be uploaded. When finishing the review, simply right click, and choose 'Conflict Resolved'.

Tortoise's Addition to Right Click (Fig B.) Tortoise's Addition to Right Click (Fig B.)

Figure B. shows the changes Tortoise makes to the Right Click context menu. If a folder is not SVN enabled another option will appear called CheckOut. A CheckOut will let you subscribe to a Subversion repository. In the case of Figure B. I am right clicking on a folder that is already checked. Tortoise now asks to either Update from the repository or Commit (upload) to the repository.

The only issue I've encountered using SVN is keeping some encrypted files synchronized. The problem is, every time I modify and re-encrypt the files Tortoise will commit the entire file set again. (It sees the encrypted file as a binary file, as every time it's modified the entire contents change.) This is not a huge problem as I have plenty of storage space and the files are not very big. :)