Building a blog atop a version control platform
Most blogging platforms are database-backed: they run with a relational database storing all your posts, and then query the database when it needs to present a post on a web page. You create posts by making INSERT commands, and edit with UPDATEs.
This model has some great aspects: it’s pretty fast (though not the fastest), it allows many users to run blogs on the one server (by adding a foreign key named blog_id to the posts table, for example), and it’s easy to query and search, just by using SQL (the web app you’re using to run your blog generally takes care of that for you). You can blog from anywhere, as long has you have a web browser.
But there are some weaknesses to this setup as well. For example, it’s hard to track changes (resulting in paragraphs at the end of posts starting with UPDATE: or EDIT:). And unless you work directly on the server (which is risky), you’ll have to use that web app to write your posts. Not a huge burden, unless you happen to be on dialup, or spotty 3G. It’s also hard to back up.
So how do we get around these problems, without getting rid of too many of the advantages that a web app-based blog gives us?
Of course, I wouldn’t be writing this if I didn’t think I had an answer.
A Version Control System (VCS) is a tool often used by programmers to keep track of the changes they make their code. If I introduce a bug to my code (it happens), then I can use my VCS to revert to the version of the code that didn’t contain the bug, and give it another shot. Or if somebody else introduces a bug, I can use the VCS we both use to find out who they are, and … umm… provide them with constructive feedback.
Although VCSs are used mainly for software development, that’s not their only use.
If we make a blog that uses a VCS for storage instead of a relational DB, and treat each post as a document, we immediately get some awesome benefits:
- You can track your edits. Writing an UPDATE: paragraph is no longer the only way to indicate that you’ve made a change since your original post. You might still choose to do it if your readers are in the habit of reading your blog that way, but it’s not the only way. For example, you could point to a list of edits on a site like Github.
- You can serve up posts blazingly fast. Because most VCSs track files living in folders on your hard drive, the most convenient way to use a VCS to track your posts is to store your posts as files. It just so happens that the fastest way to serve information over the internet is from files stored in exactly that form. And what that means for you is that you can use a less powerful, less expensive server to host your blog.
- Uploading images and videos is taken care of automatically. When you save your files to a VCS (a step known in the software business as “committing your changes”), everything you’ve saved in the VCS gets uploaded automatically. All you have to do is save it into your version-contolled folder, then link to it from your post. Simple.
- You don’t need specialised publishing software. If you’re hardcore, you can create your posts with just a text editor and a VCS tool. If you’re less hardcore, then I’m planning to make the software that will do this for you. It’s a “someday” goal at the moment though, so no promises.
Now, this hypothetical software isn’t going to work miracles. There are some things you might miss. For example, the “post from anywhere” concept becomes a bit more work for your friendly software developer, so that might not be there from the word “go”. I’m envisioning a desktop app as the first editor for this guy, with your VCS of choice running in the background. I really haven’t thought yet about the whole “theme” thing either. It’s a work in progress, people.
*lightbulb*
Although I guess you could do something like a build process, the step in programming where code is compiled and tested, ready for deployment. I guess that’s the step where you apply your layouts, assemble your RSS document, and whatever else you need to do to make your blog functional and amazing.
Mobile accessibility might also be a bit tricky – as far as I know, nobody’s found a good enough reason to write a mobile VCS client yet (please let me know if I’m wrong!), and that’s kind of a necessary component. Again, there’d be ways of getting around that limitation, but they’re not simple, and they’re not necessarily easy.
Anyway, that’s the idea. Feel free to take it and run with it – or point out who has done it already.
Art, sharing and copyright
There are some things that Western society usually encourages. Not that other societies don’t, but the West is the one I know. Art is held up as a sign of civilised life. Sharing is considered one of the most basic social skills. The rule of law is something we have set up enormous governing entities to try to preserve. Yet we are at a point in our development as a culture where a sharp conflict exists between these three principles: create art and cultural works, share what you have and obey the law.
The conflict arises out of a set of laws that were intended to give artists protection against other people passing of the artist’s work as their own. I am referring, of course, to copyright law. Copyright law allows the creator of a work to control who copies it, and under what conditions. The law says that we are forbidden from copying a work unless the creator gives us permission.
This rule gave artists an easy way to make money: I, the artist, make many copies of my work, and sell them to you. That works as long as I have control over the creation and distribution of copies. Which, in turn, works as long as copies are hard to make.
The problem that has arisen lately is that many works of art and cultural works (e.g. music, movies and TV shows) have been getting easier and easier to copy, now that we use computers so much. Copyright law has become unenforceable, because copying is so deeply embedded in the way we get data from one place to another. Copying had to become cheap, because we rely on it so much that we had to make it cheap. And now that it’s cheap, anybody can do it.
That means that the creator of any work that can be converted to digital form is no longer in control of its distribution. Well, legally speaking they are, but in practical terms that control is so easily broken that it’s easier to say it doesn’t exist.
Not only that, but as the number of copies of a work explodes, the value of each copy collapses. When record labels only distributed copies of songs on tapes, making illegal copies was a lot harder. The only people who did a lot of it had special equipment that cost a lot of money. Those people generally considered that cost an investment; they were selling their illegal copies for a profit. We bought them because we couldn’t afford to make them ourselves. Now that we can, the “bootleg” recording trade is dead and gone. We swap songs with each other for free, and because everyone can do it, there’s no profit to be made from illegal copies.
So is there any profit to be made from authentic copies?
As with so many seemingly straight-forward questions, the answer is "well, it depends." If authentic copies of your work are easily distinguished from unauthorised copies, then yes. You still have something scarce enough that you might be able to make money by selling copies. Paintings, sculptures or films on celluloid might qualify here. But if your work is easily copied with little to no loss (say, a movie on DVD), then you’re going to have to find a different way to make money from it. Public performances (like at the cinema) and merchandise are popular options, and convenience is a good one too (think of the iTunes store). But as long as you’re selling something that can be reduced to bits, you’d better be selling something other than just the bits themselves.
Some copyright owners choose to use the legal system as a way to make money from their works. Mostly these copyright owners are not the creators of works, but instead have made an agreement with the creator that allows them to make copies of the work, sell them, and sue others who do the same without permission. The most common practice is to use a court order to get the contact details of someone whom you suspect has been making illegal copies of your work, then get in contact with that person and threaten to sue them, unless they pay you an amount of money that is just a bit less than the cost of going to court. Because going to court would cost you more money, and there’s a chance you might lose.
What this means is that you’re using the court to get money out of someone, and not to start a court case. And the legality of such a move is … blurry. (I am not a lawyer. This is not legal advice.) It’s pretty hard to prove that someone is threatening to litigate without intending to go to court. But the strategy has been compared to extortion; it certainly has a “pay up or else” ring to it.
Anyway, the practical upshot of everything I’ve been saying is that we have some strong community values that are in conflict, and the balance between them won’t be restored easily. In the meantime, if you are creating works that can be made digital (and the number of things that can’t is going down steadily), you might have to be a bit more clever in order to make money. Not because you don’t have the right to make money from your work. You do. It’s just not going to be as easy as it was. You’ll need a model that’s more than just “copies for cash.” Because that’s going away. Piracy, sadly, is not.
A word on website hosting
I’ve had a few people and groups ask me about setting up their own website. After having this conversation a few times, I thought perhaps I should distill the common elements here. So here goes.
When choosing website hosting, most of us have 3 basic options:
A static website (i.e. no content management tools) is a simple way to host unchanging information. This solution costs the least money. The trade-off is that the content takes some effort and savvy (e.g. knowledge of HTML, CSS and FTP) to set up and to change. Problems such as “dead links” (links that go nowhere) can be a real pain point for readers and for the site maintainer, especially if the structure of the site changes during its life. You won’t need to worry about the inner workings of this website, because there really won’t be any.
A more complex and powerful option is a hosted content management system (CMS), e.g. Tumblr, Blogger or WordPress. The most important difference from the first option is that this solution allows you to change things easily. These solutions usually have a “freemium” model, in which the basic offering has no cost, but extra options (such as using your own template or your own domain name) attract a modest fee. Such solutions are aimed to meet the needs of those whose website is a tool to raise awareness of their offering, rather than being the offering itself. Hosted systems usually include basic support to help you along the way, meaning that while you need a little bit of tech savvy, you don’t need to be a guru to manage it. The basics are easily learnt by anyone who can use the web, and the inner workings are taken care of by the people you’re paying.
A self-hosted system (e.g. a virtual private server (VPS) or dedicated server) is best for those with complex needs, or those who want to host a web app that they have developed themselves. This solution will give you all the complexity and all the power. But be aware that when things break, there’s nobody to help you. These solutions also have a higher financial cost than the other solutions mentioned here, as well as needing more time to monitor, maintain and support. If you’re going with this, you definitely want to be a geek, or at the very least have a geek on hand who owes you several large favours.
I hope that this is useful to you. If questions arise, please comment.
URL Shortening Sucks When Bandwidth Is Scarce
There’s plenty of people on various blogs telling us that URL shorteners like bit.ly or tinyurl break the web, because they create an arbitrary dependency on themselves. For example, if you hit a URL on the bit.ly domain and bit.ly is down, you’re out of luck. But not only do URL shorteners break the when they go down, they make it slower even when they are working as designed.
Here’s why: when you click on a link from bit.ly (or ow.ly, or ur1.ca, or arseh.at) your browser has to go and make a request to the server that runs the URL shortener before it can do anything else – before it can even think about loading what you actually want to see. So every short link you click is slower than the unshortened link would be.
And that’s just for starters. URL shorteners also mess up most of the fancy things that browsers – especially mobile browsers – do to make pages load faster. I’m talking about techniques like DNS prefetching and HTML prefetching and… well, mostly different kinds of prefetching. But they rely on knowing which URL you’re likely to look at next, and if someone has put a mask over the URL you’re heading to (say, by using a URL shortening service), those optimisations can’t do you any good.
“Sure, that’s fine,” you might well be thinking, “but why mention it now?” Well, sir or madam or neuter, I’m glad you asked. The reason I’ve embarked on this particular rant at this particular point in time is that mobile web usage is only going up. That goes double if you include the use of apps that link, like Twitter, Facebook, Google+ and all the rest. The mobile web is becoming so prevalent that it’s almost difficult to get a new mobile phone that doesn’thave a built-in browser. So ubiquitous, in fact, that I am starting to use it rather a lot. And, as a consequence, get mightily annoyed at how bloody slow it is.
We don’t need more things that slow the web down. Please keep those links long. Because let’s face it: it’s not like we’re doing anyone but Twitter any favours.
Where baby HTTP 404s come from
DISCLAIMER: I maintain the code that I refer to in this post, but I didn’t write the code.
I’ve been hunting HTTP 404 errors (that’s the “File Not Found” variety), and today I came across a bit of a puzzler. The non-existent image files were being referred to by a stylesheet as part of a CSS background: instruction, but the rule that contained the instruction was never invoked. That is, the rule was for a class name that was never used.
How the hell was I getting these image requests if the code that started the requests was never being used?
Then I noticed that the errors were coming from just a few different user agents. And they all had something about them…
- LG-GW305/V100 Obigo/WAP2.0 Profile/MIDP-2.1 Configuration/CLDC-1.1 UNTRUSTED/1.0 lg-gw305
- Nokia2730c-1/2.0 (10.47) Profile/MIDP-2.1 Configuration/CLDC-1.1 nokia2730c-1/UC Browser7.7.1.88/70/352
- Nokia5130c-2/2.0 (07.95) Profile/MIDP-2.1 Configuration/CLDC-1.1 nokia5130c-2/UC Browser7.6.1.82/70/352 UNTRUSTED/1.0
- Nokia5330-1d/5.0 (06.85) Profile/MIDP-2.1 Configuration/CLDC-1.1 nokia5330-1d
- Nokia5530/UC Browser7.6.1.82/50/352
- Nokia5800 XpressMusic/UC Browser7.7.1.88/50/352
- NokiaC3-00/5.0 (04.45) Profile/MIDP-2.1 Configuration/CLDC-1.1 nokiac3-00/UC Browser7.7.1.88/69/352 UNTRUSTED/1.0
- NokiaC3-00/5.0 (04.45) Profile/MIDP-2.1 Configuration/CLDC-1.1 Opera/9.60 (J2ME/MIDP;Opera Mini/4.2.13337Mod.by.CHIZZY/503; U; en)Presto/2.2.0 UNTRUSTED/1.0
- NokiaC3-00/5.0 (04.60) Profile/MIDP-2.1 Configuration/CLDC-1.1 nokiac3-00 UNTRUSTED/1.0
- NokiaC3-00/5.0 (07.20) Profile/MIDP-2.1 Configuration/CLDC-1.1 nokiac3-00 UNTRUSTED/1.0
See it? That’s right, they all come from mobile browsers. This puzzled me a bit, until I realised that it might be something you’d do to speed up the user experience if you knew they’d be on a slow connection (and you didn’t care if they downloaded heaps of data they didn’t need).
So here’s the piece of knowledge I want to add to the internet (he said, arrogantly assuming it wasn’t there already): Mobile browsers prefetch images from stylesheets as an optimisation in the face of low speed connections.
You’re welcome.
Passion vs detachment
I’m confused about something, namely: how you can be both passionate and detached.
Example: I was working on a problem the other day, something about grade point averages for a graduate employment campaign. This problem and I have a history. I was determined to solve it, if not once and for all, then as close as practical. Then I ran into a roadblock – something or other to do with writing results back to a database. The details aren’t important. I asked my 1up for help, and he obliged by telling me that I should change my approach and avoid the roadblock completely.
I hated that idea. Admittedly I was already a bit cranky because I was hungry (cf. my blood sugar issues), but I was also really attached to my way of solving the problem. The idea that we should take a few extra steps for caution’s sake (like not writing to the DB before we check the results) was repugnant. Surely we should just get it done, right?
The problem here is that I was attached to my solution, not to solving the problem. So how do we avoid that? How can you be both passionate about solving a problem, and detached from your solution?
Protip: Keyboard-activated, cross-browser bookmarks
I’m a web developer. That means that I use at least 3 different browsers every day. And maintaining a set of useful bookmarks across all those browsers is a pain in the arse. Also, I tend to prefer typing over mousing. It’s generally quicker. Wouldn’t it be nice to have a set of bookmarks that I can use in any browser, and access from the keyboard?
Here’s what I did to get this working. I installed Texter (made by Adam Pash of Lifehacker). Texter is a little background app that does text substitution. You type foo and hit Tab, and it gets replaced with bar. Or whatever you like.
After you’ve installed Texter, you set up your bookmarks. You make a hotstring (Texter’s word for a thing to replace) for each of the URLs you want to bookmark. You might, for example, have one for your production environment, one for dev, and one for staging. And while you’re at it, one for Gmail, one for your blog and one for Twitter. Then you just type in your shortcut and hit Tab, and the URL is there.
Nice work. Well done. I’m proud of you.
There’s a lot more that you can do with Texter. This is just one example. Explore and have fun.
Libraries are gonna have to change
Libraries have traditionally been repositories of knowledge – a place that you go to if you want to do some research, find something interesting to read, or just borrow a trashy novel to stave off boredom. I’m thinking of your local lending library here – the kind that lets you borrow a book, read it, and bring it back in a few weeks. They’re often funded by local government (at least where I live), and hence are not a profit seeking enterprise, and they usually provide services that are of huge benefit to their local community. I think libraries are fantastic.
I also think that their traditional lending-based service model is doomed.
My apologies. That was overly dramatic. It’s probably more appropriate to say that the lending model currently used by libraries is going to take a back seat to some of the less matter-bound services that libraries offer. More about that in a moment. First, let me explain why lending will decline.
The lending model is fundamentally tied to the ideas of scarcity and the high marginal cost of production – that is, it costs a lot to make another of something. This idea is certainly true of books; dead tree books cost a packet, particularly reference books. So if you want access to a lot of them, you’d better be prepared to pay – or use a lending library. But as more and more content – and more and more books – become accessible by digital means, that idea goes out the window. The cost of making a digital copy of a work that exists in digital form (like, say, an ebook) is practically zero. That means that it actually costs more to run a digital lending system, where you have to keep track of the copies you’ve made, and ensure that they are deleted when they should be, than it does to just give people digital copies of the works they want access to.
Let me say that again: it costs more to lend digital works than it does to give them away. That turns the traditional lending library model on its head, because the fundamental assumption that lending libraries operate on is that the world works the other way around. Which it has for centuries, and will continue to in some cases. The basic distinction is that it’s true for atoms, but not for bits.
So if demand for libraries’ primary purpose – lending – is going to decrease in the near future, what will they offer to the community when that happens? Well, I’m glad you asked. It just so happens that librarians (and the other people you find hanging around in libraries) know a whole lot more than simply where the books are. They also tend to have a lot of skills that the rest of us just need every so often. Skills like research, like referencing, like actually finding facts, figures, quotations and evidence that can’t necessarily be indexed by Google (yet). They have a really big role in encouraging literacy and education, and in local history research and archiving. Think local papers, minutes of local council meetings and putting names to faces in old photos.
As far as content goes, I believe that libraries will still have something to offer, even if they don’t lend so much. One of the biggest issues with the free reproduction of works that is made possible by digital tools comes in the form of copyright. While copyright protection organisations like AFACT, BREIN the MPAA and the RIAA are often portrayed as villains (and sometimes rightly so) by publications like TorrentFreak, they do seem to have the law on their side at least some of the time. The law may be outdated and unfair, but crossing it can still get you into serious trouble, and serious debt if you’re singled out for prosecution. I think libraries may be able to offer something in that kind of world. They can offer certainty of provenance – a way to be sure that you’re respecting the rights of those who made what you’re enjoying. I envision repositories of links to works that are freely distributable, either as public domain, or licensed under Creative Commons or open source licences, or some similar arrangement, which we can use, enjoy, and be sure that we’re allowed to do so. Difficult, perhaps. Certainly impossible to do in a way that is exhaustive, and keeps track of every free work out there. But something is better than nothing, and I can’t think of anyone better than a librarian to help me find that kind of thing.
There is another characteristic of libraries that makes them valuable, quite apart from the services provided by the staff. I’m talking about the space that libraries provide, purely by virtue of being in a building, for people to work on stuff while they are around other people. Coworking is a pattern that is taking off in the small business/entrepreneurial area. The idea is that a bunch of people share a work space, each working on their own project or idea, and the get the benefits of working with others: some social interaction, people to bounce ideas off and discuss problems with, as well as answers to the “how do I get my damn computer to work?” questions. I see that as a really easy transition for libraries to make. It was once the domain of universities, and to some extent it still is, as long as you have a degree in something or other. Coworking is making headway on this idea in the professional world. Libraries have the opportunity to open up that kind of experience to a much broader group of people, many of whom could benefit from it, such as amateur researchers and secondary and tertiary students.
So while I think lending as a service is set for a decline, I think there’s still a lot of value to be found in libraries, particularly in the expertise of those who inhabit them, and the atmosphere they get when people seeking knowledge find themselves in the same place. I look forward to seeing what really happens.
The Offlineable Personal Wiki
Some of us are mobile, and have smartphones, and want to take notes of things. Some of us use Evernote, but are a little bit dissatisfied because it’s a bit on the slow side, and a little nervous about giving EvernoteCorp all our data. Some of us like the idea of a wiki, but want to be able to use it on our smartphones when we’re out of range.
Enter the Offlineable Personal Wiki Which Doesn’t Have A Cool Name Yet (OPWWDHACNY). Markdown editing, text file storage, easy mobile app for editing and searching, uses Dropbox or something similar to sync with a webserver for access to other devices. Dropbox allows syncing of any filetype including photos, audio recordings etc, so there’s no technical text-only limitations.
“What about editing collisions,” I hear you cry. Well, that’s the limitation. This is a single-user affair, folks, so editing collisions aren’t a problem.
That’s what some of us want, and I suspect there’s already a few candidates on the way out there. What’s missing is the Offlineable bit.
Go forth. Develop. Profit from my brainwave.
EDIT: Added link to Evernote.
Comments (4)