Data Portability

Social network portability is one of several user-interface ideas and suggestions in the area of data-portability. As users, our identity, photos, videos and other forms of personal data should be discoverable by, and shared between our chosen (and trusted) tools or vendors. When you join a new site, you should be able to import or preferably subscribe to your profile information and your social network from any existing profile of yours. We need a DHCP for Identity. A distributed File System for data. The technologies already exist, we simply need a complete reference design to put the pieces together. This problem is solved by a number existing technologies and initiatives: Microformats, OpenID, OAuth, RDF, RSS, OPML and APML.

Data Portability Technologies

Data Portabilities mission is to put all existing technologies and initiatives in context to create a reference design for end-to-end Data Portability. To promote that design to the developer, vendor and end-user community.

This post serves a brief primer to each of these technologies.

Microformats

Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Instead of throwing away what works today, microformats intend to solve simpler problems first by adapting to current behaviours and usage patterns (e.g. XHTML, blogging).

Examples include:

People and Organizations
hCard
Calendars and Events
hCalendar
Opinions, Ratings and Reviews
VoteLinks, hReview
Social Networks
XFN
Licenses:
rel-license
Tags, Keywords, Categories
rel-tag
Lists and Outlines
XOXO

If you use Flickr, Technorati, Upcoming, Last.fm, Twitter, Cork’d or any number of other services, you can conceivably share data between the different services providers automatically.

More details can be found on the microformats website.

OpenID

OpenID is an open, decentralized framework for user-centric digital identity. OpenID takes advantage of already existing internet technology (URI, HTTP, SSL, Diffie-Hellman) and realizes that people are already creating identities for themselves whether it be at their blog, photostream, profile page, etc. With OpenID you can easily transform one of these existing URIs into an account which can be used at sites which support OpenID logins.

In other words, OpenID allows users to login using shared credentials across different services. It also allows users to decide what information to share between services. For example, you can allow the use of your address on one service, but not another. You can think of OpenID as an extension to the single sign on used by Google or Yahoo! to access their various services.

More details can be found on the OpenID website.

OAuth

The OAuth protocol is less about authentication, which is the realm of OpenID, but rather authorisation. OAuth is an open protocol to allow secure API authorisation in a simple and standard method from desktop and web applications. For consumer developers, OAuth is a method to publish and interact with protected data. For Service Provider developers, OAuth gives users access to their data while protecting their account credentials.

A number of services have already been implemented. These include Fire Eagle, Open Social, Pownce, Get Satisfaction and Magnolia.

More details can be found on the OAuth website.

Resource Description Framework (RDF)

RDF is a family of World Wide Web Consortium (W3C) specifications originally designed as a metadata model but which has come to be used as a general method of modeling information, through a variety of syntax formats.

The RDF metadata model is based upon the idea of making statements about resources in the form of subject-predicate-object expressions, called triples in RDF terminology. The subject denotes the resource, and the predicate denotes traits or aspects of the resource and expresses a relationship between the subject and the object. For example, one way to represent the notion “The sky has the color blue” in RDF is as the triple: a subject denoting “the sky”, a predicate denoting “has the color”, and an object denoting “blue”. RDF is an abstract model with several serialization formats (i.e. file formats), and so the particular way in which a resource or triple is encoded varies from format to format.

This mechanism for describing resources is a major component in what is proposed by the W3C’s Semantic Web activity: an evolutionary stage of the World Wide Web in which automated software can store, exchange, and use machine-readable information distributed throughout the web, in turn enabling users to deal with the information with greater efficiency and certainty. RDF’s simple data model and ability to model disparate, abstract concepts has also led to its increasing use in knowledge management applications unrelated to Semantic Web activity.

More details can be found on the W3C website.

Really Simple Syndication (RSS)

RSS is a family of Web feed formats used to publish frequently updated content including, but not limited to, blog entries, news headlines, and podcasts. An RSS document, which is called a “feed” or “web feed” or “channel”, contains either a summary of content from an associated web site or the full text. RSS makes it possible for people to keep up with web sites in an automated manner that can be piped into special programs or filtered displays.

RSS content can be read using software called an “RSS reader”, “feed reader” or an “aggregator”. The user subscribes to a feed by entering the feed’s link into the reader or by clicking an RSS icon in a browser that initiates the subscription process. The reader checks the user’s subscribed feeds regularly for new content, downloading any updates that it finds.

More details can be found on the RSS Board website.

Outline Processor Mark-up Language (OPML)

OPML is an XML format for outlines. Originally developed by Radio UserLand as a native file format for an outliner application, it has since been adopted for other uses, the most common being to exchange lists of web feeds between web feed aggregators.

The OPML specification defines an outline as a hierarchical, ordered list of arbitrary elements. The specification is fairly open which makes it suitable for many types of list data.

More details can be found on the OPML website.

Attention Profiling Mark-up Language (APML)

APML allows you to share your own personal Attention Profile in much the same way that OPML allows the exchange of reading lists between News Readers. The idea is to compress all forms of Attention Data into a portable file format containing a description of your ranked interests.

Services that have adopted APML include Bloglines, Cluztr, Dandelife, Engagd, Idiomag, OpenLink Data Spaces and Particls.

More details can be found on the APML website.

Securely transfering personal data around the web has become an increasingly important concept to not only users of the web, but service providers. Both Plaxo and Six Apart have been working on a system to allow the transferral of data. However, since Google announce Open Social and the Open Social API, the mantle has been handed over and there is now a strong commitment to realising data portability.

On the Web, a walled garden is an environment that controls the user’s access to Web content and services. In effect, the walled garden directs the user’s navigation within particular areas, to allow access to a selection of material, or prevent access to other material.

Recent history suggests that open standards will again better the “walled gardens” of the Web.

In 1994, when the previously obscure computer network, developed by the American Department of Defence, first become known to the general public as the “World Wide Web”, or simply The Web, many people first connected to it via AOL and CompuServe. These subscription-based service providers offered not only access to the Internet, but other services such as email, chatrooms, discussion boards and more. It was access to the Web via the Internet that would lead to the undermining of these services, and the opening up of the Web as a platform for individual and creative expression, revenue generation and social interactivity.

Whilst it took some time for the closed communities to venture out into the wilds of the Web, it brought about the standardisation of the services that made up the early web. For instance, POP and SMTP standardised email and as a result it has become the ubiquitous tool of business. Today, of the early pioneers of the Web, only AOL survives, but as an entirely different entity; a web portal supported by advertising.

History appears to be repeating itself. The biggest online phenomena of the past couple of years, the social-networking websites of Facebook and MySpace, are acting very much like the AOL of the mid-1990s. They are closed systems based upon prioprietory standards. You cannot easily move information from one system or another if you so choose. This ties users into one system, or forces them to create profiles on both. A similar comparison can be drawn with the virtual worlds of Second Life and Entropia Universe.

The Web is better when it’s social.

Part of the reason these websites are popular is because they are closed communities, where users can interact with friends and find new friends with which to interact. This community feel has been tested in recent times, with sites such as Facebook being criticised for using their user’s personal data to target advertising. It is innevitable, however, that these systems are proprietory; it is only once these systems immerge and become popular that standards can be developed and implemented.

Open Social API

Just as the Web’s open standards, embodied in the Netscape browser, displaced the online services providers, so the paradigm of open standards awaits the social networking and virtual worlds. Back in the 1990s it was Netscape, but in the 21st Century it falls to Google to defend the open standards of the Web with the Open Social API. Some say there is a large amount of self interest in this move, since Facebook and MySpace have huge communities, which both networks know a huge amount more about than Google and can hence generate billions of dollars of revenue.

The web is more interesting when you can build applications that easily interact with your friends and colleagues. But with the trend towards more social applications also comes a growing list of site-specific APIs that developers must learn. Open Social is an attempt not only to open up the closed communities and allow developers to interact with the different networks, but allow developers to only learn one API. MySpace has signed up to this initiative and, more reluctantly so has Facebook. A curiosity is AOLs recent aquisition of Bebo, another online community popular in Europe. Is AOL simply jumping on the “band-wagon”? Has it learnt its lessons of the past, or is it using knowledge of its past as a guiding principle? Whatever is the answer, Bebo’s inclusion in Open Social will help it continue its competition with other social networking websites.