Synchronization Vs. Direct access, the case of contacts

March 22, 2009

Maintaining accurate and useful contact information data is not an easy task, especially when that information is spread over several places. The solution usually proposed for this problem is refered as synchronization. Synchronization is a mechanism involving copies of data between sources in order to reflect the same state everywhere, also known as data replication.

Synchronization brings several problems. It usually tends to cast information patterns into one predifined one. It flattens the information and causes the loss of the “origin” dimension, which may have important semantics on how and where to use it. It usually breaks the link with the source, thus the newly available information from an updated source won’t be distributed until next synchronization. It implies merging choices which result into inconsistencies, loss and duplication of information.

The worst thing with synchronization is that it actually solves nothing. It may be a sufficient solution when considering only local sources like computer or cellphone addressbooks but with the era of Web 2.0, most of the contact information is to be found on the Internet. Contact sources there are whatever services manipulating people, contact of friend objects. This obviously includes social networks. The diversity of those sources make synchronization impossible to consider, their purpose is different hence the provided content shouldn’t be merged, it should be used differently, depending on the context. And why would you actually want to do synchronization? Since all the information is always out there, you only need to access it.

This is why for People, we made the choice not to copy and merge the contact information from the web to a smashed local version of it but instead provide a unified way of accessing it. Using direct access, the only effort to be done is to explicit the grouping of contact information from different sources defining a single person entity. Direct access ensures integrity and reliability of data (it is not manipulated between access and use), allows knowing about the sources thus consider the relevance of it (the full name of a contact is more likely to be accurate on LinkedIn than on Last.fm), and the update of the information is done by the actual owners of it, your contacts. Associating a cache mechanism to avoid network roundtrips and maybe an offline mode to that way of doing things should benefit the user better than forcing him to understand and suffer the mist of the broken synchronization scenarios.

About these ads

One Response to “Synchronization Vs. Direct access, the case of contacts”


  1. If you have a caching mechanism and an offline mode, then you have one-way synchronization from various sources to a synthetic repository fusing the disparate data.

    Local availability of that data is the People project’s goal, but you can be sure that users will think about exporting their nice single repository of contacts to remote applications – maybe as vcard of hcard data obtained through the Dbus interface.

    Or maybe the next step will be an Opensync connector backed by the Soylent library. Is that something that lies in the long-term roadmap, or is it expressly excluded from the feature scope ?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: