London is a data lab for tomorrow

François-Xavier Fringant, le 24 février 2011 à 23:51

Rule Britannia, Britannia rule the clouds

Those who are familiar with Saint Pancras station only know it too well: London is changing rapidly. From the Conservatives regaining power in 2008 to the upcoming Olympics, the city’s buildings are undergoing some kind of overpriced, permanent makeover.

Perhaps in a more ear-friendly fashion, its digital landscape seems to evolve even more dramatically. Over the last four months, I have been able to witness this on-going, silent revolution, through a series of insightful talks and conferences. Amid the growing data fever, London seemed like the perfect place to check the temperature.

Under construction: the road to Open Data

Who: Carole Tullo, Director of Information Policy and Services, National Archive
When: 26/10/10
Where: UCL lecture, Gower Street

After weeks of intense negotiation over what was seen as a shotgun wedding between the Tories and the Lib Dems, came about the time came to draft a sustainable programme. The Crown copyrighted Coalition Agreement, published in May, notably highlights the need for enhanced Government transparency.

“The Government believes that we need to throw open the doors of public bodies, to enable the public to hold politicians and public bodies to account. […] We will create a new ‘right to data’ so that government-held datasets can be requested and used by the public, and then published on a regular basis.”

(Article 16 of the Coalition Agreement)

The simple “right to data” phrase measures the distance travelled the last few years over the disclosure of Public Sector Information (PSI). After continuing campaigning, successes such as the COINS database last year clearly showed the way forward. Openness is also listed as one of the five priorities of Britain’s Big Society, the flagship policy idea endorsed by David Cameron and his supporters.

As a matter of fact, the UK government has long recognised the importance of the re-use of its data by including it in its agenda. The European Commission implemented a 2003 EU directive which encourages PSI sharing and develops innovation. Interestingly, the British government has now formally established the Function of Knowledge and Information Management (KIM) in the same way that the IT and Communications. The GKIM Network is managed by the National Archives that supports and promotes Knowledge and Information Management across government, ensuring that correct information is maintained and survives for tomorrow. However, the full potential of public sector information has arguably not been unlocked as I write these lines. The relentless efforts of Europe-wide initiatives, such as the ePSI platform, aim to determine a common code of good practices.

Refining raw data

Who: Rufus Pollock, Open Knowledge Foundation
When: 21/10/10
Where: Dataconomy, Media 140, Kings Cross

Yet highly satisfying, openness is not an end to itself. The need to help grow the open data ecosystem has risen, for example by adapting the tools and methodologies of the free/open source software community. Among other projects, Rufus Pollock helped initiate the Open Knowledge Foundation project, a not-for-profit organization founded in 2004 and dedicated to promoting open knowledge in all its forms. The Foundation is home to various open knowledge projects, communities and resources, including the Comprehensive Knowledge Archive Network, an open source registry for datasets. The initiative revolves around the idea that all open data are not necessarily machine-readable. Hence the necessity to “clean” the data before uploading it onto a platform, through a process described as “componentization”.

Componentization: towards cleaner, ready-to-use data?

The growing number of large released datasets brought about the necessity to reorganize the way data repositories are created. Large banks of data can only be processed through collaborative production and distribution. They require systematic analysis and processes across many sets of users, in small slices. Doing this allows one to ‘divide and conquer’ the organizational and conceptual problems of highly complex systems. Since packaging is propagating, broken down resources can be easily recombined into packages, which will be used by individuals. There is a lot of interesting work both to extend CKAN and to improve associated tools such as datapkg which enable data developers to automate working with datasets.

Tips to a good data visualisation

Who: David MacCandless, Information is Beautiful
When: 09/11/10
Where:  Protein Forum #7, Waterloo

The award-winning British journalist agreed to disclose some behind-the-scene tips on how to create an eye-catching, engaging visualisation:

“What doesn’t work: just putting things out there (what he calls “spaghetti”, a confusing gathering of information). You have to sort out things.”

There are three stages to engaging, insightful data visualisation:

1. The concept, the idea to start with

2. Sketching comes ahead via brainstorming and mindmapping, mostly on a simple sheet of paper. With one concern constantly in mind: “what questions has yet to be answered?”

3. Visualisation is the very last stage. You have to play around with the designs and see what works best. An understanding of infography is preferable but not necessary.

As usual, an image always being worth a thousand words, feel free to play David McCandless’ presentation at TED entitled The Beauty of Data Visualisation. The talk was not held in London, I give you that, yet the conference he gave here was fairly similar.

The Beauty of Data Visualization

Besides, another visualisation caught my eye while sitting through those talks: Everyone Ever in the World, by London-based The Luxury of Protest, that portrays all the wars and conflicts known in the history of mankind and their (heavy) death toll.

Not so fictional Death Star: a billion killed since the dawn of times

Don’t take yourself too seriously

Over the last few months, I have also noticed a global effort from data visualisers to expand their playground and demographics. Such initiative helps data appear not as an over serious, over-mentioned topic, by relating the visualisations to simple, down to earth situation.

Who: Andrew Shoben, Greyworld
When: 09/11/10
Where:  Protein Forum n°7, Waterloo

Interestingly enough, two completely opposite processes highlight best this statement. First, one of Greyworld’s visualisation, designed for the London Stock Exchange, aims to materialize essentially fugitive stock market variations. A physical, animated structure in the middle of the marketplace animates the abstractions of the FTSE 100.

£99 billion Luftballons

Conversely, data visualisation can provide a new, digital experience of feelings. Another breathtaking work consisted in visualizing the flavours when enjoying an ice cream. The work generates mini universes with different customizable characteristics.  The video here.

The ethics of data journalism

Who: Mark Stephens, Julian Assange’s lawyer
When: 09/02/11
Where: Data Journalism Talk, The Book Club, Leonard Street

Data-centred whistleblowing initiatives, such as Wikileaks, or the investigations that unveiled the 2009 MP expenses scandal, are the target of all praise and criticism. Those two processes, yet both exposing scandals through data hacking, differ significantly in revealing the identity of the leaker. Rather than playing the traditional role of ‘digger’, The Daily Telegraph had paid in 2009 an anonymous individual to provide the digital files disclosing MP expenses. The journalists were essentially acting as curators of the data. As a result, two years later, it is fair to assume that the man on the street hardly remembers that the Daily Telegraph, let alone an unnamed individual, had anything to do in the disclosure of the scandal.

Fame and blame: data journalism at the crossroads

Quite contrary, Julian Assange has arguably chosen to become the public face of Wikileaks. According to some, this visibility is likely to shift the spotlight away from the legitimacy of his cause. In the meantime, other platforms such as OpenLeaks aim to operate in a more middle of the road, less polemical manner than Wikileaks.
.
In a wider perspective, this debate raises a valid point which data journalists will have to tackle in the near future. Great honours bring about great burdens. In an increasingly globalized information world, excessively publicized initiatives can very well backfire and reveal harmful to the initial cause they serve. According to this latest London talk, successful data journalists, to an extent, may thus have to maintain their profile in the shade. Shame that the sun never sets on the British Empire.


Partager ce billet :
|


Laisser une réponse

XHTML : Vous pouvez utiliser ces balises : <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>