Tag Archives: open data

Open Data: Looking Beyond the Apps

The open data movement has surely gathered momentum across the world. Taking a cue from the United States, which launched its data.gov open data sharing site in May 2009, many national governments have taken similar initiatives to create their own open data sites. These include developed countries such as Australia, Canada, France, Germany, Italy, Netherlands and UK as well as emerging nations such as Brazil, India, Indonesia and Russia. Most of these sites got launched between 2010-2012.

India launched its open data site, data.gov.in September 2012. After starting off slow, it has now picked up momentum and today offers more than 2500 datasets. A dataset is a table of data on a particular area. It could be as large as all the crop production in the country crop-wise, district-wise for the last 30 years or it could be as narrow as exports of a particular item to different global regions in a single year.

The US opened the government data as part of president Obama’s open governance promise, while the first Federal CIO Vivek Kundra, the person behind implementing the initiative, called upon individuals, groups and commercial companies to make use of open data to build innovative apps that would solve citizens’ problems. Kundra consistently championed building apps and even prophesized that in the coming years, there would be “explosion of apps” based on open data.

Since then, these two attributes—transparency and citizen apps—have become the de facto objectives of government open data initiatives across the world. While the developed world has taken to both these objectives, the emerging countries have focused more on the citizen app side, for obvious reasons. Transparency is a very lofty objective to achieve in these countries just by releasing some datasets, when other governance frameworks are not ready.

While both these are worthy expectations to have from government open data initiatives, what is a little worrying is that these objectives have come to define open data priorities and policies in many countries.

Take the Apps expectation, for example. Globally, the role of apps creation from open data has been so overemphasized that many governments try to measure the effectiveness of their open data programs by the number of apps developed on the data made available. That is a completely misplaced expectation because of two reasons. One, data can help in betterment of citizen’s life in many ways beyond apps. Two, it is difficult for governments to track all the apps created. Look at the US data.gov site. Though there are more than 75,000 datasets, there are only around 350 citizen developed apps shared in the site.

Apart from misplaced expectations (and disappointments because of not meeting those expectations), the apps expectation has also resulted in misplaced priorities and policies governing open data.

Here are some of the skewed policies governments have followed because of the overemphasis on the apps part of open data.

Not measuring the efficiency accrued to the economy. Open data initiatives throw important government information in public domain, accessible easily to all. Very often, similar information is separately collected by various others (academic researchers, commercial organizations, other government bodies and agencies) for their requirements, thus duplicating the efforts. In other words, it is inefficient use of time and resources.

Open data, by eliminating—or at least minimizing—the need to duplicate that effort makes the whole economy far more efficient. This is difficult to measure in the short run but over a period of time can be measured. I have never heard any open data evangelist talking about this anywhere.

Further, if the governments realize this, they could cooperate with the other stakeholders and data collection and processing can be optimized to meet the requirements of more stakeholders. In future, the cost can even be shared. This can lead to far more efficient collection and processing of basic information and even enhance data quality.

Limited Outreach. The overemphasis on apps aspect has created a misplaced priority in terms of outreach. The outreach programs of governments in most countries are directed at the tech/app builder community with some tech savvy NGOs/advocacy groups joining in. The entire open data discussion is restricted to these three communities: government, developers, NGOs/advocacy groups. Many major stakeholders such as media, market researchers and academic researchers who could play an important role in showing the latent value that lies in open data are today left out. Even if they do show an interest, they often get scared away by the technical lingo that dominates these discussions. That is a loss for the cause of open data.

In an online conversation hosted by The World Bank on Open Data for Poverty Alleviation, I raised this point. Tim Davies of Practical Participation did agree and had this to say.

I think there is often a failure in open data capacity building to think about the consultants, analysts, researchers and so-on who might be engaged as users of data, and who will provide bespoke value added services on top of it (hopefully realizing social as well as economic value).

Restrictive data formats. Many government agencies implementing open data in their countries focus all their attention on obtaining/creating datasets in machine readable format—a direct result of working from apps backwards. While a lot of time and energy is wasted in conversion/cleaning, a lot of good, structured datasets, that are not in machine readable format never make it to their list of published datasets. That is a big loss.

True, machine readable formats do make life easier for everyone, but ignoring human readable formats is the other extreme. Open data is not defined by any format. Maybe, the implementers of data portals should take some middle path, which will encourage machine readable formats but should not leave out human readable formats such as pdf completely.

Too much emphasis on datasets on consumer interest areas. The overemphasis on citizen apps put an undue pressure on the managers of data portals to work towards obtaining more and more datasets that are directly of interest to end consumers and hence good data to build apps on. So, while a hospital list or a crime info dataset is cheered, a crop production data or exports data is often dismissed as “useless information dumped by government.” While it’s true that data that is of consumer interest can be used to instantly create apps, research on data on agriculture and meteorology, when analyzed at the hands of experts and using right tools can have a far broader and long term impact on the lives of millions of citizens. These analyses could help in maximizing agricultural production/avoiding big disasters/imparting the right skills to unemployed youth and so on, even if they are not created as sleek apps.

Slowly but surely, the constraints of associating open data too much with apps and pre-designed visualizations are being realized. Mike Gurstein, a leading voice about open data argued this in his blog.

But why shouldn’t we think of “open data” as a “service” where the open data rather than being characterized by its “thingness” or its unchangeable quality as a “product”, can be understood as an on-going interactive and iterative process of co-creation between the data supplier and the end-user; where the outcome is as much determined by the needs and interests of the user as by the resources and pre-existing expectations of the data provider?

Though Gurstein’s explicit question is about the rationality of deciding outcomes by the pre-existing expectation of the data provider, the logic can be extended to ask why should it be based on the pre-existing expectation of the apps providers? In most cases, the apps providers do not have too much of extra insight about the end users’ needs.

At the end, it must be pointed out that open data is about making information work for the betterment of society—making lives of citizens convenient, creating the basis for decisions at a macro-economic level, making the economy and business ecosystem more efficient, and yes, minimizing risk. It is not about technology; technology is a very handy tool, though.

Leave a comment

Filed under Open Data, Policy & Regulation, Technology & Society

Officially Open: India Launches Open Data Beta Site

India has finally launched the beta version of its open data site, data.gov.in. The site is part of the country’s plan to provide open and transparent access to data collected by various government departments and agencies, as outlined in the recently formulated  National Data Sharing and Accessibility Policy – 2012 (NDSAP-12).

The stated advantages, as envisioned by the policy, include maximization of use of data, avoidance/minimization of duplication of efforts on collection, facilitating integration by leading to common standards, providing ownership information, faster and better decision making, and of course, equitable access to information by all citizens.

Ever since the then Federal CIO of the United States started in May 2009, many counties have launched similar sites. India is the latest country to join the bandwagon.

The data.gov.in site has debuted with 13 raw datasets provided by seven departments and four apps provided by four departments. As part of the plan, data management offices are being created in each of the departments to be headed by a senior official called data controllers. Five ministries/departments have already identified their data controllers, whose names are available in the site. These are Department of Public Enterprises under the Ministry of Heavy Industries & Public Enterprises, Department of Disinvestment under Ministry of Finance, Department of Fertilizers under Ministry of Chemicals & Fertilizers, Ministry of Food Processing Industries, and Ministry of Micro Small and Medium Enterprises. The departments will be responsible for uploading the datasets directly, for which NIC is helping them by providing training and technical help. It is conducting a workshop on 5th October focused on this.

India, however, is not a part of open government partnership, a consortium of more than 50 countries. The initiative was started by nine countries, including India, but India withdrew just before launch. India was reportedly “concerned about the Independent Review Mechanism” which opens participating countries for reviews by outsiders.

However, India has been supporting the open government community by helping create what is called an open government platform—an easy-to-use toolbox that allows smaller countries to go for similar portals without worrying about technical challenges. The platform was launched in last March.

The launch of data.gov.in marks a new chapter in governance. It is a pity that it has got almost no mention in the media, especially when corruption and the role of public institutions are being debated so intensely.

In the US, the opening up of government data to public has seen innovative applications being created by third party organizations using the data (maximizing use).

Many say a more laudable goal in India would be the avoidance of duplication of efforts and resources in collecting data. This, however, is a lofty expectation to have from a transparency initiative like this, because it is not lack of availability/knowledge but personal ego battles and/or lack of coordination between departments that are the reasons for this duplication of efforts and resources. A recent example is the tussle between Home Ministry and the UIDAI on collecting data for National Population Register and Aadhaar.

But the private sector, academics researchers, and NGOs/advocacies can surely benefit from getting easy and timely access to government collected data. With analytics and data vizulazation becoming the hottest areas in technology, an initiative like this could not have been more timely.

 

 

 

2 Comments

Filed under New Governance, Policy & Regulation

Of Numbers, Business Journalism and the Emerging World of Data Journalism

Yesterday was my last day at CyberMedia India Ltd—an organization where I served in various capacities for the last 18 years. This also happened to be—going by my current plans for the future—my last day as a journalist; business journalist, as I never forget to emphasize.

I thought this is an apt time to share what I believe are the most essential requirments for a business journalist. It would not be new to people who have worked closely with me—my juniors for sure, but many of my peers and seniors too. I have often preached two simple mantras to freshers. Many may and do disagree with me on this—and that is fine with me—but I dare anyone to point out even one instance when I have been unfaithful to these mantras!

Those mantras would sound astoundingly simple to you. In fact, I believe they really are. And here they go…

  1. Never underestimate the value of numbers
  2. Never overestimate the value of numbers

That is common sense, na?

Yes, it is.

But unfortunately, I have encountered so many youngsters who believe they can “stay away” from numbers and still be succesful business journalists. They believe I am some sort of a fundamentalist to insist on something so mundane. I take the criticism with alll humility but would still stand by my assertion. And the fact that I chose to highlight this as the most important requirement as I say good bye to the field just shows how much importance I attach to these two mantras.

But before that, I must make a clarification. These are not sufficient conditions for becoming good business journalists. A person who is on top of numbers but is not good at finding stories is good to be a statistician, not a business journalist. A business journalist should be a journalist first. And a good business journalist should be a good journalist first. A good journalist, I am repeating for the sake of completeness, should have an eye for a story. A good business journalist, often, may find a story in a set of numbers itself, though that is just one aspect of it.

So, what do I mean? When I say never underestimate the value of numbers, it simply means you must be comfortable to deal with numbers if you want to be a business journalist. You do not need to be an economics or engineering or mathematics graduate, but you must not fear numbers. You need to learn how to read them and they must not repel you. I think there is no other way. In many ways, Mantra 1 is a necessary condition to be a business journalist. It is the beginning of the journey.

Mantra 2, on the other hand, is what would help you transform yourself to a good business journalist from an average one. At first, it sounds contradicting the first mantra, but in essence, it is not. The first mantra just emphasizes the importance of being comfortable with numbers. The second suggests you should not get obsessed by numbers. Practically, it could mean one of the two things: one, tell yourself that numbers could often give you a good story, or an idea to pursue, but there are other important sources too. Two—and this is more important—just because you have discovered something by doing some number crunching does not mean the reader is interested in all those numbers. The fact is that most readers do not like a copy that is full of numbers; it must tell a story. But very often, you have got to the story by doing some heavy number crunching in the background. Resist from throwing all those numbers in the story. Tell the reader the story, may be suported by a couple of big numbers. But don’t subject him to all that you have worked out. That is what I mean when I say never overestimate the value of numbers.

If mantra 1 is about starting the journey, mantra 2 is about reaching virtuosity; knowing when to exercise restraint. Sometimes, numbers are just for the input; not for the output.

The reason I chose this topic for highlighting is not just my love for numbers. It is about the increasing relevance of this skill on part of a journalist (and not just business journalists) in a world that is going through a data revoloution, driven primarily by a movement towards transparency. Releasing of data by governments, such as the US government’s data.gov and similar initiatives the world over, is becoming mainstream. Apart from governments, international organizations and business organizations too are releasing huge raw datasets to the public domain. These datasets are invaluable sources of treasure as far as spotting trends is comcerned. And that is what good journalists have done traditionally—to be out with a trend. These datasets provide a great opportunity to analyze and come out with interesting stories by the journalists. So much so that, a new term, data journalism, is now becoming vogue.

Wikipedia calls it data driven journalism and defines it thus

Data-driven journalism is a journalistic process based on analyzing and filtering large data sets for the purpose of creating a new story. Data-driven journalism deals with open data that is freely available online and analyzed with open source tools. Data-driven journalism strives to reach new levels of service for the public, helping consumers, managers, politicians to understand patterns and make decisions based on the findings. As such, data driven journalism might help to put journalists into a role relevant for society in a new way.

I, however, find the explanation of Guardian to be far more relevant and simple.

My major disagreement with the Wikipedia definition is this: while I do believe that open data will revoloutionize the way data journalism is handled, I will not like to include it in the definition of data journalism. Journalism should not concern itself with the nature of the source of that data. Even if it is not open data, it should still be called data journalism. Having said that, open data, because of its sheer volume and openness—the fact that is available to all—will make a huge impact on how data journalism evolves.

I will also recommend this piece  by Guardian, that is an extract from its Data Journalism handbook. While you go through that yourself, I will like to reproduce extracts from its first tip and last tip.

The best tip for handling data is to enjoy yourself. Data can appear forbidding. But allow it to intimidate you and you’ll get nowhere. Treat it as something to play with and explore and it will often yield secrets and stories with surprising ease. So handle it simply as you’d handle other evidence, without fear or favour.

…….

…….

The best questions are the old ones: is that really a big number? Where did it come from? Are you sure it counts what you think it counts? These are generally just prompts to think around the data, the stuff at the edges that got squeezed by looking at a single number, the real-life complications, the wide range of other potential comparisons over time, group or geography; in short, context.

As it is, data journalism is not a new concept. All business journalists (and other journalists) would have done it in some way or other—in the earning season, for example.

As far as I am concerned, I have done many big stories, purely basing them on analysis of data. That is why some of the international multilateral organizations as well as bodies like RBI are my regular stops. They often release data that reveal exciting stories if you look for them. In fact, I have even managed to earn a name for such stories from many who term them, armchair stories. Honestly, I did not know the term data journalism while doing those.

For those who call them armchair stories, I have just one more piece of news. I have gone a step ahead. In recent days, I have focused on what one could well term lazy man’s data journalism. Many of my tweets are actually based on those “hunted” data from various sources (minister’s answer to a question in parliament, RBI Governor’s speech or GITR report by WEF) often without any analysis on my part. But what makes me select a few from so much that is around is that I know what is a little surprising, counter intuitive, or plain interesting. That does not come from my comfort with numbers. That comes from my familiarity with the area of ICT/public services. I cannot do the same in say, biotechnology.

I have found those tweets to be the most retweeted. So, there must be something interesting in them.

Leave a comment

Filed under Media, Technology & Society

Open Government Platform: Beginning of A Great Journey

In the next few hours, the Union Minister for Communications & IT, Kapil Sibal, is expected to announce the launch of open government platform, in the presence of some representatives from the US government. This will be the first major announcement after the cabinet approved the National Data Sharing and Accessibility Policy (NDSAP) 2012 last month.

The idea of open governance, spearheaded by the US, under then then Federal CIO Vivek Kundra, has been gaining popularity the world over. The Open Government Partnership  is a multilateral initiative that aims to “secure concrete commitments from governments to promote transparency, empower citizens, fight corruption, and harness new technologies to strengthen governance.”

The Open Government Partnership as a global partnership is not too old and started just about six months back. Formally launched on 20 September 2011, with an initial declarationby eight countries—Brazil, Indonesia, Mexico, Norway, Philippines, South Africa, United Kingdom, United States—the partnership now has 53 member countries, including the original eight.

With its time-honored policy of under-commitment, India is yet to formally join the partnership but is working with the US government to work on open access to data.  To become a member of OGP, participating countries must embrace a high-level Open Government Declaration; deliver a country action plan “developed with public consultation”; and commit to independent reporting on their progress going forward.

It may be noted here that publishing data collected by government is just one—though, at present, arguably the most important—aspect of the move towards this openness.

The Platform

While the actual beginning of the move towards open government began with President Obama signing the Memorandum of Transparency and Open Government on Day One of assuming office, it was with the appointment of Vivek Kundra as the Federal CIO that the real momentum started. Barely two months after his appointment in March 2009, Kundra launched Data.g0v platform (in May), for providing public access to raw datasets generated by the Executive Branch of the Federal Government in order to enable public participation and private sector innovation. It drew from the DC Data Catalog launched by Kundra when he was CTO of Washington, D.C., where he published vast amounts of datasets for public use.

Though open government is a broader objective and is not just about releasing raw government data, this was nevertheless considered a major step, as the public availability of these datasets would not only help in transparency and openness, it also would allow anyone who wishes to do so—companies, individuals, NGOs—to create innovative applications using these data. And it actually did.

But when Kundra announced his resignation in June last year, there was a lot of apprehension whether the open government movement will lose its momentum. Many believed Kundra’s resignation was because of a drastic cut in funding for the e-government initiatives that he had undertaken. In a column titled, The Death of Open Government,  in Washington Post, renowned technologist, academician and commentator was drastic in his observation.

But, with Kundra gone, I am not optimistic about the program. Whenever a program loses its key evangelist, it normally dies. The Open Government Initiative is likely to suffer a slow, inevitable death.

But nevertheless the progress continued.

And when there is something around IT, can India be kept out of it? When the US government started to look at open sourcing the data.gov platform, India—the land of techies—was of course, the first stop. And this began around August, even before the Open Government Partnership was announced. India was not to be a member of that; it still isn’t.  But when it comes to tech work, the world’s most business savvy nation, surely knew where to turn to.

In December, it was publicly announced that India and US were working together to create a platform, called data.gov- in-a-box, an open source platform that would help governments globally to produce their own version of data.gov. This is what the data.gov site said at that time.

Among the actions in the U.S. National Action Plan announced by President Obama is an effort under the U.S.-India Strategic Dialogue to produce “Data.gov-in-a-Box,” an open source version of the United States’ Data.gov data portal and India’s India.gov.in document portal. The U.S. and India are working together to produce an open source version available for implementation by countries globally, encouraging governments around the world to stand up open data sites that promote transparency, improve citizen engagement, and engage application developers in continuously improving these efforts. Technical teams from the governments of the U.S. and India have been working together since August of this year, with a planned launch of a complete open source product (which is now called the Open Government Platform (OGPL) to reflect its broad scope) in early 2012.

Today is that day, when that formal announcement about that platform is likely to happen by the Indian IT minister.

All the best for the journey together of two great nations, which are not just the most influential democracies in the world but are also the most competent when it comes to IT. And nothing marries democracy and technology like this initiative does. It takes the tool to be transparent on a platter to the governments around the world.

Additional Note: This should also convince critics of outsourcing to India (many within the Obama administration itself) that companies that seek Indian help in IT  do not do that just because it is low cost.

1 Comment

Filed under Digital Economy, New Governance, Policy & Regulation, Technology & Society