Tag Archives: data.gov.in

Open Data: Looking Beyond the Apps

The open data movement has surely gathered momentum across the world. Taking a cue from the United States, which launched its data.gov open data sharing site in May 2009, many national governments have taken similar initiatives to create their own open data sites. These include developed countries such as Australia, Canada, France, Germany, Italy, Netherlands and UK as well as emerging nations such as Brazil, India, Indonesia and Russia. Most of these sites got launched between 2010-2012.

India launched its open data site, data.gov.in September 2012. After starting off slow, it has now picked up momentum and today offers more than 2500 datasets. A dataset is a table of data on a particular area. It could be as large as all the crop production in the country crop-wise, district-wise for the last 30 years or it could be as narrow as exports of a particular item to different global regions in a single year.

The US opened the government data as part of president Obama’s open governance promise, while the first Federal CIO Vivek Kundra, the person behind implementing the initiative, called upon individuals, groups and commercial companies to make use of open data to build innovative apps that would solve citizens’ problems. Kundra consistently championed building apps and even prophesized that in the coming years, there would be “explosion of apps” based on open data.

Since then, these two attributes—transparency and citizen apps—have become the de facto objectives of government open data initiatives across the world. While the developed world has taken to both these objectives, the emerging countries have focused more on the citizen app side, for obvious reasons. Transparency is a very lofty objective to achieve in these countries just by releasing some datasets, when other governance frameworks are not ready.

While both these are worthy expectations to have from government open data initiatives, what is a little worrying is that these objectives have come to define open data priorities and policies in many countries.

Take the Apps expectation, for example. Globally, the role of apps creation from open data has been so overemphasized that many governments try to measure the effectiveness of their open data programs by the number of apps developed on the data made available. That is a completely misplaced expectation because of two reasons. One, data can help in betterment of citizen’s life in many ways beyond apps. Two, it is difficult for governments to track all the apps created. Look at the US data.gov site. Though there are more than 75,000 datasets, there are only around 350 citizen developed apps shared in the site.

Apart from misplaced expectations (and disappointments because of not meeting those expectations), the apps expectation has also resulted in misplaced priorities and policies governing open data.

Here are some of the skewed policies governments have followed because of the overemphasis on the apps part of open data.

Not measuring the efficiency accrued to the economy. Open data initiatives throw important government information in public domain, accessible easily to all. Very often, similar information is separately collected by various others (academic researchers, commercial organizations, other government bodies and agencies) for their requirements, thus duplicating the efforts. In other words, it is inefficient use of time and resources.

Open data, by eliminating—or at least minimizing—the need to duplicate that effort makes the whole economy far more efficient. This is difficult to measure in the short run but over a period of time can be measured. I have never heard any open data evangelist talking about this anywhere.

Further, if the governments realize this, they could cooperate with the other stakeholders and data collection and processing can be optimized to meet the requirements of more stakeholders. In future, the cost can even be shared. This can lead to far more efficient collection and processing of basic information and even enhance data quality.

Limited Outreach. The overemphasis on apps aspect has created a misplaced priority in terms of outreach. The outreach programs of governments in most countries are directed at the tech/app builder community with some tech savvy NGOs/advocacy groups joining in. The entire open data discussion is restricted to these three communities: government, developers, NGOs/advocacy groups. Many major stakeholders such as media, market researchers and academic researchers who could play an important role in showing the latent value that lies in open data are today left out. Even if they do show an interest, they often get scared away by the technical lingo that dominates these discussions. That is a loss for the cause of open data.

In an online conversation hosted by The World Bank on Open Data for Poverty Alleviation, I raised this point. Tim Davies of Practical Participation did agree and had this to say.

I think there is often a failure in open data capacity building to think about the consultants, analysts, researchers and so-on who might be engaged as users of data, and who will provide bespoke value added services on top of it (hopefully realizing social as well as economic value).

Restrictive data formats. Many government agencies implementing open data in their countries focus all their attention on obtaining/creating datasets in machine readable format—a direct result of working from apps backwards. While a lot of time and energy is wasted in conversion/cleaning, a lot of good, structured datasets, that are not in machine readable format never make it to their list of published datasets. That is a big loss.

True, machine readable formats do make life easier for everyone, but ignoring human readable formats is the other extreme. Open data is not defined by any format. Maybe, the implementers of data portals should take some middle path, which will encourage machine readable formats but should not leave out human readable formats such as pdf completely.

Too much emphasis on datasets on consumer interest areas. The overemphasis on citizen apps put an undue pressure on the managers of data portals to work towards obtaining more and more datasets that are directly of interest to end consumers and hence good data to build apps on. So, while a hospital list or a crime info dataset is cheered, a crop production data or exports data is often dismissed as “useless information dumped by government.” While it’s true that data that is of consumer interest can be used to instantly create apps, research on data on agriculture and meteorology, when analyzed at the hands of experts and using right tools can have a far broader and long term impact on the lives of millions of citizens. These analyses could help in maximizing agricultural production/avoiding big disasters/imparting the right skills to unemployed youth and so on, even if they are not created as sleek apps.

Slowly but surely, the constraints of associating open data too much with apps and pre-designed visualizations are being realized. Mike Gurstein, a leading voice about open data argued this in his blog.

But why shouldn’t we think of “open data” as a “service” where the open data rather than being characterized by its “thingness” or its unchangeable quality as a “product”, can be understood as an on-going interactive and iterative process of co-creation between the data supplier and the end-user; where the outcome is as much determined by the needs and interests of the user as by the resources and pre-existing expectations of the data provider?

Though Gurstein’s explicit question is about the rationality of deciding outcomes by the pre-existing expectation of the data provider, the logic can be extended to ask why should it be based on the pre-existing expectation of the apps providers? In most cases, the apps providers do not have too much of extra insight about the end users’ needs.

At the end, it must be pointed out that open data is about making information work for the betterment of society—making lives of citizens convenient, creating the basis for decisions at a macro-economic level, making the economy and business ecosystem more efficient, and yes, minimizing risk. It is not about technology; technology is a very handy tool, though.

Leave a comment

Filed under Open Data, Policy & Regulation, Technology & Society

Officially Open: India Launches Open Data Beta Site

India has finally launched the beta version of its open data site, data.gov.in. The site is part of the country’s plan to provide open and transparent access to data collected by various government departments and agencies, as outlined in the recently formulated  National Data Sharing and Accessibility Policy – 2012 (NDSAP-12).

The stated advantages, as envisioned by the policy, include maximization of use of data, avoidance/minimization of duplication of efforts on collection, facilitating integration by leading to common standards, providing ownership information, faster and better decision making, and of course, equitable access to information by all citizens.

Ever since the then Federal CIO of the United States started in May 2009, many counties have launched similar sites. India is the latest country to join the bandwagon.

The data.gov.in site has debuted with 13 raw datasets provided by seven departments and four apps provided by four departments. As part of the plan, data management offices are being created in each of the departments to be headed by a senior official called data controllers. Five ministries/departments have already identified their data controllers, whose names are available in the site. These are Department of Public Enterprises under the Ministry of Heavy Industries & Public Enterprises, Department of Disinvestment under Ministry of Finance, Department of Fertilizers under Ministry of Chemicals & Fertilizers, Ministry of Food Processing Industries, and Ministry of Micro Small and Medium Enterprises. The departments will be responsible for uploading the datasets directly, for which NIC is helping them by providing training and technical help. It is conducting a workshop on 5th October focused on this.

India, however, is not a part of open government partnership, a consortium of more than 50 countries. The initiative was started by nine countries, including India, but India withdrew just before launch. India was reportedly “concerned about the Independent Review Mechanism” which opens participating countries for reviews by outsiders.

However, India has been supporting the open government community by helping create what is called an open government platform—an easy-to-use toolbox that allows smaller countries to go for similar portals without worrying about technical challenges. The platform was launched in last March.

The launch of data.gov.in marks a new chapter in governance. It is a pity that it has got almost no mention in the media, especially when corruption and the role of public institutions are being debated so intensely.

In the US, the opening up of government data to public has seen innovative applications being created by third party organizations using the data (maximizing use).

Many say a more laudable goal in India would be the avoidance of duplication of efforts and resources in collecting data. This, however, is a lofty expectation to have from a transparency initiative like this, because it is not lack of availability/knowledge but personal ego battles and/or lack of coordination between departments that are the reasons for this duplication of efforts and resources. A recent example is the tussle between Home Ministry and the UIDAI on collecting data for National Population Register and Aadhaar.

But the private sector, academics researchers, and NGOs/advocacies can surely benefit from getting easy and timely access to government collected data. With analytics and data vizulazation becoming the hottest areas in technology, an initiative like this could not have been more timely.





Filed under New Governance, Policy & Regulation