Of Numbers, Business Journalism and the Emerging World of Data Journalism

Yesterday was my last day at CyberMedia India Ltd—an organization where I served in various capacities for the last 18 years. This also happened to be—going by my current plans for the future—my last day as a journalist; business journalist, as I never forget to emphasize.

I thought this is an apt time to share what I believe are the most essential requirments for a business journalist. It would not be new to people who have worked closely with me—my juniors for sure, but many of my peers and seniors too. I have often preached two simple mantras to freshers. Many may and do disagree with me on this—and that is fine with me—but I dare anyone to point out even one instance when I have been unfaithful to these mantras!

Those mantras would sound astoundingly simple to you. In fact, I believe they really are. And here they go…

  1. Never underestimate the value of numbers
  2. Never overestimate the value of numbers

That is common sense, na?

Yes, it is.

But unfortunately, I have encountered so many youngsters who believe they can “stay away” from numbers and still be succesful business journalists. They believe I am some sort of a fundamentalist to insist on something so mundane. I take the criticism with alll humility but would still stand by my assertion. And the fact that I chose to highlight this as the most important requirement as I say good bye to the field just shows how much importance I attach to these two mantras.

But before that, I must make a clarification. These are not sufficient conditions for becoming good business journalists. A person who is on top of numbers but is not good at finding stories is good to be a statistician, not a business journalist. A business journalist should be a journalist first. And a good business journalist should be a good journalist first. A good journalist, I am repeating for the sake of completeness, should have an eye for a story. A good business journalist, often, may find a story in a set of numbers itself, though that is just one aspect of it.

So, what do I mean? When I say never underestimate the value of numbers, it simply means you must be comfortable to deal with numbers if you want to be a business journalist. You do not need to be an economics or engineering or mathematics graduate, but you must not fear numbers. You need to learn how to read them and they must not repel you. I think there is no other way. In many ways, Mantra 1 is a necessary condition to be a business journalist. It is the beginning of the journey.

Mantra 2, on the other hand, is what would help you transform yourself to a good business journalist from an average one. At first, it sounds contradicting the first mantra, but in essence, it is not. The first mantra just emphasizes the importance of being comfortable with numbers. The second suggests you should not get obsessed by numbers. Practically, it could mean one of the two things: one, tell yourself that numbers could often give you a good story, or an idea to pursue, but there are other important sources too. Two—and this is more important—just because you have discovered something by doing some number crunching does not mean the reader is interested in all those numbers. The fact is that most readers do not like a copy that is full of numbers; it must tell a story. But very often, you have got to the story by doing some heavy number crunching in the background. Resist from throwing all those numbers in the story. Tell the reader the story, may be suported by a couple of big numbers. But don’t subject him to all that you have worked out. That is what I mean when I say never overestimate the value of numbers.

If mantra 1 is about starting the journey, mantra 2 is about reaching virtuosity; knowing when to exercise restraint. Sometimes, numbers are just for the input; not for the output.

The reason I chose this topic for highlighting is not just my love for numbers. It is about the increasing relevance of this skill on part of a journalist (and not just business journalists) in a world that is going through a data revoloution, driven primarily by a movement towards transparency. Releasing of data by governments, such as the US government’s and similar initiatives the world over, is becoming mainstream. Apart from governments, international organizations and business organizations too are releasing huge raw datasets to the public domain. These datasets are invaluable sources of treasure as far as spotting trends is comcerned. And that is what good journalists have done traditionally—to be out with a trend. These datasets provide a great opportunity to analyze and come out with interesting stories by the journalists. So much so that, a new term, data journalism, is now becoming vogue.

Wikipedia calls it data driven journalism and defines it thus

Data-driven journalism is a journalistic process based on analyzing and filtering large data sets for the purpose of creating a new story. Data-driven journalism deals with open data that is freely available online and analyzed with open source tools. Data-driven journalism strives to reach new levels of service for the public, helping consumers, managers, politicians to understand patterns and make decisions based on the findings. As such, data driven journalism might help to put journalists into a role relevant for society in a new way.

I, however, find the explanation of Guardian to be far more relevant and simple.

My major disagreement with the Wikipedia definition is this: while I do believe that open data will revoloutionize the way data journalism is handled, I will not like to include it in the definition of data journalism. Journalism should not concern itself with the nature of the source of that data. Even if it is not open data, it should still be called data journalism. Having said that, open data, because of its sheer volume and openness—the fact that is available to all—will make a huge impact on how data journalism evolves.

I will also recommend this piece  by Guardian, that is an extract from its Data Journalism handbook. While you go through that yourself, I will like to reproduce extracts from its first tip and last tip.

The best tip for handling data is to enjoy yourself. Data can appear forbidding. But allow it to intimidate you and you’ll get nowhere. Treat it as something to play with and explore and it will often yield secrets and stories with surprising ease. So handle it simply as you’d handle other evidence, without fear or favour.



The best questions are the old ones: is that really a big number? Where did it come from? Are you sure it counts what you think it counts? These are generally just prompts to think around the data, the stuff at the edges that got squeezed by looking at a single number, the real-life complications, the wide range of other potential comparisons over time, group or geography; in short, context.

As it is, data journalism is not a new concept. All business journalists (and other journalists) would have done it in some way or other—in the earning season, for example.

As far as I am concerned, I have done many big stories, purely basing them on analysis of data. That is why some of the international multilateral organizations as well as bodies like RBI are my regular stops. They often release data that reveal exciting stories if you look for them. In fact, I have even managed to earn a name for such stories from many who term them, armchair stories. Honestly, I did not know the term data journalism while doing those.

For those who call them armchair stories, I have just one more piece of news. I have gone a step ahead. In recent days, I have focused on what one could well term lazy man’s data journalism. Many of my tweets are actually based on those “hunted” data from various sources (minister’s answer to a question in parliament, RBI Governor’s speech or GITR report by WEF) often without any analysis on my part. But what makes me select a few from so much that is around is that I know what is a little surprising, counter intuitive, or plain interesting. That does not come from my comfort with numbers. That comes from my familiarity with the area of ICT/public services. I cannot do the same in say, biotechnology.

I have found those tweets to be the most retweeted. So, there must be something interesting in them.

Crackdown on Illegal Music Sites: The Solution is Not So Simple

In a well-coordinated move, the Indian Music Industry (IMI), a consortium of more than 100 music companies, recently managed to get an order from Calcutta High court directing ISPs in India to block 104 music sites on charges of piracy. Some of these sites such as,,, and are extremely popular destinations for music lovers., one of the top websites focusing on business related to digital media and entertainment, said that the IMI had made a case against each website, quoting Apurv Nagpal, CEO of Saregama, one of India’s largest and oldest music company. Medianama further said that the court orders were obtained on different dates and the first order was against

The order against was widely reported in media and we had even discussed it in our editorial meeting in Dataquest. But I came to know about the blocking of the other sites when, while searching for the lyricist of a 60s Hindi film song, in the third week of March, I clicked on a Google link and found the message that the site has been blocked because of orders from DoT. It is only when I did a couple of more queries that I saw a few write-ups (none in the traditional media) about the sites being blocked because of the orders from Calcutta High Court. Medianama even gave a list of all the sites. I found that many of the sites that I often visited to find/confirm info about songs (esp the year of a film/lyricist etc) are in the list. Most of them are music streaming sites.

According to IMI, these are illegal sites while there are a few sites such as,,, and that have legally obtained licence to stream music. The average user of the sites, however, have no way of knowing which one is legal and which one is not. Most of the people I know who use these sites are heavy purchasers of legal music. When I asked a few of them, most of them said they choose these sites because of ease of navigation/look and feel. I agree with that but have one more parameter: accuracy of information about songs. This, because, there is little to choose when it comes to the quality of sound or speed between one site and another. The Saregama site scores heavily on the accuracy-of-information front while it is poor when it comes to presentation and does not work quite often. Flipkart’s Flyte—though much better in terms of presentation and navigation—has quite a few mistakes when it comes to information on songs—one common and frequent error being combining films of the same name (one released in 40s and another in 90s, for example) to a single album.

So, while feeling good about the success of the anti-piracy moves, I was a little sad that these sites—to which I often trurned for a quick check-up of info—would not be accessible any more. But as feared by many analysts and legal experts, they resurfaced under different names. became; became; and became and so on. So, while the music industry may have won a battle—that too partially, what with all the resurfacing of some of the sites—the war is still far from being over.

But what is this war all about? On the face of it, it is piracy and loss of revenue to the music industry. From a moral and legal point of view, the IMI action looks plausible. But when you look at it practically, it is bound to fail because of two reasons. One, well discussed by many bloggers, is technical: it is virtually impossible to completely ban sites. In any case, restricting through ISPs would work only in India.

But the other reason—and I think it is far more important—is that the music industry is not yet prepared to embrace the change that would actually give them back the power. We have come to a situation like this because the music industry has been lax in moving with the times. People’s unwillingness to pay is only part of the reason for piracy. An equally strong reason is access to music. In my school/college days, for example, there was virtually no way to “get a song” without “recording it (read piracy)” till Gulshan Kumar exploited a loophole in the law to re-record many of the yesteryear’s hits in newer singers’ voice and offer an alternative. And even though these songs stood nowhere in comparison to the original, people lapped them up because they were affordable and more importantly, they were widely available. In fact, many people in my generation might have first listened to a song in Babla Mehta’s voice before listening to the the Mukesh original! Kumar created a few star singers such as Kumar Sanu and Sonu Nigam in the process! And brought about the first big change in the industry.

While Kumar’s method and today’s illegal websites’ methods vary in terms of their legal status, their basic raison d’ etre is the same. T Series under Gulshan Kumar and many sites of today were created to make music reach people in a music-hungry nation in an easier, friendlier and cheaper manner.

Today, the users of those sites, if asked to pay some money, could actually end up paying, provided pricing is right and paying is trouble-free. After all, they have been paying for things like caller tunes amounts which are often 20-30% of their montly spend on mobile!

My argument is not meant to justify illegal streaming, but to point out that the music industry is as much responsible for the problem as anyone else. And it cannot fight the disease by trying to cure the symptom.

A look at the table here would tell the story. The data is from Google AdPlanner and may not be 100% accurate. But even if you take 30% error margin, you get to see the point. Why should an obscure name like would get millions of pageviews while India’s best known music brand—which also has a vast collection available in its site for downlaod—can muster only a few thousands? Yes, the fact that they are free could be a big reason; but you will be fooling yourself to argue that it is the only reason.

And yes, these traffic figures are for Marh 2012, which for the blocked sites, are a mere fraction of what they used to get before the ban. As one can see, the loss of these sites has translated to gain for some legal streaming sites and not for

Traffic: Music Sites in India

SITE TYPE UV (India) PV (India) COMMENT Illegal/blocked 5.6M 23M Dropped by almost 2/3rd between Jan-Mar Illegal/blocked 830K 8.3M Dropped significantly between Jan-Mar Illegal/blocked 570K 2.2M Dropped significantly between Jan-Mar Illegal/blocked 320K 3.8M Dropped by almost 3/4th between Jan-Mar Legal 680K 2.6M No major gain between Jan-Mar Legal 2.9M 16M Significantly moved up between Jan-Mar Legal 2.2M 9.8M No major gain between Jan-Mar Legal 1.1M 70M Significantly moved up between Jan-Mar Legal 1.6M 7.5M Actually dropped between Jan-Mar Music Label 130K 230K No major gain between Jan-Mar

Source: DoubleClick AdPlanner by Google. All figures for March 2012 and for India traffic. K stands for thousands and M for millions. UV: Unique visitors. PV: pageviews

Most music companies believe that they can continue to do what they have been doing so far—recording the music, owning the copyright, and revenue should come to them automatically, even from newer channels. Legally, it is a valid stance. Practically, it is not.

So, what is the solution? It surely is not rocket science. Most of them know the answer; it lies in mainstreaming these sites and not excluding them. Medianama has carried an interview with Saregama CEO, who admitted as much.

We don’t want these sites to be shut down, we want them to pay a license fee and flourish as a business. There are legitimate businesses in operation too. The scope is there, and we want these sites to be legal.

But they must act. It should be right approach; right and transparent pricing. In another story, Medianama said that IMI was unwilling to share pricing. While sharing any exact pricing may be tough, it should reach out with a rough idea, because many of these sites are run by young kids in their 20s. They will not come running to get into sophisticated discussions.

It is not really lack of intention that is the problem with the industry. It is the discomfort with the disruptive changes. Take Saregama for example. It takes one step at a time. As a buyer of legal music all through, I have tried everything and can say with some authority how it has evolved. First came hamaracd but not with mp3. So, you could get around 10-12 tracks for Rs 300 or so. It won’t work half the time. Then came their current website with provision to download mp3s for a price. Then came a set of MP3 CDs—really beautiful compilations of old Indian light music—film, bhajans, ghazals—priced for Rs 75 for 40 songs. Almost all of them are gems. But try to look for them in any big store—Landmark, MusicWorld, Planet M—you will never find too many of those titles. The company site is silent about this series. Then came Flipkart’s Flyte, which made the downloads far easier and friendlier. Yet, unlike books, music is a mass market product and e-commerce with credit card/online banking is still pretty unreachable for many. Not surprisingly, cash on delivery has been the preferred mode for most e-commerce buyers in India. That is not an option in downloads.

With always connected devices, the future is clearly streaming. My own experience says that 80% of the music that I buy, I do not listen for more than 2-3 times. So, I will not mind if I can pay a very small price per listening a song. That requires a completely different kind of pricing. So, any song that costs Rs 6 at Flipkart Flyte should probably cost no more than 30 paise for listening once. This is not a suggestion by me based on any calculations, but just an illustration. The actual calculations may show even more dramatic pricing. What I want to point out is that it requires disruptive thinking.

But I must reiterate the point I made earlier. The bigger issue is ease of paying and not pricing. Even if it is 30 paise, a user with no credit card or online banking can do precious little. If, on the othe hand, the payment is through, say, a mobile, it is absolutely possible to target a much biggger base of users. It can be really simple. An SMS goes out with a code. Once the user enters the code, he can stream/download the music and it gets debited from his mobile balance. Yes, it requires talking to a couple of players—an operator/an independent payment gateway etc—but it is not impossible. And I am not stupid enough to believe that these ideas are my original and have not occurred to the bright guys who run the music business. Or for that matter, this is the only way it can be done!

The problem is not lack of ideas; it is not even lack of intention. It is just lack of strong will to disrupt a model that has been in place for so long. If the music industry does not do it, someone else will do it. Apple has already done it to a great extent, creating value for itself but making the music companies a little richer, which they seem . But as Apple without Jobs is beginning to face the possibility of an anti-trust trial in case of e-books, the closed model is being threatened.

As of now, the illegal web sites may be getting a few ads, which makes them sustain the business. But if they have to be in this business, they will have to charge the consumers or get targeted ads. These sites have to be convinced that they have to walk half way. The music industry must walk the other half. But as big boys, the onus is on the music industry to drive the change. Else, change will just happen—to them.


The Naive and the Sentimental Journalist

For a long time, I have been thinking about writing this piece, for my other blog, Nothing to Declare, where I write mostly on music,. books, and culture. Though in terms of content, it belongs more to that blog, I did not put it there because of a promise that I had made to myself when I started that blog: that I would not write pure thoughts and reflections there but would highlight less discussed aspects in the above mentioned areas, with information and facts. Something that would help the readers to research further and add to our collective knowledge. This piece, as you will agree, does not exactly satisfy that requirement.

What do I mean by The Naïve and the Sentimental Journalist? Many of you would recognize that the beautiful headline is lifted and paraphrased slightly from the title of the Nobel Prize winning writer Orhan Pamuk’s book, The Naïve and the Sentimental Novelist. Pamuk himself borrows the words from a famous 18th century German essay by Fredrich Schiller, called Über naïve and sentimentalische Dichtung (On naïve and Sentimental Poetry). “The word Sentimentalisch in German used by Schiller to describe,” Pamuk explains, “the thoughtful, troubled modern poet who has lost his childlike character and naïveté, is somewhat different in meaning from the word sentimental, its counterpart in English.” Adds Pamuk, “Schiller uses the word Sentimentalisch to describe the state of mind which has strayed from nature’s simplicity and power and has become too caught up in its own emotions and thoughts.” The sentimental poet, according to Schiller, is exceedingly aware of the poem he writes, the methods and techniques he uses and the artifice involved in his endeavor.

The underlining is mine. Actually, it is this definition of the so-called sentimental—now that we have understood what it really means—poet that I apply to what I would like to call the sentimental journalist. If the English language meaning of that word bothers you too much, you can call him by any name, as long as you appreciate what I am saying: the journalist who is extremely concerned about how his work would be received and is exceedingly aware of the tools and techniques that he uses to influence that.

A journalist, of course, is no poet. And unlike the poet of the nature, he cannot be naïve and innocent. But here, I would urge the readers to equate, in their minds, the fundamental beliefs and values in journalism, to nature. And the journalists I refer to are the ones who are—or are expected to—stray away from these beliefs and values.

There are a very few universal beliefs in journalism, though how they are defined may vary. And they are complete loyalty to the reader, complete loyalty to the facts rather than a journalist’s own thoughts of what is right, and a simple and uncomplicated writing that would not conceal the facts. Every journalist knows these.

Over the years, there have been tools and techniques to beautify journalistic pieces to make them more appealing, to attract more readers and so on. But all along, the reader always remained supreme, even as what should be done with him—entertain, inform, educate, sensitize…—kept getting debated.

But of late, with some fundamental changes in the way information is reported and consumed, that basic belief is getting threatened. Even as the world, by and large, is celebrating the freedom for common people and the diminishing impact of the filters such as traditional media establishments because of the advent of Internet—and there are some very good reasons for that—this has led to dilution in certain basic assumptions about what to expect from media. Credibility is one example of that.

The raison d’être of this piece—however unfashionable, conservative it seems—is to sensitize about that credibility crisis and not at all to argue for the old order. The change Internet has brought about is, by and large, positive but it is not wise to overlook the issues that are/may be there.

The idea of this occurred to me when I heard a “new media” expert urging the young journalists in a training session to “forget the reader and write for the search engine”. Call me by any name, but I am unable to digest that, even after a few months of hearing it, and what with all the new developments happening in Internet and social media all around.

And his entire session was devoted how exactly to do that. To be fair to him, he just put it crudely enough to “shock” me. But increasingly, that is what many media organizations are trying to do, when they are trying to “reinvent themselves”–teaching young journalists to forget the concept of a reader. I remember one of the editors in my early days of journalism telling us to think about a real person while conceptualizing a feature and think how and why it would help that person.

We have drifted 180 degrees away when we say you do NOT need to worry about the reader. In fact, you need not bother who he is. What you should worry about is keywords, headlines, how many links you are putting, how you should have your headline—does not matter if it does not reflect what you are saying in the text below.

Pamuk’s (actually Schiller’s) Sentimentalisch novelist/poet is not doing it with an explicit objective that is anything other than appealing to his reader; but the new journalist is doing it to achieve some other objective—to compete, as many would proudly describe it.

