by Sabine Gerdon
Remember the time you …
… had a great idea you could use for a research project
… or thought about an interesting question that if you could answer would revolutionize economics
… or developed an amazing model which you would like to test with real-world data
but you did not have any data
Big Data could be a game changer and impact economic research and policy making as much as it has affected the functioning of modern businesses. It is still more a vision for the future than a project for class, but tools such as web scraping and the increase in openly available data will soon make it the reality. This is more than just up-to-date data sets with a big sample size. Governments, international organizations and researchers are already discovering the value of Big Data and at the same time arguing about obstacles and downsides. Sabine Gerdon gives us an introduction to Big Data and its usein economics research.
In recent years, we have observed an explosion of data. The rise of information and communication technologies and, in particular, the growing importance of the internet, and the digitisation of many areas of our everyday life has generated tremendous amounts of data. This exponential growth of data is not likely to slow down anytime soon. This oft-called data revolution will also impact economic research in various ways. Consider the stream of data you create everyday: your metro card records what time you caught the train, and you leave digital footprints through your online activities in social networks that tell stories about your interests and friendships. After your visit your doctor, your medical data is stored in a digital format and more data about your fitness is generated using, for example, a Fitbit or an App that tracks you while you are running. All of these activities feed into what is called the generation of Big Data and offers many opportunities for economic research.
So what do we mean exactly by Big Data and how is it different from traditional data? The large volume (number of observations and gathered parameters) of data sets is an important characteristic of Big Data as the name already suggests. Due to the big volume of the data traditional processing techniques reach their capacity limits. Apart from volume, Big Data is often associated with other V’s such as velocity, value and variety. Velocity means that compared to the established and costly large data sources such as census data, big data is able to provide a richer source of information that is kept constantly updated with nearly zero marginal costs. Further potential of Big Data comes from the fact that data sources and data sets can be increasingly linked and connected. This multivariate matrix of information lays ground for the high economic and scientific value of the data. Another important aspect that makes Big Data a big deal, and differentiates it from data that has been around for years, is that often modern data sets have a bigger variety of structures than the traditional cross-sectional, time-seriesor panel data. The structure of the data sets can be very complex – some data sets do not even have a set structure and thus demand innovation in data cleaning, data processing and data analytics.
Big Data has already changed the way many companies function and as well as fueled the development of a new generation of companies. The tech industry is a pioneer in Big Data analytics.Leading companies in the digital sphere have been collecting and analysing Big Data for several years. Amazon, Facebook, Google… – any firm that takes business seriously is tapping its data to better understand its customers and optimize processes, others build their business models on exploiting information from data traffic. Jobs for data scientists in firms are mushrooming and analytical skills are very high in demand. Hal Varian, the chief economist at Google, is known to have said, “The sexy job in the next 10 years will be statisticians”. To deal with the enormous amount of data, the firm develops algorithms that automatically filter and interconnect the desired information. A famous example of the power of Big Data analysis is the story of the big American supermarket chain Target predicting the pregnancy of a girl before her father even had the slightest knowledge. Using data analytics the company identified products that when bought together increased the likelihood that a cosumer was pregnant. This allowed Target to assign a “pregnancy prediction” score to each of their shoppers which also estimated the due date within a small window, so Target could send coupons timed to very specific stages of her pregnancy. One unsuspecting and angry father than stormed into a Target to yell at them for sending his daughter coupons for baby clothes and cribs and accusing Target of having a bad influence on his daughter. Turns out, that somebody already influenced the daughter before because she was actually pregnant, and just hadn’t told her father yet. How awkward. The father apologized to Target.
So why should we as economic researchers care? Big data can be not only useful to increase the sales of your company but also change the way economic research is conducted. Economists have been sophisticated data users for a long time. However, the technological change provides opportunities to do economics in a way that our predecessors could only have dreamed of. At the same time, the increase in computing power has made it relatively easy and cheap to analyse big amounts of data. A few decades back empirical analysis took a room full of research assistants punching cards for weeks, now the same can be done in a matter of seconds by any economist on their laptop. The biggest opportunities for economic research that go hand in hand with the Big Data revolution are most likely to be two following:
- Economists will be able to conduct even more innovative and influential research. Today’s cutting-edge research can be based on crunching newly-available data from the vast administrative databases of schools, tax collection systems as well as the private sector making economic research even more relevant and influential. Many new sorts of questions can be tackled and Big Data also enables novel research designs that can inform us about the consequences of different economic policies and events. The shift from a reliance on more-or-less small-sample government surveys to administrative data with universal or near-universal population coverage allows researchers to rigorously examine variation in for example household income, health, productivity and education. The use of Big Data also increases the possibilities of testing existing models and theories that had previously been difficult to assess.
- Learning from and liaising with methods used in data science will benefit economists. Data science that incorporates the field of statistics and computer science often makes use of predictive modelling tools that are useful in the field of economics. In contrast to traditional econometric methods that mostly study the relation between a particular treatment (e.g. being in a smaller class) and an outcome variable (e.g. adult earnings), predictive modelling approaches are inherently multivariate. Their focus is not on how a single variable affects a given outcome measure, but on how the outcome varies with a large number of potential predictors, and the analyst may or may not use prior theory as to which predictors are relevant. A potential use of this is the introduction of heterogeneity into econometric models and analyses. Einav and Levin, for example, used “off-the-shelf” credit and health risk scores to account for the default propensities or likely health expenditures of individual consumers (see Einav and Levin, 2014, The Data Revolution and Economic Analysis for further explanation).
Although Big Data has great potential to benefit economic research it does not rival economic theory. When Big Data and economics meet, it is important to note that although the role of economic theory is changing, it doesn’t mean that theory becomes less important. Instead of explaining the missing data, the theory has to be used as the basis of all work to be made, guide research and make sense of the vast, sprawling and unstructured terabytes on our hard drives.
Besides all the great opportunities that go hand in hand with the triumphant success of Big Data, there are many challenges related to accessing and making use of this data. Obtaining access to government and private sector data, as well as the necessary computing resources, is often very difficult, since the data is often stored in data silos or not openly accessible due to data protection issues. Moreover, when accessible, the quality of captured data can vary greatly and thus accurate analysis depends on the veracity of the source data. Furthermore, one major advantage of Big Data is at the same time a big disadvantage. The combining of different data sets and data sources to generate new insights can unlock great value but at the same time it opens doors for potential abuses, such as threats to privacy, or malfeasance from the possibility to use detailed data on its citizens/consumers in undesirable ways. It is crucial that methods for researchers are developed to access and explore data in ways that respect privacy and confidentiality concerns. An equally important point is the need for training economists to work with large data sets and the various programming and statistical tools that are commonly required for it. Economic researchers need to be curious when it comes to new techniques for data analysis, but at the same time critical and perfectionist to make way for improvement of techniques and to avoid hazy conclusions. These challenges notwithstanding, the next few decades are likely to be a very exciting time for economic research.
The usage of Big Data offers great potential for economic research and the low-hanging fruits are mainly rich up-to-date data sources, which can boost the relevance of empirical economic research in various fields, as well as the opportunity to increase the economic tool kit by learning from for example computer science. There are still many obstacles to overcome until Big Data can be used in an efficient, effective and responsible manner for economic research. However, these challenges should not keep economist from being innovators but spur their ambition to keep up with technological change and influence the future of research. As concluding remark, it is interesting to note the point Justin Wolfers, Professor of Economics and Public Policy at the University of Michigan, made in a Freakonomics blog post. He stated that perhaps the most important insight drawn from the data explosion is the understanding of how economic reasoning suffuses almost every aspect of our lives. Economics has become a much broader social science; the field has become more connected to reality. Seeing through the economic lens in parsing strategic interactions, the causes of discrimination, patterns of marriage and divorce, and how our political machinery operates. And many more.
The Economics Revolution will be Televised
Economics in the Age of Big Data
Liran Einav and Jonathan Levin
Science, 346(6210), November 2014.
The Data Revolution and Economic Analysis
Liran Einav and Jonathan Levin
Innovation Policy and the Economy, 14, May 2014.
Edited by J. Lerner and S. Stern.
Big Data: New Tricks for Econometrics
Hal R. Varian
Journal of Economic Perspectives—Volume 28, Number 2—Spring 2014
A review of some tools for the manipulation and analysis of big data, along with some speculations about how they can be used in econometrics.
Here is a really nice cartoon representation of the Taylor rule (pointer from John Taylor himself). Homework: what does each part mean?
On this subject, and economics twitter, former and current members of the UK Monetary Policy Committee are publically attacking each other – see here:
Sentance is a hawk, Blanchflower (from Britain but a professor at Dartmouth) very much a dove, and a critic of the current British government.
Journalist Mike Bird summarises their exchange: control of monetary policy is a vast amount of political power – do we really have to give it to these people? It is a joke, but the role of fixed rules in monetary policy is a serious topic in macroeconomics. Chris Dillow gives brief and amusing recap of arguments in the discretionary/rules debate.
Suddenly lots of stuff is happening in China, at least based on media coverage. Here are some interesting articles that have come out recently, and quick summaries:
From Tyler Cowen. We are seeing a phase of centralisation. Some of the freedom of expression enjoyed by Chinese academics is being restricted.
Again HT Tyler Cowen. Roderick MacFarquhar argues that Xi Jinping is the most powerful Chinese leader (at least in terms of fewest powerful internal rivals) since Mao, more so than Deng.
(If this is true, then it makes Deng look even better, that or the situation of China in the late 1970s even worse.)
Administrators in the British Raj would often boast how, in a government ruling 250-300 million people, there were only 1000 British civil servants and c.15,000 British military officers. Of course, for every British administrator there would be dozens of Indian subordinates – the boast is double-edged: ‘look at how many collaborators we have’/’look at how many collaborators we need’.
Excluding State-Owned Enterprises (which are huge): “China has only 31 government and party employees per thousand residents. The number of civil servants per thousand residents in France is 95, in the United States, 75, and in Germany 53.” And, of course, the CCP itself has chapters in every company (every Wall-Mart store), comprising up to 10% of the population.
How do you launder money in China? Anne Stevenson-Yang has your answers. They include: insurance payouts, over-generous buyouts from overseas holding companies, assets without a clear value such as art, Macao casinos or just bribing the customs officials. Click through for more.
Chinese local governments depend on land sales for a great deal of their revenue. Turns out China’s property market is peaking, and local governments are running out of land to sell. Say Deutsche Bank analysts: “Our baseline case assumes land sale revenue to drop by 20% yoy in 2015.” In response, the central government will have to ease monetary policy. Deutsche: “We forecast two interest rate cuts and two RRR cuts in 2015. The risk is that there may be more rate and RRR cuts than we forecast.” Indeed we already have one of these a month later.
Although Mark Dow notes that Chinese monetary authorities target credit growth, and that reserve requirements are just one tool (they are happy to use ordinary Open Market Operations, and sterilise, or not, residual balance of payment inflows)
Here is more context on the monetary policy of China. How do you go about sinking your own currency? If subsidising exports through a low yuan, it means the exporters receive more yuan for good than their goods are worth. China needs very high reserve requirements to sterilise these yuan and control inflation. (Pay attention, Eurozone.)
Again hat tip to the indispensible FTAlphaville blog. On China and the prostitution industry: in a land where it is unwise to depend on the legal systsem, other mechanisms of creating trust between business partners and local authorities must be found. Apparently the solution is group trips to brothels: “And when it comes to building up mutual trust, the photos often taken during these miniature orgies provide a rich source of mutual blackmail material that can prove explosive if exposed.”
From 2013, an article on how de Tocqueville’s L’ancien regime et la revolution – key quote: “the most dangerous time for a weak government is when it sets about reforming itself” – is a bestseller in China, with Li Keqiang personally recommending it to his colleagues
Linden, Kraeler and Dedrick (2007) find that China adds $3.70 to the value of an iPod, compared with Apple’s profit margin is $80, for an iPod at that time. I imagine China’s share has increased a bit in the eight years since. High technology manufacturing is a top strategic priority for the government. But here is an example of a more general phenomenon: an environment where skills, capital and demand coincide, to create a productivity leader.
Mark Pettis teaches at Beida (Peking University), and he can go on a bit. The key part is that in China if you lower interest rates, Aggregate Supply may rise faster than Aggregate Demand, causing deflation/disinflation. Read that sentence again.
The mechansim is that most lending is done to businesses, and lowering interest rates will mean they have cheaper credit and can build more capacity. Meanwhile, interest from savings are quite important for consumers (what with China’s high savings rate). If savers receive less they will spend less.
This is worth worrying about. If the Fed or the ECB eases monetary policy, it forces China to respond to protect its exporters. But if you think that the root cause of the Great Recession/Stagnation is that Global aggregate supply has been rising higher than Global aggregate demand for the past fifteen years (and everything that has happened is just swapping these two between different countries), this means monetary easing will not help.
And finally (from Pseudoerasmus). The Tang empire borders extend too far. They never controlled that much of Vietnam and Manchuria at the same time. (And Manchuria only through tributaries.) But understand and reflect: we have very good census data for China going back two millenia.
The Teaching Awards are on! Participate and vote until Sunday next week for your favourite Teacher and TA. The well deserved winners will be revealed at the TSE Gala on 27th of February. Lastly, don’t hesitate to leave a comment, and let it be a simple thank you. We want to know everything! (Needless to say that the survey is anonymous.)
Click on this link http://goo.gl/forms/yM6g6VJgMd or visit our website http://www.tseconomist.com to find out more!
This is not something we ever wanted to write. Wednesday was the worst terrorist attack in France for half a century. But the issues of absolute freedom of expression and the extent of violence to which lone extremists are capable are crucial to understanding society in our era. Here are some of the most considered responses:
“Reproducing the imagery created by the murdered artists tends to sacralize them as embodiments of some abstract ideal of free speech. But many of the publications that today honor the dead as martyrs would yesterday have rejected their work as tasteless and obscene, as indeed it often was. The whole point of Charlie’s satire was to be tasteless and obscene, to respect no proprieties, to make its point by being untameable and incorrigible and therefore unpublishable anywhere else. The speech it exemplified was not free to express itself anywhere but in its pages. Its spirit was insurrectionist and anti-idealist, and its creators would be dumbfounded to find themselves memorialized as exemplars of a freedom that they always insisted was perpetually in danger and in need of a defense that only offensiveness could provide. To transform the shock of Charlie’s obscenities into veneration of its martyrdom is to turn the magazine into the kind of icon against which its irrepressible iconoclasm was directed. …
“In mourning the tragedy, let us not forget that Charlie Hebdo was shocking, obscene and offensive because the world is — as today’s shocking, obscene and offensive tragedy makes clear.”
“There is no duty to blaspheme, a society’s liberty is not proportional to the quantity of blasphemy it produces, and under many circumstances the choice to give offense (religious and otherwise) can be reasonably criticized as pointlessly antagonizing, needlessly cruel, or simply stupid.
“But we are not in a vacuum. … the kind of blasphemy that Charlie Hebdo engaged in had deadly consequences, as everyone knew it could … and that kind of blasphemy is precisely the kind that needs to be defended, because it’s the kind that clearly serves a free society’s greater good. If a large enough group of someones is willing to kill you for saying something, then it’s something that almost certainly needs to be said, because otherwise the violent have veto power over liberal civilization, and when that scenario obtains it isn’t really a liberal civilization any more. Again, liberalism doesn’t depend on everyone offending everyone else all the time, and it’s okay to prefer a society where offense for its own sake is limited rather than pervasive. But when offenses are policed by murder, that’s when we need more of them, not less, because the murderers cannot be allowed for a single moment to think that their strategy can succeed.”
If some of you found themselves running confused through boutiques, christmas markets and malls to find a decent christmas present for your neglected relatives; this is what a true economist (or better what ) would have adhered to. Meanwhile ask yourself if christmas would have happened if Emperor Agustus had adopted a truly economic statistician’s perspective on national census? Bonnes fêtes!
Are you curious about teachers’, students’ and most of all friends’ reflections on @JeanTirole winning the @Nobelprize? Would you like to here more about the most prominent guest @JosephEStiglitz that #TSE hosted during its Tiger Conference #TIGERForum this summer?
Then take a look at the latest edition of the Toulouse School of Economics Student Magazine, The TSEconomist here!
The TSEconomist is the Toulouse School of Economics (TSE) student’s magazine. We aim to act as a platform for interaction between all members of the TSE community; students, professors, researchers and staff. We want to enrich the student experience at TSE, so we encourage all students to participate and engage with the magazine!no later than this December!