黑料不打烊

黑料不打烊/Pew surveys world technologists on the issue of “Big Data”

黑料不打烊's Imagining the Internet Center examines the big possibilities and potential problems of working with gigantic stores of information.

Media coverage of this report: , , , , , , , , , , , , , , , ,, , ,

The growing technological ability to collect and analyze massive sets of information, known as Big Data, could lead to revolutionary changes in business, political and social enterprises, according to a new survey of internet experts and stakeholders.

But while leading technologists and researchers around the world look forward to the positive impact of Big Data, many also worry about potential drawbacks.

A new Pew Internet/黑料不打烊 survey of 1,021 Internet experts, observers and stakeholders measured current opinions about the potential impact of human and machine analysis of newly emerging large data sets in the years ahead. The survey was an opt-in, online canvassing. Some 53% of those surveyed predicted that the rise of Big Data is likely to be 鈥渁 huge positive for society in nearly all respects鈥 by the year 2020. Some 39% of survey participants said it is likely to be 鈥渁 big negative.鈥

鈥淭he analysts who expect we will see a mostly positive future say collection and analysis of Big Data will improve our understanding of ourselves and the world,鈥 said researcher Lee Rainie, director of the Pew Research Center鈥檚 Internet & American Life Project. 鈥淭hey predict that the continuing development of real-time data analysis and enhanced pattern recognition could bring revolutionary change to personal life, to the business world and to government.鈥

Survey respondent Hal Varian, chief economist at Google, said, 鈥淭his is likely to lead to a better informed, more pro-active fiscal and monetary policy.鈥 Bryan Trogdan, a consultant and entrepreneur, said, 鈥淏ig Data is the new oil.鈥 And David Weinberger of Harvard University鈥檚 Berkman Center observed, 鈥淲e are just beginning to understand the range of problems Big Data can solve, even though it means acknowledging that we’re less unpredictable, free, madcap creatures than we’d like to think. It also raises the prospect of some of our most important knowledge will consist of truths we can’t understand because our pathetic human brains are just too small.鈥

As with all technological evolution, the experts also anticipate some negative outcomes. 鈥淭he experts responding to this survey noted that the people controlling the resources to collect, manage and sort large data sets are generally governments or corporations with their own agendas to meet,鈥 said Janna Anderson, director of 黑料不打烊鈥檚 Imagining the Internet Center and a co-author of the study. 鈥淭hey also say there鈥檚 a glut of data and a shortage of human curators with the tools to sort it well, there are too many variables to be considered, the data can be manipulated or misread, and much of it is proprietary and unlikely to be shared.鈥

Survey participant John Pike, director of GlobalSecurity.org, said, 鈥淭he world is too complicated to be usefully encompassed in such an undifferentiated Big Idea. Whose 鈥楤ig Data鈥 are we talking about? Wall Street, Google, the NSA? I am small, so generally I do not like Big.鈥

Survey respondent danah boyd, a Microsoft research scientist and expert on the societal impacts of the Internet, observed, 鈥淭he Internet magnifies the good, bad and ugly of everyday life. Of course these things will be used for good. And of course they’ll be used for bad and ugly. Science fiction gives us plenty of templates for imagining where that will go. What will be interesting is how social dynamics, economic exchange and information access are inflected in new ways that open up possibilities that we cannot yet imagine. This will mean a loss of some aspects of society that we appreciate but also usher in new possibilities.鈥

This is the seventh report generated out of an analysis of the results of a Web-based survey fielded in fall 2011 to gather opinions on eight Internet issues from a select group of experts and the highly engaged Internet public. (Details can be found here: http://www.elon.edu/e-web/predictions/expertsurveys/)

URLs for deeper details about this report online:

The main Imagining the Internet/Pew Internet report –
A large selection of hundreds of anonymous responses –
A large selection of credited responses to the survey on future of Big Data –

Following is a small sample of respondents鈥 remarks:

鈥淚’m a big believer in nowcasting. Nearly every large company has a real-time data warehouse and has more timely data on the economy than our government agencies. In the next decade we will see a public/private partnership that allows the government to take advantage of some of these private sector data stores. This is likely to lead to a better informed, more pro-active fiscal and monetary policy.鈥 鈥擧al Varian, chief economist at Google

鈥淏ig Data allows us to see patterns we have never seen before. This will clearly show us interdependence and connections that will lead to a new way of looking at everything. It will let us see the 鈥榬eal-time鈥 cause and effect of our actions. What we buy, eat, donate, and throw away will be visual in a real-time map to see the ripple effect of our actions. That could only lead to mores-conscious behavior.鈥 鈥擳iffany Shlain, director and producer of the film 鈥楥onnected鈥 and founder of The Webby Awards

鈥淕lobal climate change will make it imperative that we proceed in this direction of nowcasting to make our societies more nimble and adaptive to both human-caused environmental events and extreme weather events or decadal scale changes. Coupled with the data, though, we must have a much better understanding of decision making, which means extending knowledge about cognitive biases, about boundary work (scientists, citizens, and policymakers working together to weigh options on the basis not only of empirical evidence but also of values).鈥 鈥擥ina Maranto, co-director for ecosystem science and coordinator, graduate program in environmental science at the University of Miami

鈥淢edia and regulators are demonizing Big Data and its supposed threat to privacy. Such moral panics have occurred often thanks to changes in technology…But the moral of the story remains: there is value to be found in this data, value in our newfound publicness. Google’s founders have urged government regulators not to require them to quickly delete searches because, in their patterns and anomalies, they have found the ability to track the outbreak of the flu before health officials could and they believe that by similarly tracking a pandemic, millions of lives could be saved. Demonizing data, big or small, is demonizing knowledge, and that is never wise.鈥 鈥擩eff Jarvis, professor, pundit and blogger

鈥淟arge, publicly available data sets, easier tools, wider distribution of analytics skills, and early stage artificial intelligence software will lead to a burst of economic activity and increased productivity comparable to that of the Internet and PC revolutions of the mid to late 1990s. Social movements will arise to free up access to large data repositories, to restrict the development and use of AIs, and to ‘liberate’ AIs.鈥 鈥擲ean Mead, director of analytics at Mead, Mead & Clark, Interbrand

鈥淭he world is too complicated to be usefully encompassed in such an undifferentiated Big Idea. Whose 鈥楤ig Data鈥 are we talking about? Wall Street, Google, the NSA? I am small, so generally I do not like Big.鈥
鈥擩ohn Pike, director of GlobalSecurity.org

鈥淲e can now make catastrophic miscalculations in nanoseconds and broadcast them universally. We have lost the balance inherent in ‘lag time.’鈥 鈥擬arcia Richards Suelzer, senior analyst at Wolters Kluwer

鈥淏etter information is seldom the solution to any real-world social problems. It may be the solution to lots of business problems, but it’s unlikely that the benefits will accrue to the public. We’re more likely to lose privacy and freedom from the rise of Big Data.鈥 鈥擝arry Parr, owner and analyst for MediaSavvy

鈥淏ig Data will not be so big. Most data will remain proprietary, or reside in incompatible formats and inaccessible databases where it cannot be used in ‘real time.’ The gap between what is theoretically possible and what is done (in terms of using real-time data to understand and forecast cultural, economic and social phenomena) will continue to grow.鈥 鈥擩eff Eisenach, managing director, Navigant Economics LLC, a consulting business; formerly a senior policy expert with the US Federal Trade Commission

鈥淣ever underestimate the stupidity and basic sinfulness of humanity.鈥 鈥擳om Rule, educator, technology consultant, and musician based in Macon, Georgia

鈥淢ore information will be beneficial in all sorts of ways we can’t even fathom right now. Namely because we don’t have the data.鈥 鈥擩ohn Capone, freelance writer and journalist; former editor of MediaPost Communications publications

鈥淭he huge prospects for the ‘Internet of Things’ tip me to checking the first choice. I tend to think of the Internet of Things as multiplying points of interactivity鈥攕ensors and/or actuators鈥攖hroughout the social landscape. As the cost of connectivity goes down the number of these points will go up, diffusing intelligence everywhere.鈥 鈥擣red Hapgood, technology author and consultant; moderator of the Nanosystems Interest Group at MIT in the 1990s

鈥淒ata that is much more available in quantity, cost, and quality will be a marked feature of the coming decade, but much of that will be ‘Little Data,’ which is useful mostly or entirely only locally (for practical or privacy concerns). I will want data possibly related to my health kept as private as possible. My house should enable control for light, heat, sound, image, etc. that enhances my experiences and convenience, and saves resources. For example, lighting will increasingly respond to occupancy or ‘presence’ (not just that someone is present, but who they are, how many they are, and what activity engaged in), and so provide better lighting services, automatically, and at less net energy than before. However, who outside the building should care about the details? No one. Big Data will be a net plus, but a sizeable amount of problems will be created by it as well, particularly around security and privacy.鈥 鈥擝ruce Nordman, research scientist at Lawrence Berkeley National Laboratory

鈥淏ig Data should be developed within a context of openness and improved understandings of dynamic, complex whole ecosystems. There are difficult matters that must be addressed, which will take time and support, including: public and private sector entities agreeing to share data; providing frequently updated meta-data; openness and transparency; cost recovery; and technical standards.鈥 鈥擱ichard Lowenberg, director, broadband planner 1st-Mile Institute; network activist since early 1970s

鈥淭he real power of ‘Big Data’ will come depending largely on the degree to which it is held in private hands or openly available. Openly available data, and widespread tools for manipulating it, will create new ways of understanding and governing ourselves as individuals and as societies.鈥 鈥擜lex Halavais, associate professor at Quinnipiac University; vice president of the Association of Internet Researchers; author of Search Engine Society

鈥淚n order for Big Data to have a positive impact on society overall, it has to be transparent. Ordinary citizens would have to be able to query the data set and discover real answers, regardless of the light that shows on individuals or corporations or governments. There is too much at stake for these parties to allow open, transparent access to this data. As long as some data sets or parts of data sets are hidden, there is room for misuse and manipulation. I think this manipulation is sure to take place. Unless Big Data is democratized on a massive scale, it will overall have a negative impact on society. Right now, I don’t see much hope for such a democratization.鈥 鈥擭athan Swartzendruber, technology education at SWON Libraries Consortium

Respondents were allowed to keep their remarks anonymous if they chose to do so. Following are predictive statements selected from the hundreds of anonymous comments from survey participants:

鈥淚f Big Data is not also Wide Data (that is, dispersed among as many players and citizens as possible) then it will be a negative overall.鈥

鈥淭he few people who will understand the dangers of ‘Big Data’ will have high cognitive abilities and training. The general population will continue to rely on crappy results because they know no better.鈥

鈥淐ollection is likely to be imperceptible to most, unless law and regulation make it overt and provide the individual choice. Analysis likely will suffer from a divorce in knowledge and context between the orderers and the providers of the analysis. No example currently is better than that between the avaricious ignorance of bank executives and the technologists’ naivet茅 about the realities of collateralized debt obligations. Reliance will lead to increasingly unstable processes where only those able to use Big Data will be able to protect themselves, with the individual increasingly at risk. Rapid program stock trading is a current, pernicious example.鈥

鈥淲e will become more addicted to what the databases tell us. It might impair risk-taking for the good. We’ll depend more on models than instincts.鈥

鈥淏ig Data is not well matched to tiny minds. The data sets now exceed the capabilities of most businesspeople to know what to do with, about, and for the data. This will lead to huge abuse and misapplication.鈥

鈥淲e still haven’t figured out the implications of chaos theory, and if ‘Big Data’ and futurecasting aren鈥檛 perfect examples of chaos-based information, then I don’t know what is. Generically, we’re not prepared for this great a lack of privacy; we’re even less prepared for data of this magnitude available only to the powerful, rich, or connected.鈥

鈥淟egal protections for the citizenry (in those jurisdictions which are not decidedly autocratic) are lacking, and will be essential to prevent corporate or governmental abuse of the insights available about people through widely aggregated data, as well as through new surveillance techniques.鈥

鈥淭he old lesson that correlation is not causation seems never to be learned. The control over data means that inaccurate data is hard to identify and correct. I see that the problems will only increase with the size of the datasets. Most emphasis seems to be given to doing clever things with data rather than ensuring its validity or giving the right people control over it.鈥

鈥淭he fact that most data is unstructured is a huge issue, and I doubt that we will solve the problems associated with getting meaning from that morass.鈥 Another anonymous survey participant wrote,

鈥淐ertainly in 2020 Big Data will be more risky than trustworthy. We just won’t have enough experience鈥攖he equivalent of the 100-year flood in forecasting terms鈥攁nd so our systems will ‘look good’ on some basic problems but prove to make whoppers of mistakes.鈥

The findings reflect the reactions in an online, opt-in survey of a diverse set of 1,021 technology stakeholders and critics who were asked to choose one of two provided scenarios and explain their choice. While 53 percent selected the statement that that Big Data 鈥渨ill cause more problems than it solves,鈥 a significant number of the survey participants who selected that scenario said the true outcome will be a little bit of both scenarios, and many said while they chose the first scenario as a 鈥渧ote鈥 for what they hope will happen they actually expect the outcome will be closer to the second scenario.

53% agreed with the statement:

Thanks to many changes, including the building of “the Internet of Things,” human and machine analysis of large data sets will improve social, political, and economic intelligence by 2020. The rise of what is known as “Big Data” will facilitate things like “nowcasting” (real-time “forecasting” of events); the development of “inferential software” that assesses data patterns to project outcomes; and the creation of algorithms for advanced correlations that enable new understanding of the world. Overall, the rise of Big Data is a huge positive for society in nearly all respects.

39% agreed with the alternate statement, which posited:

Thanks to many changes, including the building of “the Internet of Things,” human and machine analysis of Big Data will cause more problems than it solves by 2020. The existence of huge data sets for analysis will engender false confidence in our predictive powers and will lead many to make significant and hurtful mistakes. Moreover, analysis of Big Data will be misused by powerful people and institutions with selfish agendas who manipulate findings to make the case for what they want. And the advent of Big Data has a harmful impact because it serves the majority (at times inaccurately) while diminishing the minority and ignoring important outliers. Overall, the rise of Big Data is a big negative for society in nearly all respects.

Note: A total of 8% did not respond. The survey results are based on a non-random online sample of 1,021 Internet experts and other Internet users, recruited via email invitation, conference invitation, or link shared on Twitter, Google Plus or Facebook. Since the data are based on a non-random sample, a margin of error cannot be computed, and the results are not projectable to any population other than the people participating in this sample. The 鈥減redictive鈥 scenarios used in this tension pair were created to elicit thoughtful responses to commonly found speculative futures thinking on this topic in 2011; this is not a formal forecast. Many respondents remarked that both scenarios will happen to a certain degree.

The Imagining the Internet Center () is an initiative of 黑料不打烊’s School of Communications. The center’s research holds a mirror to humanity’s use of communications technologies, informs policy development, exposes potential futures and provides a historic record. Imagining the Internet is directed by Janna Quitney Anderson, an associate professor of communications.

The Pew Research Center鈥檚 Internet & American Life Project (), directed by Lee Rainie, is a nonprofit, non-partisan 鈥渇act tank鈥 that provides information on the issues, attitudes and trends shaping America and the world. It produces reports exploring the impact of the Internet on families, communities, work and home, daily life, education, health care and civic and political life.