TechHui

Hawaiʻi's Technology Community

Global Data Inflation and Sexy Statisticians

During an insanely bumpy plane flight today I read an excellent article in the Economist on data inflation and its implications. Apparently, the amount of data generated globally is, "...growing at a terrific rate (a compound annual 60%.)" IDC completed a study last year that estimated the world generated 1.2 zettabytes of data in 2008. A zettabyte is a trillion gigabytes.


Some interesting data points:

  • Experiments at the Hadron Collider generate 40 terabytes of data per second
  • By 2013 the amount of traffic flowing over the internet annually will reach 667 exabytes
  • Facebook hosts over 40 billion photos
  • Wal-Marts transaction database has reached 2.5 petabytes (it records about 16,700 transactions per second)
  • Data management and analytics is a $100 billion industry growing by 10% a year

No wonder Google's chief economist predicts the job of statistician will soon become the "sexiest" around. :-)

Views: 95

Comment

You need to be a member of TechHui to add comments!

Join TechHui

Comment by Konstantin A Lukin on March 10, 2010 at 5:37am
Money and an economy based on greed rather than people is a cancer literally killing this planet.
I agree with you, Gus. Unfortunately there aren't too many of us willing to discuss 'the other side' of industrialization and possible moral/environmental implications.. :(
Comment by Viil on March 9, 2010 at 4:05pm
The increasing amount of data and information (how you define each depends on which perspective you take) is a huge challenge for information storage, organization, retrieval, and presentation. How do we take advantage of this increasing load of potential knowledge? How do we make sure that people can get to the information they need when they need it, and it is presented in a way they understand?
Comment by Ken Berkun on March 9, 2010 at 12:00pm
I am lucky to be married to a sexy statistician. Eat your heart out Gus.
Comment by Daniel Leuck on March 6, 2010 at 8:29pm
Hi Brian - That is a good question. They reference other sources, some of which tried to distinguish between data and information or identify potentially useful data for the purpose of their estimate (i.e. remove duplicate data and other noise.) Given the scope, there is obviously a lot of guesswork, and in many cases you don't know what is relevant data until you perform heavy analysis, especially in the realm of scientific research. How do the physicists collecting sensor data at the Hadron Collider decide what sensor data to toss? This is an extreme case, as not many people collect 40 terabytes of data per second, but I think it illustrates the problem of big data nicely.

The Economist special report, which was comprised of nine articles, discussed the questions of what should be stored, how it should be organized, and how people will be able to find relevant data in an ocean of zettabytes of information. Google, a company that states its mission as being, "to organize the world's information and make it universally accessible and useful", is clearly at the forefront for publicly accessible data.

Gus & Kostya - Sorry for the bait and switch title. :-)
Comment by Gus Higuera on March 6, 2010 at 2:25pm
I have to completely agree with you Kostya. Money and an economy based on greed rather than people is a cancer literally killing this planet.
Comment by Konstantin A Lukin on March 6, 2010 at 4:14am
Here is another article, published a bit earlier on the same subject: Methane Bubbling From Arctic Lakes, Now And At End Of Last Ice Age. Logically speaking, how important is economic growth if the planet has no more ice left? Is it time to revise economic principles into something that works in present environmental conditions?

Comment by Konstantin A Lukin on March 5, 2010 at 11:51pm
Not to be a party pooper, but generated data is not the only thing that's growing at 'sexy' rates. According to the findings of an international research team led by University of Alaska Fairbanks scientists Natalia Shakhova and Igor Semiletov, Methane Releases from Arctic Shelf May Be Much Larger and Faster Th.... It is interesting how Universe has a way of keeping a perfect balance..

Quote:
Shakhova's research results show that the East Siberian Arctic Shelf is already a significant methane source: 7 teragrams yearly, which is equal to the amount of methane emitted from the rest of the ocean. A teragram is equal to about 1.1 million tons. Our concern is that the subsea permafrost has been showing signs of destabilization already," she said. "If it further destabilizes, the methane emissions may not be teragrams, it would be significantly larger.


Doode, where are the pictures of the sexy statisticians?!
Yea, those would be very much appreciated in exciting times like these :)
Comment by Gus Higuera on March 5, 2010 at 11:05pm
Doode, where are the pictures of the sexy statisticians?!

Sponsors

web design, web development, localization

© 2024   Created by Daniel Leuck.   Powered by

Badges  |  Report an Issue  |  Terms of Service