此文说的大数据,不是常规数据库,不是人们经常谈论的 data marts 以及 data warehouse。

来源: 美国老土 2014-07-22 08:35:07 [] [博客] [旧帖] [给我悄悄话] 本文已被阅读: 次 (965 bytes)
常规数据库,以及人们经常谈论的 data marts 以及 data warehouse, require data to be stored in relational database with fixed columns and constraints. like Primary key, foreign key, etc.
Of course data mart or data warehouse can do many data analysis and give many results.

Here, so called big data, means much larger data volume than regular data mart or data warehouse and the data structure is arbitrary, there is no fixed columns.

As more and more data is generate by all sorts of devices and social media network, there are demands for this kind of data analysis.

Apache Hadoop is one of this kind of big data system started in Yahoo.
Map Reduce is the model to process this kind of data which is started by Google.
Apache hadoop support a data analysis scripting language call Pig, and SQL kind of analysis language called Hive.


所有跟帖: 

Redeveloped, following Google white papers -数据分析- 给 数据分析 发送悄悄话 (233 bytes) () 07/22/2014 postreply 08:40:13

两位补充的,非常 educational ! -多哥- 给 多哥 发送悄悄话 多哥 的博客首页 (0 bytes) () 07/22/2014 postreply 08:42:22

哪里哪里,都是胡说之。 多哥才是真知灼见。 Enjoy the day! -美国老土- 给 美国老土 发送悄悄话 美国老土 的博客首页 (0 bytes) () 07/22/2014 postreply 08:44:41

哪里哪里,随便说说,供大家批判提高啊。 -多哥- 给 多哥 发送悄悄话 多哥 的博客首页 (0 bytes) () 07/22/2014 postreply 13:50:20

一个炤头吃饭,多多包涵!:) -数据分析- 给 数据分析 发送悄悄话 (42 bytes) () 07/22/2014 postreply 08:50:00

我看几位好像有大阴谋 -怪哉- 给 怪哉 发送悄悄话 怪哉 的博客首页 (3 bytes) () 07/22/2014 postreply 08:56:13

加跟帖:

  • 标题:
  • 内容(可选项): [所见即所得|预览模式] [HTML源代码] [如何上传图片] [怎样发视频] [如何贴音乐]