此文说的大数据,不是常规数据库,不是人们经常谈论的 data marts 以及 data warehouse。
常规数据库,以及人们经常谈论的 data marts 以及 data warehouse, require data to be stored in relational database with fixed columns and constraints. like Primary key, foreign key, etc.
Of course data mart or data warehouse can do many data analysis and give many results.
Here, so called big data, means much larger data volume than regular data mart or data warehouse and the data structure is arbitrary, there is no fixed columns.
As more and more data is generate by all sorts of devices and social media network, there are demands for this kind of data analysis.
Apache Hadoop is one of this kind of big data system started in Yahoo.
Map Reduce is the model to process this kind of data which is started by Google.
Apache hadoop support a data analysis scripting language call Pig, and SQL kind of analysis language called Hive.