Definition[]
The big data paradigm
“ | consists of the distribution of data systems across horizontally coupled, independent resources to achieve the scalability needed for the efficient processing of extensive datasets.[1] | ” |
Overview[]
"This new paradigm leads to a number of conceptual definitions that suggest Big Data exists when the scale of the data causes the management of the data to be a significant driver in the design of the system architecture. This definition does not explicitly refer to the horizontal scaling in the Big Data paradigm."[2]
"fundamentally, the Big Data paradigm is a shift in data system architectures from monolithic systems with vertical scaling (i.e., adding more power, such as faster processors or disks, to existing machines) into a parallelized, 'horizontally scaled', system (i.e., adding more machines to the available collection) that uses a loosely coupled set of resources in parallel. This type of parallelization shift began over 20 years ago for compute-intensive applications in simulation communities, when scientific simulations began using massively parallel processing (MPP) systems."[3]
References[]
- ↑ NIST Big Data Interoperability Framework, Vol. 1, at 5.
- ↑ Id.
- ↑ Id.