The IT Law Wiki

Definition[]

The big data paradigm

consists of the distribution of data systems across horizontally coupled, independent resources to achieve the scalability needed for the efficient processing of extensive datasets.[1]

Overview[]

"This new paradigm leads to a number of conceptual definitions that suggest Big Data exists when the scale of the data causes the management of the data to be a significant driver in the design of the system architecture. This definition does not explicitly refer to the horizontal scaling in the Big Data paradigm."[2]

"fundamentally, the Big Data paradigm is a shift in data system architectures from monolithic systems with vertical scaling (i.e., adding more power, such as faster processors or disks, to existing machines) into a parallelized, 'horizontally scaled', system (i.e., adding more machines to the available collection) that uses a loosely coupled set of resources in parallel. This type of parallelization shift began over 20 years ago for compute-intensive applications in simulation communities, when scientific simulations began using massively parallel processing (MPP) systems."[3]

References[]