big data

GC: n

CT: Big Data is an all-encompassing term for any collection of data sets so large or complex that it becomes difficult to process using traditional data processing applications.
The challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and privacy violations. The trend to larger data sets equates to additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to “spot business trends, prevent diseases, combat crime and so on.”Scientists, Practitioners of Media and Advertising and Governments alike regularly encounter limitations due to large data sets in many areas. The limitations affect Internet search, finance and business informatics.
Scientists, for example, ponder upon limitations including meteorology, genomics, connectomics, complex physics simulations, biological and environmental research, and in e-Science in general.
Data sets grow in size in part because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers, and wireless sensor networks. The world’s technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s; as of 2012, every day 2.5 exabytes (2.5×1018) of data were created; The challenge for large enterprises is determining who should own big data initiatives that straddle the entire organization.
Big data is difficult to work with using most relational database management systems and desktop statistics and visualization packages, requiring instead “massively parallel software running on tens, hundreds, or even thousands of servers”.What is considered “big data” varies depending on the capabilities of the organization managing the set, and on the capabilities of the applications that are traditionally used to process and analyze the data set in its domain. Big Data is a moving target; what is considered to be “Big” today will not be so years ahead. “For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration.

S: (last access: 11 May 2017)

N: 1. Computing (also with capital initials) data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges; (also) the branch of computing involving such data.
Big data is a new addition to our language, but exactly how new is not an easy matter to determine. A 1980 paper by Charles Tilly provides an early documented use of big data, but Tilly wasn’t using the word in the exact same way we use it today; rather, he used the phrase “big-data people” to refer to historians engaged in data-rich fields such as cliometrics. Today, big data can refer to large data sets or to systems and solutions developed to manage such large accumulations of data, as well as for the branch of computing devoted to this development. Francis X. Diebold, a University of Pennsylvania economist, who has written a paper exploring the origin of big data as a term, a phenomenon, and a field of study, believes the term “probably originated in lunch-table conversations at Silicon Graphics Inc. (SGI) in the mid 1990s….”
2. Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves.
3. Big data refers to a scale of data that cannot be managed by conventional means. Internet-scale data has fostered the creation of new architectures and applications that are able to process this new class of data. These architectures are highly scalable and efficiently process data in parallel across a sea of servers.

S: 1. OD – (last access: 11 May 2017; MW – (last access: 11 May 2017). 2. (last access: 11 May 2017). 3.
TERMIUM PLUS – (last access: 11 May 2017).

SYN: massive data, large data, big dataset, massive dataset, large dataset.

S: GDT – (last access: 11 May 2017)

CR: artificial intelligence, blockchain, computer science.