2.2.1 Big Data in MODERN COMPUTER Network
Big Data
In simple terms, big data refers to everything that enables an organization to create, manipulate, and manage very large data sets (measured in terabytes, petabytes, exabytes, and so on) and the facilities in which these are stored. Distributed data centers, data warehouses, and cloud-based storage are common aspects of today’s enterprise networks. Many factors have contributed to the merging of “big data” and business networks, including continuing declines in storage costs, the maturation of data mining and business intelligence (BI) tools, and government regulations and court cases that have caused organizations to stockpile large masses of structured and unstructured data, including documents, e-mail messages, voice-mail messages, text messages, and social media data. Other data sources being captured, transmitted, and stored include web logs, Internet documents, Internet search indexing, call detail records, scientific research data and results, military surveillance, medical records, video archives, and e-commerce transactions.
big data
A collection of data on such a large scale that standard data analysis and management tools are not adequate. More broadly, big data refers to the volume, variety, and velocity of structured and unstructured data pouring through networks into processors and storage devices, along with the conversion of such data into business advice for enterprises.
Data sets continue to grow with more and more being gathered by remote sensors, mobile devices, cameras, microphones, radio frequency identification (RFID) readers, and similar technologies. One study from a few years ago estimated that 2.5 exabytes (2.5 × 10 bytes) of data are created each day, and 90 percent of the data in the world was created in the past two years [IBM11]. Those numbers are likely higher today.
Big Data Infrastructure Considerations
Traditional business data storage and management technologies include relational database management systems (RDBMS), network-attached storage (NAS), storage-area networks (SANs), data warehouses (DWs), and business intelligence (BI) analytics.
Traditional data warehouse and BI analytics systems tend to be highly centralized within an enterprise infrastructure. These often include a central data repository with a RDBMS, high-performance storage, and analytics software, such as online analytical processing (OLAP) tools for mining and visualizing data.
analytics
Analysis of massive amounts of data, particularly with a focus on decision making.
Increasingly, big data applications are becoming a source of competitive value for businesses, especially those that aspire to build data products and services to profit from the huge volumes of data that they capture and store. There is every indication that the exploitation of data will become increasingly important to enterprises in the years ahead as more and more businesses reap the benefits of big data applications.
Big Data Networking Example
To get some feel for the networking requirements for a typical big data system, consider the example ecosystem of Figure 2.3(compared to Figure 1.1 from Chapter 1).
FIGURE 2.3 Big Data Networking Ecosystem
Key elements within the enterprise include the following:
Data warehouse: The DW holds integrated data from multiple data sources, used for reporting and data analysis.
Data management servers: Large banks of servers serve multiple functions with respect to big data. The servers run data analysis applications, such as data integration tools and analytics tools. Other applications integrate and structure data from enterprise activity, such as financial data, point-of-sale data, and e-commerce activity.
In simple terms, big data refers to everything that enables an organization to create, manipulate, and manage very large data sets (measured in terabytes, petabytes, exabytes, and so on) and the facilities in which these are stored. Distributed data centers, data warehouses, and cloud-based storage are common aspects of today’s enterprise networks. Many factors have contributed to the merging of “big data” and business networks, including continuing declines in storage costs, the maturation of data mining and business intelligence (BI) tools, and government regulations and court cases that have caused organizations to stockpile large masses of structured and unstructured data, including documents, e-mail messages, voice-mail messages, text messages, and social media data. Other data sources being captured, transmitted, and stored include web logs, Internet documents, Internet search indexing, call detail records, scientific research data and results, military surveillance, medical records, video archives, and e-commerce transactions.
big data
A collection of data on such a large scale that standard data analysis and management tools are not adequate. More broadly, big data refers to the volume, variety, and velocity of structured and unstructured data pouring through networks into processors and storage devices, along with the conversion of such data into business advice for enterprises.
Data sets continue to grow with more and more being gathered by remote sensors, mobile devices, cameras, microphones, radio frequency identification (RFID) readers, and similar technologies. One study from a few years ago estimated that 2.5 exabytes (2.5 × 10 bytes) of data are created each day, and 90 percent of the data in the world was created in the past two years [IBM11]. Those numbers are likely higher today.
Big Data Infrastructure Considerations
Traditional business data storage and management technologies include relational database management systems (RDBMS), network-attached storage (NAS), storage-area networks (SANs), data warehouses (DWs), and business intelligence (BI) analytics.
Traditional data warehouse and BI analytics systems tend to be highly centralized within an enterprise infrastructure. These often include a central data repository with a RDBMS, high-performance storage, and analytics software, such as online analytical processing (OLAP) tools for mining and visualizing data.
analytics
Analysis of massive amounts of data, particularly with a focus on decision making.
Increasingly, big data applications are becoming a source of competitive value for businesses, especially those that aspire to build data products and services to profit from the huge volumes of data that they capture and store. There is every indication that the exploitation of data will become increasingly important to enterprises in the years ahead as more and more businesses reap the benefits of big data applications.
Big Data Networking Example
To get some feel for the networking requirements for a typical big data system, consider the example ecosystem of Figure 2.3(compared to Figure 1.1 from Chapter 1).
FIGURE 2.3 Big Data Networking Ecosystem
Key elements within the enterprise include the following:
Data warehouse: The DW holds integrated data from multiple data sources, used for reporting and data analysis.
Data management servers: Large banks of servers serve multiple functions with respect to big data. The servers run data analysis applications, such as data integration tools and analytics tools. Other applications integrate and structure data from enterprise activity, such as financial data, point-of-sale data, and e-commerce activity.
Workstations / data processing systems: Other systems involved in the use of big data applications
and in the generation of input to big data warehouses.
Network management server: One or more servers responsible for network management, control, and monitoring.
Not shown in Figure 2.3 are other important network devices, including firewalls, intrusion detection/prevention systems (IDS/IPS), LAN switches, and routers.
The enterprise network can involve multiple sites distributed regionally, nationally, or globally. In addition, depending on the nature of the big data system, an enterprise can receive data from other enterprise servers, from dispersed sensors and other devices in an Internet of Things, in addition to multimedia content from content delivery networks.
The networking environment for big data is complex. The impact of big data on an enterprise’s networking
infrastructure is driven by the so-called three V’s:
Network capacity: Running big data analytics requires a lot of capacity on its own; the issue is magnified when big data and day-to-day application traffic are combined over an enterprise network.
Latency: The real or near-real-time nature of big data demands a network architecture with consistent low latency to achieve optimal performance.
Storage capacity: Massive amounts of highly scalable storage are required to address the insatiable appetite of big data, yet these resources must be flexible enough to handle many different data formats and traffic loads.
Processing: Big data can add significant pressure on computational, memory, and storage systems, which, if not properly addressed, can negatively impact operational efficiency.
Network management server: One or more servers responsible for network management, control, and monitoring.
Not shown in Figure 2.3 are other important network devices, including firewalls, intrusion detection/prevention systems (IDS/IPS), LAN switches, and routers.
The enterprise network can involve multiple sites distributed regionally, nationally, or globally. In addition, depending on the nature of the big data system, an enterprise can receive data from other enterprise servers, from dispersed sensors and other devices in an Internet of Things, in addition to multimedia content from content delivery networks.
The networking environment for big data is complex. The impact of big data on an enterprise’s networking
infrastructure is driven by the so-called three V’s:
- Volume (growing amounts of data)
- Velocity (increasing speed in storing and reading data)
- Variability (growing number of data types and sources)
Network capacity: Running big data analytics requires a lot of capacity on its own; the issue is magnified when big data and day-to-day application traffic are combined over an enterprise network.
Latency: The real or near-real-time nature of big data demands a network architecture with consistent low latency to achieve optimal performance.
Storage capacity: Massive amounts of highly scalable storage are required to address the insatiable appetite of big data, yet these resources must be flexible enough to handle many different data formats and traffic loads.
Processing: Big data can add significant pressure on computational, memory, and storage systems, which, if not properly addressed, can negatively impact operational efficiency.
Secure data access: Big data projects combine sensitive information from many sources like customer
transactions, GPS coordinates, video streams, and so on, which must be protected from unauthorized access.
Modern Computer Network Theory Playlist
Modern Computer Network Practical Playlist
#Subscribe the Channel Link :-
IF any Query or Doubt DM on #Instagram :-
#Bansode_Tech_Solution
Modern Computer Network Theory Playlist
Modern Computer Network Practical Playlist
#Subscribe the Channel Link :-
IF any Query or Doubt DM on #Instagram :-
#Bansode_Tech_Solution
Comments
Post a Comment