Blog 02 | Summary of week 1: Introduction to Big Data and Business Intelligence
Lecture Summary
During week 1 of this program, we went over the terms
business intelligence, Big Data, Analytics, and Data Science. Furthermore, we studied
the paradigm shift caused by big data and how can we harness its power.
The paradigm shifts caused by big
data:
The term datafication means that every activity we do is or can be tracked by small pieces of data. When these small bits and pieces of data are looked at collectively, one can see patterns and make observations. These can be leveraged by companies to better understand the needs of their audience, measure performance, and make sound business decisions.
2. Rich in semantics and highly dynamic content and interactions
This refers to the variety of data being produced, which is not only contained to database transactions but also includes social media interactions, video and photo data, and interactions, etc. Making the variety of content both dynamic and varied in type.
3. Billions of users and massive traces of human activities. “N = ALL”
Previously we had to sample data to understand and analyze behaviors or patterns. Now, with such vast amounts of data and users, we can deal with the population directly, eliminating the need to deal with sample data and sampling errors
Big Data is essentially a treasure trove of information/data.
The 3 V’s of big data i.e.
1. Volume: the vast amounts of data being generated
2. Velocity: the speed of data generation
3. Variety: Structured and unstructured data types
Business Intelligence (BI) are the applications, tools, technologies, and techniques that are used to gather, store, and analyze data to provide actionable insights that help organizations to make sound business decisions, measure and manage their performance, and to continuously evolve.
BI has evolved from its traditional architecture which mostly used transactional data to more modern architecture that not only takes into account the transactional data but also augments data from nontraditional sources.
BI Implementation: iterative life cycle
Source: Lecture 02 Slide 25
Applications of BI are widespread ranging from neonatal care, trading advantage, and customer retention to traffic control, fraud prevention, and much more
Article: https://www.techtarget.com/searchdatamanagement/feature/Top-trends-in-big-data-for-2021-and-beyond
This article talks about how organizations that utilize big data are realizing its benefits ranging from increased efficiency to optimization of products and services provided to their customers. This has resulted in increased innovation leading to the emergence of new techniques, practices, and evolved approaches to big data technologies.
Adding to the above it also discusses the four major trends in big data:
1. Edge Computing
· Edge Computing is where the data preprocessing is handled at the device itself before being sent to the servers.
· It optimizes performance and storage by reducing the need for data to flow through the networks. This reduces computing and processing costs, helps speed up data analysis and provides faster responses to users
2. Cloud and Hybrid cloud computing
· To work with the increased amounts of data being generated, organizations are spending their resources storing their data in a range of cloud-based systems that are better equipped for all the V’s of big data. These also work with industries that are bound by technical limitations or heavy regulations using more regulatory-friendly infrastructures along with hybrid approaches supporting third-party cloud systems with on-premise computing and storage to meet specific infrastructural needs.
· Advancements (current and future) in public and hybrid cloud infrastructures have enabled organizations to take advantage of cloud computing where they can store, process, and analyze the data by transferring responsibility whilst increasing their data handling capabilities
3. Data Lake
· Instead of trying to centralize data storage in a data warehouse requiring complex and time-sensitive data ETL, organizations are evolving to a new data architecture approach that allows them to handle the challenges that come with the three V’s of big data.
· Data lakes store structured and unstructured datasets in their native formats, transferring the responsibility of transformation and processing to the endpoints having different data needs. They can also provide shared services for data preprocessing and analytics.
4. Adoption of advanced analytics, machine learning, and AI technologies
· The use of machine learning and ai technologies is revolutionary for bid data analytics.
· Enterprises can process and analyze vast amounts of data and gain deeper insights into patterns and provide better customer services, perform process optimizations, and innovate data visualization to achieve a better understanding of their data.
Very nice summary of week 1, and thank you for recommending the article on 7 Unusual Uses of Big Data (https://datafloq.com/read/7-unusual-uses-of-big-data/). The animal migration one was very interesting, and I think my friend mentioned that she did a tableau assignment on animal migration. I was impressed with the range of articles that our classmates have posted, especially the ones about sports.
ReplyDeleteThanks for sharing the four major trends in big data too. I also went down a rabbit hole to learn more about data lakes and then stopped when 'they' started to mention data swamps and its negative connotation [when swamps are actually good ecosystems]. sigh.
See you in class! -Kat Francisco
Dear Kat,
DeleteThank you for taking the time to read my summary and the article
I agree, I love the range of interesting topics that are posted for the class.
Its hard not to hyper focus on something new, I do that all the time even with my daily life situations hehe
See you in class :)