Any discussion about Big Data will not be complete without discussing about Data Science and its relation with Big Data. Data Science can be considered as the extraction of knowledge from large volumes of data that are structured (e.g. RDBMS, Excel) or unstructured (e.g. emails, videos, photos, social media, and other user-generated content). Data Science may be considered as a continuation of the field of data mining and predictive analytics.
According to the concept of the 3Vs, BigData is data that may be very big (Volume) that may come in very fast for processing like a continuous streaming data (Velocity) and may be very diverse like structured, unstructured, NoSQL database data etc (Variety). Data is the most important part in BigData; if there is no data, then there is no BigData. So we will discuss about how data is generated, data types, where the data is stored and also various challenges with managing and processing.
Learning something about the techniques and concepts of BigData is always good before learning any BigData related technology like Hadoop. It gives you a fair idea on where things fit together. This is just a quick introduction to the concepts of BigData like definitions, applications and differences with small data.
This note can be used as a quick learner or a quick refresher for BigData concepts. For detailed learning, you may refer to the reference notes or tutorials mentioned.
Though it says interview questions, this page list down questions that can be also used to test your understanding of BigData and Hadoop’s basics and about Hadoop’s component technologies that make up the Hadoop technology stack. This doesn’t go deeper into any of the technology stack component. Having a bigger picture and knowing how the components fit together will help you make decisions in using the right component in the right way.
This is a personal technical blog where we share our understanding on various concepts and is neither an official page or documentation for the products described here, nor the official views of the companies we work with.
Keywords used in this website are trademarks of their respective owners. This website is not affiliated with Oracle™ and/or any of the JEE frameworks like Spring™, Struts™, Hibernate™ and JSF™.
All contents and materials are provided freely without any warranty or liability and nothing within the site should be considered as professional advice. In any doubt, please ask, and we will try to help you based on our knowledge. Please let us know if you feel anything is not right here (including any copyright violation) and we will act upon it as fast as we can.