Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

The Big Data space is a rapidly growing and evolving arena. From predictive anal

ID: 3843504 • Letter: T

Question

The Big Data space is a rapidly growing and evolving arena. From predictive analytics and virtualization on the user end, to NoSQL and distribution methods on the back end, there are quite a lot of technologies to encompass. First, read this brief article overview of new technologies: http://www.forbes.com/sites/gilpress/2016/03/14/top-10-hot-big-data-technologies/#3c7eba937f26 Pick two of the technologies listed on this trajectory graphic in the article and consider the following: What does this technology do? How is it unique? Which problems is it trying to solve? How does it interact with other big data technologies? What companies/services are available that provide it? How mature is this technology?

Explanation / Answer

Predictive analytics

Predictive analytics includes a variety of statistical techniques from predictive modelling, machine learning, and data mining that analyse current and historical facts to make predictions about future or otherwise unknown events. The patterns found in historical and transactional data can be used to identify risks and opportunities for future.

Uniqueness — it is unique because it is using all the data and doing predictions, i.e., trying to judge something that has not happened yet and by doing so saving a lot of money and time of companies and saving lives too (medical assistance).

Problem to be solved —

Interaction —

Companies —

Maturity—

This technology is very mature as there are thousands of organisations who are currently using it either to improve their sales of a product, weather forecasting is also a part of this technology, doctors use it for testing the life of a drug, so it is being used in almost every field, thus it is a very mature technology.

Distributed file stores

A computer network where data is stored in multiple places or servers to improvise the performance of the system by providing the parallel access to all data. Generally, all the servers contain the same data, i.e., replica of original data. It is also done to prevent any kind of data loss, if any of the server fails due to human error or natural disaster.

Uniqueness — it is unique because it creates multiple copies of the same data to safeguard data from server failures, to boost up the mechanism or functioning of different process, mainly websites, e.g. Yaahoo uses 40000 servers for above technology. Thus due to these two main reasons it is unique.

Problems to be solved—

Interaction:

It interacts with most of the other techniques such as Data Integration Tools — provides data to them that must be linked or accessed by using some techniques like MongoDB NoSql etc., Data Quality tools — by creating so many replicas the quality of original information is retained and checking for errors is also convenient.

Companies:

Maturity:

It is also a mature technology as at present time there are millions of websites present on internet and to safeguard their data and better performance, the hosting companies does provide distributed file systems to them. For example, Google, one of the largest search engines, uses this technology and it has been more than 16 years since google started using it.