Please prove detailly answer for all questions blew, thank you so much. 1. Provi

ID: 3742122 • Letter: P

Question

Please prove detailly answer for all questions blew, thank you so much.

1. Provide some examples of transforming unstructured data into structured data. Can structured data be transformed into unstructured data? Explain it.

2. From a management perspective, comment on how managers should understand the distinction between these two structural categories. Provide examples to support answer.

3. Examine Web sites that offer data for financial markets, stocks and bonds, etc., such as:https://finance.yahoo.com. What are some issues of data inconsistency that can occur when examining stock data for your favorite publicly traded companies?

4. Referring to the firm’s attempts to design their quiz application, comment on some issues that could occur between the back - end database development team and the front - end design team.

5. What are the potential costs of implementing a database system?

Explanation / Answer

1. Unstructured data is defined as data which does not have a standard format and which needs to be processed before it can be useful for any particular purpose. Generally, we convert unstructured data into structured data, applying various methods to extract useful information from data. A few examples of this would be as follows:

* For textual data, there are particular methods that are used to extract information from unstructured text. These methods involve converting words into better formats for representation such as word2vec. word2vec takes a large amount of text and performs transformations on it such that each unique word in the text is represented by a vector. Vectors are more easily processed by computers, hence the unstructured text now has a particular structure.

* Another example of converting unstructured data into structured data is in the field of music technology. There are big data projects that process songs and extract certain particular features from these songs (such as pitch, beats per minute etc.) and store them in a database for further usage. Hence, the unstructured songs get converted into a structured format with each song being represented by values such as pitch, bpm.

It may or may not be possible to regain the original unstructured data from some structured data through data processing methods. It generally depends on the type of data that one is working with. For example, if one has the word2vec representation of some text, it is easily possible to reverse the word2vec process to get the original text. However, if one only has the pitch and bpm of a particular song, it is impossible to generate the original song from this data.

2. It is very important for managers to understand the distinction between structured and unstructured data. Managers should generally understand the following points of distinction between the two categories of data.

* Unstructured data is very readily available in the world while structured data is relatively rarer to find. For example, one can easily find raw corpuses of text on the internet, but readily tagged data with text and its properties (such as tone, sentiment etc.) is very hard to find.

* Unstructured data generally needs much more processing before it becomes useful. This is very different from structured data which is generally useful from the very beginning with very little processing needed. Hence the costs associated with working with unstructured data is higher. For example, in order to effectively work with stock market data, one has to parse the data streams coming from the various exchanges all over the world. Programmers have to spend time processing this data before
it can be effectively used.

3. There are a number of issues that one might face due to data inconsistencies while examining stock data for companies traded on public exchanges. A few of them are as follows:

* Difference in units: Different sources of data may use different units for their data. For example: the Yahoo Finance may use cents as a unit to list prices while Google Finance may use US Dollars. Another example is that some website may use one unit for historical prices and another completely different unit for current prices making it harder for the user to understand the data

* Inconsistency in values: Different sources may parse the original data from the stock exchange differently leading to inconsistency in values of prices of stocks, leading to losses.

* Timing inconsistencies: We may receive data from different sources at different times due to different latencies from the sources. This means that it is hard to reliably check and double-check data from multiple sources efficiently.

4. Let us assume that the firm wants to design a quiz application and has assigned two different teams to the two tasks of backend database development and frontend design. These teams then need to co-ordinate very closely in order to avoid a number of issues that may occur.

* The backend team might not understand the exact requirements of the front-end team, and hence, they might design the schema in such a way that some very important piece of data cannot be efficiently recovered. This would then make the application very slow.

* The backend team might need to perform schema changes later on which would lead to breaking frontend code because of the close coupling of the back-end and front-end systems. This means that each change in the backend may need appropriate changes in the frontend as well.

5. There are a number of costs associated with implementing a brand new database system. Some of them are as follows:

* Hardware costs: Generally, in order to process large amounts of data, an individual database server with a high-performing CPU and large amounts of storage is needed. This would mean an increase in hardware costs for the system.

* Personnel costs: If one wants an efficient and well-designed database system, one would generally need the expertise of an experience database administrator who knows how to structure the data for all the needed usecases. This might mean having to hire a new engineer which increases costs.

* Data management costs: Generally, one might need to import data from an external source to create a useful database. This would require programmer man-hours which will, in the end, show up as costs on the organization's bills.

Navigate

Please prove detail answer for the question blew, thank you so much. 1. Provide

Please prove detailly answer for all the questions blew, thank you so much. (1)

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Please prove detailly answer for all questions blew, thank you so much. 1. Provi

Question

Explanation / Answer

Related Questions

Navigate