From Book: Text Data Analysis and Management by ChengXiang Zhai and Sean Massung

ID: 3671174 • Letter: F

Question

From Book: Text Data Analysis and Management by ChengXiang Zhai and Sean Massung

Thank you

Chp-3

Exercise 3.1: In what way is NLP related to text mining?

Exercise 3.3: Given a collection of documents for a specific topic, how can we use maximum

likelihood estimation to create a topic unigram language model?

Exercise 3.7: A unigram language model as defined in this chapter can take a sequence of words as

input and output its probability. Explain how this calculation has strong independence

assumptions.

Exercise 3.9: An n-gram language model records sequences of n words. How does the number of

possible parameters change if we decided to use a 2-gram (bigram) language model

instead of a unigram language model? How about a 3-gram (trigram) model? Give your

answer in terms of V , the unigram vocabulary size.

Chp-5

Exercise 5.3: Often, push and pull modes are combined in a single system. Give an example of such

an application.

Exercise 5.5: In a future chapter, we will discuss recommender systems. These are systems in

push mode that deliver information to users. What are some specific applications of recommender systems? Can you name some services available to you that fit into this access mode?

Exercise 5.7 : Design a text information system used to explore musical artists. For example, you can

search for an artist’s name directly. The results are displayed as a graph, with edges

to similar artists (as measured by some similarity algorithm). Use TIS access mode

vocabulary to describe this system and any enhancements you could make to satisfy

different information needs.

Ch-6

Exercise 6.1: Here’s a query and document vector. What is the score for the given document using dot

product similarity?

d = f1; 0; 0; 0; 1; 4g q = f2; 1; 0; 1; 1; 1g

Exercise 6.3: Let d be a document in a corpus. Suppose we add another copy of d to collection. How

does this affect the IDF of all words in the corpus?

Exercise 6.6: If you perform stemming on words in V to create V 0 then jV 0j > jV j. True or false?

Why?

Ch-7

Exercise 7.1: How should you set the Rocchio parameters _; _; and depending on what type of

feedback you are using? That is, should the parameters be set differently if you are using

pseudo feedback compared to user-supplied relevance judgements? What about implicit

feedback through clickthrough data?

Exercise 7.9: Design a heuristic to automatically determine the best _ for mixture model feedback

on a query-by-query basis. You could look at the query itself, the number of matching

documents, or the distribution of ranking scores in the original results. Test your heuristic

by doing experiments.

Explanation / Answer

In what way is NLP related to text mining?

Natural language processing (NLP) deals with the automatic processing and analysis of unstructured textual information. One direction of NLP research relies on statistical techniques, typically involving the processing of words found in texts [7]. Another approach makes use of rule based techniques, leveraging knowledge resources such as ontologies, taxonomies, and linguistic rule bases. Statistical human language processing systems require collections of training material which exemplify the desirable (and/or undesirable) relationships and dependencies. Subsequent modification of the system then requires some degree of retraining of the system. Instead of requiring training material, rule based techniques require knowledge in the form of on-line dictionaries, established linguistic theories, and they are able to leverage existing classification systems or taxonomic frameworks. NLP applications may make use of either or both of these techniques, and the decision of which technique to use is often dependent on the availability of training materials, external resources, and the actual text analysis tasks required in the resulting application.

Often, push and pull modes are combined in a single system. Give an example of such an application.

Definitions are very important to create context for this article. A supply chain is a minimum of a network of a business, its suppliers and customers. Thus, supply chain management by a particular business is the management of human capital, processes, materials and information between that business, its suppliers and its customers that ensures maximum customer service at maximum margin to that business. Importantly, while all participants in a supply chain can benefit from improvements to the functioning of the supply chain as a whole, rarely do they benefit equally.

Customer or demand push is usually defined as a business response in anticipation of customer demand and customer or demand pull as a response resulting from customer demand. However, from a whole supply chain viewpoint, deciding whether a particular supply chain is push or pull is often difficult and generally depends on the perspective of what constitutes the supply chain and where particular participants are placed in the chain. For example, the manufacture of Toyota automobiles is heralded as a leading example of a demand driven supply chain.

However, the mining of the iron ore or operation of blast furnaces that process the iron ore for ultimate manufacture of automobiles is not. At some point in most supply chains, in their widest sense, demand push meets demand pull, and at this point inventory accumulates. This point is referred to as the push-pull interface or as the supply chain decoupling point.

Navigate

From Book: IT Systems Management (2010) by Rich Schiesser 2nd Edition Prentice H

From Bresnan, Awakening , chapters 7 and 8.. For chapter 7, answer the following

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

From Book: Text Data Analysis and Management by ChengXiang Zhai and Sean Massung

Question

Explanation / Answer

Related Questions

Navigate