Given a training set of 10 million tuples with 10 attributes each taking 8 bytes

ID: 3769945 • Letter: G

Question

Given a training set of 10 million tuples with 10 attributes each taking 8 bytes space. One attribute is a class label with two distinct values, whereas for other attributes each has 50 distinct values. Assume your machine cannot hold all the dataset in the main memory. Outline an ecient method that constructs Naive Bayes classier eciently, and answer the following questions explicitly:

how many scans of the database does your algorithm take?

Answer:

(ii) what is the maximum memory space your algorithm will use in your induction?

Answer:

(b) [6] Give each situation that one of the following measures is most appropriate for measuring the quality of classication:

(1) sensitivity

Answer:

(2) specicity

Answer:

(3) ROC curve

Answer:

(c) [5] People say that if each classier is better than random guess, ensemble of multiple such classiers will lead to a nontrivial increase of classication accuracy. Do you agree with this statement? Give reasoning on it.

Answer:

(d) [5] What are the similarities and dierences between semi-supervised classication and active learning?

Answer:

(e) [5] If one would like to work out a model to classify U. of Michigan webpages based on the model you have learned from the UIUC web- site. Is it easy to do it by transfer learning? How would you suggest the person to proceed?

Answer:

Explanation / Answer

I know some answers not all...

(6) Give each situation that one of the following measures is most appropriate for measuring the quality of classication:

1) Sensitivity -> Sensitivity measures the proportion of positives that are correctly identified as suchthe percentage of sick people who are correctly identified as having the condition.The calculation of sensitivity does not take into account indeterminate test results. If a test cannot be repeated, indeterminate samples either should be excluded from the analysis

(2) Specificity -> measures the proportion of negatives that are correctly identified as such the percentage of healthy people who are correctly identified as not having the condition.

Specificity relates to the test's ability to correctly detect patients without a condition. Consider the example of a medical test for diagnosing a disease. Specificity of a test is the proportion of healthy patients known not to have the disease, who will test negative for it.

(3) ROC Curve -> ROC curve, is a graphical plot that illustrates the performance of a binary classifier system as its discrimination threshold is varied. The curve is created by plotting the true positive rate against the false positive rate at various threshold settings. The true-positive rate is also known as sensitivity.

ROC analysis provides tools to select possibly optimal models and to discard suboptimal ones independently from the cost context or the class distribution. ROC analysis is related in a direct and natural way to cost/benefit analysis of diagnostic decision making.

(c) If each classier is better than random guess, ensemble of multiple such classiers will lead to a nontrivial increase of classication accuracy.

The success of an ensemble system - that is, its ability to correct the errors of some of its members - rests squarely on the diversity of the classifiers that make up the ensemble. After all, if all classifiers provided the same output, correcting a possible mistake would not be possible. Therefore, individual classifiers in an ensemble system need to make different errors on different instances.

The intuition, then, is that if each classifier makes different errors, then a strategic combination of these classifiers can reduce the total error, a concept not too dissimilar to low pass filtering of the noise. Specifically, an ensemble system needs classifiers whose decision boundaries are adequately different from those of others. Such a set of classifiers is said to be diverse. Classifier diversity can be achieved in several ways. Preferably, the classifier outputs should be class-conditionally independent, or better yet negatively correlated.

Their built in combination rules, such as simple majority voting for bagging, weighted majority voting for AdaBoost, a separate classifier for stacking, etc. However, an ensemble of classifiers can be trained simply on different subsets of the training data, different parameters of the classifiers, or even with different subsets of features as in random subspace models.

(d) What are the similarities and dierences between semi-supervised classication and active learning.

Similarities

Active and semi-supervised learning are important techniques when labeled data are scarce. We combine the two under a Gaussian random field model. Labeled and unlabeled data are represented as v ertices in a weighted graph, with edge weights encoding the similarity between in- stances. The semi-supervised learning problem is then formulated in terms of a Gaussian random field on this graph, the mean of which is characterized in terms of harmonic functions. Active learning is performed on top of the semisupervised learning scheme by greedily select- ing queries from the unlabeled data to minimize the estimated e xpected classification error; in
the case of Gaussian fields the risk is efficiently computed using matrix methods. W e present experimental results on synthetic data,handwritten digit recognition, and text classification tasks. The active learning scheme requires a much smaller number of queries to achieve high accuracy compared with random query selection.

Difference

Semi-supervised learning tar gets the common situation where labeled data are scarce but unlabeled data are ab undant. Under suitable assumptions, it uses unlabeled data to help supervised learning tasks. Various semi-supervised learning methods have been proposed and show promising results.

active learning in conjunction with semi-supervised learning. That is, we might allow the learning algorithm to pick a set of unlabeled instances to be labeled by a domain expert.

(e) If I would like to work out a model to classify U. of Michigan webpages based on the model that have learned from the UIUC web- site. It is not easy .If you are contacted to teach a session on RefWorks, your instructional materials based on the current interface will be accurate until our RefWorks access runs.The expectation is that we will still support RefWorks through instruction and reference, but only in equal measure with EndNote, Mendeley, Zotero, and other major tools. Another option would be to instead offer instruction on different types of citation tools and their features. The Working Group will begin working on a basic template for such instruction.

Navigate

Given a total of N items, K of which are type-1 items, the probability to observ

Given a triangle with a = 11, gamma = 35 degree, and beta = 16 degree, what is t

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Given a training set of 10 million tuples with 10 attributes each taking 8 bytes

Question

Explanation / Answer

Related Questions

Navigate