Use R studio and the data provided to evaluate the variables/factors that may af
ID: 3865455 • Letter: U
Question
Use R studio and the data provided to evaluate the variables/factors that may affect the quality of life (such as fitness level, happiness, stress, productivity).
1.Come up with 5 hypotheses about factors that affect the daily quality of life.
2. Clean the data. Data can be found: https://drive.google.com/file/d/0B1D66qK8jxd0bUhMRHItLS1QQzA/view?usp=sharing
3. Design and create a set of visualizations within a dashboard/storyboard that provides insight into validating the hypotheses.
Requirements:
- must outline 5 questions that can be evaluated using a data-driven approach
- with at minimum 1 interactive graphical element
Explanation / Answer
Conceptually, factors are variables in R which take on a limited number of different values; such variables are often refered to as categorical variables. One of the most important uses of factors is in statistical modeling; since categorical variables enter into statistical models differently than continuous variables, storing data as factors insures that the modeling functions will treat such data correctly.Factors in R are stored as a vector of integer values with a corresponding set of character values to use when the factor is displayed. The factor function is used to create a factor. The only required argument to factor is a vector of values which will be returned as a vector of factor values. Both numeric and character variables can be made into factors, but a factor's levels will always be character values. You can see the possible levels for a factor through the levels command.To change the order in which the levels will be displayed from their default sorted order, the levels= argument can be given a vector of all the possible values of the variable in the order you desire. If the ordering should also be used when performing comparisons, use the optional ordered=TRUE argument. In this case, the factor is known as an ordered factor.The levels of a factor are used when displaying the factor's values. You can change these levels at the time you create a factor by passing a vector with the new values through the labels= argument. Note that this actually changes the internal levels of the factor, and to change the labels of a factor after it has been created, the assignment form of the levels function is used. To illustrate this point, consider a factor taking on integer values which we want to display as roman numerals.
To convert the default factor fdata to roman numerals, we use the assignment form of the levels function:
Factors represent a very efficient way to store character values, because each unique character value is stored only once, and the data itself is stored as a vector of integers. Because of this, read.table will automatically convert character variables to factors unless the as.is= argument is specified. See Section for details.As an example of an ordered factor, consider data consisting of the names of months:
Although the months clearly have an ordering, this is not reflected in the output of the table function. Additionally, comparison operators are not supported for unordered factors. Creating an ordered factor solves these problems:
While it may be necessary to convert a numeric variable to a factor for a particular application, it is often very useful to convert the factor back to its original numeric values, since even simple arithmetic operations will fail when using factors. Since the as.numeric function will simply return the internal integer values of the factor, the conversion must be done using the levels attribute of the factor.Suppose we are studying the effects of several levels of a fertilizer on the growth of a plant. For some analyses, it might be useful to convert the fertilizer levels to an ordered factor:
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.