The file \"flowers.csv\" file contains information on measurements of the iris f
ID: 3043437 • Letter: T
Question
The file "flowers.csv" file contains information on measurements of the iris flowers. Create an R data frame by the name "flower.data" that contains the data in the file. The following R code shows an example of how to round a vector of numbers to zero decimal places and then calculate some statistics using the rounded numbers. You might need some of the calculations for this assignment, but you might not need others. You would replace example$years with the name of the R object that you want to analyze (in other programming languages, you might call example$years a variable).
> x <- round(example$years, 0)
> freq <- table(x)
> rel.freq <- freq/sum(freq)
> cumsum(rel.freq)
Cumulative Frequency Table for Petal Length
Use the following table to answer tasks 2-4. :
1. What is the sum of the first three frequencies in the frequency table for sepal width? _____
2. What does your answer to the previous question represent (in terms of sepal width and frequency and the percentage of all sepal measurements) ____
3. What is the sum of the last three frequencies in the frequency table for sepal width? _____
4. How many flowers in the sample had sepal widths less than 4 (do NOT round the sepal width numbers for this, but you can round your final answer to 3 decimal places)? _________
5. What does the tallest bar in the plot represent?_________ ----------------------------------------------------------------------------------------------------------
6. Create a frequency table that shows the frequencies for each species of flower in the sample. Paste your R command and output into your answer (do NOT display data from a data frame, display data using the table() command)_________
7. Explain two things about the table that you created for the previous task: Why did the frequency table for flower species contain words in the first row as opposed to numbers?______ What is the meaning of the numbers in the second row of the table? ___________________
Value 1 2 3 4 5 6 7 Cumulative Relative Frequency: .16 .33 .35 .58 .81 .97 1.00Explanation / Answer
library(datasets)
data(iris)
#Make a copy of the dataset iris. We will be working on this copy
iris1 = iris
#Top 6 rows of the dataset
head(iris1)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
#Variable names
names(iris1)
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
#Freq table after rounding the values
freq = table(round(iris1$Sepal.Width,0))
freq
2 3 4
19 106 25
#Cumulative Freq Table
rel.freq <- freq/sum(freq)
cumsum(rel.freq)
2 3 4
0.1266667 0.8333333 1.0000000
1. What is the sum of the first three frequencies in the frequency table for sepal width? _____
Ans: The Sum is 150. Since all the values are rounded and only 3 distinct values remain.
2. What does your answer to the previous question represent (in terms of sepal width and frequency and the percentage of all sepal measurements) ____
Ans: After rounding the sepal width we observer that there are 19 flowers with sepal width 2, 106 flowers with sepal width 3 and 25 flowers with sepal width 4. Ie they consitite 12.7% , 70.7% and 16.7% respectively.
3. What is the sum of the last three frequencies in the frequency table for sepal width? _____
Ans: Answer is same as that of part 1 ie 150 as only 3 groups are present.
4. How many flowers in the sample had sepal widths less than 4 (do NOT round the sepal width numbers for this, but you can round your final answer to 3 decimal places)? _________
Ans: 146
5. What does the tallest bar in the plot represent?_________
Ans: the tallest bar in the frequency bar plot shows the label/range/values which is occurring maximum times in the dataset.
6. Create a frequency table that shows the frequencies for each species of flower in the sample. Paste your R command and output into your answer (do NOT display data from a data frame, display data using the table() command)_________
Ans:
#Table of frequencies of each flower
table(iris1$Species)
setosa versicolor virginica
50 50 50
7. Explain two things about the table that you created for the previous task: Why did the frequency table
for flower species contain words in the first row as opposed to numbers?______
What is the meaning of the numbers in the second row of the table? ___________________
Ans:
The table command given the level of factor in the first row and the corresponding frequencies in the second row. Since the Variable species contains levels of string type , the output shows string in first row and their corresponding frequencies in the Second row.
library(datasets)
data(iris)
#Make a copy of the dataset iris. We will be working on this copy
iris1 = iris
#Top 6 rows of the dataset
head(iris1)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
#Variable names
names(iris1)
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
#Freq table after rounding the values
freq = table(round(iris1$Sepal.Width,0))
freq
2 3 4
19 106 25
#Cumulative Freq Table
rel.freq <- freq/sum(freq)
cumsum(rel.freq)
2 3 4
0.1266667 0.8333333 1.0000000
1. What is the sum of the first three frequencies in the frequency table for sepal width? _____
Ans: The Sum is 150. Since all the values are rounded and only 3 distinct values remain.
2. What does your answer to the previous question represent (in terms of sepal width and frequency and the percentage of all sepal measurements) ____
Ans: After rounding the sepal width we observer that there are 19 flowers with sepal width 2, 106 flowers with sepal width 3 and 25 flowers with sepal width 4. Ie they consitite 12.7% , 70.7% and 16.7% respectively.
3. What is the sum of the last three frequencies in the frequency table for sepal width? _____
Ans: Answer is same as that of part 1 ie 150 as only 3 groups are present.
4. How many flowers in the sample had sepal widths less than 4 (do NOT round the sepal width numbers for this, but you can round your final answer to 3 decimal places)? _________
Ans: 146
#No of flowers whose sepal width is less than 4
length(which(iris1$Sepal.Width < 4 ))
[1] 146
5. What does the tallest bar in the plot represent?_________
Ans: the tallest bar in the frequency bar plot shows the label/range/values which is occurring maximum times in the dataset.
6. Create a frequency table that shows the frequencies for each species of flower in the sample. Paste your R command and output into your answer (do NOT display data from a data frame, display data using the table() command)_________
Ans:
#Table of frequencies of each flower
table(iris1$Species)
setosa versicolor virginica
50 50 50
7. Explain two things about the table that you created for the previous task: Why did the frequency table
for flower species contain words in the first row as opposed to numbers?______
What is the meaning of the numbers in the second row of the table? ___________________
Ans:
The table command given the level of factor in the first row and the corresponding frequencies in the second row. Since the Variable species contains levels of string type , the output shows string in first row and their corresponding frequencies in the Second row.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.