Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

The file \"flowers.csv\" file contains information on measurements of the iris f

ID: 3043437 • Letter: T

Question

The file "flowers.csv" file contains information on measurements of the iris flowers. Create an R data frame by the name "flower.data" that contains the data in the file. The following R code shows an example of how to round a vector of numbers to zero decimal places and then calculate some statistics using the rounded numbers. You might need some of the calculations for this assignment, but you might not need others. You would replace example$years with the name of the R object that you want to analyze (in other programming languages, you might call example$years a variable).

> x <- round(example$years, 0)

> freq <- table(x)

> rel.freq <- freq/sum(freq)

> cumsum(rel.freq)

Cumulative Frequency Table for Petal Length

Use the following table to answer tasks 2-4. :

1. What is the sum of the first three frequencies in the frequency table for sepal width? _____

2. What does your answer to the previous question represent (in terms of sepal width and frequency and the percentage of all sepal measurements) ____

3. What is the sum of the last three frequencies in the frequency table for sepal width? _____

4. How many flowers in the sample had sepal widths less than 4 (do NOT round the sepal width numbers for this, but you can round your final answer to 3 decimal places)? _________

5. What does the tallest bar in the plot represent?_________ ----------------------------------------------------------------------------------------------------------

6. Create a frequency table that shows the frequencies for each species of flower in the sample. Paste your R command and output into your answer (do NOT display data from a data frame, display data using the table() command)_________

7. Explain two things about the table that you created for the previous task: Why did the frequency table for flower species contain words in the first row as opposed to numbers?______ What is the meaning of the numbers in the second row of the table? ___________________

Value 1 2 3 4 5 6 7 Cumulative Relative Frequency: .16 .33 .35 .58 .81 .97 1.00

Explanation / Answer

library(datasets)

data(iris)

#Make a copy of the dataset iris. We will be working on this copy

iris1 = iris

#Top 6 rows of the dataset

head(iris1)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

1          5.1         3.5          1.4         0.2 setosa

2          4.9         3.0          1.4         0.2 setosa

3          4.7         3.2          1.3         0.2 setosa

4          4.6         3.1          1.5         0.2 setosa

5          5.0         3.6          1.4         0.2 setosa

6          5.4         3.9          1.7         0.4 setosa

#Variable names

names(iris1)

[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"    

#Freq table after rounding the values

freq = table(round(iris1$Sepal.Width,0))

freq

2   3   4

19 106 25

#Cumulative Freq Table

rel.freq <- freq/sum(freq)

cumsum(rel.freq)

        2         3         4

0.1266667 0.8333333 1.0000000

1. What is the sum of the first three frequencies in the frequency table for sepal width? _____

Ans: The Sum is 150. Since all the values are rounded and only 3 distinct values remain.

2. What does your answer to the previous question represent (in terms of sepal width and frequency and the percentage of all sepal measurements) ____

Ans: After rounding the sepal width we observer that there are 19 flowers with sepal width 2, 106 flowers with sepal width 3 and 25 flowers with sepal width 4. Ie they consitite 12.7% , 70.7% and 16.7% respectively.

3. What is the sum of the last three frequencies in the frequency table for sepal width? _____

Ans: Answer is same as that of part 1 ie 150 as only 3 groups are present.

4. How many flowers in the sample had sepal widths less than 4 (do NOT round the sepal width numbers for this, but you can round your final answer to 3 decimal places)? _________

Ans: 146

5. What does the tallest bar in the plot represent?_________

Ans: the tallest bar in the frequency bar plot shows the label/range/values which is occurring maximum times in the dataset.

6. Create a frequency table that shows the frequencies for each species of flower in the sample. Paste your R command and output into your answer (do NOT display data from a data frame, display data using the table() command)_________

Ans:

#Table of frequencies of each flower

table(iris1$Species)

    setosa versicolor virginica

        50         50         50

7. Explain two things about the table that you created for the previous task: Why did the frequency table

for flower species contain words in the first row as opposed to numbers?______

What is the meaning of the numbers in the second row of the table? ___________________

Ans:

The table command given the level of factor in the first row and the corresponding frequencies in the second row. Since the Variable species contains levels of string type , the output shows string in first row and their corresponding frequencies in the Second row.

library(datasets)

data(iris)

#Make a copy of the dataset iris. We will be working on this copy

iris1 = iris

#Top 6 rows of the dataset

head(iris1)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

1          5.1         3.5          1.4         0.2 setosa

2          4.9         3.0          1.4         0.2 setosa

3          4.7         3.2          1.3         0.2 setosa

4          4.6         3.1          1.5         0.2 setosa

5          5.0         3.6          1.4         0.2 setosa

6          5.4         3.9          1.7         0.4 setosa

#Variable names

names(iris1)

[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"    

#Freq table after rounding the values

freq = table(round(iris1$Sepal.Width,0))

freq

2   3   4

19 106 25

#Cumulative Freq Table

rel.freq <- freq/sum(freq)

cumsum(rel.freq)

        2         3         4

0.1266667 0.8333333 1.0000000

1. What is the sum of the first three frequencies in the frequency table for sepal width? _____

Ans: The Sum is 150. Since all the values are rounded and only 3 distinct values remain.

2. What does your answer to the previous question represent (in terms of sepal width and frequency and the percentage of all sepal measurements) ____

Ans: After rounding the sepal width we observer that there are 19 flowers with sepal width 2, 106 flowers with sepal width 3 and 25 flowers with sepal width 4. Ie they consitite 12.7% , 70.7% and 16.7% respectively.

3. What is the sum of the last three frequencies in the frequency table for sepal width? _____

Ans: Answer is same as that of part 1 ie 150 as only 3 groups are present.

4. How many flowers in the sample had sepal widths less than 4 (do NOT round the sepal width numbers for this, but you can round your final answer to 3 decimal places)? _________

Ans: 146

       
  #No of flowers whose sepal width is less than 4  
  length(which(iris1$Sepal.Width < 4 ))  
  [1] 146  

5. What does the tallest bar in the plot represent?_________

Ans: the tallest bar in the frequency bar plot shows the label/range/values which is occurring maximum times in the dataset.

6. Create a frequency table that shows the frequencies for each species of flower in the sample. Paste your R command and output into your answer (do NOT display data from a data frame, display data using the table() command)_________

Ans:

#Table of frequencies of each flower

table(iris1$Species)

    setosa versicolor virginica

        50         50         50

7. Explain two things about the table that you created for the previous task: Why did the frequency table

for flower species contain words in the first row as opposed to numbers?______

What is the meaning of the numbers in the second row of the table? ___________________

Ans:

The table command given the level of factor in the first row and the corresponding frequencies in the second row. Since the Variable species contains levels of string type , the output shows string in first row and their corresponding frequencies in the Second row.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote