Using Jupyter Notebook and Python, Numpy, Pandas, sqlite3 as db: import numpy as
ID: 3918860 • Letter: U
Question
Using Jupyter Notebook and Python, Numpy, Pandas, sqlite3 as db:
import numpy as np
import pandas as pd
from pandas import DataFrame, Series
import sqlite3 as dbmight help :
from pandasql import sqldf
Read the following data set:
https://archive.ics.uci.edu/ml/machine-learning-databases/adult/
Rename the columns as per the description from this file:
https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names
Task:
Create a sql db from adult dataset and name it sqladb
1. Select 10 records from the adult sqladb
2. Show me the average hours per week of all men who are working in private sector
3. Show me the frequency table for education, occupation and relationship, separately
4. Are there any people who are married, working in private sector and having a masters
degree
5. What is the average, minimum and maximum age group for people working in
different sectors
6. Calculate age distribution by country
7. Compute a new column as 'Net-Capital-Gain' from the two columns 'capital-gain' and
'capital-loss'
Explanation / Answer
Create DB
sqlite3 sqladb.db
Create Table
CREATE TABLE adult (
age int,
workclass varchar(40),
fnlwgt int,
education varchar(40),
education_num int,
marital_status varchar(40),
occupation varchar(20),
relationship varchar(40),
race varchar(20),
sex varchar(10),
capital_gain int,
capital_loss int,
hours_per_week int,
native_country varchar(50),
label varchar(10))
1) SELECT * FROM adult LIMIT 10;
2)
SELECT AVG(hours_per_week)
FROM adult
WHERE workclass =' Private' and sex=' Male';
OUTPUT: 42
3)
4) SELECT * FROM adult
WHERE
OUTPUT: 234 people
5)
SELECT AVG(age) Average ,MIN(age) Minimum ,MAX(age) Maximum,workclass
FROM adult GROUP BY
OUTPUT:
47 19 72 Without-pay
44 17 90 Self-emp-not-inc
36 17 90 Private
46 17 84 Self-emp-inc
39 17 81 State-gov
42 17 90 Federal-gov
20 17 30 Never-worked
41 17 90 Local-gov
40 17 90 ?
6)
SELECT AVG(age) Average ,MIN(age) Minimum ,MAX(age) Maximum,native_country
FROM adult GROUP BY native_country
OUTPUT:
Average Minimum Maximum native_country
35 18 66 Jamaica
38 21 63 Outlying-US(Guam-USVI-etc)
34 19 56 Laos
39 18 74 Germany
38 17 90 United-States
37 18 65 Cambodia
39 18 75 Columbia
38 17 61 India
42 22 75 China
43 17 85 Poland
36 23 68 Ireland
39 17 90 Philippines
40 18 62 Scotland
33 19 67 Nicaragua
33 17 81 Mexico
36 21 90 Ecuador
41 17 61 Trinadad&Tobago
33 19 60 Hong
33 18 58 Honduras
38 20 66 Yugoslavia
38 19 90 South
32 32 32 Holand-Netherlands
49 24 81 Hungary
46 22 65 Greece
34 19 73 Vietnam
40 19 78 Portugal
34 19 55 Thailand
41 17 90 England
46 19 77 Italy
34 17 79 El-Salvador
38 17 63 Haiti
32 19 66 Guatemala
35 17 69 Peru
33 20 61 Taiwan
38 19 61 Japan
38 20 64 France
42 17 80 Canada
40 17 90 Puerto-Rico
39 22 63 Iran
37 18 78 Dominican-Republic
45 21 82 Cuba
38 17 90 ?
7) SELECT * ,(capital_gain - capital_loss) 'Net-Capital-Gain' FROM adult
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.