Created
June 10, 2024 01:32
-
-
Save ckashby/4d121939bd106e120515a462d968cefa to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ## Aggregating Statistics with GROUP BY | |
# 45) What is the average age of `pets` by species? | |
pd.read_sql("SELECT AVG(age) FROM pets GROUP BY species;", conn) | |
# dogs 6.5, cat 4.3, lobster 3 | |
# 46) Repeat the previous problem but make sure the species label is also displayed! | |
pd.read_sql("SELECT species, AVG(age) FROM pets GROUP BY species;", conn) | |
# Assume this behavior is always being asked of you any time you use `GROUP BY`. | |
# 47) What is the count, mean, minimum, and maximum age by species in `pets`? | |
pd.read_sql("SELECT species, COUNT(age), AVG(age), MIN(age), MAX(age) FROM pets GROUP BY species;", conn) | |
# 48) Show the mean salaries of each job title in `employees`. | |
pd.read_sql("SELECT job, AVG(salary) FROM employees GROUP BY job;", conn) | |
# 49) Show the mean salaries in New Zealand dollars of each job title in `employees`. | |
# * _NOTE:_ 1 USD = 1.65 NZD. | |
pd.read_sql("SELECT job, AVG(salary) * 1.65 AS 'New Zealand Dollars' FROM employees GROUP BY job;", conn) | |
# 50) Show the mean, min, and max salaries of each job title in `employees`, as well as the numbers of employees in each category. | |
pd.read_sql("SELECT job, AVG(salary), MIN(salary), MAX(salary), COUNT(*) FROM employees GROUP BY job;", conn) | |
# 51) Show the mean salaries of each job title in `employees` sorted descending by salary. | |
pd.read_sql("SELECT job, AVG(salary) FROM employees GROUP BY job ORDER BY AVG(salary) DESC;", conn) | |
# 52) What are the top 5 most common first names among `employees`? | |
pd.read_sql("SELECT firstname, COUNT(*) FROM employees GROUP BY firstname ORDER BY COUNT(*) DESC LIMIT 5;", conn) | |
# 53) Show all first names which have exactly 2 occurrences in `employees`. | |
pd.read_sql("SELECT firstname, COUNT(*) FROM employees GROUP BY firstname HAVING COUNT(*) = 2;", conn) | |
# 54) Take a look at the `transactions` table to get a idea of what it contains. | |
# Note that a transaction may span multiple rows if different items are purchased as part of the same order. | |
# The employee who made the order is also given by their ID. | |
pd.read_sql("SELECT * FROM transactions;", conn) | |
# 55) Show the top 5 largest orders (and their respective customer) in terms of the numbers of items purchased in that order. | |
pd.read_sql("SELECT customer, SUM(quantity) FROM transactions GROUP BY customer ORDER BY SUM(quantity) DESC LIMIT 5;", conn) | |
# 56) Show the total cost of each transaction. | |
# * _Hint:_ The `unit_price` column is the price of _one_ item. The customer may have purchased multiple. | |
# * _Hint2:_ Note that transactions here span multiple rows if different items are purchased. | |
pd.read_sql("SELECT order_id, SUM(quantity * unit_price) FROM transactions GROUP BY order_id;", conn) | |
# 57) Show the top 5 transactions in terms of total cost. | |
pd.read_sql("SELECT order_id, SUM(quantity * unit_price) FROM transactions GROUP BY order_id ORDER BY SUM(quantity * unit_price) DESC LIMIT(5);", conn) | |
# 58) Show the top 5 customers in terms of total revenue (ie, which customers have we done the most business with in terms of money?) | |
pd.read_sql("SELECT customer, SUM(quantity * unit_price) FROM transactions GROUP BY customer ORDER BY SUM(quantity * unit_price) DESC LIMIT 5;", conn) | |
# 59) Show the top 5 employees in terms of revenue generated (ie, which employees made the most in sales?) | |
pd.read_sql("SELECT employee_id, SUM(quantity * unit_price) FROM transactions GROUP BY employee_id ORDER BY SUM(quantity * unit_price) DESC LIMIT 5;", conn) | |
# 60) Which customer worked with the largest number of employees? | |
# * _Hint:_ This is a tough one! Check out the `DISTINCT` keyword. | |
pd.read_sql("SELECT customer, COUNT(DISTINCT employee_id) FROM transactions GROUP BY customer ORDER BY COUNT(DISTINCT employee_id) DESC LIMIT 1;", conn) | |
# 61) Show all customers who've done more than $80,000 worth of business with us. | |
pd.read_sql("SELECT customer, SUM(quantity * unit_price) FROM transactions GROUP BY customer HAVING SUM(quantity * unit_price) > 80000;", conn) | |
# pd.read_sql("SELECT * FROM transactions;", conn) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment