Last active
August 15, 2023 10:00
-
-
Save zzzeek/5f58d007698c4a0c372edd95ab8e0267 to your computer and use it in GitHub Desktop.
The SQL is just as easy as an ORM challenge
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" "Writing SQL is just as fast as using an ORM" proof of concept | |
Below is a simple Python object model, where we represent a database that | |
stores the names of employees at a company, some of whom are "engineers", | |
and a list of the jobs they do and the programming languages they use. | |
We'd like to persist the state represented by this Python object model | |
in a relational database, using our Python objects as a start. Then we'd | |
like to write SQL queries for rows in this database, and we get back instances | |
of Python objects exactly as they were created. | |
This is of course the job that an ORM does. We'd like to show that ORMs | |
are unnecessary if one knows SQL, that the work is just as fast and easy | |
writing raw SQL as long as one knows SQL. | |
The assigment is as follows: | |
1. create a schema to store the Python object model below relationally. | |
a. There should be six tables: "employee", "engineer", "employee_job", | |
"job", "engineer_language", and "language". | |
b. An object of type Engineer should have a row in the | |
"employee" and "engineer" tables, which are associated via foreign | |
key. | |
c. "employee_job" has foreign key constraints referring to "employee" | |
and "job". | |
d. "engineer_language" has foreign key constraints referring to | |
"engineer" and "language" | |
2. store the "emp1" and "emp2" object structures in the database. | |
Primary keys should use Postgresql SERIAL columns using an auto-incrementing | |
integer that comes from the sequence implicitly associated with a SERIAL | |
column. It's likely best to use RETURNING with INSERT statements to get | |
these back. | |
3. run this SQL query, or one like it, and get back "emp1" and "emp2" as Python | |
objects: | |
SELECT emp.id, emp.name FROM employee | |
4. The "engineer" table will also need to be queried in some way (either | |
part of the first query, or as some additional query) to determine which row is | |
an "Engineer" and which is an "Employee"; emp1 should come back as an Engineer | |
object instance and emp2 should come back as an Employee object instance. | |
The Python objects returned should also have the appropriate Job and | |
Language collections present on them as well, which would require additional | |
SQL to be emitted (again, either as joins or separate statements, whatever. | |
With SQLAlchemy, these different kinds of schemes can be selected based | |
on simple configuration settings.). | |
5. Things that aren't part of the challenge: setting up transaction blocks, rolling | |
back when errors occur, connection pooling, dealing with tricky datatypes, | |
ensuring in-memory uniqueness of objects (after all if two parts of my program | |
update ProgrammingLanguage in the same transaction do they conflict?), optimistic | |
object versioning, all things that are also not "writing SQL" that the ORM does, | |
we'll leave those out. | |
There's a psycopg2 Connection object near the end of this program. | |
Are you ready to show how easy writing raw SQL is and how there's nothing | |
else you need you need when programming real applications? Great! Let's go! | |
""" | |
class Employee(object): | |
def __init__(self, name, jobs): | |
self.name = name | |
self.jobs = jobs | |
class Engineer(Employee): | |
def __init__(self, name, jobs, languages): | |
super(Engineer, self).__init__(name, jobs) | |
self.languages = languages | |
class Job(object): | |
def __init__(self, name): | |
self.name = name | |
class ProgrammingLanguage(object): | |
def __init__(self, name): | |
self.name = name | |
j1, j2 = Job("programming"), Job("meeting") | |
python, java = ProgrammingLanguage("python"), ProgrammingLanguage("java") | |
emp1 = Engineer(name="some engineer", languages=[python, java], jobs=[j1, j2]) | |
emp2 = Employee(name="some employee", jobs=[j2]) | |
# here's a start: | |
import psycopg2 | |
connection = psycopg2.connect( | |
user="scott", password="tiger", host="localhost", database="test" | |
) | |
# go! create the schema, store the data, load it back! efficiently! | |
# lets read the data now | |
for emp in employees: | |
print(emp.name) | |
print([j.name for j in emp.jobs]) | |
if isinstance(emp, Engineer): | |
print([l.name for l in emp.languages]) |
Here's the ORM version. Deep SQLAlchemy users might notice I used a little trick to get joined table inheritance without a discriminator column :)
"""ORM solution.
As folks will point out, there's something in the SQL version that the ORM
version doesn't do yet, which is calls INSERT .. VALUES (), (), ()...RETURNING
to get primary keys back. SQLAlchemy might want to add this feature
(edit: this feature is now in 1.4,
see https://docs.sqlalchemy.org/en/14/changelog/migration_14.html#change-5263 )
even though it only works on Postgresql (MySQL and SQLite have VALUES (), (), () but don't
support RETURNING). But anyway that's not the goal on this exercise. This is
the "SQL is just as quick and easy as an ORM" exercise.
"""
from sqlalchemy import case
from sqlalchemy import Column
from sqlalchemy import create_engine
from sqlalchemy import ForeignKey
from sqlalchemy import Integer
from sqlalchemy import String
from sqlalchemy import Table
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import column_property
from sqlalchemy.orm import relationship
from sqlalchemy.orm import Session
Base = declarative_base()
class Employee(Base):
__tablename__ = "employee"
id = Column(Integer, primary_key=True)
name = Column(String)
jobs = relationship("Job", secondary="employee_job", lazy="selectin")
__mapper_args__ = {
"polymorphic_identity": "employee",
"with_polymorphic": "*",
}
class Engineer(Employee):
__tablename__ = "engineer"
id = Column(ForeignKey("employee.id"), primary_key=True)
languages = relationship(
"ProgrammingLanguage", secondary="engineer_language", lazy="selectin"
)
__mapper_args__ = {"polymorphic_identity": "engineer"}
Employee.is_engineer = column_property(
case([(Engineer.id.isnot(None), "engineer")], else_="employee")
)
Employee.__mapper__.polymorphic_on = Employee.is_engineer
class Job(Base):
__tablename__ = "job"
id = Column(Integer, primary_key=True)
name = Column(String)
def __init__(self, name):
self.name = name
class ProgrammingLanguage(Base):
__tablename__ = "language"
id = Column(Integer, primary_key=True)
name = Column(String)
def __init__(self, name):
self.name = name
Table(
"employee_job",
Base.metadata,
Column("employee_id", ForeignKey("employee.id"), primary_key=True),
Column("job_id", ForeignKey("job.id"), primary_key=True),
)
Table(
"engineer_language",
Base.metadata,
Column("engineer_id", ForeignKey("employee.id"), primary_key=True),
Column("language_id", ForeignKey("language.id"), primary_key=True),
)
j1, j2 = Job("programming"), Job("meeting")
python, java = ProgrammingLanguage("python"), ProgrammingLanguage("java")
emp1 = Engineer(name="some engineer", languages=[python, java], jobs=[j1, j2])
emp2 = Employee(name="some employee", jobs=[j2])
# here's a start
e = create_engine("postgresql+psycopg2://scott:tiger@localhost/test")
# go!
Base.metadata.create_all(e)
s = Session(e)
s.add_all([emp1, emp2])
s.commit()
for emp in s.query(Employee):
print(emp.name)
print([j.name for j in emp.jobs])
if isinstance(emp, Engineer):
print([l.name for l in emp.languages])
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Here's the raw SQL version. in some ways, it's able to do a couple of things a little better than current SQLAlchemy releases (EDIT: 1.3 only. In 1.4 we've improved PostgreSQL returning support) do (though you have to write it all out), and in other ways, it's not at all able to capture important behavioral concepts like maintaining an identity map or handling transactions in any reasonable way. It is of course vastly more verbose.