Tutorial
SQLAlchemy 官方提供了一个统一教程来详细介绍如何使用他,这里我们将它简化并提供了 Core 和 ORM 等价的案例。
安装
首先我们要安装 SQLAlchemy 以及数据库驱动:
这里我们使用 SQLite 作为示例,因为 SQLite 数据库驱动由 Python 标准库提供因此就不再需要安装了。之后我们可以测试:
创建 Engine 对象: 建立与数据库的连接
无论是 Core 还是 ORM 的方式使用 SQLAlchemy 的第一步都是构建Engine对象来与数据库构建连接,他是一个抽象的连接池:
>>> from sqlalchemy import create_engine
>>> engine = create_engine("sqlite+pysqlite:///:memory:", echo=True)
使用 Connection 对象: 使用 DBAPI 和事务来与数据库交互
Engine 用于管理连接池,而与数据库真正的交互是通过Connection对象实现的:
>>> with engine.connect() as conn:
... result = conn.execute(text("select 'hello world'"))
... print(result.all())
...
2024-10-31 15:09:46,919 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-10-31 15:09:46,920 INFO sqlalchemy.engine.Engine select 'hello world'
2024-10-31 15:09:46,920 INFO sqlalchemy.engine.Engine [generated in 0.00157s] ()
[('hello world',)]
2024-10-31 15:09:46,923 INFO sqlalchemy.]eng]ine.Engine ROLLBACK
上下文管理器创建了数据库连接 Connection 对象,我们在第一次执行 execute 是会自动开启事务(autobegin),如果连接被释放到连接池前没有显式的提交事务(commit)则会自动调用 ROLLBACK 来回滚事务,他保证了之后其他语句再次使用该连接时是“干净”的。当然上面只是查询语句所以事务并不是必须的,ROLLBACK 更多的是一种冗余操作。
DBAPI 是不会自动提交的,因此如果是创建、插入等事务操作就必须 commit 才能够真正的提交:
>>> with engine.connect() as conn:
... conn.execute(text("CREATE TABLE some_table (x int, y int)"))
... conn.execute(text("INSERT INTO some_table(x, y) VALUES (:x, :y)"),[{"x":1, "y":1},{"x":2, "y":4}])
... conn.commit()
...
2024-10-31 15:56:27,041 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-10-31 15:56:27,042 INFO sqlalchemy.engine.Engine CREATE TABLE some_table (x int, y int)
2024-10-31 15:56:27,042 INFO sqlalchemy.engine.Engine [generated in 0.00036s] ()
<sqlalchemy.engine.cursor.CursorResult object at 0x102a98e50>
2024-10-31 15:56:27,045 INFO sqlalchemy.engine.Engine INSERT INTO some_table(x, y) VALUES (?, ?)
2024-10-31 15:56:27,045 INFO sqlalchemy.engine.Engine [generated in 0.00019s] [(1, 1), (2, 4)]
<sqlalchemy.engine.cursor.CursorResult object at 0x102a98b40>
2024-10-31 15:56:27,045 INFO sqlalchemy.engine.Engine COMMIT
可以看到整个操作的流程:
- BEGIN 开启事务(implicit 即隐式,他表示 autobegin)
- 执行 CREATE 和 INSERT 操作
- COMMIT 来提交事务,这里因为事务已经提交就不需要在关闭连接时 ROLLBACK 了
上面的事务管理风格被称为 Commit As You Go
,而更为推荐的是使用预先定义的 begin 事务块来提交:
>>> with engine.begin() as conn:
... conn.execute(text("INSERT INTO some_table (x, y) VALUES (:x, :y)"),[{"x": 6, "y": 8}, {"x": 9, "y": 10}],)
2024-10-31 16:26:03,446 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-10-31 16:26:03,447 INFO sqlalchemy.engine.Engine INSERT INTO some_table (x, y) VALUES (?, ?)
2024-10-31 16:26:03,447 INFO sqlalchemy.engine.Engine [generated in 0.00030s] [(6, 8), (9, 10)]
<sqlalchemy.engine.cursor.CursorResult object at 0x102a98d00>
2024-10-31 16:26:03,448 INFO sqlalchemy.engine.Engine COMMIT
Tips
他们之间在内部流程上没什么区别,只不过声明 begin 事务块能够更加精准的控制事务的执行范围
execute() -> Result: 执行语句并返回结果
通过 Connection.execute()
来执行 SQL 语句,该方法接受一个 ClauseElement 类对象,最简单的构造方式就是通过 text 来包裹一个 SQL 语句:
>>> with engine.connect() as conn:
... result = conn.execute(text("SELECT x,y FROM some_table"))
... for row in result:
... print(f"x: {row.x} y:{row.y}")
...
2024-11-05 15:18:54,986 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-11-05 15:18:54,988 INFO sqlalchemy.engine.Engine SELECT x,y FROM some_table
2024-11-05 15:18:54,989 INFO sqlalchemy.engine.Engine [cached since 4.281e+05s ago] ()
x: 1 y:1
x: 2 y:4
x: 6 y:8
x: 9 y:10
2024-11-05 15:18:54,994 INFO sqlalchemy.engine.Engine ROLLBACK
其中执行返回Result 对象,该对象被封装成一个可迭代对象以允许用户从其中获取数据。这在 SELECT 语句中有非常直观的体现。当然有很多中方法返回结果列表,只不过迭代是最 Pythonic 的。
Result 对象中包含的是 Row 对象,他表示数据库中表的一行,其本身类似于 Python 的具名元组,因此要调用其中的值可以:
# 解构
for x, y in result:
pass
# 整数索引
for row in result:
x = row[0]
# 属性名
for row in result:
print(f"x: {row.x} y:{row.y}")
# 字典, mappings() 即返回通用的 dict 版本
for dict_row in result.mappings():
x = dict_row['x']
y = dict_row['y']
execute 接受参数
SQL 语句通常伴随与语句本身一起传递的数据,例如 WHERE 子句以及更为普遍的 INSERT 语句。他们被称为绑定参数(bind parameters)。他实现了安全的将本地数据传入数据库服务器的方式(避免 SQL 注入的发生)。
Python 的 DBAPI 规范允许六种不同的绑定参数形式,SQLALchemy 将这些抽象为一种形式(即不管后面使用的 DBAPI 是什么在 SQLALchemy 系统中之使用这一种形式),即通过 : 命名,之后通过字典传入:
>>> with engine.connect() as conn:
... result = conn.execute(text("SELECT x,y FROM some_table WHERE y > :y"), {"y": 2})
... for row in result:
... print(f"x: {row.x} y:{row.y}")
...
2024-11-05 16:18:11,569 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-11-05 16:18:11,570 INFO sqlalchemy.engine.Engine SELECT x,y FROM some_table WHERE y > ?
2024-11-05 16:18:11,570 INFO sqlalchemy.engine.Engine [generated in 0.00218s] (2,)
x: 2 y:4
x: 6 y:8
x: 9 y:10
2024-11-05 16:18:11,575 INFO sqlalchemy.engine.Engine ROLLBACK
对于需要多个值传入的可以使用字典列表:
>>> with engine.connect() as conn:
... conn.execute(
... text("INSERT INTO some_table (x, y) VALUES (:x, :y)"),
... [{"x": 11, "y": 12}, {"x": 13, "y": 14}],
... )
... conn.commit()
...
2024-11-05 16:25:36,549 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-11-05 16:25:36,549 INFO sqlalchemy.engine.Engine INSERT INTO some_table (x, y) VALUES (?, ?)
2024-11-05 16:25:36,549 INFO sqlalchemy.engine.Engine [cached since 4.32e+05s ago] [(11, 12), (13, 14)]
<sqlalchemy.engine.cursor.CursorResult object at 0x102a98f30>
2024-11-05 16:25:36,552 INFO sqlalchemy.engine.Engine COMMIT
接受多个参数相当于在每一个字典上运行一次给定的 INSERT 语句,不同之处在于该操作的性能会被更好的优化。
Tips
始终应该使用 bind parameters 的方式来传递参数,SQLALchemy 会确保传入的参数被正确转义。
Database MetaData: 构建数据库元数据
上面我们执行 SQL 语句都是 text(sql_string)
的形式,实际上 SQLALchemy 提供了更高级的SQL Expression Language,他提供了面向对象的 SQL 语句构造方式,但是使用它们的前提是需要将数据库中的表、列等概念映射到 Python 对象上,这是通过构造数据库元数据(Database Metadata)来实现的。
当我们使用关系型数据库时,所有操作都有一个主体 table,在 SQLALchemy 中使用 Table 对象来表示。我们可以通过编程方式构建(即直接实例化 Table 对象)或者使用映射类(大致可以理解为继承 Table 类)。无论哪一种方式总是要从一个集合开始,这个集合将保存我们所有表,他相当于抽象了数据库本身。
Note
大致可以将实例化 Table 对象称为 Core 形式的构建,映射类称为 ORM 形式的构建。
对于编程方式直接创建 MetaData,对于映射类这需要创建 DeclarativeBase 的子类:
MetaData 或 DeclarativeBase 本质上就是对数据库的抽象,一旦有了数据库就可以声明 Table 对象了,他才是数据操作的核心。无非就是一个通过构建 Table 对象实例(他接受一个 MetaData 参数来表示位于的数据库),一个直接通过继承来实现(实际上 DeclarativeBase 内部维护了一个registry集合来容纳 Table):
>>> from sqlalchemy import Table, Column, Integer, String
>>> user_table = Table(
... "user_account",
... metadata_obj,
... Column("id", Integer, primary_key=True),
... Column("name", String(30)),
... Column("fullname", String),
... )
>>> from sqlalchemy import ForeignKey
>>> address_table = Table(
... "address",
... metadata_obj,
... Column("id", Integer, primary_key=True),
... Column("user_id", ForeignKey("user_account.id"), nullable=False),
... Column("email_address", String, nullable=False),
... )
>>> from typing import List
>>> from typing import Optional
>>> from sqlalchemy.orm import Mapped
>>> from sqlalchemy.orm import mapped_column
>>> from sqlalchemy.orm import relationship
>>>
>>> class User(Base):
... __tablename__ = "user_account"
... id: Mapped[int] = mapped_column(primary_key=True)
... name: Mapped[str] = mapped_column(String(30))
... fullname: Mapped[Optional[str]]
... addresses: Mapped[List["Address"]] = relationship(back_populates="user")
... def __repr__(self) -> str:
... return f"User(id={self.id!r}, name={self.name!r}, fullname={self.fullname!r})"
...
class Address(Base):
__tablename__ = "address"
id: Mapped[int] = mapped_column(primary_key=True)
email_address: Mapped[str]
user_id = mapped_column(ForeignKey("user_account.id"))
user: Mapped[User] = relationship(back_populates="addresses")
def __repr__(self) -> str:
return f"Address(id={self.id!r}, email_address={self.email_address!r})">>> class Address(Base):
... __tablename__ = "address"
... id: Mapped[int] = mapped_column(primary_key=True)
... email_address: Mapped[str]
... user_id = mapped_column(ForeignKey("user_account.id"))
... user: Mapped[User] = relationship(back_populates="addresses")
... def __repr__(self) -> str:
... return f"Address(id={self.id!r}, email_address={self.email_address!r})"
无论是那种方式都是用于构造 Table 的,他们都接受一个字符串作为 table_name, 之后就是定义表中的字段了,他们都是 Column 对象,其接受字符串作为列名,紧接着就是列数据类型和列约束。无非就是定义的形式不一样而已。当然一个是 table 的实例,一个是 DeclarativeBase 类,他们的使用也有区别,这实际上就是 Core 和 ORM 操作上的区别:
Note
Table 对象以及 Column 对象(通过 Table.c.[column_name]
获取)是构造 SQL Expression Language
的基础
注意以上仅仅是定义,目前只存在于我们的内存中,他真正与数据库建立链接需要两个步骤:
- 转换为 DDL 语句(即 CREATE 语句) -> 由 SQLALchemy 自动完成
- 通过 Connection 发送到数据库来执行 -> 需要 Engine 来实现与真正的数据库建立链接
注意这里声明式和映射类就统一了,都是通过 MetaData.create_all(engine)
来实现与数据库通信创建表,无非是映射类需要间接的通过 DelarativeBase.metadata
来获取 MetaData 对象:
>>> metadata_obj.create_all(engine)
2024-11-06 16:24:03,401 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-11-06 16:24:03,405 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("user_account")
2024-11-06 16:24:03,405 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 16:24:03,411 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("user_account")
2024-11-06 16:24:03,411 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 16:24:03,413 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("address")
2024-11-06 16:24:03,413 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 16:24:03,414 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("address")
2024-11-06 16:24:03,414 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 16:24:03,417 INFO sqlalchemy.engine.Engine
CREATE TABLE user_account (
id INTEGER NOT NULL,
name VARCHAR(30),
fullname VARCHAR,
PRIMARY KEY (id)
)
2024-11-06 16:24:03,417 INFO sqlalchemy.engine.Engine [no key 0.00012s] ()
2024-11-06 16:24:03,419 INFO sqlalchemy.engine.Engine
CREATE TABLE address (
id INTEGER NOT NULL,
user_id INTEGER NOT NULL,
email_address VARCHAR NOT NULL,
PRIMARY KEY (id),
FOREIGN KEY(user_id) REFERENCES user_account (id)
)
2024-11-06 16:24:03,419 INFO sqlalchemy.engine.Engine [no key 0.00010s] ()
2024-11-06 16:24:03,420 INFO sqlalchemy.engine.Engine COMMIT
反省: 自动构建已有表的元数据
如果数据库中已经存在了对应的表,那么 SQLALchemy 提供一种机制来实现对数据库的反省:
>>> some_table = Table("some_table", metadata_obj, autoload_with=engine)
2024-11-06 17:53:54,601 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-11-06 17:53:54,601 INFO sqlalchemy.engine.Engine PRAGMA main.table_xinfo("some_table")
2024-11-06 17:53:54,601 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 17:53:54,602 INFO sqlalchemy.engine.Engine SELECT sql FROM (SELECT * FROM sqlite_master UNION ALL SELECT * FROM sqlite_temp_master) WHERE name = ? AND type in ('table', 'view')
2024-11-06 17:53:54,602 INFO sqlalchemy.engine.Engine [raw sql] ('some_table',)
2024-11-06 17:53:54,603 INFO sqlalchemy.engine.Engine PRAGMA main.foreign_key_list("some_table")
2024-11-06 17:53:54,603 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 17:53:54,603 INFO sqlalchemy.engine.Engine PRAGMA temp.foreign_key_list("some_table")
2024-11-06 17:53:54,603 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 17:53:54,603 INFO sqlalchemy.engine.Engine SELECT sql FROM (SELECT * FROM sqlite_master UNION ALL SELECT * FROM sqlite_temp_master) WHERE name = ? AND type in ('table', 'view')
2024-11-06 17:53:54,603 INFO sqlalchemy.engine.Engine [raw sql] ('some_table',)
2024-11-06 17:53:54,606 INFO sqlalchemy.engine.Engine PRAGMA main.index_list("some_table")
2024-11-06 17:53:54,606 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 17:53:54,607 INFO sqlalchemy.engine.Engine PRAGMA temp.index_list("some_table")
2024-11-06 17:53:54,607 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 17:53:54,607 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("some_table")
2024-11-06 17:53:54,607 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 17:53:54,607 INFO sqlalchemy.engine.Engine PRAGMA main.index_list("some_table")
2024-11-06 17:53:54,607 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 17:53:54,608 INFO sqlalchemy.engine.Engine PRAGMA temp.index_list("some_table")
2024-11-06 17:53:54,608 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 17:53:54,608 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("some_table")
2024-11-06 17:53:54,608 INFO sqlalchemy.engine.Engine [raw sql] ()
2024-11-06 17:53:54,608 INFO sqlalchemy.engine.Engine SELECT sql FROM (SELECT * FROM sqlite_master UNION ALL SELECT * FROM sqlite_temp_master) WHERE name = ? AND type in ('table', 'view')
2024-11-06 17:53:54,608 INFO sqlalchemy.engine.Engine [raw sql] ('some_table',)
2024-11-06 17:53:54,610 INFO sqlalchemy.engine.Engine ROLLBACK
他和手动通过构建 Table 示例的结果相同:
>>> some_table
Table('some_table', MetaData(), Column('x', INTEGER(), table=<some_table>), Column('y', INTEGER(), table=<some_table>), schema=None)
映射类本身也可以进行反省,不过太依赖于上面的方式:
>>> class SomeTable(Base):
... __table__ = Table("some_table", Base.metadata, autoload_with=engine)
构建 SQL Expression Language 执行
上面 execute()
运行 SQL 语句是通过 text()
包装字符串来实现的,一旦我们构建了数据库元数据就可以使用更好的方式即使用SQL Expression language来构建 SQL 语句。
对于 Insert 语句他的核心是包含 Value 子句,更加结构化的构建方式如下:
>>> from sqlalchemy import insert
>>> stmt = insert(user_table).values(name="spongebob", fullname="Spongebob Squarepants")
他们在内部被转换为 SQL 语句字符串,可以直接 print 来查看下:
他们在内部是使用 ClauseElement.compile()
函数来构建的,他返回一个 StrSQLCompiler 对象:
>>> compiled = stmt.compile()
>>> compiled.string
'INSERT INTO user_account (name, fullname) VALUES (:name, :fullname)'
>>> compiled.params
{'name': 'spongebob', 'fullname': 'Spongebob Squarepants'}
之后就是执行该语句,他同样是使用 execute 来执行,这个 text()
包裹的字符串没有两样:
>>> with engine.connect() as conn:
... result = conn.execute(stmt)
... conn.commit()
...
2024-11-07 10:04:33,031 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-11-07 10:04:33,032 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2024-11-07 10:04:33,032 INFO sqlalchemy.engine.Engine [generated in 0.00136s] ('spongebob', 'Spongebob Squarepants')
2024-11-07 10:04:33,035 INFO sqlalchemy.engine.Engine COMMIT
Tips
尽管 execute 结果会返回 Result 对象,但是对于 Insert 语句来说并不会返回任何行,不过我们可以通过 result.inserted_primary_keys
来查看所有插入成功的主键信息
上面我们构建 Insert 是携带了 Values 子句的,但是他们被 stmt.compile()
后分割为 string 和 params 两个部分,这是有意义的。我们可以直接在 execute 中携带 params:
>>> with engine.connect() as conn:
... result = conn.execute(
... insert(user_table),
... [
... {"name": "sandy", "fullname": "Sandy Cheeks"},
... {"name": "patrick", "fullname": "Patrick Star"},
... ],
... )
... conn.commit()
...
2024-11-07 10:09:34,905 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-11-07 10:09:34,906 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2024-11-07 10:09:34,906 INFO sqlalchemy.engine.Engine [generated in 0.00082s] [('sandy', 'Sandy Cheeks'), ('patrick', 'Patrick Star')]
2024-11-07 10:09:34,906 INFO sqlalchemy.engine.Engine COMMIT
Tips
在构建语句是如果定不准就直接 print 或者 compile 看一下。
Select 语句也是同样的构造方法,需要什么子句就携带什么子句方法:
>>> from sqlalchemy import select
>>> stmt = select(user_table).where(user_table.c.name == "spongebob")
同样我们可以使用 compile 来解析下 stmt 的内容:
>>> stmt.compile().string
'SELECT user_account.id, user_account.name, user_account.fullname \nFROM user_account \nWHERE user_account.name = :name_1'
>>> stmt.compile().params
{'name_1': 'spongebob'}
要想执行还是需要 execute 命令,select 会真正的获取数据,返回的是Result对象,通常我们可以迭代他来获取行:
>>> with engine.connect() as conn:
... for row in conn.execute(stmt):
... print(row)
...
2024-11-07 10:14:47,961 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-11-07 10:14:47,961 INFO sqlalchemy.engine.Engine SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
WHERE user_account.name = ?
2024-11-07 10:14:47,961 INFO sqlalchemy.engine.Engine [generated in 0.00086s] ('spongebob',)
(1, 'spongebob', 'Spongebob Squarepants')
2024-11-07 10:14:47,962 INFO sqlalchemy.engine.Engine ROLLBACK
对于 ORM 的包装类来说是一个道理,其中不再是 user_table 这样的 Table 实例而是像 User 这样的列名,对应的 user_table.c.[column]
也编程了 User.[attr]
这样的形式:
>>> from sqlalchemy.orm import Session
>>> with Session(engine) as session:
... for row in session.execute(select(User)):
... print(row)
...
2024-11-07 10:22:27,758 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-11-07 10:22:27,759 INFO sqlalchemy.engine.Engine SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
2024-11-07 10:22:27,759 INFO sqlalchemy.engine.Engine [generated in 0.00026s] ()
(User(id=1, name='spongebob', fullname='Spongebob Squarepants'),)
(User(id=2, name='sandy', fullname='Sandy Cheeks'),)
(User(id=3, name='patrick', fullname='Patrick Star'),)
2024-11-07 10:22:27,760 INFO sqlalchemy.engine.Engine ROLLBACK
并且他返回的不再是元组而是一个 User 对象。这实际上是 ORM 和 Core 其中一个核心的区别。