Basic Relationship Patterns¶
A quick walkthrough of the basic relational patterns.
The imports used for each of the following sections is as follows:
from sqlalchemy import Column, ForeignKey, Integer, Table
from sqlalchemy.orm import declarative_base, relationship
Base = declarative_base()
One To Many¶
A one to many relationship places a foreign key on the child table referencing
the parent. relationship()
is then specified on the parent, as referencing
a collection of items represented by the child:
class Parent(Base):
__tablename__ = "parent"
id = Column(Integer, primary_key=True)
children = relationship("Child")
class Child(Base):
__tablename__ = "child"
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey("parent.id"))
To establish a bidirectional relationship in one-to-many, where the “reverse”
side is a many to one, specify an additional relationship()
and connect
the two using the relationship.back_populates
parameter:
class Parent(Base):
__tablename__ = "parent"
id = Column(Integer, primary_key=True)
children = relationship("Child", back_populates="parent")
class Child(Base):
__tablename__ = "child"
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey("parent.id"))
parent = relationship("Parent", back_populates="children")
Child
will get a parent
attribute with many-to-one semantics.
Alternatively, the relationship.backref
option may be used
on a single relationship()
instead of using
relationship.back_populates
:
class Parent(Base):
__tablename__ = "parent"
id = Column(Integer, primary_key=True)
children = relationship("Child", backref="parent")
Configuring Delete Behavior for One to Many¶
It is often the case that all Child
objects should be deleted
when their owning Parent
is deleted. To configure this behavior,
the delete
cascade option described at delete is used.
An additional option is that a Child
object can itself be deleted when
it is deassociated from its parent. This behavior is described at
delete-orphan.
Many To One¶
Many to one places a foreign key in the parent table referencing the child.
relationship()
is declared on the parent, where a new scalar-holding
attribute will be created:
class Parent(Base):
__tablename__ = "parent"
id = Column(Integer, primary_key=True)
child_id = Column(Integer, ForeignKey("child.id"))
child = relationship("Child")
class Child(Base):
__tablename__ = "child"
id = Column(Integer, primary_key=True)
Bidirectional behavior is achieved by adding a second relationship()
and applying the relationship.back_populates
parameter
in both directions:
class Parent(Base):
__tablename__ = "parent"
id = Column(Integer, primary_key=True)
child_id = Column(Integer, ForeignKey("child.id"))
child = relationship("Child", back_populates="parents")
class Child(Base):
__tablename__ = "child"
id = Column(Integer, primary_key=True)
parents = relationship("Parent", back_populates="child")
Alternatively, the relationship.backref
parameter
may be applied to a single relationship()
, such as Parent.child
:
class Parent(Base):
__tablename__ = "parent"
id = Column(Integer, primary_key=True)
child_id = Column(Integer, ForeignKey("child.id"))
child = relationship("Child", backref="parents")
One To One¶
One To One is essentially a bidirectional relationship with a scalar attribute on both sides. Within the ORM, “one-to-one” is considered as a convention where the ORM expects that only one related row will exist for any parent row.
The “one-to-one” convention is achieved by applying a value of
False
to the relationship.uselist
parameter of the
relationship()
construct, or in some cases the backref()
construct, applying it on the “one-to-many” or “collection” side of a
relationship.
In the example below we present a bidirectional relationship that includes
both one-to-many (Parent.children
) and
a many-to-one (Child.parent
)
relationships:
class Parent(Base):
__tablename__ = "parent"
id = Column(Integer, primary_key=True)
# one-to-many collection
children = relationship("Child", back_populates="parent")
class Child(Base):
__tablename__ = "child"
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey("parent.id"))
# many-to-one scalar
parent = relationship("Parent", back_populates="children")
Above, Parent.children
is the “one-to-many” side referring to a collection,
and Child.parent
is the “many-to-one” side referring to a single object.
To convert this to “one-to-one”, the “one-to-many” or “collection” side
is converted into a scalar relationship using the uselist=False
flag,
renaming Parent.children
to Parent.child
for clarity:
class Parent(Base):
__tablename__ = "parent"
id = Column(Integer, primary_key=True)
# previously one-to-many Parent.children is now
# one-to-one Parent.child
child = relationship("Child", back_populates="parent", uselist=False)
class Child(Base):
__tablename__ = "child"
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey("parent.id"))
# many-to-one side remains, see tip below
parent = relationship("Parent", back_populates="child")
Above, when we load a Parent
object, the Parent.child
attribute
will refer to a single Child
object rather than a collection. If we
replace the value of Parent.child
with a new Child
object, the ORM’s
unit of work process will replace the previous Child
row with the new one,
setting the previous child.parent_id
column to NULL by default unless there
are specific cascade behaviors set up.
Tip
As mentioned previously, the ORM considers the “one-to-one” pattern as a
convention, where it makes the assumption that when it loads the
Parent.child
attribute on a Parent
object, it will get only one
row back. If more than one row is returned, the ORM will emit a warning.
However, the Child.parent
side of the above relationship remains as a
“many-to-one” relationship and is unchanged, and there is no intrinsic system
within the ORM itself that prevents more than one Child
object to be
created against the same Parent
during persistence. Instead, techniques
such as unique constraints may be used in
the actual database schema to enforce this arrangement, where a unique
constraint on the Child.parent_id
column would ensure that only
one Child
row may refer to a particular Parent
row at a time.
In the case where the relationship.backref
parameter is used to define the “one-to-many” side, this can be converted
to the “one-to-one” convention using the backref()
function which allows the relationship generated by the
relationship.backref
parameter to receive custom parameters,
in this case the uselist
parameter:
from sqlalchemy.orm import backref
class Parent(Base):
__tablename__ = "parent"
id = Column(Integer, primary_key=True)
class Child(Base):
__tablename__ = "child"
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey("parent.id"))
parent = relationship("Parent", backref=backref("child", uselist=False))
Many To Many¶
Many to Many adds an association table between two classes. The association
table is indicated by the relationship.secondary
argument to
relationship()
. Usually, the Table
uses the
MetaData
object associated with the declarative base
class, so that the ForeignKey
directives can locate the
remote tables with which to link:
association_table = Table(
"association",
Base.metadata,
Column("left_id", ForeignKey("left.id")),
Column("right_id", ForeignKey("right.id")),
)
class Parent(Base):
__tablename__ = "left"
id = Column(Integer, primary_key=True)
children = relationship("Child", secondary=association_table)
class Child(Base):
__tablename__ = "right"
id = Column(Integer, primary_key=True)
Tip
The “association table” above has foreign key constraints established that
refer to the two entity tables on either side of the relationship. The data
type of each of association.left_id
and association.right_id
is
normally inferred from that of the referenced table and may be omitted.
It is also recommended, though not in any way required by SQLAlchemy,
that the columns which refer to the two entity tables are established within
either a unique constraint or more commonly as the primary key constraint;
this ensures that duplicate rows won’t be persisted within the table regardless
of issues on the application side:
association_table = Table(
"association",
Base.metadata,
Column("left_id", ForeignKey("left.id"), primary_key=True),
Column("right_id", ForeignKey("right.id"), primary_key=True),
)
For a bidirectional relationship, both sides of the relationship contain a
collection. Specify using relationship.back_populates
, and
for each relationship()
specify the common association table:
association_table = Table(
"association",
Base.metadata,
Column("left_id", ForeignKey("left.id"), primary_key=True),
Column("right_id", ForeignKey("right.id"), primary_key=True),
)
class Parent(Base):
__tablename__ = "left"
id = Column(Integer, primary_key=True)
children = relationship(
"Child", secondary=association_table, back_populates="parents"
)
class Child(Base):
__tablename__ = "right"
id = Column(Integer, primary_key=True)
parents = relationship(
"Parent", secondary=association_table, back_populates="children"
)
When using the relationship.backref
parameter instead of
relationship.back_populates
, the backref will automatically
use the same relationship.secondary
argument for the
reverse relationship:
association_table = Table(
"association",
Base.metadata,
Column("left_id", ForeignKey("left.id"), primary_key=True),
Column("right_id", ForeignKey("right.id"), primary_key=True),
)
class Parent(Base):
__tablename__ = "left"
id = Column(Integer, primary_key=True)
children = relationship("Child", secondary=association_table, backref="parents")
class Child(Base):
__tablename__ = "right"
id = Column(Integer, primary_key=True)
The relationship.secondary
argument of
relationship()
also accepts a callable that returns the ultimate
argument, which is evaluated only when mappers are first used. Using this, we
can define the association_table
at a later point, as long as it’s
available to the callable after all module initialization is complete:
class Parent(Base):
__tablename__ = "left"
id = Column(Integer, primary_key=True)
children = relationship(
"Child",
secondary=lambda: association_table,
backref="parents",
)
With the declarative extension in use, the traditional “string name of the table”
is accepted as well, matching the name of the table as stored in Base.metadata.tables
:
class Parent(Base):
__tablename__ = "left"
id = Column(Integer, primary_key=True)
children = relationship("Child", secondary="association", backref="parents")
Warning
When passed as a Python-evaluable string, the
relationship.secondary
argument is interpreted using Python’s
eval()
function. DO NOT PASS UNTRUSTED INPUT TO THIS STRING. See
Evaluation of relationship arguments for details on declarative
evaluation of relationship()
arguments.
Deleting Rows from the Many to Many Table¶
A behavior which is unique to the relationship.secondary
argument to relationship()
is that the Table
which
is specified here is automatically subject to INSERT and DELETE statements, as
objects are added or removed from the collection. There is no need to delete
from this table manually. The act of removing a record from the collection
will have the effect of the row being deleted on flush:
# row will be deleted from the "secondary" table
# automatically
myparent.children.remove(somechild)
A question which often arises is how the row in the “secondary” table can be deleted
when the child object is handed directly to Session.delete()
:
session.delete(somechild)
There are several possibilities here:
If there is a
relationship()
fromParent
toChild
, but there is not a reverse-relationship that links a particularChild
to eachParent
, SQLAlchemy will not have any awareness that when deleting this particularChild
object, it needs to maintain the “secondary” table that links it to theParent
. No delete of the “secondary” table will occur.If there is a relationship that links a particular
Child
to eachParent
, suppose it’s calledChild.parents
, SQLAlchemy by default will load in theChild.parents
collection to locate allParent
objects, and remove each row from the “secondary” table which establishes this link. Note that this relationship does not need to be bidirectional; SQLAlchemy is strictly looking at everyrelationship()
associated with theChild
object being deleted.A higher performing option here is to use ON DELETE CASCADE directives with the foreign keys used by the database. Assuming the database supports this feature, the database itself can be made to automatically delete rows in the “secondary” table as referencing rows in “child” are deleted. SQLAlchemy can be instructed to forego actively loading in the
Child.parents
collection in this case using therelationship.passive_deletes
directive onrelationship()
; see Using foreign key ON DELETE cascade with ORM relationships for more details on this.
Note again, these behaviors are only relevant to the
relationship.secondary
option used with
relationship()
. If dealing with association tables that are mapped
explicitly and are not present in the relationship.secondary
option of a relevant relationship()
, cascade rules can be used
instead to automatically delete entities in reaction to a related entity being
deleted - see Cascades for information on this feature.
Association Object¶
The association object pattern is a variant on many-to-many: it’s used
when your association table contains additional columns beyond those
which are foreign keys to the left and right tables. Instead of using
the relationship.secondary
argument, you map a new class
directly to the association table. The left side of the relationship
references the association object via one-to-many, and the association
class references the right side via many-to-one. Below we illustrate
an association table mapped to the Association
class which
includes a column called extra_data
, which is a string value that
is stored along with each association between Parent
and
Child
:
class Association(Base):
__tablename__ = "association"
left_id = Column(ForeignKey("left.id"), primary_key=True)
right_id = Column(ForeignKey("right.id"), primary_key=True)
extra_data = Column(String(50))
child = relationship("Child")
class Parent(Base):
__tablename__ = "left"
id = Column(Integer, primary_key=True)
children = relationship("Association")
class Child(Base):
__tablename__ = "right"
id = Column(Integer, primary_key=True)
As always, the bidirectional version makes use of relationship.back_populates
or relationship.backref
:
class Association(Base):
__tablename__ = "association"
left_id = Column(ForeignKey("left.id"), primary_key=True)
right_id = Column(ForeignKey("right.id"), primary_key=True)
extra_data = Column(String(50))
child = relationship("Child", back_populates="parents")
parent = relationship("Parent", back_populates="children")
class Parent(Base):
__tablename__ = "left"
id = Column(Integer, primary_key=True)
children = relationship("Association", back_populates="parent")
class Child(Base):
__tablename__ = "right"
id = Column(Integer, primary_key=True)
parents = relationship("Association", back_populates="child")
Working with the association pattern in its direct form requires that child objects are associated with an association instance before being appended to the parent; similarly, access from parent to child goes through the association object:
# create parent, append a child via association
p = Parent()
a = Association(extra_data="some data")
a.child = Child()
p.children.append(a)
# iterate through child objects via association, including association
# attributes
for assoc in p.children:
print(assoc.extra_data)
print(assoc.child)
To enhance the association object pattern such that direct
access to the Association
object is optional, SQLAlchemy
provides the Association Proxy extension. This
extension allows the configuration of attributes which will
access two “hops” with a single access, one “hop” to the
associated object, and a second to a target attribute.
Warning
The association object pattern does not coordinate changes with a separate relationship that maps the association table as “secondary”.
Below, changes made to Parent.children
will not be coordinated
with changes made to Parent.child_associations
or
Child.parent_associations
in Python; while all of these relationships will continue
to function normally by themselves, changes on one will not show up in another
until the Session
is expired, which normally occurs automatically
after Session.commit()
:
class Association(Base):
__tablename__ = "association"
left_id = Column(ForeignKey("left.id"), primary_key=True)
right_id = Column(ForeignKey("right.id"), primary_key=True)
extra_data = Column(String(50))
child = relationship("Child", backref="parent_associations")
parent = relationship("Parent", backref="child_associations")
class Parent(Base):
__tablename__ = "left"
id = Column(Integer, primary_key=True)
children = relationship("Child", secondary="association")
class Child(Base):
__tablename__ = "right"
id = Column(Integer, primary_key=True)
Additionally, just as changes to one relationship aren’t reflected in the
others automatically, writing the same data to both relationships will cause
conflicting INSERT or DELETE statements as well, such as below where we
establish the same relationship between a Parent
and Child
object
twice:
p1 = Parent()
c1 = Child()
p1.children.append(c1)
# redundant, will cause a duplicate INSERT on Association
p1.child_associations.append(Association(child=c1))
It’s fine to use a mapping like the above if you know what
you’re doing, though it may be a good idea to apply the viewonly=True
parameter
to the “secondary” relationship to avoid the issue of redundant changes
being logged. However, to get a foolproof pattern that allows a simple
two-object Parent->Child
relationship while still using the association
object pattern, use the association proxy extension
as documented at Association Proxy.
Late-Evaluation of Relationship Arguments¶
Many of the examples in the preceding sections illustrate mappings
where the various relationship()
constructs refer to their target
classes using a string name, rather than the class itself:
class Parent(Base):
# ...
children = relationship("Child", back_populates="parent")
class Child(Base):
# ...
parent = relationship("Parent", back_populates="children")
These string names are resolved into classes in the mapper resolution stage,
which is an internal process that occurs typically after all mappings have
been defined and is normally triggered by the first usage of the mappings
themselves. The registry
object is the container in which
these names are stored and resolved to the mapped classes they refer towards.
In addition to the main class argument for relationship()
,
other arguments which depend upon the columns present on an as-yet
undefined class may also be specified either as Python functions, or more
commonly as strings. For most of these
arguments except that of the main argument, string inputs are
evaluated as Python expressions using Python’s built-in eval() function,
as they are intended to receive complete SQL expressions.
Warning
As the Python eval()
function is used to interpret the
late-evaluated string arguments passed to relationship()
mapper
configuration construct, these arguments should not be repurposed
such that they would receive untrusted user input; eval()
is
not secure against untrusted user input.
The full namespace available within this evaluation includes all classes mapped
for this declarative base, as well as the contents of the sqlalchemy
package, including expression functions like desc()
and
sqlalchemy.sql.functions.func
:
class Parent(Base):
# ...
children = relationship(
"Child",
order_by="desc(Child.email_address)",
primaryjoin="Parent.id == Child.parent_id",
)
For the case where more than one module contains a class of the same name, string class names can also be specified as module-qualified paths within any of these string expressions:
class Parent(Base):
# ...
children = relationship(
"myapp.mymodel.Child",
order_by="desc(myapp.mymodel.Child.email_address)",
primaryjoin="myapp.mymodel.Parent.id == myapp.mymodel.Child.parent_id",
)
The qualified path can be any partial path that removes ambiguity between
the names. For example, to disambiguate between
myapp.model1.Child
and myapp.model2.Child
,
we can specify model1.Child
or model2.Child
:
class Parent(Base):
# ...
children = relationship(
"model1.Child",
order_by="desc(mymodel1.Child.email_address)",
primaryjoin="Parent.id == model1.Child.parent_id",
)
The relationship()
construct also accepts Python functions or
lambdas as input for these arguments. This has the advantage of providing
more compile-time safety and better support for IDEs and PEP 484 scenarios.
A Python functional approach might look like the following:
from sqlalchemy import desc
def _resolve_child_model():
from myapplication import Child
return Child
class Parent(Base):
# ...
children = relationship(
_resolve_child_model(),
order_by=lambda: desc(_resolve_child_model().email_address),
primaryjoin=lambda: Parent.id == _resolve_child_model().parent_id,
)
The full list of parameters which accept Python functions/lambdas or strings
that will be passed to eval()
are:
relationship.order_by
relationship.primaryjoin
relationship.secondaryjoin
relationship.secondary
relationship.remote_side
relationship.foreign_keys
relationship._user_defined_foreign_keys
Changed in version 1.3.16: Prior to SQLAlchemy 1.3.16, the main relationship.argument
to relationship()
was also evaluated through eval()
As of
1.3.16 the string name is resolved from the class resolver directly without
supporting custom Python expressions.
Warning
As stated previously, the above parameters to relationship()
are evaluated as Python code expressions using eval(). DO NOT PASS
UNTRUSTED INPUT TO THESE ARGUMENTS.
It should also be noted that in a similar way as described at
Appending additional columns to an existing Declarative mapped class, any MapperProperty
construct can be added to a declarative base mapping at any time. If
we wanted to implement this relationship()
after the Address
class were available, we could also apply it afterwards:
# first, module A, where Child has not been created yet,
# we create a Parent class which knows nothing about Child
class Parent(Base):
...
# ... later, in Module B, which is imported after module A:
class Child(Base):
...
from module_a import Parent
# assign the User.addresses relationship as a class variable. The
# declarative base class will intercept this and map the relationship.
Parent.children = relationship(Child, primaryjoin=Child.parent_id == Parent.id)
Note
assignment of mapped properties to a declaratively mapped class will only
function correctly if the “declarative base” class is used, which also
provides for a metaclass-driven __setattr__()
method which will
intercept these operations. It will not work if the declarative
decorator provided by registry.mapped()
is used, nor will it
work for an imperatively mapped class mapped by
registry.map_imperatively()
.
Late-Evaluation for a many-to-many relationship¶
Many-to-many relationships include a reference to an additional, typically non-mapped
Table
object that is typically present in the MetaData
collection referred towards by the registry
. The late-evaluation
system also includes support for having this attribute be specified as a
string argument which will be resolved from this MetaData
collection. Below we specify an association table keyword_author
,
sharing the MetaData
collection associated with our
declarative base and its registry
. We can then refer to this
Table
by name in the relationship.secondary
parameter:
keyword_author = Table(
"keyword_author",
Base.metadata,
Column("author_id", Integer, ForeignKey("authors.id")),
Column("keyword_id", Integer, ForeignKey("keywords.id")),
)
class Author(Base):
__tablename__ = "authors"
id = Column(Integer, primary_key=True)
keywords = relationship("Keyword", secondary="keyword_author")
For additional detail on many-to-many relationships see the section Many To Many.