Skip to content

Filtering via correlated subquery after joining across polymorphic subclasses is joining on wrong alias starting in 1.4.37 #8162

Closed
@jdimmerman

Description

@jdimmerman

Describe the bug

Consider having two polymorphic subclasses that can be joined to eachother via some mapping table. If you have a query that starts with one sublcass, joins to the mapping table, and then joins to the second subclass, a subsequent filter that users a correlated subquery against the mapping table is correlating using the incorrect instance of the base class's table.

To Reproduce

Note - I distilled a much more complex configuration into the following. There are clearly simpler ways to configure the goal of this example, but the complexity I added was needed to demonstrate the issue.

Consider:

from sqlalchemy import Column, Integer, String, ForeignKey
from sqlalchemy.orm import declarative_base, relationship

Base = declarative_base()

class Employee(Base):
    __tablename__ = "employee"
    id = Column(Integer, primary_key=True)
    name = Column(String(50))
    type = Column(String(50))

    __mapper_args__ = {
        "polymorphic_identity": "employee",
        "polymorphic_on": type,
    }


class Engineer(Employee):
    __tablename__ = "engineer"
    base_id = Column(Integer, ForeignKey(Employee.id), primary_key=True)
    engineer_name = Column(String(30))

    project_membership = relationship(
        "ProjectMembership",
        primaryjoin="Engineer.id == ProjectMembership.engineer_id",
        uselist=False,
    )

    __mapper_args__ = {
        "polymorphic_identity": "engineer",
    }


class Manager(Employee):
    __tablename__ = "manager"
    base_id = Column(Integer, ForeignKey(Employee.id), primary_key=True)
    manager_name = Column(String(30))

    __mapper_args__ = {
        "polymorphic_identity": "manager",
    }


class ProjectMembership(Base):
    __tablename__ = "project_membership"
    id = Column(Integer, primary_key=True)
    project_name = Column(String(50))
    manager_id = Column(Integer, ForeignKey(Manager.id), nullable=True)
    engineer_id = Column(Integer, ForeignKey(Engineer.id), nullable=True)

and the query

query = (
    Engineer.query
    .outerjoin(ProjectMembership, ProjectMembership.engineer_id == Engineer.id)
    .outerjoin(Manager, ProjectMembership.manager_id == Manager.id)
    .filter(~Engineer.project_membership.has())
)

In 1.4.36, this emits:

SELECT employee.id,
    employee.name,
    employee.type,
    engineer.base_id,
    engineer.engineer_name
FROM employee
    JOIN engineer ON employee.id = engineer.base_id
    LEFT OUTER JOIN project_membership ON project_membership.engineer_id = employee.id
    LEFT OUTER JOIN (
        employee AS employee_1
        JOIN manager AS manager_1 ON employee_1.id = manager_1.base_id
    ) ON project_membership.manager_id = employee_1.id
WHERE NOT (
        EXISTS (
            SELECT 1
            FROM project_membership
            WHERE employee.id = project_membership.engineer_id
        )
    )

Starting in 1.4.37, this emits

SELECT employee.id,
    employee.name,
    employee.type,
    engineer.base_id,
    engineer.engineer_name
FROM employee
    JOIN engineer ON employee.id = engineer.base_id
    LEFT OUTER JOIN project_membership ON project_membership.engineer_id = employee.id
    LEFT OUTER JOIN (
        employee AS employee_1
        JOIN manager AS manager_1 ON employee_1.id = manager_1.base_id
    ) ON project_membership.manager_id = employee_1.id
WHERE NOT (
        EXISTS (
            SELECT 1
            FROM project_membership
            WHERE employee_1.id = project_membership.engineer_id
        )
    )

Notice the WHERE clause of the correlated subquery. In 1.4.36, this references employee.id, which is the instance associated with Engineer. In 1.4.37, this references employee_1.id, which is the instance associated with Manager.

Error

N/A

Versions

  • OS: Debian GNU/Linux 11 (bullseye)
  • Python: 3.10.2
  • SQLAlchemy: 1.4.37 (as compared to behavior 1.4.36)
  • Database: postgres:12 (docker image)
  • DBAPI (eg: psycopg, cx_oracle, mysqlclient): psycopg2 2.9.3

Additional context

Thanks!

Activity

added
bugSomething isn't working
and removed
requires triageNew issue that requires categorization
on Jun 23, 2022
zzzeek

zzzeek commented on Jun 23, 2022

@zzzeek
Member

hi -

This is indeed a behavioral change that is likely unintended. However I would note that your code emits a warning under both 1.4.36 and 1.4.37. If you follow the warning's guidance, the issue is fixed.

the warning:

SAWarning: An alias is being generated automatically against joined entity mapped class Manager->manager due to overlapping tables.  This is a legacy pattern which may be deprecated in a later release.  Use the aliased(<entity>, flat=True) construct explicitly, see the linked example. (Background on this error at: https://sqlalche.me/e/14/xaj2)

following the warning's guidance will fix the issue as SQLAlchemy no longer has to guess what the scope of the aliased "employee" table is:

s = Session()

mm = aliased(Manager, flat=True)

query = (
    s.query(Engineer)
    .outerjoin(ProjectMembership, ProjectMembership.engineer_id == Engineer.id)
    .outerjoin(mm, ProjectMembership.manager_id == mm.id)
    .filter(~Engineer.project_membership.has())
)

print(query)
SELECT employee.id AS employee_id, employee.name AS employee_name, employee.type AS employee_type, engineer.base_id AS engineer_base_id, engineer.engineer_name AS engineer_engineer_name 
FROM employee JOIN engineer ON employee.id = engineer.base_id LEFT OUTER JOIN project_membership ON project_membership.engineer_id = employee.id LEFT OUTER JOIN (employee AS employee_1 JOIN manager AS manager_1 ON employee_1.id = manager_1.base_id) ON project_membership.manager_id = employee_1.id 
WHERE NOT (EXISTS (SELECT 1 
FROM project_membership 
WHERE employee.id = project_membership.engineer_id))
added this to the 1.4.x milestone on Jun 23, 2022
zzzeek

zzzeek commented on Jun 23, 2022

@zzzeek
Member

from #8064 it's the change in util/ORMAdapter/_include_fn

added
regressionsomething worked and was broken by a change
on Jun 23, 2022
sqla-tester

sqla-tester commented on Jun 23, 2022

@sqla-tester
Collaborator

Mike Bayer has proposed a fix for this issue in the main branch:

refine _include_fn to not include sibling mappers https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3942

sqla-tester

sqla-tester commented on Jun 23, 2022

@sqla-tester
Collaborator

Mike Bayer has proposed a fix for this issue in the rel_1_4 branch:

refine _include_fn to not include sibling mappers https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3943

jdimmerman

jdimmerman commented on Jun 23, 2022

@jdimmerman
Author

Makes sense, thanks @zzzeek. We were filtering out warnings, which I'll fix and we'll make that reference explicit.

added a commit that references this issue on Jun 23, 2022
c4ee403
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingormregressionsomething worked and was broken by a change

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @zzzeek@sqla-tester@jdimmerman

        Issue actions

          Filtering via correlated subquery after joining across polymorphic subclasses is joining on wrong alias starting in 1.4.37 · Issue #8162 · sqlalchemy/sqlalchemy