Skip to content

IN conditions with Array value broken in 1.4 #7177

Closed
@mschmitzer

Description

@mschmitzer

Describe the bug

Upgrading to SQLAlchemy 1.4 breaks queries that use IN to compare Array-typed columns (Postgresql) to multiple values. This used to work under 1.3.x.

To Reproduce

#!/usr/bin/env python
import sys

from sqlalchemy import Column, Integer, engine
from sqlalchemy.orm import Session
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Foo(Base):
    __tablename__ = "foo"
    id = Column(Integer, primary_key=True)
    arr = Column(ARRAY(Integer))

if __name__ == "__main__":
    eng = engine.create_engine(sys.argv[1])
    session = Session(eng)
    res = list(session.query(Foo).filter(Foo.arr.in_([[1], [2], [3]])))
    print(res)

Error

Traceback (most recent call last):
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 1699, in _execute_context
    context = constructor(
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/default.py", line 1015, in _init_compiled
    expanded_state = compiled._process_parameters_for_postcompile(
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/sql/compiler.py", line 1198, in _process_parameters_for_postcompile
    statement = re.sub(
  File "/usr/lib/python3.8/re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/sql/compiler.py", line 1196, in process_expanding
    return replacement_expressions[m.group(1)]
KeyError: 'arr_1])::INTEGER['

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./alchemy_repro.py", line 23, in <module>
    sys.exit(main(sys.argv))
  File "./alchemy_repro.py", line 19, in main
    res = list(session.query(Foo).filter(Foo.arr.in_([[1], [2], [3]])))
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/orm/query.py", line 2839, in __iter__
    return self._iter().__iter__()
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/orm/query.py", line 2846, in _iter
    result = self.session.execute(
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/orm/session.py", line 1689, in execute
    result = conn._execute_20(statement, params or {}, execution_options)
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 1611, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 1478, in _execute_clauseelement
    ret = self._execute_context(
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 1705, in _execute_context
    self._handle_dbapi_exception(
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 2023, in _handle_dbapi_exception
    util.raise_(
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 1699, in _execute_context
    context = constructor(
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/default.py", line 1015, in _init_compiled
    expanded_state = compiled._process_parameters_for_postcompile(
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/sql/compiler.py", line 1198, in _process_parameters_for_postcompile
    statement = re.sub(
  File "/usr/lib/python3.8/re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "/home/marc/src/sqlalchemy/lib/sqlalchemy/sql/compiler.py", line 1196, in process_expanding
    return replacement_expressions[m.group(1)]
sqlalchemy.exc.StatementError: (builtins.KeyError) 'arr_1])::INTEGER['
[SQL: SELECT foo.id AS foo_id, foo.arr AS foo_arr 
FROM foo 
WHERE foo.arr IN ([POSTCOMPILE_arr_1])::INTEGER[]]
[parameters: [{}]]

Versions

  • OS: Linux (Ubuntu 20.04)
  • Python: 3.8.10
  • SQLAlchemy: 1.4.25
  • Database: Postgresql
  • DBAPI: psycopg

Additional context

Note that the reproducer doesn't require the database schema to be set up as the error happens in the compilation phase.

The problem can be "fixed" by tightening the regular expression at sql/compiler.py:1199 to exclude closing square brackets. But that just triggers a different problem.

Activity

added
bugSomething isn't working
datatypesthings to do with database types, like VARCHAR and others
regressionsomething worked and was broken by a change
and removed
requires triageNew issue that requires categorization
on Oct 11, 2021
added this to the 1.4.x milestone on Oct 11, 2021
CaselIT

CaselIT commented on Oct 11, 2021

@CaselIT
Member

hi,

this is a combination of the cast heuristic an the post-compile of the in. I think this happens also in other dbapi. (meaning that the cast we add wrong here)

thanks for reporting

zzzeek

zzzeek commented on Oct 11, 2021

@zzzeek
Member

this maybe should figure out to not fall into the "expanding IN" category at all so there is no need to work with the regexp here. but that means it has to generate a cache key against the literal values.

Here are three workarounds:

from sqlalchemy import tuple_
from sqlalchemy.dialects.postgresql import array

res = list(session.query(Foo).filter(Foo.arr.in_([array([1]), array([2]), array([3])])))
print(res)

res = list(session.query(Foo).filter(Foo.arr.in_(tuple_(array([1]), array([2]), array([3])))))
print(res)

res = list(session.query(Foo).filter(Foo.arr.in_(tuple_([1], [2], [3]))))
print(res)
zzzeek

zzzeek commented on Oct 11, 2021

@zzzeek
Member

well if it is in "expanding IN" then caching still works. array() skips caching for now, but the tuple_() form does cache, which means if you were querying with different length lists every time it would fill up your cache with an entry for every length. so tuple_() is not that safe to use here.

mschmitzer

mschmitzer commented on Oct 11, 2021

@mschmitzer
Author

The array workaround seems to work. Thanks for the quick responses!

zzzeek

zzzeek commented on Oct 11, 2021

@zzzeek
Member

wow this one sucks

zzzeek

zzzeek commented on Oct 11, 2021

@zzzeek
Member

oh. something just popped out that is easier. why is that.

zzzeek

zzzeek commented on Oct 11, 2021

@zzzeek
Member

ok here is the minimal fix for this exact situation:

diff --git a/lib/sqlalchemy/dialects/postgresql/psycopg2.py b/lib/sqlalchemy/dialects/postgresql/psycopg2.py
index a71bdf7606..c2e9153087 100644
--- a/lib/sqlalchemy/dialects/postgresql/psycopg2.py
+++ b/lib/sqlalchemy/dialects/postgresql/psycopg2.py
@@ -649,8 +649,7 @@ class PGCompiler_psycopg2(PGCompiler):
             bindparam.type._is_array or bindparam.type._is_type_decorator
         ):
             typ = bindparam.type._unwrapped_dialect_impl(self.dialect)
-
-            if typ._is_array:
+            if not bindparam.expanding and typ._is_array:
                 text += "::%s" % (
                     elements.TypeClause(typ)._compiler_dispatch(
                         self, skip_bind_expression=skip_bind_expression, **kw
diff --git a/lib/sqlalchemy/sql/compiler.py b/lib/sqlalchemy/sql/compiler.py
index 333ed36f41..9a754fcbe1 100644
--- a/lib/sqlalchemy/sql/compiler.py
+++ b/lib/sqlalchemy/sql/compiler.py
@@ -1196,7 +1196,7 @@ class SQLCompiler(Compiled):
             return replacement_expressions[m.group(1)]
 
         statement = re.sub(
-            r"\[POSTCOMPILE_(\S+)\]", process_expanding, self.string
+            r"\[POSTCOMPILE_(\S+?)\]", process_expanding, self.string
         )
 
         expanded_state = ExpandedState(
@@ -2013,7 +2013,7 @@ class SQLCompiler(Compiled):
                     [parameter.type], parameter.expand_op
                 )
 
-        elif isinstance(values[0], (tuple, list)):
+        elif isinstance(values[0], (tuple, list)) and not parameter.type._is_array:
             to_update = [
                 ("%s_%s_%s" % (name, i, j), value)
                 for i, tuple_element in enumerate(values, 1)

we can start with that

zzzeek

zzzeek commented on Oct 11, 2021

@zzzeek
Member

the mcve

from sqlalchemy import Column
from sqlalchemy import engine
from sqlalchemy import Integer
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session

Base = declarative_base()

class Foo(Base):
    __tablename__ = "foo"
    id = Column(Integer, primary_key=True)
    arr = Column(ARRAY(Integer))

if __name__ == "__main__":
    eng = engine.create_engine("postgresql://scott:tiger@localhost/test", echo=True)
    Base.metadata.drop_all(eng)
    Base.metadata.create_all(eng)
    session = Session(eng)
    session.add(Foo(arr=[2]))
    session.commit()

    res = list(session.query(Foo).filter(Foo.arr.in_([[1], [2], [3]])))

    print(res)

3 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdatatypesthings to do with database types, like VARCHAR and otherspostgresqlregressionsomething worked and was broken by a change

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @zzzeek@mschmitzer@sqla-tester@CaselIT

        Issue actions

          IN conditions with Array value broken in 1.4 · Issue #7177 · sqlalchemy/sqlalchemy