Closed
Description
Describe the bug
Upgrading to SQLAlchemy 1.4 breaks queries that use IN
to compare Array-typed columns (Postgresql) to multiple values. This used to work under 1.3.x.
To Reproduce
#!/usr/bin/env python
import sys
from sqlalchemy import Column, Integer, engine
from sqlalchemy.orm import Session
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Foo(Base):
__tablename__ = "foo"
id = Column(Integer, primary_key=True)
arr = Column(ARRAY(Integer))
if __name__ == "__main__":
eng = engine.create_engine(sys.argv[1])
session = Session(eng)
res = list(session.query(Foo).filter(Foo.arr.in_([[1], [2], [3]])))
print(res)
Error
Traceback (most recent call last):
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 1699, in _execute_context
context = constructor(
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/default.py", line 1015, in _init_compiled
expanded_state = compiled._process_parameters_for_postcompile(
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/sql/compiler.py", line 1198, in _process_parameters_for_postcompile
statement = re.sub(
File "/usr/lib/python3.8/re.py", line 210, in sub
return _compile(pattern, flags).sub(repl, string, count)
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/sql/compiler.py", line 1196, in process_expanding
return replacement_expressions[m.group(1)]
KeyError: 'arr_1])::INTEGER['
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./alchemy_repro.py", line 23, in <module>
sys.exit(main(sys.argv))
File "./alchemy_repro.py", line 19, in main
res = list(session.query(Foo).filter(Foo.arr.in_([[1], [2], [3]])))
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/orm/query.py", line 2839, in __iter__
return self._iter().__iter__()
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/orm/query.py", line 2846, in _iter
result = self.session.execute(
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/orm/session.py", line 1689, in execute
result = conn._execute_20(statement, params or {}, execution_options)
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 1611, in _execute_20
return meth(self, args_10style, kwargs_10style, execution_options)
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
return connection._execute_clauseelement(
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 1478, in _execute_clauseelement
ret = self._execute_context(
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 1705, in _execute_context
self._handle_dbapi_exception(
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 2023, in _handle_dbapi_exception
util.raise_(
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/util/compat.py", line 207, in raise_
raise exception
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/base.py", line 1699, in _execute_context
context = constructor(
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/engine/default.py", line 1015, in _init_compiled
expanded_state = compiled._process_parameters_for_postcompile(
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/sql/compiler.py", line 1198, in _process_parameters_for_postcompile
statement = re.sub(
File "/usr/lib/python3.8/re.py", line 210, in sub
return _compile(pattern, flags).sub(repl, string, count)
File "/home/marc/src/sqlalchemy/lib/sqlalchemy/sql/compiler.py", line 1196, in process_expanding
return replacement_expressions[m.group(1)]
sqlalchemy.exc.StatementError: (builtins.KeyError) 'arr_1])::INTEGER['
[SQL: SELECT foo.id AS foo_id, foo.arr AS foo_arr
FROM foo
WHERE foo.arr IN ([POSTCOMPILE_arr_1])::INTEGER[]]
[parameters: [{}]]
Versions
- OS: Linux (Ubuntu 20.04)
- Python: 3.8.10
- SQLAlchemy: 1.4.25
- Database: Postgresql
- DBAPI: psycopg
Additional context
Note that the reproducer doesn't require the database schema to be set up as the error happens in the compilation phase.
The problem can be "fixed" by tightening the regular expression at sql/compiler.py:1199
to exclude closing square brackets. But that just triggers a different problem.
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
CaselIT commentedon Oct 11, 2021
hi,
this is a combination of the cast heuristic an the post-compile of the in. I think this happens also in other dbapi. (meaning that the cast we add wrong here)
thanks for reporting
zzzeek commentedon Oct 11, 2021
this maybe should figure out to not fall into the "expanding IN" category at all so there is no need to work with the regexp here. but that means it has to generate a cache key against the literal values.
Here are three workarounds:
zzzeek commentedon Oct 11, 2021
well if it is in "expanding IN" then caching still works. array() skips caching for now, but the tuple_() form does cache, which means if you were querying with different length lists every time it would fill up your cache with an entry for every length. so tuple_() is not that safe to use here.
mschmitzer commentedon Oct 11, 2021
The
array
workaround seems to work. Thanks for the quick responses!zzzeek commentedon Oct 11, 2021
wow this one sucks
zzzeek commentedon Oct 11, 2021
oh. something just popped out that is easier. why is that.
zzzeek commentedon Oct 11, 2021
ok here is the minimal fix for this exact situation:
we can start with that
zzzeek commentedon Oct 11, 2021
the mcve
3 remaining items