Skip to content

reduce JSON operator precedence to 5 and add tests #9836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jmriebold opened this issue May 25, 2023 · 8 comments
Closed

reduce JSON operator precedence to 5 and add tests #9836

jmriebold opened this issue May 25, 2023 · 8 comments
Assignees
Labels
bug Something isn't working great mcve An issue with a great mcve near-term release addition to the milestone which indicates this should be in a near-term release postgresql sql
Milestone

Comments

@jmriebold
Copy link

jmriebold commented May 25, 2023

Describe the bug

I'm trying to query for rows where a JSONB column contains any of an array of JSON values. The raw SQL query I wrote works (cribbing from this post), and looks like this:

SELECT t1.id, t1.jsonb_data
FROM t1
WHERE t1.jsonb_data @> ANY (CAST(ARRAY['[{"foo": "bar"}]', '[{"foo": "baz"}]'] AS JSONB[]))

which I translated into SQLA as

stmt = select(t1).filter(
    t1.c.jsonb_data.contains(any_(cast(array([str([{'foo': ii}]) for ii in ['bar', 'baz']]), ARRAY(JSONB))))
)

However, that produces the following SQL, which results in a syntax error:

SELECT t1.id, t1.jsonb_data
FROM t1
WHERE t1.jsonb_data @> (ANY (CAST(ARRAY['[{"foo": "bar"}]', '[{"foo": "baz"}]'] AS JSONB[])))

It seems the extra parentheses break the query.

Optional link from https://docs.sqlalchemy.org which documents the behavior that is expected

No response

SQLAlchemy Version in Use

2.0.15

DBAPI (i.e. the database driver)

asyncpg==0.27.0

Database Vendor and Major Version

PostgreSQL 15.2

Python Version

3.10

Operating system

Linux

To Reproduce

from asyncio import run

from sqlalchemy import Column
from sqlalchemy import Integer
from sqlalchemy import MetaData
from sqlalchemy import Table
from sqlalchemy import any_
from sqlalchemy import cast
from sqlalchemy import select
from sqlalchemy.dialects.postgresql import array, ARRAY, JSONB
from sqlalchemy.ext.asyncio import create_async_engine

meta = MetaData()
t1 = Table('t1', meta, Column('id', Integer(), primary_key=True, autoincrement=True),
           Column('jsonb_data', JSONB(none_as_null=True)))


async def main():
    engine_string = 'postgresql+asyncpg://postgres:password@localhost:5432/postgres'
    engine = create_async_engine(engine_string, echo=True)
    async with engine.begin() as conn:
        await conn.run_sync(meta.create_all)

    async with engine.connect() as conn:
        stmt = select(t1).filter(
            t1.c.jsonb_data.contains(any_(cast(array([str([{'foo': ii}]) for ii in ['bar', 'baz']]), ARRAY(JSONB))))
        )
        await conn.execute(stmt)

    await engine.dispose()


if __name__ == '__main__':
    run(main())

Error

Traceback (most recent call last):
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 399, in _prepare_and_execute
    prepared_stmt, attributes = await adapt_connection._prepare(
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 641, in _prepare
    prepared_stmt = await self._connection.prepare(operation)
  File "/home/user/example/local/lib/python3.10/site-packages/asyncpg/connection.py", line 565, in prepare
    return await self._prepare(
  File "/home/user/example/local/lib/python3.10/site-packages/asyncpg/connection.py", line 583, in _prepare
    stmt = await self._get_statement(
  File "/home/user/example/local/lib/python3.10/site-packages/asyncpg/connection.py", line 397, in _get_statement
    statement = await self._protocol.prepare(
  File "asyncpg/protocol/protocol.pyx", line 168, in prepare
asyncpg.exceptions.PostgresSyntaxError: syntax error at or near "ANY"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
    self.dialect.do_execute(
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 460, in execute
    self._adapt_connection.await_(
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 76, in await_only
    return current.driver.switch(awaitable)
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 129, in greenlet_spawn
    value = await result
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 435, in _prepare_and_execute
    self._handle_exception(error)
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 370, in _handle_exception
    self._adapt_connection._handle_exception(error)
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 663, in _handle_exception
    raise translated_error from error
sqlalchemy.dialects.postgresql.asyncpg.AsyncAdapt_asyncpg_dbapi.ProgrammingError: <class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax error at or near "ANY"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/example/repro.py", line 35, in <module>
    run(main())
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/home/user/example/repro.py", line 29, in main
    await conn.execute(stmt)
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/ext/asyncio/engine.py", line 457, in execute
    result = await greenlet_spawn(
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 134, in greenlet_spawn
    result = context.throw(*sys.exc_info())
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1614, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1481, in _execute_clauseelement
    ret = self._execute_context(
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1845, in _execute_context
    self._handle_dbapi_exception(
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2026, in _handle_dbapi_exception
    util.raise_(
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
    self.dialect.do_execute(
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 460, in execute
    self._adapt_connection.await_(
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 76, in await_only
    return current.driver.switch(awaitable)
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 129, in greenlet_spawn
    value = await result
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 435, in _prepare_and_execute
    self._handle_exception(error)
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 370, in _handle_exception
    self._adapt_connection._handle_exception(error)
  File "/home/user/example/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 663, in _handle_exception
    raise translated_error from error
sqlalchemy.exc.ProgrammingError: (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax error at or near "ANY"
[SQL: SELECT t1.id, t1.jsonb_data 
FROM t1 
WHERE t1.jsonb_data @> (ANY (CAST(ARRAY[%s, %s] AS JSONB[])))]
[parameters: ("[{'foo': 'bar'}]", "[{'foo': 'baz'}]")]
(Background on this error at: https://sqlalche.me/e/14/f405)

Process finished with exit code 1

Additional context

No response

@jmriebold jmriebold added the requires triage New issue that requires categorization label May 25, 2023
@CaselIT CaselIT added postgresql sql and removed requires triage New issue that requires categorization labels May 25, 2023
@CaselIT
Copy link
Member

CaselIT commented May 25, 2023

Hi,

thanks for reporting.
@zzzeek Does this suggests that the precedence of contains is wrong? the any one seems fine since it works when used with =.

Example:

import sqlalchemy as sa
from sqlalchemy.dialects import postgresql as PG

t = sa.table("t", sa.column("a", sa.Integer), sa.column("b", PG.JSONB))

any_ = sa.any_(sa.values(t.c.a).data([(1,), (2,), (3,)]).scalar_values())
q = t.c.a == any_
print(q.compile(dialect=PG.dialect()))
q2 = t.c.b.contains(any_)
print(q2.compile(dialect=PG.dialect()))

this prints

t.a = ANY (VALUES (%(param_1)s), (%(param_2)s), (%(param_3)s))
t.b @> (ANY (VALUES (%(param_1)s), (%(param_2)s), (%(param_3)s)))

@CaselIT CaselIT added the bug Something isn't working label May 25, 2023
@CaselIT CaselIT added this to the 2.0.x milestone May 25, 2023
@zzzeek zzzeek added the great mcve An issue with a great mcve label May 26, 2023
@zzzeek
Copy link
Member

zzzeek commented May 26, 2023

I wonder why the parenthesis dont work? how is PG parsing this that the parens are confusing it? usually PG is the DB where we dont have to worry about silly things like this

@zzzeek
Copy link
Member

zzzeek commented May 26, 2023

So here is the patch to remove the parens:

diff --git a/lib/sqlalchemy/dialects/postgresql/json.py b/lib/sqlalchemy/dialects/postgresql/json.py
index 232f058042..41ad983a71 100644
--- a/lib/sqlalchemy/dialects/postgresql/json.py
+++ b/lib/sqlalchemy/dialects/postgresql/json.py
@@ -16,7 +16,7 @@ from ...sql import operators
 
 __all__ = ("JSON", "JSONB")
 
-idx_precedence = operators._PRECEDENCE[operators.json_getitem_op]
+idx_precedence = 5
 
 ASTEXT = operators.custom_op(
     "->>",

That's one way at least, this sets precedence at 5 which is what CONTAINS has for PG ARRAY, not sure why JSON is different.

but the test case still fails (desired SQL though) can we get something that works? or is this another issue?

sqlalchemy.exc.DBAPIError: (sqlalchemy.dialects.postgresql.asyncpg.Error) <class 'asyncpg.exceptions.InvalidTextRepresentationError'>: invalid input syntax for type json
DETAIL:  Token "'" is invalid.
[SQL: SELECT t1.id, t1.jsonb_data 
FROM t1 
WHERE t1.jsonb_data @> ANY (CAST(ARRAY[$1::VARCHAR, $2::VARCHAR] AS JSONB[]))]
[parameters: ("[{'foo': 'bar'}]", "[{'foo': 'baz'}]")]

@zzzeek
Copy link
Member

zzzeek commented May 26, 2023

don't need the cast. just use literal with a type:

async def main():
    engine_string = "postgresql+asyncpg://scott:tiger@localhost:5432/postgres"
    engine = create_async_engine(engine_string, echo=True)
    async with engine.begin() as conn:
        await conn.run_sync(meta.create_all)

    async with engine.connect() as conn:
        stmt = select(t1).filter(
            t1.c.jsonb_data.contains(
                any_(
                    array(
                        [
                            literal([{"foo": ii}], JSONB)
                            for ii in ["bar", "baz"]
                        ]
                    ),
                )
            )
        )
        await conn.execute(stmt)

    await engine.dispose()

@zzzeek zzzeek added the near-term release addition to the milestone which indicates this should be in a near-term release label May 26, 2023
@zzzeek zzzeek changed the title Using any_() with contains() on a Postgres JSONB column reduce JSON operator precedence to 5 and add tests May 26, 2023
@zzzeek
Copy link
Member

zzzeek commented May 26, 2023

output SQL is:

SELECT t1.id, t1.jsonb_data 
FROM t1 
WHERE t1.jsonb_data @> ANY (ARRAY[$1::JSONB, $2::JSONB])
2023-05-26 14:49:05,356 INFO sqlalchemy.engine.Engine [generated in 0.00021s] ('[{"foo": "bar"}]', '[{"foo": "baz"}]')

@CaselIT
Copy link
Member

CaselIT commented May 26, 2023

this is somewhat connected to #9041

@CaselIT CaselIT self-assigned this May 26, 2023
@sqla-tester
Copy link
Collaborator

Federico Caselli has proposed a fix for this issue in the main branch:

Improve PostgreSQL custom operators https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/4636

@jmriebold
Copy link
Author

Thanks so much for the help and quick turnaround, @CaselIT and @zzzeek!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working great mcve An issue with a great mcve near-term release addition to the milestone which indicates this should be in a near-term release postgresql sql
Projects
None yet
Development

No branches or pull requests

4 participants