Skip to content

Event after_commit doesn't execute hooks after each commit() #4794

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dmitry-moroz opened this issue Aug 2, 2019 · 10 comments
Closed

Event after_commit doesn't execute hooks after each commit() #4794

dmitry-moroz opened this issue Aug 2, 2019 · 10 comments
Labels
bug Something isn't working events SQLAlchemy event hooks, the event system
Milestone

Comments

@dmitry-moroz
Copy link

dmitry-moroz commented Aug 2, 2019

Hi. I have found strange behavior of events in sqlalchemy. If I use after_commit event then do commit and continue set new hooks and make new commits - some hooks are not executed.

Code:

from sqlalchemy import create_engine, event
from sqlalchemy.orm import sessionmaker, Session


class MySession(Session):

    def on_commit(self, hook, *args, **kwargs):
        def real_hook(session):
            return hook(*args, **kwargs)

        event.listen(self, 'after_commit', real_hook, once=True)


db_factory = sessionmaker(
    class_=MySession,
    bind=create_engine('postgresql+psycopg2://test:***@localhost/test'),
    autocommit=False,
    autoflush=False,
    expire_on_commit=False,
)


def test_func(x):
    print(x)


db = db_factory()


for i in range(100):
    db.on_commit(test_func, i)
    db.commit()

Full output:

0
1
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
38
40
42
43
44

Env:

psycopg2==2.8.3
SQLAlchemy==1.3.6

Python 3.6.9

Ubuntu 16.04.6 LTS

Kernel: 4.4.0-157-generic

This bug can be reproduced with old version, e.g. SQLAlchemy==1.1.14 and Python 2.7.9.

Looks like usage of weakref in event/registry.py is a cause of the problem. Hook is not executed when his id() was used before. So, when Python reuse the same id() (saves new hook in the same part of memory where previous hook was saved) this new hook will not be executed. Because this hook is seen as "already in listeners" for sqlalchemy, and that is why new hook will not be added to listeners and will not be executed because there is option once=True :)

So, as a result you can use following workarounds:

  1. This code forces Python to not remove old hooks from the memory and allocate for each new hook new unique part of memory.
class MySession(Session):

    hooks = []

    def on_commit(self, hook, *args, **kwargs):
        def real_hook(session):
            return hook(*args, **kwargs)

        self.hooks.append(real_hook)
        event.listen(self, 'after_commit', real_hook, once=True)
  1. This code checks new hook for existence and removes them if it exist in registry.
class MySession(Session):

    def on_commit(self, hook, *args, **kwargs):
        def real_hook(session):
            return hook(*args, **kwargs)

        hook_args = (self, 'after_commit', real_hook)

        if event.contains(*hook_args):
            event.remove(*hook_args)

        event.listen(*hook_args, once=True)
@dmitry-moroz dmitry-moroz changed the title Event after_commit doesn't execute hooks after each commit Event after_commit doesn't execute hooks after each commit() Aug 2, 2019
@zzzeek
Copy link
Member

zzzeek commented Aug 3, 2019

At the moment, I don't think any of the weakref usage in registry.py or elsewhere is actually involved with the issue. The function is being GC'ed because the once wrapper is losing references to it, it looks like the fix, at least the most expedient one, would be this:


diff --git a/lib/sqlalchemy/util/langhelpers.py b/lib/sqlalchemy/util/langhelpers.py
index d923a6aa15..e3fe81009c 100644
--- a/lib/sqlalchemy/util/langhelpers.py
+++ b/lib/sqlalchemy/util/langhelpers.py
@@ -1459,6 +1459,7 @@ def only_once(fn):
     once = [fn]
 
     def go(*arg, **kw):
+        fn_please_dont_gc = fn
         if once:
             once_fn = once.pop()
             return once_fn(*arg, **kw)

this is because event.listen(fn) was meant to imply that "fn" is now strongly referenced. this is because it needs to fire off even if it isn't referenced anywhere else. which you might note indicates the collection of listeners will keep growing in your example, even though only one is active. The "once" flag was never intended to manage memory and it was assumed that hooks added with "once", even though they are only invoked once, would last in the collection forever. "once" wasn't intended as a means of adding event hooks per-call that then get automatically deregistered afterwards. But since you are using it that way, maybe it should work anyway. but that's not likely something I can commit to 1.3 if it's even doable since this implies event de-registration within the scope of the event being fired which I'm not sure is possible (haven't tried though).

the documentation doesn't state much about "once" too explicitly, that it wasn't intended for this use is sort of implied in "Event registration and removal is not intended to be a "high velocity" operation; it is a configurational operation. For systems that need to quickly associate and deassociate with events at high scale, use a mutable structure that is handled from inside of a single listener.".

So basically if I fix the bug using the above strong reference, your application will leak memory. If I figure out automatic de-registration, that would likely be based on the go() wrapper sending out some kind of signal.

overall your immediate issue is better served by not using "once" to imply automatic de-registration, and to just register a hook once on your Session and handle swapping out internal state separately.

@zzzeek zzzeek added bug Something isn't working orm labels Aug 3, 2019
@zzzeek zzzeek added this to the 1.3.xx milestone Aug 3, 2019
@zzzeek zzzeek added events SQLAlchemy event hooks, the event system and removed orm labels Aug 3, 2019
@zzzeek
Copy link
Member

zzzeek commented Aug 3, 2019

eeyeup, this might be not possible-ish, well, not easy-ish:

diff --git a/lib/sqlalchemy/event/registry.py b/lib/sqlalchemy/event/registry.py
index 07b961c012..ae9ffdc035 100644
--- a/lib/sqlalchemy/event/registry.py
+++ b/lib/sqlalchemy/event/registry.py
@@ -213,7 +213,7 @@ class _EventKey(object):
             stub_function._sa_warn()
 
         if once:
-            self.with_wrapper(util.only_once(self._listen_fn)).listen(
+            self.with_wrapper(util.only_once(self._listen_fn, self.remove)).listen(
                 *args, **kw
             )
         else:
diff --git a/lib/sqlalchemy/util/langhelpers.py b/lib/sqlalchemy/util/langhelpers.py
index d923a6aa15..9038bd50e7 100644
--- a/lib/sqlalchemy/util/langhelpers.py
+++ b/lib/sqlalchemy/util/langhelpers.py
@@ -1452,7 +1452,7 @@ def warn_limited(msg, args):
     warnings.warn(msg, exc.SAWarning, stacklevel=2)
 
 
-def only_once(fn):
+def only_once(fn, cleanup=None):
     """Decorate the given function to be a no-op after it is called exactly
     once."""
 
@@ -1461,7 +1461,9 @@ def only_once(fn):
     def go(*arg, **kw):
         if once:
             once_fn = once.pop()
-            return once_fn(*arg, **kw)
+            result = once_fn(*arg, **kw)
+            cleanup()
+            return result
 
     return go
 

and we get:

Traceback (most recent call last):
  File "test3.py", line 33, in <module>
    db.commit()
  File "/home/classic/dev/sqlalchemy/lib/sqlalchemy/orm/session.py", line 1028, in commit
    self.transaction.commit()
  File "/home/classic/dev/sqlalchemy/lib/sqlalchemy/orm/session.py", line 502, in commit
    self.session.dispatch.after_commit(self.session)
  File "/home/classic/dev/sqlalchemy/lib/sqlalchemy/event/attr.py", line 296, in __call__
    for fn in self.listeners:
RuntimeError: deque mutated during iteration

@zzzeek
Copy link
Member

zzzeek commented Aug 3, 2019

OK here is the third and most promising approach, however there is a test failing on this, this registration stuff is EXTREMELY complicated:


diff --git a/lib/sqlalchemy/event/registry.py b/lib/sqlalchemy/event/registry.py
index 07b961c012..eccab11c74 100644
--- a/lib/sqlalchemy/event/registry.py
+++ b/lib/sqlalchemy/event/registry.py
@@ -72,7 +72,10 @@ def _stored_in_collection(event_key, owner):
     owner_ref = owner.ref
     listen_ref = weakref.ref(event_key._listen_fn)
 
-    if owner_ref in dispatch_reg:
+    if (
+        owner_ref in dispatch_reg
+        and dispatch_reg[owner_ref]() is event_key._listen_fn
+    ):
         return False
 
     dispatch_reg[owner_ref] = listen_ref

@zzzeek
Copy link
Member

zzzeek commented Aug 3, 2019

the above patch means all the go() functions stay in the collection, but if a new one comes in with the same key, meaning the inner function was GC'ed, it gets replaced here.

@zzzeek
Copy link
Member

zzzeek commented Aug 3, 2019

the previous approach makes it impossible to detect the same listener function being registered twice in a wrapped scenario (at least I can't get it to work yet, test/base/test_events.py -> test_double_event_wrapped, for #3199).

Another approach. Store weakrefs to the functions instead of numerical ids, but then we aren't managing memory anymore, the collection grows unbounded just as if we had the only_once() wrapper strong reference it, so this seems more complicated than it needs to be:

diff --git a/lib/sqlalchemy/event/registry.py b/lib/sqlalchemy/event/registry.py
index 07b961c012..128ecc6022 100644
--- a/lib/sqlalchemy/event/registry.py
+++ b/lib/sqlalchemy/event/registry.py
@@ -156,9 +156,9 @@ class _EventKey(object):
         self.identifier = identifier
         self.fn = fn
         if isinstance(fn, types.MethodType):
-            self.fn_key = id(fn.__func__), id(fn.__self__)
+            self.fn_key = weakref.ref(fn.__func__), id(fn.__self__)
         else:
-            self.fn_key = id(fn)
+            self.fn_key = weakref.ref(fn)
         self.fn_wrap = _fn_wrap
         self.dispatch_target = dispatch_target

all along, the assumption is, if we are applying a wrapper to the incoming function, the wrapper is still strong referencing the function. so I can "fix" the issue with the first approach but that means once=True is not automatically de-registering or managing memory.

@sqla-tester
Copy link
Collaborator

Mike Bayer has proposed a fix for this issue in the master branch:

Strong reference listen function wrapped by "once" https://gerrit.sqlalchemy.org/1393

@sqla-tester
Copy link
Collaborator

Mike Bayer has proposed a fix for this issue in the rel_1_3 branch:

Strong reference listen function wrapped by "once" https://gerrit.sqlalchemy.org/1394

sqlalchemy-bot pushed a commit that referenced this issue Aug 5, 2019
Fixed issue in event system where using the ``once=True`` flag with
dynamically generated listener functions would cause event registration of
future events to fail if those listener functions were garbage collected
after they were used, due to an assumption that a listened function is
strongly referenced.  The "once" wrapped is now modified to strongly
reference the inner function persistently, and documentation is updated
that using "once" does not imply automatic de-registration of listener
functions.

Fixes: #4794
Change-Id: I06388083d1633dcc89e8919eb1e51270f966df38
(cherry picked from commit e22d2b0)
@zzzeek zzzeek modified the milestones: 1.3.xx, 1.3.x Dec 18, 2019
@reritom
Copy link

reritom commented Aug 16, 2021

Has there been any consideration to add a session.on_commit(..) method? I find that using the ORM events statically isn't very flexible.

If I have a user and want to update theirs permissions (assuming relationships like User 1-n UserPermission n-1 Permission), the signal that I want to send after_commit is "user_permissions_updated". At the dao/dal layer, I know exactly what signal I want to send once committed, so it makes sense to be able to register the callback at that time.

@staticmethod
def update_user_permissions(user_id, permission_ids):
    # Create some UserPermissions
    # Delete some UserPermissions
    session.on_commit(lambda: user_permissions_updated_signal.send(user_id))

If I want to use the after_commit event, based on what I have seen and used before I would have do:

@event.listens_for(session, "after_commit"):
def create_signals(mapper, connection, target):
    # Somehow determine the changes to determine the correct signal to emit
   ...

Which is very indirect.

A simpler case is something like the flow for verifying a user. In the dao I can do:

@staticmethod
def verify_user(user_id: int) -> None:
    user = session.query(User).get(user_id)
    user.is_verified = True
    session.add(user)
    session.on_commit(lambda: user_verified.send(user.id))

But if I don't have the on_commit, I would need to listen to the session commit, then find the models that have been changed, then determine what the change was, then determine which signal to send.

I have tried implementing something similar to the solution of @dmitry-moroz , but I am not sure how robust it is, and am having problems handling rollbacks, etc.

It would be very useful to have something similar to this https://docs.djangoproject.com/en/3.2/topics/db/transactions/#performing-actions-after-commit - where you can register callbacks, which are discarded on rollbacks.

@zzzeek
Copy link
Member

zzzeek commented Aug 16, 2021

hi @reritom -

While I appreciate that you are looking for an API that resembles what the user posting this bug report was originally trying to do, this current document regards a small bug in the event registration system that is unrelated to your topic. Can I ask that going forward you create a new discussion to introduce new topics going forward? you can of course refer to issues that mention similar issues by indicating their number. I've added your question and an appropriate recipe for this use case at #6888.

@reritom
Copy link

reritom commented Aug 16, 2021

@zzzeek Apologies for bringing it up in the wrong place. This was the only thread anywhere that I could find that mentioned the desired API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working events SQLAlchemy event hooks, the event system
Projects
None yet
Development

No branches or pull requests

4 participants