-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test and identify reference cycles in high volume objects #5056
Comments
see #5050 |
Mike Bayer has proposed a fix for this issue in the master branch: Test for short term reference cycles and resolve as many as possible https://gerrit.sqlalchemy.org/1630 |
Here's a really significant source of cycles that probably can be backported to 1.3:
some other cycles I'm removing from mapper options are more intricate and risky. |
Mike Bayer has proposed a fix for this issue in the rel_1_3 branch: Test for short term reference cycles and resolve as many as possible https://gerrit.sqlalchemy.org/1631 |
Added test support and repaired a wide variety of unnecessary reference cycles created for short-lived objects, mostly in the area of ORM queries. For the 1.3 backport, includes the prefix_anon_map() optimization from 1.4 / master which inlines the anonymous symbol generation into a single object. This removes a cycle from the compiler that otherwise results in a signficantly higher number of unreachable cycles. Fixes: sqlalchemy#5056 Change-Id: Ifd93856eba550483f95f9ae63d49f36ab068b85a (cherry picked from commit 492930ed572de5f5550d514bc2ca52a57f108350)
@zzzeek Are we going to eliminate some other recursive function calls that cause ref cycles? clone:
visit:
|
I generally want to spend time only with those reference cycles that are subject to frequent collecion. the bits of code you refer towards above seem like they are more relevant to fixed mapper configurations. if you can provide more short tests to add to the test_memusage->CycleTest suite, that would be best. |
I am interested in all these changes because they will reduce the unreachable objects in my application to a minimum. I'll try to figure out some simple tests that will show the effect of the changes. Also for the record, I found an added benefit of reducing ref cycles which is that my profiling tool is showing more accurate results now instead of showing random weakref remove calls at the top of the call stack due to object allocation and deallocation triggered gc. The reason is gc detects cycle, removes objects then triggers some weakkeydict / weakvaluedict's remove method. |
when your application runs, Python's GC runs periodically. Within a few seconds of your program starting up and the mappers have been configured, GC will kick in and these particular cycles will be gone. Based on my research, cycles in Python are not documented as causing explosive growth in memory, only a little bit of latency added due to the GC background thread.
I've found gc.DEBUG_COLLECTABLE to be the best flag to use to see what actual cycles the garbage collector is collecting, it shows exactly what it found and nothing else. |
Yes, at this point I am convinced that the memory usage is not solely caused by ref cycles, but a mix of high concurrency, some ref cycles and memory fragmentation. If the object cannot be collected in gen 0, it goes to gen 1 and stay for a longer time, and so on. My hypothesis is when the application is busy enough, object may stay for a longer time than we like.
In my application, even in staging env, it will take 100% CPU because it prints too much. :) A simple PK query creates 10k garbage objects before the patch and it now creates <500. A complex query with eager load options and others creates a few thousand now, which is much better than before. |
please provide the MCVE for the few thousand cycles with eager load options version. Query-time cycles I'm looking to bring to near zero. |
you can erase all cycles created by mapper -level configuration like this:
the configure step occurs before your ORM classes are able to do anything at all, so if your ORM classes are in play, that gc.collect() will have run ahead of time. |
where did I miss that while we can't "del clone" in Python 2 we can just set it to None and that works ? im trying to understand how that works at all right now, in fact, there's still a function that points to itself. |
that makes this easier. of course we can add "visit = None" where needed. |
Added test support and repaired a wide variety of unnecessary reference cycles created for short-lived objects, mostly in the area of ORM queries. For the 1.3 backport, includes the prefix_anon_map() optimization from 1.4 / master which inlines the anonymous symbol generation into a single object. This removes a cycle from the compiler that otherwise results in a signficantly higher number of unreachable cycles. Fixes: #5056 Change-Id: Ifd93856eba550483f95f9ae63d49f36ab068b85a (cherry picked from commit 492930ed572de5f5550d514bc2ca52a57f108350)
I've committed what we have and I am glad that we have a test harness for this issue and a huge improvement in reference cycles for high volume areas. For the 1.3.x branch, this is a little bit more risk than I'm normally comfortable with as the changes to the mapper options and AliasedClass objects are significant, but I do want to get these out into the wild and I'm pretty confident that the huge number of tests we have for these areas should keep us in good shape, with my concern being an odd weakref related issue for the AliasedClass part and some elaborate pathing issue for the loader options part, however in both cases I'd want to know about these new cases and add them to testing. for ongoing, as you are locating relatively simple cycles that just require a simple break at the end, feel free to submit these as PRs with tests added to the new Cycle suite. if you can produce MCVEs for many hundreds of cycles still created for eager loading scenarios, file new issues for those so I can break it down into the compoinents that produce cycles as we now have a clear path to fixing them. and of course thanks for all the help and happy new year! 🎉 🎉 🎉 |
Happy new year! Thanks for the swift response and prompt action. I will keep working on the ref cycle issues and let you know. |
test for Core and ORM operations with quickly disposed objects that nonetheless create reference cycles. these should at least be tested and for very common objects like Query, loader option objects, etc. there should preferably few to zero cycles generated.
The text was updated successfully, but these errors were encountered: