Deadlock in Coldfusion 8.0.1
By Aaron Foote at 4:04 in: Adobe | ColdFusionColdFusion has a deadlock that will hang the server. Simply request a WSDL and a regular CFM page at the same time in the right circumstances and bingo. The circumstance is the ColdFusion TemplateClassLoader loading classes in to memory. Classes are loaded or reload by the TemplateClassLoader on the first request of a file after the server starts, when the template cache is cleared (even if your not using the “Trusted Cache” ) or as the result of a Garbage Collection of objects in the TemplateClassLoader’s cache.
This issue was discovered while analysing a hung ColdFusion server for a client. The thread dump showed many threads stuck in the TemplateClassLoader which were waiting on a lock.
The following shows three of the threads. The two that caused the deadlock and a third that is essentially a victim – any further CFM requests to the server will fall in to this victim category.
WSDL Request – note that 0×10d69910 is locked, also that it is waiting on 0×0f970e48
CFM Request – note it is waiting to lock 0×10d69910, which is locked by the WSDL thread
Finally our victim – this thread isn’t blocked in the traditional sense. It’s waiting another thread, which has a lock on 0×0f970e48 to “notify” it
It is this victim thread that is interesting. It’s waiting for another thread to acquire a lock on 0×0f970e48 and call .notify() on the lock object. Unusually, no thread has a lock 0×0f970e48. Under normal circumstances it would be the CFM thread (above) that would obtain the lock on 0×0f970e48 and release all the “victim” threads - but it’s deadlocked so it can’t.
Not shown is the thread dump of Web Service requests that were also present. After investigation these proved to be victims of the deadlock, not contributors.
The key insight in the problem was the realisation that a garbage collection on the permanent generation had occurred shortly before the crash. To have garbage collection details logged add the following flags to your jvm.config file -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution
The first step in troubleshooting this situation was to replicate the problem locally. The first attempt was to simply use JMeter to place some load on the application, requesting CFM pages, WSDL files and Web Service requests. This proved to be fruitless. In an attempt to cause a permanent generation collection, as was implicated in the server logs, I manually called the Garbage Collector while the JMeter load was running. This too was futile; I could not force a collection of the permanent generation. To monitor the various JVM memory spaces I had JProfiler running
I remembered that clearing the Template Cache caused ColdFusion to reload class files, even if you are not using the Trusted Cache feature. We can see this occurring in JProfiler – the following is a graph of the size of the permanent generation, while requesting the same CFM file and clearing the template cache. You can see that each time the template cache is cleared, the used space in the permanent generation increases. The drop at the end is a Garbage Collection – this shows that the TemplateClassLoader is releasing classes from the permanent generation allowing them to be collected.

To test this, I again ran the same JMeter load as before but this time manually cleared the Template Cache in the ColdFusion Administrator. Bingo! Each and every time the server would hang.
The next step was to make a synthetic example – one that did not rely on confidential client code. The primary reason was to have code I can publish so others in the community could reproduce this bug. It has also allowed me to further investigate the causes of this issue.
I started by making a simplistic replica of the client’s code. The main points where that the Web Service and CFM files share common code. The Web Service has an Application.cfc file with the common code loaded in the OnRequestStart method. The CFM files have an Application.cfc file with an OnRequest method.
Did you know that OnRequestStart gets called when you request a WSDL?
After a reasonable amount of investigation, the common code between the Web Service and CFM had NOTHING to do with the issue. Nor did OnRequestStart or anything to do with Applaction.cfc. It simply comes down to a WSDL request and a CFM request – even if both are completely unrelated – occurring at the same time.
So we know this occurs when the Template Cache is cleared. We can assume that it can occur at server start-up, although this would be much harder to reproduce. It is likely it that it will occur in common usage such as when a WSDL is requested for the first time and a new CFM page request occurs at the first time.
But what about our client and the link to permanent generation collection? The permanent generation collection itself is not a cause, but it is a symptom. The TempleClassLoader uses Soft References in its data cache – this lets the TempleClassLoader have data in memory, but that data can be Garbage Collected. What occurred was exactly that. The TempleClassLoaders cache was garbage collected, causing it to reload classes — this freed classes in the permanent generation which where then garbage collected — the event that I saw initially.
Attached to this post is a zip file containing code that will reproduce the problem. clearTemplateCache.cfm will clear the template cache using the Admin API – you will need to edit this file and put in your Administrator password. This file makes it easy to include in a JMeter test. I’ve included the JMeter test I used, the hostname I used is cfdeadlock, if you use something else change the host name in the “HTTP Request Defaults” element. Clearing the template cache is set to start 5 seconds after the test starts.
The tests I ran had only 3 threads (concurrent requests). One CFM, one WSDL and one to clear the Template Cache. This issue is not related to load. I have no idea if this issue is present in any other version of ColdFusion – I’ll leave that as an exercise for the community.
Enjoy.
- Permalink
- Bookmark: Digg This! | Del.icio.us | Technorati














We have logged bug 83343 for this issue.
Note that this issue is not reproducible on latest builds. So it should work in the next release.
very vey nice
, thanks
This issue is already fixed in CF9.
Great article Aaron, impressive!
Impressive Article, Aaron!
Hrmm that was weird, my comment got eaten. Anyway I wanted to say that it’s nice to know that someone else also mentioned this as I had trouble finding the same info elsewhere. This was the first place that told me the answer. Thanks.
Google kazanclari kitabi indir | google kazanclari gercek mi