Tuesday, April 6, 2010

java.net.URL and JARURLConnection may cause file handle leaks

A couple of days ago I noticed that there was often a problem during the automatic redeployment of our web application to the central Apache Tomcat server. After some investigation it turned out that there was a file handle leak preventing Tomcat from redeploying the application:

INFO: Undeploying context [/myapp]
Apr 6, 2010 10:08:39 AM org.apache.catalina.startup.ExpandWar deleteDir
SEVERE: [C:\apache-tomcat\webapps\myapp\WEB-INF\lib] could not be completely deleted. The presence of the remaining files may cause problems

The result is that after redeployment the directory with the exploded app contains only WEB-INF\lib with one jar file inside it (the leaking one). In order to find out the cause of the leak I analyzed the application with JPicus. The result was the following























It showed that the root cause of the problem was the call to java.netURL.openStream which in turn delegated the call to a JarURLConnection.

URLConnection and its subclasses may open resources like files and sockets and keep them open "for a while" to improve efficiency. Some attempt to address this issue has been expressed in the response to this bug report. It suggests that getting a stream from the connection and closing it will close all underlying resources. This behavior is not explicitly stated in the javadoc and is hence JVM implementation specific. Even with OpenJDK, it is very likely to have resource leaks if the URL is referencing an entry in the jar file. For example, this code snippet, when executed on Java HotSpot(TM) Client VM 1.6. will keep the file open even though the stream is closed :

URL url = new URL("jar:file:///C:/Users/user/my.jar!/log4j.properties");
InputStream in = url.openStream();
in.close();


According to the javadoc, this behavior can be controlled by the property "useCaches" which probably defaults to "true". Unfortunately, in my case, this was not under my control, because the call to java.net.URL.openStream had been made indirectly by log4j Logger.getLogger.

In case the URL is not pointing to an entry in a jar file, closing the stream might actually help. For example, if Log4j is reading its properties file from the WEB-INF/classes directory, and closes the stream after that, there shall be no leaks. This, however, takes us to my next discovery. It appears that log4j has been handling the connection streams incorrectly until version 1.2.15, where this has been fixed. If your log4j.properties is not in a jar file, you may benefit from upgrading to version 1.2.15 or above.

Summary
It might be dangerous to use URL.openStream or URLConnection classes in an environment, where resource leaks are not tolerable. If you do so, you must take care to close all streams as soon as possible (applies to every other stream in general). In case you use URLs that refer to entries in a jar or a zip file, it is very likely that the underlying files will not be closed even though you have closed all streams properly. Then you are left at the mercy of the garbage collector to close the file upon finalization.

Note: When trying to run JPicus agent, the -javaagent option did not work when running Tomcat as a service. The workaround was to start Tomcat from the command line to perform the analysis.

2 comments:

  1. Hi, Great.. Tutorial is just awesome..It is really helpful for a newbie like me.. I am a regular follower of your blog. Really very informative post you shared here. Kindly keep blogging. If anyone wants to become a Java developer learn from Java Training in Chennai. or learn thru Java Online Training from India . Nowadays Java has tons of job opportunities on various vertical industry.

    ReplyDelete