Thursday, June 14, 2007

Getting through that damn proxy! (using Java)

If you are on a corporate network, you know how annoying it can be to try and get through the proxy. And if you are a developer, it can be even more frustrating because you can't even test your networked application until you first get through the proxy.

Sun has a great article on what is built in to Java to handle network proxies. This one covers the latest features available since Java 1.5 that let you control use of a proxy at a per-socket and per-connection level. Check it out first:

Java Networking and Proxies

Now, that article is pretty good but it doesn't cover how to do proxy authentication. Thankfully, there is another article from Sun which covers how to use java.net.Authenticator. This class, and its related classes, is how you will be doing authentication. How you go about doing authentication with these classes is covered here:

Http Authentication

But wait! There is one prominent piece missing from what's built in to Java for dealing with proxies. And this is that it cannot make regular TCP tunnels through a HTTP proxy (using a Proxy.Type.HTTP Proxy with a Socket won't work). This support is provided in HTTP & HTTP proxies through the CONNECT method. Before getting into how to handle this in Java, let me just give a quick overview of how the process works:


  • First, the client starts a socket and sends out a HTTP CONNECT request to the HTTP proxy. This request indicates what host and port we are actually interested in connecting to through the proxy.

  • The proxy establishes a connection to the host and port we request.

  • The proxy then sets up a tunnel which allows data we send to the proxy to pass through to the actual host, and vice-versa.

  • Now, the proxy sends back a HTTP OK response on the socket we used to send it the request (if everything goes right - if not, you'll get a response with a HTTP error code).

  • Right after the HTTP OK response, the same socket is now what we use to do all communications with the actual host we wanted to connect to. The proxy will tunnel data between us and that remote host.



Now, this HTTP CONNECT method for establishing a tunnel through a HTTP proxy is typically used for establishing SSL connections through the proxy. This means the proxy is usually not going to monitor the actual traffic that goes out once the tunnel is established (if it were SSL traffic, it would be encrypted anyway). But the typical use of this method for SSL tunnels means that HTTP proxies are most likely going to have the following restriction: If you try to establish a tunnel through the proxy to a remote port other than 443 (the SSL port), the proxy is probably going to reject your request. That is, you couldn't use the CONNECT method to connect to somehost.com:3001 because 3001 is not the SSL port.

So, keeping all that in mind, do you have to write your own implementation for dealing with HTTP proxy tunneling? Nope - luckily, there is the Jakarta Commons HttpClient. It has a ProxyClient class which can set all this up for you and return the socket which is ready to have actual data sent over it (that is, after the HTTP OK response has been received from the proxy). For a quick start, check out the ProxyTunnelDemo sample.