Reverse Proxy and Load Balancing using Apache mod_proxy

Reverse Proxy is a type of proxy server that retrieves the resource on behalf of the client such that the response appears to come from the reverse proxy server itself. One useful application of reverse proxy is when you want to expose a server on your internal network to the internet.

Load Balancing is a technique of distributing requests into several backend servers, hence providing redundancy and scalability. Many Java EE app servers came built-in with clustering feature allowing session to be replicated across all nodes in the cluster.

Apache has few modules that can be used to setup the two above. Following is a guide on how to set it up.

  1. Ensure you have an apache-httpd installed. This guide is written against httpd 2.4. If you’re on OSX most likely you’ve already got httpd installed.
  2. Figure out where is your httpd.conf file located. Typically it’s at $HTTPD_ROOT/conf/httpd.conf.
  3. Enable following modules. Module can be enabled by uncommenting (removing the ‘#’ character at the beginning) the LoadModule directive on your httpd.conf:
    LoadModule proxy_module modules/mod_proxy.so
    LoadModule proxy_http_module modules/mod_proxy_http.so
    LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
    LoadModule slotmem_shm_module modules/mod_slotmem_shm.so
    LoadModule lbmethod_byrequests_module modules/mod_lbmethod_byrequests.so
    
  4. (Optional) depending on your operating system setup, you might need to change the user & group httpd will use to start the process. On my OSX I have to change it into my own user / group (or root). Use the User and Group directive for this
    User gerrytan
    Group staff
    
  5. Add the following proxy / load balancer configuration
    <Proxy balancer://mycluster>
      BalancerMember http://192.168.1.101:8080 route=node1
      BalancerMember http://192.168.1.102:8080 route=node2
      ProxySet stickysession=BALANCEID
    </Proxy>
    
    ProxyPassMatch ^(/.*)$ balancer://mycluster$1
    ProxyPassReverse / http://192.168.1.101:8080/
    ProxyPassReverse / http://192.168.1.102:8080/
    ProxyPassReverseCookieDomain 192.168.1.101 localhost
    ProxyPassReverseCookieDomain 192.168.1.102 localhost
    

    The <Proxy> directive specifies we are defining a load balancer with a name balancer://mycluster with two backend servers (each specified by the BalancerMember directive). In my case my backend servers are 192.168.1.101:8080 and 192.168.1.102:8080. You can add additional IPs to your network interface to simulate multiple backend servers running in different hosts. Note the existence of the route attribute which will be explained shortly.

    The ProxySet directive is used to set the stickysession attribute specifying a cookie name that will cause request to be forwarded to the same member. In this case I used BALANCEID, and in my webapp I wrote a code that will set the cookie value to balancer.node1 or balancer.node2 respectively. When apache detects the existence of this cookie, it will only redirect request to BalancerMemeber which route attribute matches the string after the dot (in this case either node1 or node2). Thanks to this blog article for explaining the step: http://www.markround.com/archives/33-Apache-mod_proxy-balancing-with-PHP-sticky-sessions.html.

    The ProxyPassMatch directive is used to capture and proxy the requests that came into the apache httpd server (the proxy server). It takes a regular expression to match the request path. For example, assuming I installed my httpd on localhost:80, and I made a request to http://localhost/foo, then this directive will (internally) forward the request to http://192.168.1.101:8080/foo.

    The ProxyPassReverse directive is used to rewrite http redirect (302) sent by the backend server, so the client does not bypass the reverse proxy.

    ProxyPassReverseCookieDomain is similar to ProxyPassReverse but applied to cookies.

Few words of warning

There are plenty methods of configuring reverse proxy, this guide uses http reverse proxy which might not deliver the best performance. You might also want to consider AJP reverse proxy for better performance.

Installing GNU Toolchain on OS X 10.8

Having GNU Toolchain handy is always a good thing because you’ll never know when you have that urge to “download the source and compile it yourself”. However it took my a while to figure it out the first time I got my macbook air.

A lot of posts and forum threads suggest installing Xcode which come bundled with a set of toolchain, but apparently newer version of Xcode (v 4.4.1 at the time this post is written) no longer installs the binary into the standard location (/usr/bin, /usr/lib and so on) — instead it’s all placed inside /Applications/Xcode.app to fit with newer App Store style packaging. This means you can’t access the toolchain from terminal shell.

Thanks to http://stackoverflow.com/a/9420451/179630, the trick is once you’ve got Xcode installed, go to Xcode > Preferences > Downloads and install the Command Line Tools component.

How to Add Additional IP to a Network Interface on Mac OS X

See here for similar guide on Windows.

After half an hour surfing various pages, I’ve finally figured out how to assign additional IP to a network interface on Mac OS X (thanks to this apple discussion thread: https://discussions.apple.com/thread/885005?start=0&tstart=0).

Adding another IP to your network interface is a handy way of running multiple instance of a process listening on the same port (eg: Tomcat container). This is very useful to perform High-Availability (HA) setup locally on laptop.

This guide is written against OS X Mountain Lion (10.8), but hopefully not that much different for older / newer OS X version.

  1. Open Network Preferences (System Preferences > Network). On the left list box each connection made through available network interfaces are listed. On below screenshot I have two connections configured for my wireless and bluetooth network interface respectively.
  2. On OS X you need to “duplicate a connection” to add an IP to an existing network interface. Typically you would add additional IP to your primary network interface (eg: your wireless or LAN card) so it’s reachable from other host in your network. In my macbook air, since it doesn’t have a LAN port, I used the wifi network interface. Select your primary network connection (“Wi-Fi” in my case), click on the gear icon at the bottom and select Duplicate Service… You might need to click the padlock icon first to perform administrative task.
  3. Enter a name for the new connection, eg: Tomcat NIC 2 and hit Duplicate. The new connection will be created. Although it’s a different connection, it still uses the same network interface (in my case the wifi network interface). Initially the connection might not have an automatically assigned IP.
  4. If the connection does not have an automatically assigned IP, assign it manually. From Network Preferences, select the new connection, and click Advanced… button. Go to the TCP/IP tab, select Configure IPv4 Manually, provide an IPv4 address and subnet mask. Ensure the IP isn’t already used in your network (you can check it using ping).

You can verify the virtual interface has been created by using ifconfig command. Notice in my case I have got more than 1 IP assigned to my wifi interface (en0)


$ ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
 options=3<RXCSUM,TXCSUM>
 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
 inet 127.0.0.1 netmask 0xff000000
 inet6 ::1 prefixlen 128
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
 ether 70:56:81:c0:c1:4f
 inet6 fe80::7256:81ff:fec0:c14f%en0 prefixlen 64 scopeid 0x4
 inet 192.168.1.101 netmask 0xffffff00 broadcast 192.168.1.255
 inet 192.168.1.2 netmask 0xffffff00 broadcast 192.168.1.255
 inet 192.168.1.102 netmask 0xffffff00 broadcast 192.168.1.255
 media: autoselect
 status: active
p2p0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 2304
 ether 02:56:81:c0:c1:4f
 media: autoselect
 status: inactive