Reverse Proxy Server 2: mod proxy html

From The Uniform Server Wiki
Jump to navigation Jump to search

 

Uniform Server 5.0-Nano
Reverse Proxy.

If you are going to do any serious work with reverse proxies you will require one essential module mod proxy html it allows embedded page links to be rewritten on-the-fly. It is a third party module not included in the standard Apache package. This page provides details where to obtain and how to use it.

To understand why mod proxy html is essential the following pages provide working examples and iuuses that may be encountered.

Downloads

mod proxy html

Download the latest version of mod_proxy_html from Apachelounge

Go direct to Download page

Description mod_proxy_html-3.0.1-w32.zip 27 Jun '08 485K - output filter to rewrite HTML links in a proxy situation

Installing - mod proxy html

Extract all files from mod_proxy_html-3.0.1-w32.zip. This creates a new folder mod_proxy_html-3.0.1-w32 inside this you will find folder mod_proxy_html containing files that need to be copied to Apache as follows:

  • Create a new folder C:\server_a1\UniServer\usr\local\apache2\modules\mod_proxy_html
  • Copy file libxml2.dll and proxy_html.conf to the above folder
  • Copy file mod_proxy_html.so to folder C:\server_a\UniServer\usr\local\apache2\modules
  • Copy file proxy_html.conf to folder C:\server_a1\UniServer\usr\local\apache2\conf

httpd.conf

Edit Apache's configuration file: C:\server_a\UniServer\usr\local\apache2\conf\httpd.conf

LoadModule proxy_module modules/mod_proxy.so
#LoadModule proxy_ajp_module modules/mod_proxy_ajp.so
#LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
LoadModule proxy_connect_module modules/mod_proxy_connect.so
LoadModule proxy_ftp_module modules/mod_proxy_ftp.so
LoadModule proxy_http_module modules/mod_proxy_http.so
LoadModule proxy_html_module modules/mod_proxy_html/mod_proxy_html.so

  • Add the line shown highlighted in bold.
  • Loads module mod_proxy_html.so

Include conf/proxy_html.conf
NameVirtualHost *
<VirtualHost *>
ServerName localhost:80
DocumentRoot C:/server_a/UniServer/www

  • Towards the end of configuration file
  • Add the line shown highlighted in bold.
  • Place just above line NameVirtualHost.

Note: Examples use default settings in proxy_html.conf

Top

Proxy commands - revisited

When using mod proxy html the two basic commands ProxyPass and ProxyPassReverse are still required. However ProxyPassReverse is now contained in a Location block. This has the advantage of being more robust and allows additional commands to be added that specifically target the requested location.

From the previous page the following shows server_a with original proxy commands and the new location block.

ProxyPass /info/ http://localhost:82/
ProxyPassReverse  /info/  http://localhost:82/

ProxyPass: Any requests to folder info are passed to the proxy server localhost:82.This command, maps a remote server into your proxy server’s name space.

ProxyPassReverse: Has the same syntax as ProxyPass, it rewrites the response URI to keep it pointing at the same place as seen by a browser.

ProxyPass  /info/  http://localhost:82/

<Location /info/>
 ProxyPassReverse http://localhost:82/
</Location>

ProxyPassReverse http://localhost:82/ is actually shorthand for ProxyPassReverse /info/ http://localhost:82/. This shorthand is possible because the ProxyPass directive is inside a <Location> block.

This command masquerades one server as another. Apache adjusts URL's (location, Content-Location and URI headers) from the reverse-proxied server so they look as if they came from the server running the proxy engine.

Both of the above produce the same result, when a user types the following http://some_domain/info/ into a browser the contents from server http://localhost:82/ will seamlessly appear in folder info.

Top

Example 4 - Standard Apache reverse proxy

An interesting experiment is to proxy our test server using only ProxyPass and ProxyPassReverse. Change Uniform Server’s config file httpd.conf as shown below.

NameVirtualHost *

<VirtualHost *>
 ServerName localhost:80
 DocumentRoot /www

ProxyRequests off
<Proxy *>
  Order deny,allow
  Deny from all
  Allow from 127.0.0.1
</Proxy>

ProxyPass /info/ http://localhost:82/
<Location /info/>
  ProxyPassReverse  http://localhost:82/
</Location>

</VirtualHost>
  • Start servers
  • Result Page 2 still fails because it uses web-root relative links.

Repeat the following code for each back-end server you wish to test.

ProxyPass /info/ http://localhost:82/
<Location /info/>
  ProxyPassReverse  http://localhost:82/
</Location>
  • Replace folder name and port number as appropriate
  • Check for any link problems

With the configuration structure in place it is easy to apply mod proxy html to any servers showing link problems.time.

Top

Example 5 - mod proxy html

To apply mod proxy to our test server.

Change Uniform Server’s config file httpd.conf as shown below.

Include conf/proxy_html.conf

NameVirtualHost *

<VirtualHost *>
 ServerName localhost:80
 DocumentRoot /www

ProxyRequests off
<Proxy *>
  Order deny,allow
  Deny from all
  Allow from 127.0.0.1
</Proxy>

ProxyPass /info/ http://localhost:82/
ProxyHTMLURLMap http://localhost:82 /info
<Location /info/>
  ProxyPassReverse  http://localhost:82/
  SetOutputFilter proxy-html
  ProxyHTMLURLMap /          /info/
  ProxyHTMLURLMap /info      /info
</Location>



</VirtualHost>

Note: A block contains three ProxyHTMLURLMap directives and a SetOutputFilter directive.

ProxyPass /info/ http://localhost:82/
ProxyHTMLURLMap http://localhost:82 /info
<Location /info/>
  ProxyPassReverse  http://localhost:82/
  SetOutputFilter proxy-html
  ProxyHTMLURLMap /           /info/
  ProxyHTMLURLMap /info      /info
</Location>
  • First ProxyHTMLURLMap directive informs mod_proxy_html to rewrite any occurrences of http://localhost:82 to /info.
    • This means any absolute URLs will be rewritten to location /info/, which will then get proxied appropriately.
  • SetOutputFilter informs Apache to pass the proxied content through the proxy-html filter. This performs all rewritting as follows:
  • The next ProxyHTMLURLMap directive handles URLs that are server web root relative, for example /test3. This will be rewritten to /info/test3/ and proxied correctly.
  • The last ProxyHTMLURLMap directive prevents creating infinite looping.

Top

Test

Note: Before running test clear browser cache

Start both servers, type http://localhost/info/ into the proxy server (server_a)

Depending on your browser you will see something like the following:

FireFox:

  • Content Encoding Error
  • The page you are trying to view cannot be shown because it uses an invalid or unsupported form of compression.

Internet Explorer:

  • Internet Explorer cannot display the webpage
  • You are not connected to the Internet.
  • The website is encountering problems.
  • There might be a typing error in the address.

Opera:

  • ‹uÁ ƒ0†ï‚ïPú=·=ÍÃ.CÐÁ®ÝŒµ£µR

Certainly not the result you were expecting.

Note:

If you have tried out the mini-proxy server you will find its works. More importantly it uses the same mod proxy html component. So whats the difference? Mini servers are minimal and as such mask real configuration issues and interaction problems.

Top

Problem

Of the three browsers Opera has displayed the correct content, Firefox detects an error or mismatch and provides a clue while Explorer provides a useless blanket message.

The output Opera displays is a compressed version of the requested page. During page negotiation Opera agrees to accept a compressed page hence decompresses it, cannot render the page, decides it’s a text file and displays it accordingly.

Solution 1

The above sounds like a plausible explanation! A quick way to check this is to turn off compression.

Edit the two configuration files:

  • C:\server_a\UniServer\usr\local\apache2\conf\httpd.conf
  • C:\server_b\UniServer\usr\local\apache2\conf\httpd.conf

Locate this line in each file and comment out as shown:

#LoadModule deflate_module modules/mod_deflate.so

Test:

  • Clear browser cache
  • Repeat the above test
  • Works as expected
  • All links on page 2 are rewritten.

Top

Solution 2

Although solution 1 is viable it significantly impacts on bandwidth. The reason to have compression running was to reduce bandwidth. There are two compression paths first between browser and proxy second between proxy and back-end server.

Problems started when mod proxy html was introduced. This rewrites page links in pages served from the back-end server. It is logical to assume a back-end server compression issue and not the from-end.

Edit configuration file:

  • C:\server_a\UniServer\usr\local\apache2\conf\httpd.conf
  • Enable compression un-comment line as shown:
    LoadModule deflate_module modules/mod_deflate.so
  • This enables compression between browser and proxy server.

Test:

  • Clear browser cache
  • Restart server_a
  • Repeat the above test
  • Works as expected
  • All links on page 2 are rewritten.

This solution is more ideal; it preserves Internet bandwidth reduction assuming the back-end server is on a local network where the bandwidth hit is less significant.

Top

Summary

Apache’s basic reverse proxy is ideal for sites that use only relative links however it was never intended to proxy sites containing absolute links or web root relative links.

To proxy the above sites requires a third party module mod proxy html this rewrites links within a page before a page is served.

I have highlighted issues when run in conjunction with compression and offered a partial solution. On the next page I cover a complete solution.

Top


Ric