Reverse Proxy Server: mod proxy html
Reverse Proxy Servers: Introduction | Basics | mod proxy html | IIS back-end server |
Uniform Server 3.5-Apollo Reverse Proxy. |
If you are going to do any serious work with reverse proxies you will require one essential module mod proxy html it allows embedded page links to be written on-the-fly. It is a third party module not included in the standard Apache package. This page provides details where to obtain and how to use it.
To understand why mod proxy html is essential examples on this page, proxy real applications (MediaWiki, Wordpress and Joomla) using the ready to go mini-servers.
Downloads
Mini-Servers
I have assumed you have downloaded the following mini-servers for details check out Mini Servers Ready To Go
- Mini Server 15 - MediaWiki 1.12.0
- Mini Server 16 - WordPress 2.6.1
- Mini Server 18 - Joomla 1.5.6
- Mini Server 20 - Reverse Proxy (Note includes mini-server 6)
mod proxy html
Mini Server 20 includes mod_proxy_html 3.0.1 for the latest version check out Alachelounge
The mini server uses the files extracted from:
mod_proxy_html-3.0.1-w32.zip 27 Jun '08 485K - output filter to rewrite HTML links in a proxy situation
Found on the download page
Installing - mod proxy html
Extract all files from mod_proxy_html-3.0.1-w32.zip. This creates a new folder mod_proxy_html-3.0.1-w32 inside this you will find folder mod_proxy_html containing files that need to be copied to Apache as follows:
- Copy file mod_proxy_html.so to folder \udrive\usr\local\apache2\modules
- Copy file libxml2.dll to folder \udrive\usr\local\apache2\bin
- Copy file proxy_html.conf to folder \udrive\usr\local\apache2\conf
httpd.conf
Open Apache's configuration file: httpd.conf Located in folder: \udrive\usr\local\apache2\conf
# =================================================== LoadModule proxy_module modules/mod_proxy.so LoadFile /usr/local/apache2/bin/libxml2.dll LoadModule rewrite_module modules/mod_rewrite.so |
Add the lines highlighted bold. LoadFile /usr/local/apache2/bin/libxml2.dll - forces the executables to be loaded. LoadModule proxy_html_module modules/mod_proxy_html.so - load the Apache module. |
Towards the end of the configuration file add the line highlighted bold just above the virtual host:
# =================================================== Include conf/proxy_html.conf NameVirtualHost * <VirtualHost *> |
Note: The examples use the default settings in proxy_html.conf. |
Proxy commands - revisited
When using mod proxy html the two basic commands ProxyPass and ProxyPassReverse are still required. However ProxyPassReverse is now contained in a Location block. This has the advantage of being more robust and allows additional commands to be added that specifically target the requested location.
From the previous page the following shows server 6 with original proxy commands and the new location block.
ProxyPass /info/ http://localhost:8086/ |
ProxyPass: Any requests to folder info are passed to the proxy server localhost:8086.This command, maps a remote server into your proxy server’s name space. ProxyPassReverse: Has the same syntax as ProxyPass, it rewrites the response URI to keep it pointing at the same place as seen by a browser. |
ProxyPass /info/ http://localhost:8086/ <Location /info/> |
ProxyPassReverse http://localhost:8086/ is actually shorthand for ProxyPassReverse /info/ http://localhost:8086/. This shorthand is possible because the ProxyPass directive is inside a <Location> block. This command masquerades one server as another. Apache adjusts URL's (location, Content-Location and URI headers) from the reverse-proxied server so they look as if they came from the server running the proxy engine. |
Both of the above produce the same result, when a user types the following http://some_domain/info/ into a browser the contents from server http://localhost:8086/ will seamlessly appear in folder info.
Example 4 - Standard Apache reverse proxy
An interesting experiment is to proxy our test servers using only ProxyPass and ProxyPassReverse. Change Uniform Server’s config file httpd.conf as shown below.
If using mini-serer 20 open folder example 4 and copy file httpd.conf up one level into folder \udrive\usr\local\apache2\conf and let it overwrite the existing file.
NameVirtualHost * <VirtualHost *> ServerName localhost:80 DocumentRoot /www ProxyRequests off <Proxy *> Order deny,allow Deny from all Allow from 127.0.0.1 </Proxy> ProxyPass /info/ http://localhost:8086/ <Location /info/> ProxyPassReverse http://localhost:8086/ </Location> ProxyPass /wiki/ http://localhost:8095/wiki/ <Location /wiki/> ProxyPassReverse http://localhost:8095/wiki/ </Location> ProxyPass /wordpress/ http://localhost:8096/wordpress/ <Location /wordpress/> ProxyPassReverse http://localhost:8096/wordpress/ </Location> ProxyPass /joomla/ http://localhost:8098/joomla/ <Location /joomla/> ProxyPassReverse http://localhost:8098/joomla/ </Location> </VirtualHost> |
Start servers
Results as follows Info With the exception of Test 3 the sites are proxied with no problems. Test3 fails because it uses web-root relative links while Test1 and Test2 use relative links Media Wiki Checkout some of the Wiki links, note they work as do images. I would conclude from this that Media Wiki uses relative links. Wordpress At first sight it appears to work correctly. On the hello world page click either of the two links Uncategorized or 1 Comment, note what is displayed in your browser’s address bar. It’s the real Wordpress server address (domain) this clearly shows that absolute addresses break out of the proxy server. Joomla Checkout some of the links, note they work as do images. I would conclude from this that Joomla uses relative links. |
With the configuration structure in place it is easy to apply mod proxy html to any servers showing link problems, Info and Wordpress. The other servers do not require rewriting if you do they will take a small extra amount of processing time.
Example 5 - mod proxy html
For completeness and to show how symmetrical the code is. I have applied mod proxy to our test servers.
Change Uniform Server’s config file httpd.conf as shown below.
If using mini-serer 20 open folder example 5 and copy file httpd.conf up one level into folder \udrive\usr\local\apache2\conf and let it overwrite the existing file.
Include conf/proxy_html.conf NameVirtualHost * <VirtualHost *> ServerName localhost:80 DocumentRoot /www ProxyRequests off <Proxy *> Order deny,allow Deny from all Allow from 127.0.0.1 </Proxy> ProxyPass /info/ http://localhost:8086/ ProxyHTMLURLMap http://localhost:8086 /info <Location /info/> ProxyPassReverse http://localhost:8086/ SetOutputFilter proxy-html ProxyHTMLURLMap / /info/ ProxyHTMLURLMap /info /info </Location> ProxyPass /wiki/ http://localhost:8095/wiki/ ProxyHTMLURLMap http://localhost:8095/wiki /wiki <Location /wiki/> ProxyPassReverse http://localhost:8095/wiki/ SetOutputFilter proxy-html ProxyHTMLURLMap / /wiki/ ProxyHTMLURLMap /wiki /wiki </Location> ProxyPass /wordpress/ http://localhost:8096/wordpress/ ProxyHTMLURLMap http://localhost:8096/wordpress /wordpress <Location /wordpress/> ProxyPassReverse http://localhost:8096/wordpress/ SetOutputFilter proxy-html ProxyHTMLURLMap / /wordpress/ ProxyHTMLURLMap /wordpress /wordpress </Location> ProxyPass /joomla/ http://localhost:8098/joomla/ ProxyHTMLURLMap http://localhost:8098/joomla /joomla <Location /joomla/> ProxyPassReverse http://localhost:8098/joomla/ SetOutputFilter proxy-html ProxyHTMLURLMap / /joomla/ ProxyHTMLURLMap /joomla /joomla </Location> </VirtualHost> |
Note: Each block contains three ProxyHTMLURLMap directives and a SetOutputFilter directive. Looking at the first block: ProxyPass /info/ http://localhost:8086/ ProxyHTMLURLMap http://localhost:8086 /info <Location /info/> ProxyPassReverse http://localhost:8086/ SetOutputFilter proxy-html ProxyHTMLURLMap / /info/ ProxyHTMLURLMap /info /info </Location> The first ProxyHTMLURLMap directive informs mod_proxy_html to rewrite any occurrences of http://localhost:8086 to /info. This means any absolute URLs will be rewritten to location /info/, which will then get proxied appropriately.
|
Summary
Apache’s basic reverse proxy is ideal for sites that use only relative links however it was never intended to proxy sites containing absolute links or web root relative links. To proxy these sites requires a third party module mod proxy html this rewrites links within a page before a page is served.
References: Running a Reverse Proxy in Apache
Ric |