Caching reverse proxy (with Apache) as media mirror

The advantage of using a caching reverse proxy for mirroring media files (images etc.) instead of rsync-ing them is, that only the requested files are being transfered to the mirrors and not all old ones that are nearly never accessed. Thus less diskspace is required at the mirror and while setting up a new one the bandwidth- and time-consuming process of syncing all files is omitted.

Basically it's also possible to cache HTML files, but since they are getting modified very often (because of comments etc.) it's very hard to find out when the cache needs to be updated and it needs a lot more effort in configuring everything correctly.

On a Debian (lenny) system I did the following steps to set up the caching reverse proxy.

  • Create a directory where Apache can cache (writable for www-data): here I used /var/wwwcache.

  • Create /etc/apache2/sites-available/media.de.indymedia.org-proxy:

<VirtualHost *:80>
   ServerName media.de.indymedia.org

   # disable forward proxying
   ProxyRequests Off
   <Proxy *>
      Order deny,allow
      Allow from all
   </Proxy>
   # pass requests to the server that has all files
   ProxyPass / http://www3.de.indymedia.org/
   ProxyPassReverse / http://www3.de.indymedia.org/

   # should exist and be writable by apache:
   CacheRoot /var/wwwcache
   CacheDirLevels 3
   CacheDirLength 1
   # cache these paths:
   CacheEnable disk /rtsp
   CacheEnable disk /images
   CacheEnable disk /media
   CacheEnable disk /icon
   CacheEnable disk /style
   CacheEnable disk /static
   # 32 mb:
   CacheMaxFileSize 33554432
   CacheMinFileSize 1
   # 24 h:
   CacheDefaultExpire 86400
   # 10 d:
   CacheMaxExpire 864000
   # also use cache when client requested refresh
   CacheIgnoreCacheControl On
   # cache files without last modified date
   CacheIgnoreNoLastMod On
   # store all files
   CacheStoreNoStore On
   # because media items will never need a query string
   CacheIgnoreQueryString On

   LogFormat "noip - - %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %T %V" noip
   CustomLog  /dev/null noip
   ErrorLog /dev/null
</VirtualHost>

  • Enable required modules (proxy, proxy_http, cache, disk_cache):

a2enmod proxy proxy_http cache disk_cache

  • Enable the newly created vhost:

a2ensite media.de.indymedia.org-proxy

TODO: To prevent the cache from filling up the disk htcacheclean needs to be run periodically.

Documentation on caching and proxying with Apache:

-- BriKs - 14 Feb 2010
Topic revision: r1 - 14 Feb 2010, BriKs
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback