How to set up a mirror of nyc.indymedia.org

new York Indymedia needs solid, well connected servers for the mirroring of nyc.indymedia.org.. We look for multihomed T3 connections or better, in order to ensure high reliability and low latency. Currently, we do not monitor the individual bandwidth consumption of the site, although it has not been as great as some other mir sites. Diskspace is currently (November 2005) 11G.

Dedicated servers are generally preferred. In particular, we advise against running critical applications on the same host; if you do, please plan to closely monitor resource usage (including allocation of memory and processor time) and the results of the scripts to update the mirror. We attempt to provide timely notification of problems, but we're a volunteer organization running a 24/7 network and techies are not always available as needed.

A dedicated IP address for the site is also desirable in order that the mirror can be set up to run via HTTPS.

There are other Indymedia sites that don't have as many resources as the New York one, if you can offer capacity for other sites please add your contact info to the IMCServerOffers page. See also the global MirrorHowTo page.

A lot of the New York wiki mir documentation is based upon UK information: check out UkHostingComparisons for considerations for hosting for a new publishing server for MiR indymedia sites and UkCrypto for ideas about using more encryption.

help and contact info

Please ask in #nyc on irc.indymedia.org or on imc-nyc-web list if you need help. You are also advised to join the list.

httpd.conf

Use the imc-nyc-mirror-httpd.conf file you can get here as a basis for your mirror:

Please retain the config for not logging IP addresses, the sample file has comments explaining what you might need to change.

Compression

You have save a lot of bandwidth if you serve html files compressed gzip. You can test if you are using apachebench:

  ab -v 4 -n 1 -H "Accept-Encoding: gzip" http://publish.nyc.indymedia.org/ 

If it's working then you will get this line in the response header:

  Content-Encoding: gzip

Apache 2.x has mod_deflate built in and you just need to add this configuration file to enable it on Fedora / Red Hat, on debian you also need to enable it:

 /usr/sbin/a2enmod deflate

Apache 1.3.x needs to have mod_gzip complied, on debian it's simply a matter of doing:

  apt-get install libapache-mod-gzip

And then installing this configuration file:

download the site

NOTE: from here on hasn't been edited properly yet!!!

Before using rsync to update the site use ncftp to download the tgz archive (10Gb), don't use wget as it often has a 2.5gig file size limit:

ncftpget ftp://ftp.indymedia.org.uk/pub/imc-uk.tar.gz

Then extract it:

tar -zxvf imc-uk.tar.gz

And put it somewhere, for example:

/var/www/www.indymedia.org.uk

rsync

You can do the first update of the downloaded site using rsync like this:

rsync -vazL  rsync://rsync.nyc.indymedia.org/nyc.indymedia.org/ /var/www/nyc.indymedia.org/

If rsync fails this doesn't matter, just keep running the command till you have it all updated. Alternatively use the pull.sh script below to do the initial sync.

Please make sure you have an up to date copy of the site before you set up the cron jobs because it it's very bad for the server if there are too many rsyncs running at once!

rsync scripts

There are a couple of scripts for updating the site:

The pull.sh one should be run every hour or two, it does an rsync of the whole site and the fast-pull.sh can be run a lot more often, perhaps every 10 or 20mins -- please run these scripts on the commant line and time them before setting up the cron jobs.

rsync cron job

Then you will need to set up a cron job to get the site every hour, do this with crontab -e (you can do this as any user as long as they have write permissions for the data) and add the following (change to the path for the scripts you have installed):

 # rsync cron jobs for nyc.indymedia.org
 # ----------------------------------------
 # full sync
 # get the whole site every hour using the pull.sh script
 # (first number is no of mins past the hour, please use an unused number from below, and update the Wiki!)
 40  *   *  *  * /usr/local/bin/rsync-nyc.indy.sh
 # fast sync
 # get this months content using the fast-pull.sh script
 # (change the first number from 0 so that your mirror doesn't try to update at the same time as others) 
 0   */9 *  *  * /usr/local/bin/full-rsync-nyc.indy.sh  

The full sync time above should be changed to allow only one sync to happen at a time. Please use the following table:

* * * www0
05 * * * * www1
50 * * * * www2
10 * * * * www10
15 * * * * www4
20 * * * * www12
25 * * * * www8
35 * * * * www7
40 * * * * www10
45 * * * * alt1
50 * * * * www11
55 * * * * www9

Follow this link for instructions on how to set up a UkRsyncServer - it is probably similar for nyc!!

dns

Round robin dns is being used so that www.indymedia.org.uk will resolve to a different ip address at different times, the pages you will get will be from one of these mirrors:

If you have a mirror set up and updating and on a reliable connection then get in contact saying what your servers IP address is and we can give it an nycX subdomain and also add it to the round robin dns.

trouble shooting

Various oddities...

You will probably need to delete the front page and sym link it to the one in en, eg:

cd /var/www/nyc.indymedia.org/
rm index.html
ln -s en/index.html

debian apache modules

Apache 1.3.x

The site requires SSI, so you might need to install mod_include and also mod_env if your apache doesn't come with them.

# Re the SSI stuff; if your www-server startup script complains it does
# not know a certain command, chances are that you need to load another
# SSI module (see http://httpd.apache.org/docs/mod ). In the block below
# the modules needed with apache 1.3.2 in Debian Woody are loaded.
 LoadModule autoindex_module /usr/lib/apache/1.3/mod_autoindex.so
 LoadModule alias_module /usr/lib/apache/1.3/mod_alias.so
 LoadModule env_module /usr/lib/apache/1.3/mod_env.so
 LoadModule includes_module /usr/lib/apache/1.3/mod_include.so
 LoadModule dir_module /usr/lib/apache/1.3/mod_dir.so
 LoadModule mime_module /usr/lib/apache/1.3/mod_mime.so
 LoadModule access_module /usr/lib/apache/1.3/mod_access.so
 LoadModule config_log_module /usr/lib/apache/1.3/mod_log_config.so
 Options +Includes

Apache 2.x

First off you need to enable some modules:

  $ sudo /usr/sbin/a2enmod
  mime_magic
  include
  deflate
  headers

This fixes SSI.

Then to use mod_deflate you need to copy deflate.conf into /etc/apache2/mods-available and symlink it:

  cd /etc/apache2/mods-enabled
  ln -s /etc/apache2/mods-available/deflate.conf

You can check it's serving content gzipped:

  ab -H 'Accept-Encoding: gzip' -v 4 http://nyc.indymedia.org/ | grep Content-Encoding = = _not sure about this bit_

status

  • *There is no mirror status page yet.

home mirrors

Even if you just have a broadband connection, and a desire to help then we still want to hear from you. These connections won't be put in the main rotation because they will not be able to cope with the load and are generally not reliable enough, but they are great for backups and a possible distributed DSL scheme we'd like to look into with other imcs.

If you have a domestic mirror set up ask on the imc-nyc-web list or in #nyc on irc.indymedia.org for a domain name to be set up for your site.

Even if you do not have any hard drive space or extensive technical knowledge then we may be able to help you to obtain these. Please see the other contact info below.


-- GarconDuMonde - 22 Nov 2005
Topic revision: r1 - 23 Nov 2005, GarconDuMonde
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback