For the time being, I thought it might be helpful to keep all the server information on one twiki rather than spreading it out with one twiki per server. If this does not work, we can create multiple pages.
NB: people seem incapable of adding years to their dates ... and they aren't ... so they're just spaced out, right?
Please note that some services are monitored : MonitoringSetup
_Please use following convention when editing the wiki: Most recent additions go at the top of the list, use dating consistent with the other entries. (Add year if blank.)_
Fri Oct 31 10:15:23 PST 2003
- Sarai is back up, appears to have been a brownout at the colo, mail spool is being processed and sent out.
Thu Oct 30 23:45:00 PST 2003
- Colocation where lists server is located goes offline, other servers hosted there are also unavailable. Someone was contacted at 8am PST who is local and can investigate -- micah
Sat Oct 11 11:58:35 EDT 2003
- bad disk identified, copying data to a better disk, running integrity tests on the hardware.
Fri Oct 10, time unknown
- Database server connected to staughton had a massive disk failure, files have not been lost, just "orphaned", folks are working hard at getting things back up. All sites on staughton affected.
Fri Jul 11 19:30:00 UTC 2003
- Stallman back on line. Raid problem dealt with - although will need to deal with degraded state of raid soon.
Fri Jul 11 13:29:62 CEST 2003
- there is a replacement page for
sites on stallman, hosted on ahimsa, at 220.127.116.11. The page
is "/imc/custom/index.html"; waiting for someone with DNS access to
point stallman.indymedia.org to 18.104.22.168
Fri Jul 11 00:00:27 PST 2003
- the scsi raid may have degraded, for certain the scsi alarm was on, and the system was unable to boot, kernel panic, unable to mount root fs on device .... unknown the expected resolution time, cross your fingers.
Tue Apr 1 00:14:27 PST 2003
- speakeasy colocation power turned off for testing purposes, we were not notified of this outage to properly shut down our boxes, and we hope everything comes back up gracefully when they are finished.... unknown the expected resolution time, cross your fingers that it is short.
31 Oct, Fri, 2003 noon GMT
sarai is down
11:58 < zl2tod> it looks like there's a routing loop at 20 * e2.valor.qa.cortland.net
(22.214.171.124) 1107.261 ms !H *
12:03 < zl2tod> port 80 connects, but act's like a black hole
3 Oct, Fri, 2003 early morning GMT
until at least 6 Oct Mon, 2003 06:00 GMT
- lists.indymedia.org has been down more than 72 hours :((( . Was this a planned shutdown or a hardware/software crash? (or is a paranoid hypothesis - authorities - correct)? Could any techie reading this who has an idea what's happening please at least say something
26 Mar, Wed, 2003 02:35 UT
- note that one needs to remove the power cord from the machine in order to reboot it. i.e., pushing the power button is not sufficient to cause it to boot properly. Shut it down, pull the cord out, put the cord back in, then press the power button to turn it on. Anyway, we had to fsck it, but it was relatively painless as fscks go. It rebooted and mounted the drives cleanly.
25 Mar, Tue, 2003 noon PST
- asked NOC to reboot machine, but video is unavailable (eternally flaky video card) and the box seems to have stopped during boot. noc has offered to hook up a serial console. no clue yet why sarai went down again, as graph of temperature, etc. at http://riseup.net/mrtg/env/
shows very consistent readings subsequent to installation of new fan. we will make another trip out there this evening.
25 Mar, Tue, 2003 7:20am PST
- sarai not reachable by ping.
24 Mar, Mon, 2003 8:45pm PST
- group of philly folks went to NOC to install new fan. sarai is as temperamental as ever but it did seem to respond well once we got it to boot (!) mail delivery seemed to be cruising along
27 Feb, Thu, 2003 4:00pm PST
- rolando rec'd a new CPU fan for sarai; will install when shoji can get me into the data center
26 Feb, Wed, 2003 12:30pm PST
- sarai down again, was noticed around 8am, came back up approximately at 3pm
23 Feb, Sun, 2003 12:07 PST
- sarai is rebooted and back up
22 Feb, Sat, 2003 7:10pm PST
- received ticket number [datarealm #99394] for reboot request.
22 Feb, Sat, 2003 5:45pm PST
- sarai appears down, micah sent an email to firstname.lastname@example.org
to see about getting it rebooted. At a meeting prior there were three people designated as having the ability to call to get it rebooted. We dont know who they are.
30 Mar, Sat, 21:26:51 EST 2002
- sarai is having problems. mail is being queued up but not delivered. The problem is with qrunner. Don't know how long it'll take to fix it.
29 Mar, Friday 17:28 GMT
- sarai is back online, and micah upgrading kernel. heat and/or ventilation may be issues in its instability; it will soon be moving to a 24/7 data center with easier reboot support.
29 Mar, Friday 0700 GMT
- from #tech sarai is down for the night (?). we've talked to shoji and he's going to get someone to reboot it in the morning. i (rolando) am (is) going to try to get in front of that to make sure that le rebooteur reads the screen and writes down what's on it before rebooting.
29 Mar, Fri 04:43:34 GMT 2002
- sarai is down again, witnessed a load spike and inodes % of maximum in use above 0.75, will need to look into increasing as per http://lists.suse.com/archive/suse-security/2000-Aug/0307.html
when system is up
28 Mar, Thursday around midnight GMT
- sarai's web interface is offline
28 Mar, Thursday 16:40 GMT
- sarai is back on-line
27 Mar, Wednesday 17:15 GMT
- sarai not responding
25 Mar, Monday 10:30 GMT
- sarai rebooted. logs indicate it stopped working at 19:31 local time (? - 0:31 GMT 24 mar)
24 May Sunday 21:30 GMT
- no email is currently delivered for Indymedia domain. Archives of email lists (list.indymedia.org) is not accesible either. Since it is weekend, it is hard to get access to the server, so the problem might persist until Monday.
23 May, Thurs, 13:00 GMT
sarai not accesible, email not delivered - toni
05 Apr, Fri, 09:40 PST 17:40 UTC
- image posted to tech site, of temperature data from sarai. possible diurnal fluctuations: http://tech.indymedia.org/front.php3?article_id=389&group=webcast
- 25 Jan, Fri 10:00pm PDT ish - Ender was offline for approximately 14 hours due to complications at our ISP caused by the MS SQL Slammer worm. [ more ]
Amazing ... this information dates from Saturday January 25, 2003 ... but could the writer include a complete date? Heck no. Why would a human being possibly think that /year/ matters ... we all live in the ''now'', right? (yaa yaa yaa, no reason to be sour ... year after year of dealing with individuals who shouldn't have passed Grade 3 existentialism makes me a happy camper ... inexplicably lame, again and again and again ... but you're here for the fun, so this is inappropriate. Pfffffft)
- 21 Jan, Tue 1:30 EST - Micah is working on Inglis Stallman connectivity, might be temporary connection problems.
- 13 Jun, Thurs 10:20 CDT ish - Stallman ssh connections failed and about 10 minutes later, the server stopped serving pages. Currently (11:30 CDT) it is still down and we cannot get a hold of anyone nearby.
- 25 Apr, Thurs 23:59:50* pg going down.........postgres is shut down as all dbs are now on inglis postal was also laid down
- 21 Apr, Sunday, 19:00 GMT (UTC)* [stallman] - /www was full. - stefani moved /www/uploads seattle to /var/uploads/seattle, changed the symlink in seattle's local/webcast after running remove-dups and remove-bad-thumbs scripts.
- 18 Apr, Thursday 13:30 GMT* [stallman] - WEB SERVING PROBLEMS - yesterday we had problems w interface that connects to inglis, that seems to be fixed. we had problems too with too few filedescriptors that squid was freaking out about. that was fixed too. however, apache is hitting high number of processes and then it stops serving. matze applied temporary fix http://lists.indymedia.org/mailman/public/imc-sysadmin/2002-April/001616.html
- 29 Mar, Friday 14:30 GMT* [stallman] On reboot the postgres taming script postal is not run, Micah ran it and load began to return to normal fluctuations For some reason stallman rebooted, as evidenced by lastlog: reboot system boot 2.2.18 Thu Mar 28 21:24 (2+10:52)
- 29 Mar, Friday 13:20 GMT* [stallman] experiencing high load (20-30) for the past 2 hours, very difficult to access web pages (waiting time nearly one minute)
- 18 May 2002* apparently /var filled up, and squid stopped responding. i freed up space in /var, even though a df does not show it. stopped and restarted squid, apache and postgres (the last on inglis), have tried to update the newswires, refresh, but no luck. some articles are visible if you know the id, but the summaries.inc file is
21 Jan, Tue 1:30 EST
- Micah is working on Inglis Stallman connectivity, might be temporary connection problems.
28 May, Tues, 09:03:25 PDT 2002
Publishing has been broken since yesterday, put the old kernel on, but that killed the net connection. In process of getting this fixed.
27 May, Mon, 15:55:18 PDT 2002
Rebooted inglis to force the netcard to full duplex 100, right now it is not back up... hopefully soon
Doesn't look like it is coming up, headed down to speakeasy to fix. -- micah
24 Apr, Wed
- all dbs are migrated from stallman
21 May, Tues 09:45 GMT-8
- blackcat is back up. sorry for the inconvenience -gekked
21 May, Tues, 00:32 GMT-8
- blackcat has been down for the past few hours, and we are currently not able to reach the individuals with access to the physical location. box can be pinged but no services are running. we estimate being back online within the next 10 hours. - email@example.com
22 Apr, Monday Apr 22 19:00 UTC
- after being down for about 12 hours, blackcat is back up and all services should be back to normal. it took some time to get the box back up due to a miscomunication between me and the ISP, and due to a bad rc script. also, postgres seems to have had some problems over the weekend, so sites relying on that database may have seen some ill behaviour. i'll be looking into and and contacting those IMC's affected directly. - firstname.lastname@example.org
Kropotkin is the development server.
Kropotkin recently moved from Los Angeles to the SF Bay Area where it is awaiting a new colocation berth. Gentoo linux has been installed. A backup of the old (hacked) hard drive contents is being uploaded to Berkman. Contact email@example.com
Gauss hosts <contact.indymedia.org>
- 26 Jun, Sat - judi has been unreachable for a month or two, with intermittent problems before that.
- 9 Oct, Wed 8:20pm PDT ish - Judi's upstream provider seems to be having a problem. I'm on dsl just half a mile away and am connecting fine. I've tried calling over to the center but nobody is there. I had keys until they had to change the locks because of some inconsiderate earth firsters. I'm not sure what to do. Hopefully it'll come back online. -rabble
Berkman hosts <radio.indymedia.org>
- 01 Jun 2002