September 27, 2003

Egg everywhere

The current Recommind engine, when it has as much data in memory as this one has, has a little problem with garbage collection. The simple model, garbage collect frequently and often, essentially creates tiny outages every minute or so. Last night we restarted the engine in order to switch over to garbage collection every night. Recommind's new version will obviate the problem, but for the next few weeks we're choosing the overnight outage over the frequent micro-outages.

From top: Memory: 24G real, 370M free, 28G swap in use, 40G swap free

So, we also ran a back up of the DB2 database last night. The previous backup, it turns out, was significantly smaller. That may be because we haven't deleted some of the test tables we made for determining whether clustered indexing would improve performance. I doubt we've had *that* many folks register.

I am aware of this size differential because last night's back up failed, hanging the system from 3 am to 5:45 am. That's when my spouse tapped me and said, "Did you get up when your 3 am alarm went off?" The system then hung for another handful of minutes as i tracked down the phone number of the DBA. He made more space and began the backup all over again. I would have liked to postpone until midnight tonight, but apparently we would have had to block all registrations all day at best. The full backup took about three hours and created two parts each 199.98 GB in size.

So we're back up now.

I spent the time looking at the resolved addresses of the denied parties. We had sent out a message to all the RLG Member representatives yesterday at 4 pm. I had a nice sense of how global the membership is as i watched denials from Oxford to Hawaii.

If anyone sees Murphy, please tell him to throw the book at someone else for a while.

Posted by judielaine at September 27, 2003 11:11 AM | TrackBack
Comments