Archive for May 2nd, 2012

cron screwyness

While trying to diagnose a problem with my VMs (namely, why starting a fedora 16 based vm fails to bring up the network connection), I ran into a strange issue – on my dom0 and some of the domUs, logrotate hasn’t been running, leaving me with insanely long logfiles!

So I set out trying to understand where the problem came from. Both the f16-based dom0 and domU had old logs, but my f14 domU was still working. That pointed to an update failing. And because logrotate never ran, I’ve got the yum logs of what was installed.

logrotate by default is setup to append the date to the filename, so I knew when logrotate was last run – 20111127. So anything between 27th November and 4th December could be the culprit.  To start with, I looked at cron in particular, since that’s what supposed to run logrotate.

When I checked in my dom0, I got this:

[[email protected] log]# grep cron yum.log
 Dec 01 00:23:46 Updated: cronie-anacron-1.4.8-2.fc14.x86_64
 Dec 01 00:23:48 Updated: cronie-1.4.8-2.fc14.x86_64
 Dec 01 01:38:54 Updated: cronie-anacron-1.4.8-2.fc15.x86_64
 Dec 01 01:38:56 Updated: cronie-1.4.8-2.fc15.x86_64
 Dec 01 01:38:58 Updated: crontabs-1.11-2.20101115git.fc15.noarch
 Dec 01 03:05:41 Updated: cronie-anacron-1.4.8-10.fc16.x86_64
 Dec 01 03:05:43 Updated: cronie-1.4.8-10.fc16.x86_64

Which showed me that I upgraded my dom0 to f16 on December 1st. Which is kinda helpful – I did upgrade cronie and cronie-anacron. Also, crontabs was upgraded, but only in F15. Then I looked at my domU. Again, judging from the dates in /var/log, I was looking for something between November 7 – 13. And lo and behold, I found lines that looked exactly like those in my dom0:

[[email protected] log]# grep cron yum.log
Aug 20 15:20:05 Updated: cronie-1.4.8-2.fc14.x86_64
Aug 20 15:20:05 Updated: cronie-anacron-1.4.8-2.fc14.x86_64
Nov 07 18:33:50 Updated: cronie-anacron-1.4.8-2.fc15.x86_64
Nov 07 18:33:54 Updated: cronie-1.4.8-2.fc15.x86_64
Nov 07 18:33:54 Updated: crontabs-1.11-2.20101115git.fc15.noarch
Nov 07 20:50:42 Updated: cronie-anacron-1.4.8-10.fc16.x86_64
Nov 07 20:50:44 Updated: cronie-1.4.8-10.fc16.x86_64

Except the F14 update was done in August, before the problem started. So, hmm, maybe the upgrade to Fedora 15 broke it. It would certainly explain why the F14 domU is still working fine. And I just happened to have an F15 domU that I had yet to upgrade to F16.

But it was strange – when I checked it, logrotate was working fine. /var/log had stuff timestamped 20120429, so the upgrade to F15 couldn’t have been the problem. logrotate kept an old version of yum.log, so I was able to look for cron related entries:

[[email protected] log]# grep cron yum.log*
yum.log:Apr 19 13:16:38 Updated: cronie-anacron-1.4.8-4.fc15.x86_64
yum.log:Apr 19 13:16:40 Updated: cronie-1.4.8-4.fc15.x86_64
yum.log-20120101:Sep 12 18:54:29 Updated: cronie-anacron-1.4.8-2.fc15.x86_64
yum.log-20120101:Sep 12 18:54:30 Updated: cronie-1.4.8-2.fc15.x86_64

Which wasn’t very helpful – I was using virtually the exact same versions that I had upgraded my F16 systems to – 1.4.2. So it wasn’t the F14 -> F15 upgrade. So maybe it was the F15 -> F16 upgrade?

I puzzled over this, then I saw a log named ‘cron’. Which turned out to really be key:

[[email protected] log]# tail -n 5 cron
Dec  1 01:39:01 elemental crond[2011]: (*system*) RELOAD (/etc/cron.d/0hourly)
Dec  1 01:41:01 elemental crond[2011]: (*system*) RELOAD (/etc/cron.d/smolt)
Dec  1 01:43:01 elemental crond[2011]: (*system*) RELOAD (/etc/cron.d/sysstat)
May  1 23:49:29 elemental crontab[9780]: (root) BEGIN EDIT (root)
May  1 23:49:41 elemental crontab[9780]: (root) END EDIT (root)

The crontab made sense – I run crontab -e to see if logrotate was in root’s crontab (which it wasn’t), but the crond entries were interesting – for one, it showed me the exact time it stopped, but also told me that cron has a daemon. And guess what was part of the F15 -> F16 upgrade? Moving stuff to the new systemd.

So I quickly checked crond’s status on a F16 machine:

[[email protected] log]# systemctl status crond.service
crond.service - Command Scheduler
          Loaded: loaded (/lib/systemd/system/crond.service; disabled)
          Active: inactive (dead)
          CGroup: name=systemd:/system/crond.service

Which, oops, that looks bad. But the F15 machine looked a lot better:

[[email protected] log]# systemctl status crond.service
crond.service - Command Scheduler
          Loaded: loaded (/lib/systemd/system/crond.service)
          Active: active (running) since Mon, 30 Apr 2012 14:16:01 +0800; 1 day and 10h ago
        Main PID: 831 (crond)
          CGroup: name=systemd:/system/crond.service
                  â 831 /usr/sbin/crond -n

And finally the F14 machine looked like this:

[[email protected] init.d]# service crond status
crond (pid  1118) is running...

So it seems the fix is to enable crond. I have no clue why crond would be disabled between F15 -> F16 upgrades, but I’m seeing it on two different F16 systems, both of which were upgraded from F15. I might do a clean install of F16 and look, as well as try an F15 -> F16 upgrade, but that’s for later.

For now, I’m doing a systemctl enable crond.service followed by systemctl start crond.service on the three F16 systems and calling it a day.

, ,

No Comments