Making Linux systems that don't suck. Part II
Introduction
Welcome to part II of this series. Here is Part I and here is Part 0.
Today we will talk about scheduling unattended tasks.
The classical way to do this in Unix is using the at and cron systems.
I say systems because they are not simple commands, they are a whole set of commands and daemons. Or rather two of them.
On one hand, you have at, atq, atrm, batch and atd.
On the other, there's crond and crontab.
So, all things considered, there are 7 commands you need to consider to manage unattended tasks [1].
While one of the pillars of Unix is that each command should do one thing and do it well, we have here one of the most peculiar results. You have seven commands to do one or two things, and they do them rather badly.
What's wrong with them?
The artificial division of work.
The usual reason for having two command sets is:
At is meant for one-shot, non-recurring tasks.
Cron is meant for repetitive tasks.
Let's see that again.
Do you use any kind of software to schedule your appointments?
Do you use two programs, one for things that repeat, one for those that don't?
Wouldn't that usage strike you as slightly nuts?
There is absolutely no reason why you should have two sets of programs with completely different workflows, syntaxes, CLIs, spools and whatever else for this. It's trivial to extend either one to do the job of the other.
But it's not done because...
This is old unmaintained code
Look at the Debian package page for cron and for at. These are the most common (but not the only) implementations used on Linux distros.
First unusual thing: they have no webpage. You know why? Because their last release is from when webpages were not all that common.
In the cron sources, the newest file is dated 08/06/96.
In at, it's dated from last year, but that's a bit deceiving: look at the at changelog:
A minor update in 2006.
A larger patch in 2005.
A critical patch in 2002.
Yes, this piece of software sees maintenance every two years or so. And it's not because it's finished and perfect, it's fixing things like "allow usernames longer than 8 characters" in 2005!
Did I mention that...
at/atq/atrm is SUID root?
crond runs as root?
Yup. 1996 code.
Which is why these programs....
Suck really, really bad for many uses.
First, cron (keep in mind that this is just an example, there are dozens of things cron can't do right).
Here's how you run a cron job on the last day of each month (a common business use case):
0 3 * * * if [ "`date -d tomorrow +%d`" = 1 ]; then run_the_script; fi
Of course that may work only on Linux, check this discussion for more fun.
That's also the answer to "How do you run a cron job every X minutes" when X is not a divider of 60. Or every Y hours, when Y is not a divider of 24: run it every day/hour/minute and make it fail when it shouldn't run.
Now think about how to make it run on the last monday of each month. Come on, I give you ten minutes to figure it out.
Now, let's conside at.
Here are some examples of time specification syntax that work with at:
teatime + 4 hours tomorrow now + 6 hours now + 23 minutes 10am Jul 31
And here are some that don't work:
now + 6 hours 23 minutes now + 6 hours + 23 minutes Jul 31 10am
What on earth is that syntax? What's wrong with simply specifying a date and time / a time from now?
By the way, here's what the at man page tells you to read to know the way to specify a time. Please, please check it out. It's a yacc grammar. Yes. You are supposed to understand yacc grammars to figure out at.
Now, assume I scheduled a job (just "ls /tmp"), and want to change it. Here's what I get to work with (not for the weary):
#!/bin/sh # atrun uid=500 gid=100 # mail ralsina 0 umask 22 MANPATH=/usr/man:/usr/X11R6/man:/opt/java/man:/opt/java/jre/man:/opt/kde/man:/opt/plan9/man:/opt/qt/man:/opt/qt4/man; export MANPATH : : ANOTHER 50 lines of environment variables : : cd /mnt/centos/home/ralsina || { echo 'Execution directory inaccessible' >&2 exit 1 } ls /tmp
The idea seems to be running the job in conditions as close as possible to the moment you scheduled it. Although that's already arguably wrong (I prefer my scheduled tasks to run in a controlled environment, not in whatever mess my xterm is!), I am pretty sure there is a way to do this less messy. Like a "edit environment" switch separate from "edit the job".
So, if cron and at suck, what should you use? Well there are some...
Alternative implementations
If you are going to keep on using the cron/at system, please don't use those. Investigate alternatives, here are a few pointers.
One conservative alternative for cron is bcron [2]:
It has been worked on this century.
It's designed to be secure (no suid binaries).
Has a maintainer.
Seems to be fully compatible with Vixie Cron (what you probably use now)
On the other hand, it adds no new features, and cron really needs some.
There is also fcron:
Actively maintained
Works well if your system is not up 24/7 (no need for anacron or other uglyness)
Many nice and useful new features.
Bruce Guenter (author of bcron) claims fcron has some issues like lack of /etc/cron.d and /etc/crontab and some parsing incompatibilities with Vixie cron, but I have not verified it, and they look like you can work around them.
There are other crons:
-
Dillon's cron, last updated in 2005.
-
Based on Vixie cron, last updated in 2002.
-
An interesting cron aiming to be small, secure and not rely on mail for notifications.
Last updated in 2005.
-
Written in Guile, allows for alternative config files in scheme. Interesting but a bit exotic for my taste. Last updated in 2006.
I am sure I am missing a few more.
At replacements:
-
Regular at, without the too-smart syntax. There seems to be no support for running commands as different users unless each runs a copy of the daemon, so I don't recommend this.
That's it. I can't find a decent implementation of at [3]. Sucks, doesn't it?
But maybe you don't need to use these because there are...
Incompatible Alternatives
If you are willing to be more daring, you can consider alternative job scheduling systems.
Here are some I have found:
-
Advantages:
You can schedule repetitive tasks starting after a certain date/time
You can schedule non-repetitive tasks as well as repetitive.
Says in the docs "Unix cron often needs a separated at daemon to execute one-time-jobs. This is nothing more than a design problem in cron." which means the author is not braindead.
Secure: no SUID crap.
Disadvantages:
Can't fully emulate cron, even ignoring syntax
Runs a copy of the daemon for each user who needs it (but it's small).
Last updated in 2004. Could mean it's stable, could mean it's abandoned.
-
Maybe someday, the cron/at replacement feature is planned, at least.
-
Intriguing but I can't find enough information (or a Linux port)
-
Several batch scheduling systems.
Oriented generally towards replacing at and running tasks with controlled environments.
This will not be an easy ride. Since your distro is currently planned around at and cron, when you install sotware they will also install cron jobs. If you switch completely to a non-compatible system, then many tasks will simply not be executed, and your system will rot.
And really, there are a bazillion packagers and software writers and distro managers and getting them all to abandon cron/at is not going to happen soon.
So, maybe we could have...
A reasonable replacement (a proposal)
Do a reasonable scheduler (something like uschedule) and add compatibility layers.
-
Create a daemon. One that doesn't run as root (like bcron's or uschedule's).
That daemon will take care of running the tasks.
-
Add an interface, one that makes sense, like a DB of scheduled tasks, with a 21st century syntax and capabilities (ie: not cron's). The interfaces mentioned below are simply translated into this.
This is the preferred interface. The others are just compatibility layers. Say that a lot. Give the admins nice web-based GUI-based and CLI-based interfaces to this.
Add to that daemon a Vixie-cron compatible interface (ie, crontab command, /etc/crontab, /etc/cron.d). Again, bcron already has this, uschedule doesn't (sadly). Maybe hack something out of both?
Implement an at command that schedules tasks against that daemon. Doing this for uschedule should be pretty simple, specially if you ignore most of the crazy at syntax.
It should not be terribly hard to do. But we won't see it anytime soon.
Roberto: Muy buen artículo. Sigue imprimiéndome la idea de "no uses lo que usan todos sólo porque sí; preguntate si son las mejores herramientas" que ya me impusiste cuando presencié tu charla en la CaFeConf, y eso es bueno.
Algo para criticar. Quizás le resulta ćomodo a los demás, pero a mí, como fiel lector de tu blog, me molesta de sobremanera la ventana de "snapshots" que aparece al pasar el mouse por los links.
Saludos, y gracias por seguir compartiendo tus conocimientos.
Me parece que la voy a sacar. Gracias por el comentario!
Thanks for your great article!
I am working on internet cafe managing software for Linux. At first I tried to use the standard tools like cron and at as much as possible to perform scheduled logouts, messaging etc. I've encountered a lot of problems using cron and at. Especially the 'feature' that grabs the environmental settings and try to mimic the sircumstances when the scheduled commands actually execute.
I've ended up with a home brewed solution where all the events are stored in a mysql database. There's one script that (initiated by cron) executes every minute to query the database for any scheduled tasks were execution time < now(). With that system I've crontrol over the environment variables no matter who puts in the new tasks in the database.
There is a need for a "free software" alternative to the bigger guys' systems; anyone accustomed to mainframe systems will have *wildly* higher expectations than the very primitive things that cron offers.
Notably, things like
- Logging results ("in email" doesn't count)
- Ability to manage jobs across multiple hosts so that I don't have to log onto 50 servers to look at the crontab on each one.
- Some notion of dependancies; after completing job A, run job B, then job C, and so forth
- Some degree of logic; if job A runs OK, run B next, otherwise run C next
- Some awareness of system state so that if I'm doing maintenance on a host, jobs will get deferred
- Ability to write scripts that manipulate the queues
For doing cross-host work, it only makes sense to use a proper database system, as opposed to hacking something up using hash tables or such. The OSS database offering proper temporal types and such is PostgreSQL, of course.
Doing something on the last Monday of the month isn't too hard (though it did take me about 10 minutes to get to this version of it). Check today's month against a week from today. When they're different, it's time to run.
0 3 * * * [ $(date +%m) -ne $(date -d '+7 days' +%m) ] && run_the_script
Pete: almost there!
It should be 0 3 * * 1
DOH! You are completely right, although I really had meant to put that in. I sometimes type three splats without even thinking about it... Which reminds me what a good thing peer review is. :)
Thanks for catching that. It's what I get for being so smug after asking a co-worker how he'd do the same thing. :) :) :)
Best regards!
The reason why I hate cron, is because when a cron job starts up, the cron job's environment is NOT the same as your own.
In particular, your startup file isn't source'd, so the PATH isn't the same. The reason why this is so terrible, is that "non-standard" programs like ruby and perl get put into sometimes /usr/bin or /usr/local/bin or ...
which means if you ever move your ruby or perl scripts, you're constantly tweaking the paths.
Another thing to hate:
For reasons I don't understand, SOMETIMES, errors get sent to the email of the person running the code and sometimes it doesn't.
The third thing I hate, is if the command line is TOO long, cron silently fails.