For months I've had the multilog rotating issue where /var/log/mail/send/current and /var/log/mail/smtp/current do not rotate correctly, hanging qmail, smtpd, rblsmtpd, etc.
Doing some further debugging of this issue, its possible to always cause it to hang during rotating when triggered with a kill -ALRM of the multilog process.
Doing a ktrace of this shows the following during the actual rotating:
Any thoughts/guidance?
14084 multilog CALL sigprocmask(0x2,0xbfbfeaa8,0)
14084 multilog RET sigprocmask 0
14084 multilog CALL sigprocmask(0x2,0xbfbfeaa4,0)
14084 multilog RET sigprocmask 0
14084 multilog CALL read(0,0x804ca80,0x400)
14084 multilog RET read -1 errno 4 Interrupted system call
14084 multilog PSIG SIGALRM caught handler=0x80499da mask=0x0 code=0x0
14084 multilog CALL sigreturn(0xbfbfe7b0)
14084 multilog RET sigreturn JUSTRETURN
14084 multilog CALL sigprocmask(0x1,0xbfbfea94,0)
14084 multilog RET sigprocmask 0
14084 multilog CALL sigprocmask(0x1,0xbfbfea90,0)
14084 multilog RET sigprocmask 0
14084 multilog CALL fchdir(0x4)
14084 multilog RET fchdir 0
14084 multilog CALL fsync(0x6)
14084 multilog RET fsync 0
14084 multilog CALL close(0x6)
14084 multilog RET close 0
14084 multilog CALL rename(0x804b3b6,0x804b32b)
14084 multilog NAMI "current"
14084 multilog NAMI "previous"
14084 multilog RET rename 0
14084 multilog CALL open(0x804b3b6,0x20d,0x180)
14084 multilog NAMI "current"
14084 multilog RET open 6
14084 multilog CALL fcntl(0x6,0x2,0x1)
14084 multilog RET fcntl 0
14084 multilog CALL fchmod(0x6,0x1a4)
14084 multilog RET fchmod 0
14084 multilog CALL chmod(0x804b32b,0x1e4)
14084 multilog NAMI "previous"
14084 multilog RET chmod 0
14084 multilog CALL fork
14084 multilog RET fork 14151/0x3747
14084 multilog CALL wait4(0x3747,0xbfbfeab0,0,0)
14084 multilog RET wait4 14151/0x3747
14084 multilog CALL write(0x2,0x804d2e0,0x43)
14084 multilog GIO fd 2 wrote 67 bytes
"multilog: warning: processor failed in /var/log/mail/smtp, pausing
"
14084 multilog RET write 67/0x43
14084 multilog CALL gettimeofday(0xbfbfea40,0)
14084 multilog RET gettimeofday 0
14084 multilog CALL gettimeofday(0xbfbfea40,0)
14084 multilog RET gettimeofday 0
14084 multilog CALL poll(0xbfbfea58,0,0x139b)
14084 multilog RET poll 0
Starting with a fresh /var/log/mail/smtp -- here are the files generated by this run:
root:envy|8:52:35am|/var/log/mail/smtp: ls -la
total 20
drwxr-xr-x 2 qmaill qnofiles 512 Mar 1 08:52 .
drwxr-xr-x 10 qmaill qnofiles 512 Mar 1 08:48 ..
-rw-r--r-- 1 qmaill qnofiles 0 Mar 1 08:52 current
-rw------- 1 root qnofiles 5042 Mar 1 08:52 ktrace.out
-rw------- 1 qmaill qnofiles 0 Mar 1 08:50 lock
-rw-r--r-- 1 qmaill qnofiles 0 Mar 1 08:52 newstate
-rwxr--r-- 1 qmaill qnofiles 161 Mar 1 08:51 previous
-rw-r--r-- 1 qmaill qnofiles 0 Mar 1 08:52 processed
-rwxr-xr-x 1 qmaill qnofiles 7946 Mar 1 08:48 smtplog
-rw-r--r-- 1 qmaill qnofiles 0 Mar 1 08:50 state
As can be seen -- the file sits in previous, and doesn't get to create a new timestamped file
when I kill -9 multilog then start services again, it looks like this:
root:envy|8:53:56am|/var/log/mail/smtp: ls -la
total 32
drwxr-xr-x 2 qmaill qnofiles 512 Mar 1 08:53 .
drwxr-xr-x 10 qmaill qnofiles 512 Mar 1 08:48 ..
-rwxr--r-- 1 qmaill qnofiles 161 Mar 1 08:51 @4000000045e6daf4184a1de4.u
-rw-r--r-- 1 qmaill qnofiles 0 Mar 1 08:52 @4000000045e6daf418db3284.u
-rw-r--r-- 1 qmaill qnofiles 411 Mar 1 08:53 current
-rw------- 1 root qnofiles 16102 Mar 1 08:53 ktrace.out
-rw------- 1 qmaill qnofiles 0 Mar 1 08:50 lock
-rwxr-xr-x 1 qmaill qnofiles 7946 Mar 1 08:48 smtplog
-rw-r--r-- 1 qmaill qnofiles 0 Mar 1 08:53 state
for reference:
root:envy|8:54:35am|/var/log/mail/smtp: uname -a
FreeBSD envy.logicx.net 6.2-STABLE FreeBSD 6.2-STABLE #6: Sun Feb 25 14:39:36 EST 2007 root@envy.logicx.net:/usr/obj/usr/src/sys/LNS-i386 i386
LogicX,
Did you ever resolve this problem?
This appears to be happening to me on a 5.4->6.2 upgrade which I've just upgraded all of the toaster components for.
I've re-installed daemontools, but that didn't seem to fix it.
I've removed the svscan log/run files and regenerated them with toaster_setup.pl -s qmailconf
When multilog gets confused, it stops smtp, rblsmptd, etc. until I kill -9 the multilog process.
Terribly aggravating since it seems to die every few minutes.
The shower is the ideal place for troubleshooting epiphany.
I remembered this morning that during the toaster upgrade process, I'd gotten a warning to change my toaster-watcher.conf permissions to 600, because the new file had something more open.
At the time, it seemed rational to also change the toaster.conf permissions. That was the problem.
When smtplog or sendlog or pop3log were being called, they would fail silently because they couldn't get to their config.
I changed the toaster.conf to 644 perms, and now it's merrily rotating through those programs.
I am experiencing this issue too.
However I change the mods to 644 and its still occuring.
ls -l /usr/local/etc/toaster*
-rw-r--r-- 1 root wheel 27826 Nov 14 11:39 /usr/local/etc/toaster-watcher.conf
-rw-r--r-- 1 root wheel 6766 Oct 11 15:36 /usr/local/etc/toaster.conf
see this email for a couple ideas.
http://mail.tnpi.net/toaster/?0::11137
Well opening up old wounds...
We recently had an outage on our toaster, and fsck had to fix things on /usr/local and /var
We know that /usr/local/var/couriersslcache was broken, but not what else was damaged.
We started getting the issue mentioned here, where multilog hangs hanging the server for port 25 logins, they just time out. This as people said before happens every 3-4 hours.
After reading the thread I had a look at perms and ownership on toaster-watcher.conf and it was
-rw-r----- 1 root 501 27401 Oct 29 11:27 /usr/local/etc/toaster-watcher.conf
so I changed it to 644 so everything could read it, but haven't fixed the group yet as I'm not sure what group this file should belong to, obviously 501 is a bit suspicious.
After reading the mail at http://mail.tnpi.net/toaster/?0::11137
I did a
diff /usr/local/sbin/maillogs /var/log/mail/smtp/smtplog
which returned no output. But just to be double sure I copied the /usr/local/sbin/maillogs over /var/log/mail/smtp/smtplog
This didn't seem to fix the problem.
I need a quick and dirty solution so I've cronned a script to kill the logging and restart the server every few hours. I'd obviously like to really fix it.
I've noticed a couple of processes...
qmaill 6635 0.0 0.2 5312 1640 ?? S 3:31PM 0:00.02 sh -c ./smtplog
qmaill 6636 0.0 0.3 9152 3264 ?? R 3:31PM 0:00.13 /usr/local/bin/perl ./smtplog
running from time to time, what are these meant to be doing and how do I test that it's ok?
Oh yeah and when I run ./smtplog
[root@vega /var/log/mail/smtp]# ./smtplog
^C
It sits there doing nowt until I kill it.
interestingly
[root@vega /var/log/mail/smtp]# diff /usr/local/bin/maillogs /usr/local/src/Mail-Toaster-5.09/bin/maillogs
1,4c1
< #!/usr/local/bin/perl
<
< eval 'exec /usr/local/bin/perl -S $0 ${1+"$@"}'
< if 0; # not running under some shell
---
> #!/usr/bin/perl
But that seem to suggest that the one in use a) uses the right path to perl and b) checks to see if it's running in some shell or no, which wouldn't seem to be an issue.
So does anyone have any ideas as to what I might be able to do to fix this issue.
thanks in advance
M
Obviously something is not quite right. The best way to determine what is manually running smtplog. If you just run ./smtplog, then it'll just sit there waiting to read something in STDIN. So, give it something and see what happens:
cat @NNNN* | ./smtplog
and then see what happens.