vmysql: sql error[8]: MySQL server has gone away

Started by xerion, April 06, 2005, 01:22:34 PM

Previous topic - Next topic

xerion

First, I understand this is not a Toaster issue, but a vpopmail issue.  I am posting here in hopes someone has run into this and has some insight.

Since upgrading to vpopmail 5.4.6_1, I get the following email about once per day (always exactly the same):
Subject: Cron <root@mail> /usr/local/vpopmail/bin/clearopensmtpvmysql: error creating table 'relay': MySQL server has gone awayvmysql: sql error[8]: MySQL server has gone away

Repeated runs (100+) of clearopensmtp do not reproduce this message.

A quick review of the vauth.c/vmysql.c shows that error[8] occurs in the vupdate_rules function, namely when the "select ip_addr from relay" SQL statement fails.

I can conclude the issue is not with the select statement - if it were, the problem would not be as sporadic.  Also, repeated (100+) runs of this query with the mysql client do not generate this error.  As this is an extremely low-volume mail server, there are typically only 1-5 records in the relay table at any given time.

I am fairly certain I see what is occurring - vpopmail maintains persistent connections to the MySQL server - by default, MySQL closes inactive connections after 8 hours.  I believe this may be the cause for this behavior.  Unfortunately I cannot increase/disable the wait_time to test this theory, as I use the MySQL server for other purposes as well.  What really baffles me is why this was not an issue prior to 5.4.6_1.  I suppose I could  browse the CVS repository and see what changed recently, but I'm lazy. Very Happy

Obviously the message is not critical, it is simply a minor nuisance, which I can live with if need-be.

Has anyone else encountered this?  Any insight would be much appreciated.

xerion

(If I'm posting these to the wrong forum, I apologize - please move them)

This problem has now gotten MUCH worse.  I seem to recall having this issue with a previous version of vpopmail, but it was fixed!

Two subsequent delivery attempts to a local address failed with chkusr saying the address was unknown.  The mail log shows the error in the subject twice.  The delivery attempts were 33 seconds apart.

Reviewing vauth.c/vmysql.c I can see this error is occurring in valias_select, with the following SQL statement:
select valias_line from valias where alias = '%s' and domain = '%s'

The obvious side-effect is that deliveries are now being intermittently rejected with a permanent error.  At the present time, unless I can find a fix, the only work-around I can think of is to rebuild qmail-smtpd without the chkusr patch.

Suggestions?  Anyone?

LogicX

tail your mysql logs
tail the mysql error log
view your mysql status using mytop
verify your aren't exceeding your connection limits
if so, raise them in my.cnf or reduce your usage
--- May this post be indexed by spiders, and archived for all to see as my internet epitaph.
http://fpux.com" target="_blank">http://fpux.com

xerion

LogicX, thank you for your reply.

All the MySQL log shows is as follows:
050316 16:53:47  mysqld started
050316 16:53:48  InnoDB: Started; log sequence number 0 368950522
/usr/local/libexec/mysqld: ready for connections.
Version: '4.1.10a-log'  socket: '/tmp/mysql.sock'  port: 3306  FreeBSD port: mysql-server-4.1.10a

max_connections is is 500, of which typically 20-40 are in use.
mysql> show processlist;+--------+-------------+-----------+----------+-------------+---------+----------------------------------------------------------------+------------------+| Id     | User        | Host      | db       | Command     | Time    | State                                                          | Info             |+--------+-------------+-----------+----------+-------------+---------+----------------------------------------------------------------+------------------+|  87330 | vpopmail    | localhost | vpopmail | Sleep       |     343 |                                                                | NULL             ||  87331 | vpopmail    | localhost | vpopmail | Sleep       |     252 |                                                                | NULL             ||  89599 | replication | XXX:1211  | NULL     | Binlog Dump | 1142384 | Has sent all binlog to slave; waiting for binlog to be updated | NULL             || 109754 | vpopmail    | localhost | vpopmail | Sleep       |     327 |                                                                | NULL             || 109756 | vpopmail    | localhost | vpopmail | Sleep       |     967 |                                                                | NULL             || 109757 | vpopmail    | localhost | vpopmail | Sleep       |     371 |                                                                | NULL             || 109758 | vpopmail    | localhost | vpopmail | Sleep       |     353 |                                                                | NULL             || 109760 | vpopmail    | localhost | vpopmail | Sleep       |     393 |                                                                | NULL             || 109761 | vpopmail    | localhost | vpopmail | Sleep       |     666 |                                                                | NULL             || 109766 | vpopmail    | localhost | vpopmail | Sleep       |     325 |                                                                | NULL             || 109767 | vpopmail    | localhost | vpopmail | Sleep       |     666 |                                                                | NULL             || 181973 | wimgam      | XXX:50977 | wimg     | Sleep       |   15933 |                                                                | NULL             || 227191 | wimgam      | XXX:2938  | wimg     | Sleep       |    3980 |                                                                | NULL             || 236871 | wimgam      | XXX:2645  | wimg     | Sleep       |     998 |                                                                | NULL             || 237070 | wimgam      | XXX:1088  | wimg     | Sleep       |    1220 |                                                                | NULL             || 237644 | wimgam      | XXX:2218  | wimg     | Sleep       |    4709 |                                                                | NULL             || 238088 | wimgam      | XXX:2254  | wimg     | Sleep       |      10 |                                                                | NULL             || 238175 | wimgam      | XXX:1029  | wimg     | Sleep       |     267 |                                                                | NULL             || 238816 | wimgam      | XXX:4441  | wimg     | Sleep       |    1753 |                                                                | NULL             || 239411 | xerion      | XXX:4539  | NULL     | Sleep       |      75 |                                                                | NULL             || 241553 | syslog      | localhost | syslog   | Sleep       |    1148 |                                                                | NULL             || 241718 | xerion      | XXX:1675  | NULL     | Sleep       |       1 |                                                                | NULL             || 241727 | xerion      | localhost | dummy    | Query       |       0 | NULL                                                           | show processlist |+--------+-------------+-----------+----------+-------------+---------+----------------------------------------------------------------+------------------+23 rows in set (0.00 sec)


Anything else I should check?

LogicX

yes, turn on mysql logging
log=/path/to/file.log (in my.cnf)
and watch as the process connects, see what errors occur.
also use mysqladmin/mysqlcheck and verify that the database(s) are fine -- do a check, do an analyze/optimize
do various levels of checks

I enjoy having this in my my.cnf:
#this does an auto-check on tables during startup and repairs them
myisam-recover=BACKUP,FORCE

YMMV
--- May this post be indexed by spiders, and archived for all to see as my internet epitaph.
http://fpux.com" target="_blank">http://fpux.com

xerion

The vpopmail database tables are all in MyISAM format.
mysql> check table dir_control, lastauth, relay, valias, vpopmail;+----------------------+-------+----------+----------+| Table                | Op    | Msg_type | Msg_text |+----------------------+-------+----------+----------+| vpopmail.dir_control | check | status   | OK       || vpopmail.lastauth    | check | status   | OK       || vpopmail.relay       | check | status   | OK       || vpopmail.valias      | check | status   | OK       || vpopmail.vpopmail    | check | status   | OK       |+----------------------+-------+----------+----------+5 rows in set (0.09 sec)


I have enabled mysql logging.  An interesting side-effect of restarting MySQL is that I no longer see idle connections from vpopmail.  I will monitor the server and if the problem arises again, I will try to isolate the log events from that query.  Sample log entry:
050406 18:31:49       2 Connect     vpopmail@localhost on                      2 Init DB     vpopmail                      2 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "fiery" and pw_domain = "domain.net"                      2 Query       replace into lastauth set user='fiery', domain='domain.net', remote_ip='192.168.2.75', timestamp=1112837509                      2 Query       replace into relay ( ip_addr, timestamp ) values ( '192.168.2.75', 1112837509 )                      2 Quit


Thanks for your time.  I'll keep you posted.

xerion

Okay I'm not sure if this provides any insight, but I find it curious.

Notice in my previous post that vpopmail connected on MySQL thead #2 and immediately disconnected when it was done.  Compare that to below, where thread #248 was never closed, and several queries have been issued on it.  The querylog shows this behavior on several threads -- not just #248.

050406 19:29:48     248 Connect     vpopmail@localhost on vpopmail                    248 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "user1" and pw_domain = "domain.net"                    248 Query       replace into lastauth set user='user1', domain='domain.net', remote_ip='imap', timestamp=1112840988050406 19:52:07     248 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "user2" and pw_domain = "domain.net"                    248 Query       replace into lastauth set user='user2', domain='domain.net', remote_ip='imap', timestamp=1112842327050406 20:39:57     248 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "user1" and pw_domain = "domain.net"                    248 Query       replace into lastauth set user='user1', domain='domain.net', remote_ip='imap', timestamp=1112845197050406 20:43:19     248 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "user3" and pw_domain = "domain.net"                    248 Query       replace into lastauth set user='user3', domain='domain.net', remote_ip='imap', timestamp=1112845399050406 21:56:14     248 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "user4" and pw_domain = "domain.net"                    248 Query       replace into lastauth set user='user4', domain='domain.net', remote_ip='imap', timestamp=1112849774050406 22:00:15     248 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "user4" and pw_domain = "domain.net"                    248 Query       replace into lastauth set user='user4', domain='domain.net', remote_ip='imap', timestamp=1112850015                    248 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "postmaster" and pw_domain = "domain.net"                    248 Query       replace into lastauth set user='postmaster', domain='domain.net', remote_ip='imap', timestamp=1112850357


Again, show processlist is showing many idle connections from vpopmail.  After restarting MySQL, I checked my mail via POP3, IMAP and sent myself mail from gmail.  After each process, vpopmail closed its connection.  I am at a loss as to why it has stopped doing that:
mysql> show processlist;+------+-------------+-----------+----------+----------------+-------+----------------------------------------------------------------+------------------+| Id   | User        | Host      | db       | Command        | Time  | State                                                          | Info             |+------+-------------+-----------+----------+----------------+-------+----------------------------------------------------------------+------------------+|    5 | replication | XXX:1055  | NULL     | Binlog Dump    | 11379 | Has sent all binlog to slave; waiting for binlog to be updated | NULL             ||   28 | vpopmail    | localhost | vpopmail | Sleep          |   259 |                                                                | NULL             ||  162 | vpopmail    | localhost | vpopmail | Sleep          |   259 |                                                                | NULL             ||  248 | vpopmail    | localhost | vpopmail | Sleep          |   259 |                                                                | NULL             ||  327 | vpopmail    | localhost | vpopmail | Sleep          |   259 |                                                                | NULL             ||  406 | vpopmail    | localhost | vpopmail | Sleep          |   236 |                                                                | NULL             ||  427 | vpopmail    | localhost | vpopmail | Sleep          |    12 |                                                                | NULL             ||  428 | vpopmail    | localhost | vpopmail | Sleep          |   259 |                                                                | NULL             ||  429 | vpopmail    | localhost | vpopmail | Sleep          |   259 |                                                                | NULL             ||  431 | vpopmail    | localhost | vpopmail | Sleep          |   259 |                                                                | NULL             ||  432 | vpopmail    | localhost | vpopmail | Sleep          |   259 |                                                                | NULL             ||  494 | syslog      | localhost | syslog   | Sleep          |   171 |                                                                | NULL             || 1506 | xerion      | XXX:2603  | NULL     | Sleep          |   199 |                                                                | NULL             || 1515 | DELAYED     |           | syslog   | Delayed insert |   171 | Waiting for INSERT                                             |                  || 1539 | xerion      | localhost | dummy    | Query          |     0 | NULL                                                           | show processlist |+------+-------------+-----------+----------+----------------+-------+----------------------------------------------------------------+------------------+15 rows in set (0.00 sec)

Does anyone else see vpopmail maintaining persistent connections to MySQL on their systems?

xerion

Okay one more tidbit of info... restarting courier-authdaemon clears the vpopmail connections, so obivously it's Courier-IMAP maintaining the persistent MySQL connections.  Perhaps I'm looking in the wrong direction on this one - certainly Courier has nothing to do with qmail-smtp/chkusr, and I have yet to see a failure authenticating to the server while checking mail. Confused

xerion

This problem just occured again - unfortunately the querylog does not show any activity:
2005-04-07 13:47:09.664656500 vmysql: error creating table 'valias': MySQL server has gone away2005-04-07 13:47:09.664692500 vmysql: sql error[j]: MySQL server has gone away050407 13:46:53    7789 Connect     syslog@localhost on                   7789 Init DB     syslog                   7789 Query       SELECT host, date, time, message FROM syslog ORDER BY date DESC,time DESC LIMIT 0, 25                   7789 Quit                   7790 Connect     vpopmail@localhost on                   7790 Init DB     vpopmail                   7790 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "fiery" and pw_domain = "domain.net"                   7790 Query       replace into lastauth set user='fiery', domain='domain.net', remote_ip='192.168.2.75', timestamp=1112906813                   7790 Query       replace into relay ( ip_addr, timestamp ) values ( '192.168.2.75', 1112906813 )                   7790 Quit050407 13:47:04    7791 Connect     vpopmail@localhost on                   7791 Init DB     vpopmail                   7791 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "fiery" and pw_domain = "domain.net"                   7791 Query       replace into lastauth set user='fiery', domain='domain.net', remote_ip='192.168.2.75', timestamp=1112906824050407 13:47:14    7791 Query       replace into relay ( ip_addr, timestamp ) values ( '192.168.2.75', 1112906834 )                   7791 Quit050407 13:47:24    7793 Connect     vpopmail@localhost on                   7793 Init DB     vpopmail                   7793 Query       select pw_name, pw_passwd, pw_uid, pw_gid, pw_gecos, pw_dir, pw_shell , pw_clear_passwd from vpopmail where pw_name = "fiery" and pw_domain = "domain.net"                   7793 Query       replace into lastauth set user='fiery', domain='domain.net', remote_ip='192.168.2.75', timestamp=1112906844                   7793 Query       replace into relay ( ip_addr, timestamp ) values ( '192.168.2.75', 1112906844 )                   7793 Quit


Anything else to try/check before recompiling qmail-smtpd?

LogicX

why not recompile vpopmail?
be sure your toaster file is updated so the latest version gets installed
--- May this post be indexed by spiders, and archived for all to see as my internet epitaph.
http://fpux.com" target="_blank">http://fpux.com