Skip to content

Bug Report: Infinite MySQL Reconnection Loop on "Duplicate column name" (Error 1060) #129

@jefferyjfbao

Description

@jefferyjfbao

Summary

The sniffer can get stuck in an infinite MySQL reconnection loop during startup or when a new sensor connects. This is triggered when the system attempts to add a custom header column to the cdr_next table that already exists, resulting in MySQL Error 1060 (ER_DUP_FIELDNAME). Because this specific error code is not handled or ignored during the schema sync process, the SQL driver enters a retry-and-reconnect loop, blocking the main initialization or connection process.

Symptoms

In the syslogs, you will see repeating entries similar to:

Oct 28 05:43:18 voipmonitor[3945371]: query error in [ALTER TABLE cdr_next ADD COLUMN custom_header__X-Cluster-Tag VARCHAR(255);]: 1060 - Duplicate column name 'custom_header__X-Cluster-Tag'
Oct 28 05:43:18 voipmonitor[3945371]: next attempt 1 - query: ALTER TABLE cdr_next ADD COLUMN ...
...
Oct 28 05:43:18 voipmonitor[3945371]: reconnecting to mysql...
The sniffer appears to be "stuck" or unresponsive during this period.

Root Cause Analysis

Cache Inconsistency in existsColumn: The check sqlDb->existsColumn(this->fixedTable, ...) sometimes returns false even if the column exists. This typically happens if the cdr_next table metadata was cached before the column was added (e.g., during a previous partial initialization or a crash/restart scenario like MEMORY IS FULL).
Infinite Retry Loop: When existsColumn incorrectly returns false, the sniffer executes ALTER TABLE ... ADD COLUMN .... MySQL returns Error 1060.
Missing Error Handling: In sql_db.cpp, the error 1060 is not part of the ignoreErrorCodes list for this operation. The default behavior for an unhandled SQL error is to log it, increment the attempt counter, call this->reconnect(), and try again. This repeats up to maxQueryPass times, causing a significant hang.

Proposed Fix

The fix is to explicitly ignore ER_DUP_FIELDNAME (1060) during the custom header column synchronization in calltable.cpp. Since the goal is to ensure the column exists, receiving a "Duplicate column" error means the objective is already met.

// Suggested change in calltable.cpp (CustomHeaders::createColumnsForFixedHeaders)
void CustomHeaders::createColumnsForFixedHeaders(SqlDb *sqlDb) {
...
// Define ER_DUP_FIELDNAME if not already available (typically 1060)
#ifndef ER_DUP_FIELDNAME
#define ER_DUP_FIELDNAME 1060
#endif
sqlDb->setIgnoreErrorCode(ER_DUP_FIELDNAME);
for(map<sCH_index, sCustomHeaderData>::iterator iter = custom_headers.begin(); iter != custom_headers.end(); iter++) {
if(iter->first.i1 == 0) {
if(!sqlDb->existsColumn(this->fixedTable, "custom_header__" + iter->second.first_header())) {
sqlDb->query(string("ALTER TABLE ") + this->fixedTable + " ADD COLUMN custom_header__" + iter->second.first_header() + " VARCHAR(255);");
}
}
}
sqlDb->clearIgnoreErrorCodes();
...
}
This ensures that even if existsColumn fails due to cache issues, the ALTER TABLE will fail silently and allow the sniffer to proceed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions