Database driver instrumentors' sqlcommenter should skip server-side prepared statements #3164
Description
What problem do you want to solve?
tl;dr: Injecting a unique sqlcomment into a MySQL statement, i.e. with a unique traceparent
, causes mysql-connector to treat the statement as new. This imposes new, extra overhead if a user-side mysql-connector cursor is enabled for server-side prepared statements.
More information
What is a server-side prepared statement
A server-side prepared statement is a call to PREPARE
on a MySQL server, typically followed by more than one EXECUTE
call with different params to save on resources. Please read this for more info.
Only mysql-connector supports true server-side prepared statements for MySQL server. PyMySQL does not. I've not found a explicit issue for mysqlclient (docs). So I've also tested instrumentation of mysqlclient and PyMySQL and neither exhibits the performance penalty outlined below. What mysqlclient and PyMySQL drivers do is an emulation of prepared statements (not true server-side statements), and it's probably client-side prepared statements for some sort of optimization. With those, the MySQL server only logs QUERY
.
What happens with prepared statements (without sqlcommenter)
If an app-side, mysql-connector cursor is set with prepared=True
, mysql-connector will perform native prepared statement execution (see docs here). Here is an example of 'prepared' cursor setup and usage in an instrumented db client app (without sqlcommenting):
import mysql.connector
from opentelemetry.instrumentation.mysql import MySQLInstrumentor
MySQLInstrumentor().instrument()
cnx = mysql.connector.connect(
host="foo-host",
port="3306",
user="foo-user",
password="bar-password",
database="foo-db",
)
cursor = cnx.cursor(prepared=True)
stmt = "SELECT * FROM city WHERE id = %s" # (1)
cursor.execute(stmt, (1818,)) # (2)
cursor.execute(stmt, (1900,)) # (3)
cursor.execute(stmt, (2000,)) # (4)
✅ The mysql-connector cursor will correctly optimize by only doing a Prepare
phase for the execute
marked with (2)
. The execute
and (3)
and (4)
will skip Prepare
and only do Execute
on the server. Here are the MySQL general logs that result:
2024-11-25T20:19:14.260210Z 13 Prepare SELECT * FROM city WHERE id = ?
2024-11-25T20:19:14.260479Z 13 Reset stmt
2024-11-25T20:19:14.260896Z 13 Execute SELECT * FROM city WHERE id = 1818
2024-11-25T20:19:14.262569Z 13 Reset stmt
2024-11-25T20:19:14.262897Z 13 Execute SELECT * FROM city WHERE id = 1900
2024-11-25T20:19:14.263474Z 13 Reset stmt
2024-11-25T20:19:14.263810Z 13 Execute SELECT * FROM city WHERE id = 2000
The problem: What happens with prepared statements (WITH sqlcommenter)
Using changes from #3163 : if we use mysql-connector cursor with prepared=True
and we enable sqlcommenting, native prepared statement execution would introduce a new performance penalty unless we take precautions. Here is the same client app setup but with enable_commenter=True
:
import mysql.connector
from opentelemetry.instrumentation.mysql import MySQLInstrumentor
MySQLInstrumentor().instrument(
enable_commenter=True, # !!!
)
cnx = mysql.connector.connect(
host="foo-host",
port="3306",
user="foo-user",
password="bar-password",
database="foo-db",
)
cursor = cnx.cursor(prepared=True)
stmt = "SELECT * FROM city WHERE id = %s" # (1)
cursor.execute(stmt, (1818,)) # (2)
cursor.execute(stmt, (1900,)) # (3)
cursor.execute(stmt, (2000,)) # (4)
❌ This time, the mysql-connector cursor will NOT correctly optimize and do more work because of the introduction of sqlcomment uniqueness! Every client-side execute
((2)
, (3)
, and (4)
) results in a Prepare
+ Execute
, plus a Close
in between! More processing, more caching, more memory use, and increased latency:
2024-11-25T20:28:37.423818Z 35 Prepare SELECT * FROM city WHERE id = ? /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-b80c8f24dd4f8025-01'*/
2024-11-25T20:28:37.423935Z 35 Reset stmt
2024-11-25T20:28:37.424463Z 35 Execute SELECT * FROM city WHERE id = 1818 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-b80c8f24dd4f8025-01'*/
2024-11-25T20:28:37.425580Z 35 Close stmt
2024-11-25T20:28:37.425800Z 35 Prepare SELECT * FROM city WHERE id = ? /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-bdccd0b081753e66-01'*/
2024-11-25T20:28:37.425907Z 35 Reset stmt
2024-11-25T20:28:37.426158Z 35 Execute SELECT * FROM city WHERE id = 1900 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-bdccd0b081753e66-01'*/
2024-11-25T20:28:37.426905Z 35 Close stmt
2024-11-25T20:28:37.427029Z 35 Prepare SELECT * FROM city WHERE id = ? /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-72dec2e23f36f12f-01'*/
2024-11-25T20:28:37.427220Z 35 Reset stmt
2024-11-25T20:28:37.427675Z 35 Execute SELECT * FROM city WHERE id = 2000 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-72dec2e23f36f12f-01'*/
This does not happen if mysql-connector cursor is set with the default prepared=False
, without nor with sqlcommenting. MySQL server ends up doing n * QUERY
instead of PREPARE
+ (n * EXECUTE
)
2024-11-25T20:33:43.151505Z 46 Query SELECT * FROM city WHERE id = 1818
2024-11-25T20:33:43.152935Z 46 Query SELECT * FROM city WHERE id = 1900
2024-11-25T20:33:43.153776Z 46 Query SELECT * FROM city WHERE id = 2000
2024-11-25T20:34:51.497504Z 49 Query SELECT * FROM city WHERE id = 1818 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-8e131b2efc67b5d49a373bcd074ea04b-1687fb39738d86f9-01'*/
2024-11-25T20:34:51.499008Z 49 Query SELECT * FROM city WHERE id = 1900 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-8e131b2efc67b5d49a373bcd074ea04b-ad727cb2192ad5a0-01'*/
2024-11-25T20:34:51.499537Z 49 Query SELECT * FROM city WHERE id = 2000 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-8e131b2efc67b5d49a373bcd074ea04b-76fa05d5df541c4f-01'*/
This is not an Otel nor sqlcommenter correctness issue because the traceparent values are all correct. It's a severe performance issue introduced under specific conditions.
Describe the solution you'd like
Database driver instrumentation should continue to generate trace, no matter which type of cursor is created and instrumented on database client side. It is common practice to have multiple cursors in an (instrumented) app for organization and sometimes concurrency. For example: one cursor per Flask route.
Database driver instrumentation should continue to support sqlcommenter when cursors are created and do not perform server-side prepared statements (the default cnx.cursor()
, or cnx.cursor(prepared=True)
). This is in order to correlate as much trace as possible from database client to records of actual queries performed on database server.
Database driver instrumentation should intercept when database client app side makes server-side prepared statements (e.g. cnx.cursor(prepared=True)
). One approach is to skip all sqlcomment for those queries. A warning should be logged.
I mentioned already that this is isolated to mysql-connector. However, it might work to make this a DB-API instrumentor update to limit duplicate code. Therefore the issue title is Database driver instrumentors ...
instead of mysql-connector instrumentor ...
Describe alternatives you've considered
I attempted to confine changes to mysql-connector instrumentor in Closed PR #3023. However I ended up editing the DB-API code anyway and still introduced several duplicates. From my testing this worked, but there is too much developer overhead. A simpler implementation, plus doc updates in #3162, will make for better dev experience.
Additional Context
Found while doing tests for #3163 as part of #2902 .
Would you like to implement a fix?
Yes