Skip to content

Database driver instrumentors' sqlcommenter should skip server-side prepared statements #3164

Open
@tammy-baylis-swi

Description

What problem do you want to solve?

tl;dr: Injecting a unique sqlcomment into a MySQL statement, i.e. with a unique traceparent, causes mysql-connector to treat the statement as new. This imposes new, extra overhead if a user-side mysql-connector cursor is enabled for server-side prepared statements.

More information

What is a server-side prepared statement

A server-side prepared statement is a call to PREPARE on a MySQL server, typically followed by more than one EXECUTE call with different params to save on resources. Please read this for more info.

Only mysql-connector supports true server-side prepared statements for MySQL server. PyMySQL does not. I've not found a explicit issue for mysqlclient (docs). So I've also tested instrumentation of mysqlclient and PyMySQL and neither exhibits the performance penalty outlined below. What mysqlclient and PyMySQL drivers do is an emulation of prepared statements (not true server-side statements), and it's probably client-side prepared statements for some sort of optimization. With those, the MySQL server only logs QUERY.

What happens with prepared statements (without sqlcommenter)

If an app-side, mysql-connector cursor is set with prepared=True, mysql-connector will perform native prepared statement execution (see docs here). Here is an example of 'prepared' cursor setup and usage in an instrumented db client app (without sqlcommenting):

import mysql.connector
from opentelemetry.instrumentation.mysql import MySQLInstrumentor

MySQLInstrumentor().instrument()
cnx = mysql.connector.connect(
    host="foo-host",
    port="3306",
    user="foo-user",
    password="bar-password",
    database="foo-db",
)
cursor = cnx.cursor(prepared=True)
stmt = "SELECT * FROM city WHERE id = %s"  # (1)
cursor.execute(stmt, (1818,))              # (2)
cursor.execute(stmt, (1900,))              # (3)
cursor.execute(stmt, (2000,))              # (4)

✅ The mysql-connector cursor will correctly optimize by only doing a Prepare phase for the execute marked with (2). The execute and (3) and (4) will skip Prepare and only do Execute on the server. Here are the MySQL general logs that result:

2024-11-25T20:19:14.260210Z	   13 Prepare	SELECT * FROM city WHERE id = ?
2024-11-25T20:19:14.260479Z	   13 Reset stmt
2024-11-25T20:19:14.260896Z	   13 Execute	SELECT * FROM city WHERE id = 1818
2024-11-25T20:19:14.262569Z	   13 Reset stmt
2024-11-25T20:19:14.262897Z	   13 Execute	SELECT * FROM city WHERE id = 1900
2024-11-25T20:19:14.263474Z	   13 Reset stmt
2024-11-25T20:19:14.263810Z	   13 Execute	SELECT * FROM city WHERE id = 2000

The problem: What happens with prepared statements (WITH sqlcommenter)

Using changes from #3163 : if we use mysql-connector cursor with prepared=True and we enable sqlcommenting, native prepared statement execution would introduce a new performance penalty unless we take precautions. Here is the same client app setup but with enable_commenter=True:

import mysql.connector
from opentelemetry.instrumentation.mysql import MySQLInstrumentor

MySQLInstrumentor().instrument(
    enable_commenter=True,  # !!!
)
cnx = mysql.connector.connect(
    host="foo-host",
    port="3306",
    user="foo-user",
    password="bar-password",
    database="foo-db",
)
cursor = cnx.cursor(prepared=True)
stmt = "SELECT * FROM city WHERE id = %s"  # (1)
cursor.execute(stmt, (1818,))              # (2)
cursor.execute(stmt, (1900,))              # (3)
cursor.execute(stmt, (2000,))              # (4)

❌ This time, the mysql-connector cursor will NOT correctly optimize and do more work because of the introduction of sqlcomment uniqueness! Every client-side execute ((2), (3), and (4)) results in a Prepare + Execute, plus a Close in between! More processing, more caching, more memory use, and increased latency:

2024-11-25T20:28:37.423818Z	   35 Prepare	SELECT * FROM city WHERE id = ? /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-b80c8f24dd4f8025-01'*/
2024-11-25T20:28:37.423935Z	   35 Reset stmt
2024-11-25T20:28:37.424463Z	   35 Execute	SELECT * FROM city WHERE id = 1818 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-b80c8f24dd4f8025-01'*/
2024-11-25T20:28:37.425580Z	   35 Close stmt
2024-11-25T20:28:37.425800Z	   35 Prepare	SELECT * FROM city WHERE id = ? /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-bdccd0b081753e66-01'*/
2024-11-25T20:28:37.425907Z	   35 Reset stmt
2024-11-25T20:28:37.426158Z	   35 Execute	SELECT * FROM city WHERE id = 1900 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-bdccd0b081753e66-01'*/
2024-11-25T20:28:37.426905Z	   35 Close stmt
2024-11-25T20:28:37.427029Z	   35 Prepare	SELECT * FROM city WHERE id = ? /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-72dec2e23f36f12f-01'*/
2024-11-25T20:28:37.427220Z	   35 Reset stmt
2024-11-25T20:28:37.427675Z	   35 Execute	SELECT * FROM city WHERE id = 2000 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-72dec2e23f36f12f-01'*/

This does not happen if mysql-connector cursor is set with the default prepared=False, without nor with sqlcommenting. MySQL server ends up doing n * QUERY instead of PREPARE + (n * EXECUTE)

2024-11-25T20:33:43.151505Z	   46 Query	SELECT * FROM city WHERE id = 1818
2024-11-25T20:33:43.152935Z	   46 Query	SELECT * FROM city WHERE id = 1900
2024-11-25T20:33:43.153776Z	   46 Query	SELECT * FROM city WHERE id = 2000
2024-11-25T20:34:51.497504Z	   49 Query	SELECT * FROM city WHERE id = 1818 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-8e131b2efc67b5d49a373bcd074ea04b-1687fb39738d86f9-01'*/
2024-11-25T20:34:51.499008Z	   49 Query	SELECT * FROM city WHERE id = 1900 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-8e131b2efc67b5d49a373bcd074ea04b-ad727cb2192ad5a0-01'*/
2024-11-25T20:34:51.499537Z	   49 Query	SELECT * FROM city WHERE id = 2000 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-8e131b2efc67b5d49a373bcd074ea04b-76fa05d5df541c4f-01'*/

This is not an Otel nor sqlcommenter correctness issue because the traceparent values are all correct. It's a severe performance issue introduced under specific conditions.

Describe the solution you'd like

Database driver instrumentation should continue to generate trace, no matter which type of cursor is created and instrumented on database client side. It is common practice to have multiple cursors in an (instrumented) app for organization and sometimes concurrency. For example: one cursor per Flask route.

Database driver instrumentation should continue to support sqlcommenter when cursors are created and do not perform server-side prepared statements (the default cnx.cursor(), or cnx.cursor(prepared=True)). This is in order to correlate as much trace as possible from database client to records of actual queries performed on database server.

Database driver instrumentation should intercept when database client app side makes server-side prepared statements (e.g. cnx.cursor(prepared=True)). One approach is to skip all sqlcomment for those queries. A warning should be logged.

I mentioned already that this is isolated to mysql-connector. However, it might work to make this a DB-API instrumentor update to limit duplicate code. Therefore the issue title is Database driver instrumentors ... instead of mysql-connector instrumentor ...

Describe alternatives you've considered

I attempted to confine changes to mysql-connector instrumentor in Closed PR #3023. However I ended up editing the DB-API code anyway and still introduced several duplicates. From my testing this worked, but there is too much developer overhead. A simpler implementation, plus doc updates in #3162, will make for better dev experience.

Additional Context

Found while doing tests for #3163 as part of #2902 .

Would you like to implement a fix?

Yes

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions