Skip to content

Conversation

@mmcnabb-vms
Copy link
Contributor

@mmcnabb-vms mmcnabb-vms commented Dec 24, 2025

MARS prefetch window

The first issue is with the sliding window calculation for fetching a large result set. It hardcodes a prefetch size of 4, i.e. the first fetch will request the server to send 4 packets, which FreeTDS dequeues into a packet cache. However, on completion of processing of each one of those packets, it requests the server to send another 4 packets. This causes the number of packets cached in the client to increase indefinitely. I have changed it to base the prefetch window growth on the sequence number of the packet just processed, not the latest sequence out of all the cached packets.

MARS "socket" lifetime

The second problem relates to the decision of when to free stmt->tds for a MARS session "socket". (To recap for anyone unfamiliar with MARS, a TDSSOCKET on a MARS connection is not actually a socket; it's a virtual channel being multiplexed over one TCP/IP socket). When beginning a SQLExecute() when MARS is active: if stmt->tds == NULL then a new channel is allocated, saved as stmt->tds and logically owned by the stmt.

Observed behaviour was that an insert into command with parameters would cause stmt->tds to be freed just before SQLExecute() returns; whereas if there were no parameters (e.g. the new values were hardcoded in the command), then stmt->tds would not be freed.

It seems to me this probably wasn't intentional (as I can't think of any reason for that behaviour); the upshot was that if we do, for example, a 10,000 row fetch, and execute one insert for each row fetched, then the TDSSOCKET is created and destroyed 10,000 times -- but only if the insert used parameterization. This has a performance impact.

NB: I have been trying to track down a memory leak observed on one system associated with creating and freeing this object, a leak of about 512 bytes per insert was observed -- I haven't gotten to the bottom of it yet, but creating the TDSSOCKET once instead of 10,000 times certainly helps. I'm continuing to work on that.

The code responsible for this behaviour is at the end of SQLExecute(), in odbc.c line 3745:

	switch (result_type) {
	case TDS_CMD_DONE:
		odbc_unlock_statement(stmt);

A non-parameterized insert (or update or delete) gets a 0xFD(DONE) TDS response which translates to result_type == TDS_DONE_RESULT, but a parameterized insert gets an 0xFF(DONEINPROC) response which translates to result_type == TDS_CMD_DONE.

My solution to this issue is: I don't really see why stmt->tds needs to be freed at any point earlier than stmt being freed, so I have moved that freeing code from odbc_unlock_statement into the part of the code that frees stmt.

Let me know what you think of this and if there might be any other issues I am overlooking . The MARS1 test still passes but it's not a very comprehensive test.

MARS1 test

Expanded this test to test both parameterized and non-parameterized inserts, and made it easier to do a large data set for leak detection, although I'm not sure how we might go about actually including automated leak detection in the test. (I could write the VMS code for this!)

On my test systems the MARS1 test did increase memory usage due to the packet sequence number problem, but did not show any visible leaks once that was fixed. I came across the TDSSOCKET over-destruction problem while investigating a memory leak in another test case I have (that's not part of the FreeTDS test suite) - investigation ongoing.

Signed-off-by: Matt McNabb <matthew.mcnabb@vmssoftware.com>
Signed-off-by: Matt McNabb <matthew.mcnabb@vmssoftware.com>
Signed-off-by: Matt McNabb <matthew.mcnabb@vmssoftware.com>
Signed-off-by: Matt McNabb <matthew.mcnabb@vmssoftware.com>
src/tds/packet.c Outdated
if ((int32_t) (tds->recv_seq + 2 - tds->recv_wnd) >= 0)
tds_update_recv_wnd(tds, tds->recv_seq + 4);
/* Look ahead by up to 4 packets */
memcpy(&this_seq, ((char const *)packet->buf) + offsetof(TDS72_SMP_HEADER, seq), sizeof this_seq);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use

this_seq = TDS_GET_A4LE(&((const TDS72_SMP_HEADER *) packet->buf)->seq);

also, consider that other code will still use tds->recv_seq, perhaps it would be better to update that in a different way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree about using TDS_GET_A4LE ; but is packet guaranteed to be correctly aligned for that cast?

The code in tds_connection_put_packet makes the same cast, so I guess it's not a problem. The code in tds_packet_read() uses memcpy(&mars_header, packet->buf, sizeof(mars_header));.

src/tds/packet.c Outdated
tds_update_recv_wnd(tds, tds->recv_seq + 4);
/* Look ahead by up to 4 packets */
memcpy(&this_seq, ((char const *)packet->buf) + offsetof(TDS72_SMP_HEADER, seq), sizeof this_seq);
if (this_seq + 2 >= tds->recv_wnd)
Copy link
Contributor

@freddy77 freddy77 Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formulae is wrong if you have wrapping, the old one was correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, you're right about the wrapping issue. I didn't like the old one as the C Standard doesn't define the behaviour of converting out-of-range unsigned values to signed integer, but FreeTDS is unlikely to be used on any system that doesn't behave as expected here.

Perhaps we could improve readability by borrowing TCP sequence macros, e.g. from BSD but with TDS_INT as the cast type (it should be the signed counterpart of TDS_UINT).

ODBC_BUF *odbc_buf = NULL;

printf("%s\n", command);
if (command[0] == '!')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It surely sounds useful and not conflicting with SQL statements. About the character used not any language in my mind use the exclamation point in that way. I would use an at sign (@) instead, both make and DOS batches use this symbol with that semantic. It's also used in T-SQL for variable names but I cannot see any statement starting with that symbol.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good

tds = tds_alloc_additional_socket(dbc_tds->conn);
if (tds)
{
tdsdump_log(TDS_DBG_INFO1, "MARS SID %d allocated new TDSSOCKET\n", tds->sid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe MARS SID %d allocated for new TDSSOCKET\n ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, both of them are allocated really (the SID and the TDSSOCKET); how about with instead of for

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds good to me

* iteration of the insert if and only if it was a parameterized insert. So we need
* to test both parameterized and non-parameterized inserts.
*/
long bind1;
Copy link
Contributor

@freddy77 freddy77 Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct type to bind a SQL_C_SLONG is SQLINTEGER. Yes, surely confusing, but I think initial ODBC specifications were written with 16 bits in mind.

SET_STMT(stmt2);
odbc_reset_statement();

goto done;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug code? Why skipping part of the test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes this was debug code sorry

done:
EndTransaction(SQL_COMMIT);

/* TODO test receiving large data should not take much memory */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot see the implementation of this in the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced that comment with /* TODO a way for this test to detect memory leak here. */ a bit further up -- unless you meant something else? Receiving one row of large data would take as much memory as it takes to process that data, so I took your comment to mean that repeated iterations of the test should not keep increasing memory.

src/odbc/odbc.c Outdated
Comment on lines 4471 to 4474
tdsdump_log(TDS_DBG_INFO1, "MARS SID %d socket freed\n", tds->sid);
if (!(tds->state == TDS_IDLE || tds->state == TDS_DEAD)) {
tdsdump_log(TDS_DBG_WARN, "MARS SID %d was not idle/dead\n", tds->sid);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These looks like an use after free.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're right; will have to reorder those statements (and re-word the messages, or remember the variables being logged)

@freddy77
Copy link
Contributor

About leaks (or at least testing high memory usage) in src/odbc/unittests/tvp.c there's a memory_usage function you could reuse.

@freddy77
Copy link
Contributor

About the leak, if the original mars1 test caused some leaks (or some other code) I would prefer to have it fixed instead of changing the test. Note that some of my test environment (https://freetds.sourceforge.net/) use Valgrind to detect leaks so if there are a leak in the test if should happen with some different environment.

Signed-off-by: Matt McNabb <Matthew.mcnabb@vmssoftware.com>
Signed-off-by: Matt McNabb <Matthew.mcnabb@vmssoftware.com>
@mmcnabb-vms
Copy link
Contributor Author

mmcnabb-vms commented Dec 29, 2025

By the way, I have stumbled on another issue, that might be a bug in ODBC32.dll ... if we enable odbc_reset_statement() in the mars1 test's main insert loop (and of course add code to rebind parameters if a parameterized insert is coming), then: on the exactly 2039th call to odbc_reset_statement(), it appears to yield a bogus value for stmt2. The first 2038 calls yielded the same handle value, but the 2039th call returns a new handle value, and the new handle leads to segfaults or failure codes when passed to SQLBindParameter or odbc_command with parameterized insert.

This only happens in Windows

Signed-off-by: Matt McNabb <Matthew.mcnabb@vmssoftware.com>
@mmcnabb-vms
Copy link
Contributor Author

mmcnabb-vms commented Dec 30, 2025

Added a heap log check to the branch (you don't have to pull that into master necessarily, it could just stay here for development purposes) .

If I revert the commit "Don't need to free MARS socket until statement free" then I observe the leak running MARS1 test in VMS, but not Windows or Linux. (so leave that to me to try and track down). It's fairly small , something like 24 bytes per allocate/release cycle.

Committed some fixes as per your suggestions (this time I didn't squash and force push - let me know your preference in that respect)

Signed-off-by: Matt McNabb <Matthew.mcnabb@vmssoftware.com>
@mmcnabb-vms
Copy link
Contributor Author

@freddy77 are you waiting on anything else to merge this PR? (I can squash the fixes to your comments and rebase up to current master, if you'd like to start a fresh review from that point).

The appveyor Windows tests appear to be frozen as I write this comment, but on previous pushes I have seen:

  • o_error caused a database deadlock (surely nothing to do with this PR)
  • o_closestmt intermittently segfaults. I haven't been able to reproduce this segfault on my local system, and also don't see how anything on this PR would affect o_closestmt.

@freddy77
Copy link
Contributor

@freddy77 are you waiting on anything else to merge this PR? (I can squash the fixes to your comments and rebase up to current master, if you'd like to start a fresh review from that point).

Sorry, just pretty busy with other things

The appveyor Windows tests appear to be frozen as I write this comment, but on previous pushes I have seen:

* o_error caused a database deadlock (surely nothing to do with this PR)

Recently tests run more in parallel, having some issues with locking. Some tests rely on system tables and long queries. error test uses sysobjects but not for long, probably it's another test causing a longer lock. Don't worry. I removed some but they are not related to your changes.

* o_closestmt intermittently segfaults.  I haven't been able to reproduce this segfault on my local system, and also don't see how anything on this PR would affect o_closestmt.

It looks like sometimes there are intermittent segfaults on AppVeyor... but even the compiler sometimes segfault which surely is not FreeTDS code! Obviously if the fault in closestmt is more frequent it can be the test itself for some reasons.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants