Skip to content

SCM function storage fixes.#537

Closed
MiranDMC wants to merge 4 commits into
masterfrom
Fix_SCM_function_storage
Closed

SCM function storage fixes.#537
MiranDMC wants to merge 4 commits into
masterfrom
Fix_SCM_function_storage

Conversation

@MiranDMC
Copy link
Copy Markdown
Collaborator

@MiranDMC MiranDMC commented Mar 7, 2026

Introduced distinction between function id and index in storage.
Default value 0 of CRunningScript::ScmFunction no longer can identify valid function.
Some internal properties of ScmFunction are now private.
Used std::array instead of c-arrays.
Removed some magic numbers.
Showing errors on errors instead of pretending everything is fine.

@MiranDMC MiranDMC force-pushed the Fix_SCM_function_storage branch 2 times, most recently from a580150 to b4377c1 Compare March 7, 2026 00:13
@MiranDMC MiranDMC force-pushed the Fix_SCM_function_storage branch from b4377c1 to 7447d73 Compare March 7, 2026 00:16
@MiranDMC MiranDMC marked this pull request as ready for review March 7, 2026 00:21
@x87
Copy link
Copy Markdown

x87 commented Mar 7, 2026

Can it be (unit) tested?

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 7, 2026

All unit tests already heavy depend on SCM functions. Nothing broken.
Indentation levels still are fine in script log.
Added error messages are not occurring, while previously Get method was constantly called with non existing ids and returning nullptr.

Still no idea if this fixes the error with looping SCM function call stack, but for sure it patches at least one hole.

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 7, 2026

image

Apparently original problem remains unsolved.

@x87
Copy link
Copy Markdown

x87 commented Mar 7, 2026

Hard to tell without code.

Looking at the changes, I'm thinking we can use R*'s CPool to maintain pool of functions, all methods are conveniently here already (via psdk). You just need to create new CPool with 0x400 elements

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 7, 2026

I doubt the container management itself is the problem. If anything maybe it would be better to just have array of elements instead of array of pointers for performance reasons.

Maybe it has something to do with the cleanup and after cleanup use? Seems to occur more frequently after starting new session.
I never seen ids with value 0 or greater than 1023 so it suggests it is not random value.

@x87
Copy link
Copy Markdown

x87 commented Mar 7, 2026

Try some older CLEO version? Maybe recent regression

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 7, 2026

Thing is I did not used script log before. Activated it after doing some memory handling errors in script.
You should keep script log on yourself.

@MiranDMC MiranDMC marked this pull request as draft March 7, 2026 22:39
@MiranDMC MiranDMC marked this pull request as ready for review March 8, 2026 23:24
@x87
Copy link
Copy Markdown

x87 commented Mar 9, 2026

I don't have enough context for this refactoring. Are we solving some actual issues here? Can they be expressed in unit tests?

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 9, 2026

I don't have enough context for this refactoring. Are we solving some actual issues here? Can they be expressed in unit tests?

I would basically repeat the description of the PR. ScmFunction manager was allowing 0 as function identifier while 0 is default value for scripts with empty callstack. Class was also returning nullptr when junk id/index was used in Get method.
I wasted 2 weeks questioning my scripting and chasing rabbits because SCM function storage was masking problems occurring in other places.

@x87
Copy link
Copy Markdown

x87 commented Mar 9, 2026

So why not just change default id to 1?

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 9, 2026

So why not just change default id to 1?

What do you mean? It is common memory is usually initialized as 0. For sure all foreign scripts will have that value set to 0.
If somebody asks for function with non existent id it is reason to show error. Next time won't be that lucky and it will hit some existing function.
Previously storage was only used during cleo_return calls. Now script log uses it during every command.

Encapsulating some properties of the storage also makes sense, as it forces to use proper methods.

@x87
Copy link
Copy Markdown

x87 commented Mar 9, 2026

Show me the exact issue that this change fixes. Write a test.

@x87
Copy link
Copy Markdown

x87 commented Mar 9, 2026

So why not just change default id to 1?

What do you mean? It is common memory is usually initialized as 0. For sure all foreign scripts will have that value set to 0. If somebody asks for function with non existent id it is reason to show error. Next time won't be that lucky and it will hit some existing function. Previously storage was only used during cleo_return calls. Now script log uses it during every command.

Encapsulating some properties of the storage also makes sense, as it forces to use proper methods.

If I understand correctly you need to change all ScmFunction::allocationPlace = 0 to ScmFunction::allocationPlace = 1. No additional changes needed.

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 9, 2026

There need to be no other scripts using SCM functions.

{$CLEO}
script_name {name} "A"

foo()

debug_on
trace "after foo"
terminate_this_custom_script

function foo()
    while true
        wait {time} 0
        print_formatted_now {format} "inside foo" {time} 50
    end
end
{$CLEO}
script_name {name} "B"

wait {time} 5000
cleo_return 0

terminate_this_custom_script

"After foo" is printed, meaning script B is executing code from script A, plus destroying its call stack.

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 9, 2026

So why not just change default id to 1?

What do you mean? It is common memory is usually initialized as 0. For sure all foreign scripts will have that value set to 0. If somebody asks for function with non existent id it is reason to show error. Next time won't be that lucky and it will hit some existing function. Previously storage was only used during cleo_return calls. Now script log uses it during every command.
Encapsulating some properties of the storage also makes sense, as it forces to use proper methods.

If I understand correctly you need to change all ScmFunction::allocationPlace = 0 to ScmFunction::allocationPlace = 1. No additional changes needed.

But after a full circle allocation place will be warped to 0.
Now there is ID and INDEX, where index is encapsulated. Getters and setters are doing +/-1 when needed and throwing errors if somebody asks for Id_None (0).

@x87
Copy link
Copy Markdown

x87 commented Mar 9, 2026

So why not just change default id to 1?

What do you mean? It is common memory is usually initialized as 0. For sure all foreign scripts will have that value set to 0. If somebody asks for function with non existent id it is reason to show error. Next time won't be that lucky and it will hit some existing function. Previously storage was only used during cleo_return calls. Now script log uses it during every command.
Encapsulating some properties of the storage also makes sense, as it forces to use proper methods.

If I understand correctly you need to change all ScmFunction::allocationPlace = 0 to ScmFunction::allocationPlace = 1. No additional changes needed.

But after a full circle allocation place will be warped to 0. Now there is ID and INDEX, where index is encapsulated. Getters and setters are doing +/-1 when needed and throwing errors if somebody asks for Id_None (0).

No, which is why I mentioned 'all'. After warping, you skip 0 again

if (++allocationPlace >= Store_Size) allocationPlace = 1; // end of store reached, skip 0

@x87
Copy link
Copy Markdown

x87 commented Mar 9, 2026

Slot 0 will be used for the "empty call stack" function, then you don't need to check for 0

   size_t ScmFunction::GetCallStackSize() const
    {
        return Get(prevScmFunctionId)->GetCallStackSize();
    }

also simplifies +1/-1 in other places where it is easy to make mistakes

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 9, 2026

Maybe it simplifies something or it just moves +1 into another place and wastes one slot in the storage.
There still needs to be validation of provided input argument to show errors and encapsulation to make sure nobody is tampering with the storage.

@x87
Copy link
Copy Markdown

x87 commented Mar 9, 2026

{$CLEO}
script_name {name} "B"

wait {time} 5000
cleo_return 0

terminate_this_custom_script

what error does it produce with this change?

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 9, 2026

{$CLEO}
script_name {name} "B"

wait {time} 5000
cleo_return 0

terminate_this_custom_script

what error does it produce with this change?

image

@x87
Copy link
Copy Markdown

x87 commented Mar 9, 2026

So how to get SHOW_ERROR("CLEO function with id %d not found in the storage!", id); ?

@x87
Copy link
Copy Markdown

x87 commented Mar 9, 2026

encapsulation to make sure nobody is tampering with the storage.

how's that possible? We don't provide public access to it to my knowledge

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 9, 2026

So how to get SHOW_ERROR("CLEO function with id %d not found in the storage!", id); ?

Have script with SCMFunction field corrupted or uninitialized. The fact I have no idea how to reproduce it right now from script in other way than using memory write does not mean it can not happen in the wild.
It might happen in case of other bugs we do not know yet about or invalid script struct arriving from external sources.
If we get errors like that it will be indicator that call stack storage probably should not be global but per script.

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 9, 2026

encapsulation to make sure nobody is tampering with the storage.

how's that possible? We don't provide public access to it to my knowledge

Encapsulation exists for a reason. To prevent us from ourself. Now it is obvious it is impossible to do something unexpected.
Same kind of question why to use const if we do not write that variable anyway.

@MiranDMC
Copy link
Copy Markdown
Collaborator Author

MiranDMC commented Mar 9, 2026

So how to get SHOW_ERROR("CLEO function with id %d not found in the storage!", id); ?

Tricky to unit test as the error is shown only when non existent function id is requested. Due to global nature of the storage it is not predictable.

Comment thread source/ScmFunction.cpp Outdated
if (++allocationPlace >= Store_Size) allocationPlace = 0; // end of store reached
if (allocationPlace == start_search)
lastAllocIdx++;
if (lastAllocIdx >= Store_Size) lastAllocIdx = 0; // warp around
Copy link
Copy Markdown

@x87 x87 Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we creating a functon at slot 0 after warping?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Slot like any other.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you changed it already

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed update. Not sure if that is an improvement. +1 just migrated into other places and slot 0 is now unused. Which is weird thing to do.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but isn't that what you ultimately tried to do? if the script calls return with uninitialized ScmFunctionId it should not get access to valid function

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, 0 is never a valid slot for the function

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 is not valid function ID. Index is internal businesses of storage class.

@MiranDMC MiranDMC force-pushed the Fix_SCM_function_storage branch from 6322bbb to 40b941f Compare March 10, 2026 18:39
@x87
Copy link
Copy Markdown

x87 commented Mar 11, 2026

Showing errors on errors instead of pretending everything is fine.

#550 (comment)

@x87
Copy link
Copy Markdown

x87 commented Mar 11, 2026

I should've posted here, instead of #550.

This MR does not solve the fundamental problem in the SCM functions architecture, which is that all scripts use the linear shared array for functions and access to them is given based on an unrestricted mutable field. Any script can modify its function id (intentionally or not) and there is no way to validate legitimacy of that access, as demonstrated in #550 (comment).

This PR only shifts already existing validation from return command to storage getter, providing different error message, when the function does not exist. In my opinion, renaming index to id, adding private class properties, etc, is just a ceremony that does not fix the actual fundamental issue described above.

So far, we identified and fixed one specific edge case, which could bite the scripter even on legitimate code, which is the stale function index in terminated scripts. #548 + #550 combined solve THIS particular edge case. The other issue remains present.

Which is why I'm against doing changes that bring no real value.

@x87 x87 closed this Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants