-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Modernize File Not Always Closed query #18845
base: main
Are you sure you want to change the base?
Python: Modernize File Not Always Closed query #18845
Conversation
7bc2978
to
2f2e755
Compare
QHelp previews: python/ql/src/Resources/FileNotAlwaysClosed.qhelpFile is not always closedWhen a file is opened, it should always be closed. A file opened for writing that is not closed when the application exits may result in data loss, where not all of the data written may be saved to the file. A file opened for reading or writing that is not closed may also use up file descriptors, which is a resource leak that in long running applications could lead to a failure to open additional files. RecommendationEnsure that opened files are always closed, including when an exception could be raised. The best practice is often to use a ExampleIn the following examples, in the case marked BAD, the file may not be closed if an exception is raised. In the cases marked GOOD, the file is always closed. def bad():
f = open("filename", "w")
f.write("could raise exception") # BAD: This call could raise an exception, leading to the file not being closed.
f.close()
def good1():
with open("filename", "w") as f:
f.write("always closed") # GOOD: The `with` statement ensures the file is always closed.
def good2():
f = open("filename", "w")
try:
f.write("always closed")
finally:
f.close() # GOOD: The `finally` block always ensures the file is closed.
References
|
…oc and annotate tests.
a2fbf85
to
3707f10
Compare
3d08e52
to
bdbdcf8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff! I've added a couple of comments, though many of them are more me musing about how this fits in the grander scheme of things. In the interest of expedience, I would be happy to leave some of the more broad changes I suggest as potential future work. (Once we have more of these quality queries ported over, we'll probably have a better feel for what would make sense as a framework for this sort of thing.)
private DataFlow::TypeTrackingNode fileOpenInstance(DataFlow::TypeTracker t) { | ||
t.start() and | ||
result instanceof FileOpenSource | ||
or | ||
exists(DataFlow::TypeTracker t2 | result = fileOpenInstance(t2).track(t2, t)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a type tracker actually needed here? As far as I can tell FileOpenSource
only contains API graph nodes at the moment, so I would hope that the type tracking done within the API graph calculation would be sufficient.
|
||
/** A node where a file is closed. */ | ||
abstract class FileClose extends DataFlow::CfgNode { | ||
/** Holds if this file close will occur if an exception is thrown at `e`. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By e
do you mean raises
?
private predicate fileLocalFlowStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) { | ||
DataFlow::localFlowStep(nodeFrom, nodeTo) | ||
or | ||
exists(FileWrapperCall fw | nodeFrom = fw.getWrapped() and nodeTo = fw) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This extension of local flow makes me wonder if it would make more sense to rewrite this part of the query as a proper data-flow query (with an additional step for file wrapper calls). My main worry is that calculating the fileLocalFlow
relation might result in bad performance.
private predicate fileLocalFlow(DataFlow::Node source, DataFlow::Node sink) { | ||
fileLocalFlowStep*(source, sink) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I've misread the rest of the code, source
will in fact always be a FileOpen
instance. If this is true, it might make sense to specialise the argument to that class (a kind of manual "magic"), as this would certainly reduce the size of the fileLocalFlow
predicate (assuming the compiler hasn't already figured that this kind of magic is available).
|
||
FileWrapperCall() { | ||
wrapped = this.getArg(_).getALocalSource() and | ||
this.getFunction() = classTracker(_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly puzzled by this very weak restriction. Do we really not require anything of the class that's wrapping it?
( | ||
retVal = ret.getValue() | ||
or | ||
retVal = ret.getValue().(List).getAnElt() | ||
or | ||
retVal = ret.getValue().(Tuple).getAnElt() | ||
) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it's a subset of a more generic concept of "returning some structure containing a thing of interest". For instance, what if the file object is put in a dict that's returned?
If we rewrite the query to use data-flow, I could see this potentially being more widely useful as a standard set of additional flow steps.
|
||
predicate fileMayNotBeClosedOnException(FileOpen fo, DataFlow::Node raises) { | ||
fileIsClosed(fo) and | ||
exists(DataFlow::CfgNode fileRaised | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move the raises
argument into this exists
(as it doesn't seem to be used anywhere outside of this predicate)?
Rewrites
py/file-not-closed
query to not rely onpointsTo
analysis.Reviewing per-commit may be helpful.