-
-
Notifications
You must be signed in to change notification settings - Fork 2
Description
I tested the performance of the original gditools vs gditools3, and on Python 2.x vs Python 3, when extracting all files from the Samba de Amigo GDI. This was on Windows 10 with plenty of RAM for I/O caching. Listed times are the best of 3 runs.
The command was python2 -m cProfile gditools.py -i "Samba De Amigo Ver. 2000 v1.001 (2000)(Sega)(NTSC)(JP)[!].gdi" --extract-all -o x (changing the parameters when required, of course).
gditools, Python 2.7.15: "1113648 function calls (1112882 primitive calls) in 4.269 s".
gditools3, Python 2.7.15: "1120856 function calls (1120079 primitive calls) in 3.688 s".
gditools3, Python 3.4.1: "1151809 function calls (1150367 primitive calls) in 49.048 s".
gditools3, Python 3.7.0: "1151359 function calls (1149891 primitive calls) in 27.732 s" (but usually 37 s).
Sorting by cumulative time, here's an extract of the profiler result on Python 3.4.1 (very similar to 3.7.0):
ncalls tottime percall cumtime percall filename:lineno(function)
363 0.084 0.000 0.084 0.000 {built-in method utime}
365 0.177 0.000 0.177 0.000 {built-in method open}
496052 0.179 0.000 0.179 0.000 {method 'seek' of '_io.BytesIO' objects}
507742 0.281 0.000 0.281 0.000 {method 'read' of '_io.BytesIO' objects}
1278 0.448 0.000 0.448 0.000 {method 'write' of '_io.BufferedWriter' objects}
1321 0.552 0.000 0.552 0.000 {method 'read' of '_io.BufferedReader' objects}
1321 46.920 0.036 47.958 0.036 gditools3.py:455(read) <<<===
1321 0.123 0.000 48.103 0.036 gditools3.py:533(read)
1321 0.004 0.000 48.114 0.036 gditools3.py:580(read)
1321 0.005 0.000 48.131 0.036 gditools3.py:700(read)
363 0.071 0.000 48.682 0.134 gditools3.py:1001(_copy_buffered)
363 0.008 0.000 48.996 0.135 gditools3.py:256(dump_file_by_record)
1 0.003 0.003 49.043 49.043 gditools3.py:305(dump_all_files)
1 0.000 0.000 49.045 49.045 gditools3.py:1058(main)
1 0.000 0.000 49.124 49.124 gditools3.py:19(<module>)
56/1 0.003 0.000 49.124 49.124 {built-in method exec}
So the issue on Python 3 is in gditools3.py:455(read). On Python 2, the result is very different:
ncalls tottime percall cumtime percall filename:lineno(function)
1321 1.849 0.001 2.655 0.002 gditools3.py:455(read) <<<===
Looking at the results of pprofile, it is clear the problem is with BytesIO:
Total duration: 38.5193s
File: gditools3.py
File duration: 37.6029s (97.62%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
455| 1322| 0.0079844| 6.03964e-06| 0.02%| def read(self, length = None):
456| 1321| 0.00499964| 3.78474e-06| 0.01%| if self.__mode == 2048:
457| 0| 0| 0| 0.00%| return self.file.read(length)
458| 0| 0| 0| 0.00%|
459| 1321| 0.0169811| 1.28548e-05| 0.04%| elif self.__mode in [2352, 2336]:
460| 1321| 0.00699306| 5.29376e-06| 0.02%| if length == None:
461| 0| 0| 0| 0.00%| length = self.length - self.binpointer
462| 0| 0| 0| 0.00%|
463| 0| 0| 0| 0.00%| # Amount of bytes left until beginning of next sector
464| 1321| 0.00599122| 4.53537e-06| 0.02%| tmp = 2048 - self.binpointer % 2048
465| 1321| 0.00899363| 6.8082e-06| 0.02%| FutureOffset = self.binpointer + length
466| 1321| 0.025969| 1.96586e-05| 0.07%| realLength = self.realOffset(FutureOffset) - \
(call)| 1321| 0.0109961| 8.32408e-06| 0.03%|# gditools3.py:437 realOffset
467| 1321| 0.01599| 1.21045e-05| 0.04%| self.realOffset(self.binpointer)
(call)| 1321| 0.0099926| 7.56442e-06| 0.03%|# gditools3.py:437 realOffset
468| 0| 0| 0| 0.00%| # This will (hopefully) accelerates readings on HDDs at the
469| 0| 0| 0| 0.00%| # cost of more memory use.
470| 1321| 0.53275| 0.000403293| 1.38%| buff = BytesIO(self.file.read(realLength))
471| 0| 0| 0| 0.00%| # The first read can be < 2048 bytes
472| 1321| 0.0129924| 9.83526e-06| 0.03%| data = buff.read(tmp)
473| 1321| 0.00699544| 5.29556e-06| 0.02%| length -= tmp
474| 1321| 0.00499678| 3.78257e-06| 0.01%| buff.seek(self._skipToNext, 1)
475| 0| 0| 0| 0.00%| # The middle reads are all 2048 so we optimize here!
476| 494324| 2.26847| 4.58904e-06| 5.89%| for i in range(length // 2048):
477| 493003| 29.1512| 5.91298e-05| 75.68%| data += buff.read(2048)
478| 493003| 3.29499| 6.68351e-06| 8.55%| buff.seek(self._skipToNext, 1)
479| 0| 0| 0| 0.00%| # The last read can be < 2048 bytes
480| 1321| 0.00699615| 5.29611e-06| 0.02%| data += buff.read(length % 2048)
481| 0| 0| 0| 0.00%| # Seek back to where we should be
482| 1321| 0.0189881| 1.43741e-05| 0.05%| self.seek(FutureOffset)
(call)| 1321| 0.141838| 0.000107372| 0.37%|# gditools3.py:520 seek
483| 1321| 0.00499725| 3.78293e-06| 0.01%| return data
That didn't help me much, though. The fact is that Python 3 is slower than Python 2 on the same code.
What are my options? I could try to reproduce the problem with the minimum amount of code and:
- profile it again
- debug / trace it thoroughly
- rewrite it to work around any implementation bugs or limitations (perhaps creating new bugs/limitations in the process)
Any suggestions?