Skip to content

Commit bc14f1d

Browse files
Annihhalostatue
authored andcommitted
Improve ldiff binary files detection
The former code was performing scan on the first 4K of each file to see if one of them has a '\0' char in it and consider it as a binary file. This commit does not change this heuristic just the implementation. Instead of using the scan method with a regexp, use a simple include?. This not only fix compatibility issues with UTF8 escape sequences, but also the performance: 1. it does not leverage a Regexp system. 2. it stops at first occurence worst case is O(n). 3. it does not store much. Also instead of using .empty? which would signal a non-binary file, the call to include? invert the boolean test. IMHO it is clearer. Note: this could have been achieved simply by replacing .empty by .any? but the other improvements listed above motivated the change.
1 parent fec781d commit bc14f1d

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

lib/diff/lcs/ldiff.rb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -104,9 +104,9 @@ def run(args, _input = $stdin, output = $stdout, error = $stderr) # :nodoc:
104104

105105
# Test binary status
106106
if @binary.nil?
107-
old_txt = data_old[0, 4096].scan(/\0/).empty?
108-
new_txt = data_new[0, 4096].scan(/\0/).empty?
109-
@binary = !old_txt || !new_txt
107+
old_bin = data_old[0, 4096].include?("\0")
108+
new_bin = data_new[0, 4096].include?("\0")
109+
@binary = old_bin || new_bin
110110
end
111111

112112
unless @binary

0 commit comments

Comments
 (0)