Skip to content

str.count("\n") is 1.3-170 times faster than str.lines.count or str.each_line.count depending on the string size #220

Open
@ilyazub

Description

@ilyazub

str.count("\n") is 1.3-170 times faster than str.lines.count or str.each_line.count (ref: https://serpapi.com/blog/lines-count-failed-deployments/). The speed difference grows with the lines count.

$ ruby tmp/string_count_benchmark.rb
Warming up --------------------------------------
  String#count('\n')    86.000  i/100ms
   String#lines.size     1.000  i/100ms
  String#lines.count     1.000  i/100ms
String#each_line.count
                         1.000  i/100ms
Calculating -------------------------------------
  String#count('\n')    771.031  (± 6.6%) i/s -      3.870k in   5.041849s
   String#lines.size      4.785  (± 0.0%) i/s -     24.000  in   5.037242s
  String#lines.count      4.513  (± 0.0%) i/s -     23.000  in   5.112095s
String#each_line.count
                          4.763  (± 0.0%) i/s -     24.000  in   5.075882s

Comparison:
  String#count('\n'):      771.0 i/s
   String#lines.size:        4.8 i/s - 161.12x  (± 0.00) slower
String#each_line.count:        4.8 i/s - 161.87x  (± 0.00) slower
  String#lines.count:        4.5 i/s - 170.86x  (± 0.00) slower

Benchmark code:

require "benchmark/ips"

HTML = "\nruby\n" * 1024 * 1024

def fastest
  HTML.count("\n")
end

def faster
  HTML.each_line.count
end

def fast
  HTML.lines.length
end

def slow
  HTML.lines.size
end

Benchmark.ips do |x|
  x.report("String#count('\\n')")     { fastest }
  x.report("String#lines.size")       { faster  }
  x.report("String#lines.count")      { fast    }
  x.report("String#each_line.count")  { slow    }
  x.compare!
end

I'd like to add this benchmark to fast-ruby. Wdyt?


Based on our updates to the @guilhermesimoes' very helpful gist: https://gist.github.com/guilhermesimoes/d69e547884e556c3dc95?permalink_comment_id=4687645#gistcomment-4687645

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions