Skip to content

JetBrains-Research/HumanEval-Nagini

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

128 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HumanEval-Nagini

Examples from HumanEval translated to Nagini. Nagini is a verification framework built on top of Python language. Its purpose is to prove the functional properties of programs.

We added such properties to the translated programs in Bench. Additionally, different parts of the verification proofs/executable code are marked there, so that one can easily remove, for example, all invariants from the code and evaluate ones algorithm for invariant generation on this benchmark. We differentiate the following parts of code: invariants, function preconditions, function postconditions, function implementation, pure functions.

In src we copied Nagini verifier's code, to use it in our CI infrastructure. In CI we checked that every program in Bench is verified in the time limit of 600 seconds. Currently, it's failing due to Nagini's inconsistency between runs and sometimes more time needed for the verification of the same task.

In WIP you can find examples from HumanEval that are in the process of being translated.

There is also a similar benchmark of HumanEval translated to Dafny

Current status:

  • 0. has_close_elements
  • 1. separate_paren_groups
  • 2. truncate
  • 3. below_zero
  • 4. mean_absolute_derivation
  • 5. intersperse
  • 6. parse_nested_parens
  • 7. filter_by_substring
  • 8. sum_product
  • 9. rolling_max
  • 10. is_palindrome
  • 11. string_xor
  • 12. longest
  • 13. greatest_common_divisor
  • 14. all_prefixes
  • 15. string_sequence
  • 16. count_distinct_characters
  • 17. parse_music
  • 18. how_many_times
  • 19. sort_numbers
  • 20. find_closest_elements
  • 21. rescale_to_unit
  • 22. filter_integers
  • 23. strlen
  • 24. largest_divisor
  • 25. factorize
  • 26. remove_duplicates
  • 27. flip_case
  • 28. concatenate
  • 29. filter_by_prefix
  • 30. get_positive
  • 31. is_prime
  • 32. poly
  • 33. sort_third
  • 34. unique
  • 35. max_element
  • 36. fizz_buzz
  • 37. sort_even
  • 38. encode_cyclic
  • 39. prime_fib
  • 40. triples_sum_to_zero
  • 41. car_race_collision
  • 42. incr_list
  • 43. pairs_sum_to_zero
  • 44. change_base
  • 45. triangle_area
  • 46. fib4
  • 47. median
  • 48. is_palindrome
  • 49. modp
  • 50. encode_shift
  • 51. remove_vowels
  • 52. below_threshold
  • 53. add
  • 54. same_chars
  • 55. fib
  • 56. correct_bracketing
  • 57. monotonic
  • 58. common
  • 59. largest_prime_factor
  • 60. sum_to_n
  • 61. correct_bracketing
  • 62. derivative
  • 63. fibfib
  • 64. vowels_count
  • 65. circular_shift
  • 66. digitSum
  • 67. fruit_distribution
  • 68. pluck
  • 69. search
  • 70. strange_sort_list
  • 71. triangle_area
  • 72. will_it_fly
  • 73. smallest_change
  • 74. total_match
  • 75. is_multiply_prime
  • 76. is_simple_power
  • 77. iscube
  • 78. hex_key
  • 79. decimal_to_binary
  • 80. is_happy
  • 81. numerical_letter_grade
  • 82. prime_length
  • 83. starts_one_ends
  • 84. solve
  • 85. add
  • 86. anti_shuffle
  • 87. get_row
  • 88. sort_array
  • 89. encrypt
  • 90. next_smallest
  • 91. is_bored
  • 92. any_int
  • 93. encode
  • 94. skjkasdkd
  • 95. check_dict_case
  • 96. count_up_to
  • 97. multiply
  • 98. count_upper
  • 99. closest_integer
  • 100. make_a_pile
  • 101. words_string
  • 102. choose_num
  • 103. rounded_avg
  • 104. unique_digits
  • 105. by_length
  • 106. f
  • 107. even_odd_palindrome
  • 108. count_nums
  • 109. move_one_ball
  • 110. exchange
  • 111. histogram
  • 112. reverse_delete
  • 113. odd_count
  • 114. minSubArraySum
  • 115. max_fill
  • 116. sort_array
  • 117. select_words
  • 118. get_closest_vowel
  • 119. match_parens
  • 120. maximum
  • 121. solution
  • 122. add_elements
  • 123. get_odd_collatz
  • 124. valid_date
  • 125. split_words
  • 126. is_sorted
  • 127. intersection
  • 128. prod_signs
  • 129. minPath
  • 130. tri
  • 131. digits
  • 132. is_nested
  • 133. sum_squares
  • 134. check_if_last_char_is_a_letter
  • 135. can_arrange
  • 136. largest_smallest_integers
  • 137. compare_one
  • 138. is_equal_to_sum_even
  • 139. special_factorial
  • 140. fix_spaces
  • 141. file_name_check
  • 142. sum_squares
  • 143. words_in_sentence
  • 144. simplify
  • 145. order_by_points
  • 146. specialFilter
  • 147. get_max_triples
  • 148. bf
  • 149. sorted_list_sum
  • 150. x_or_y
  • 151. double_the_difference
  • 152. compare
  • 153. Strongest_Extension
  • 154. cycpattern_check
  • 155. even_odd_count
  • 156. int_to_mini_roman
  • 157. right_angle_triangle
  • 158. find_max
  • 159. eat
  • 160. do_algebra
  • 161. solve
  • 162. string_to_md5
  • 163. generate_integers

About

examples from HumanEval translated to Nagini

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors