Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
0dadd45
Change iNat import form to use textarea for ID list
JoeCohen Jan 13, 2026
b23ec37
Change caption to allow up to "900" ids.
JoeCohen Jan 13, 2026
549431a
Increase iNat import ID list limit to 9,984 characters
JoeCohen Jan 13, 2026
df5eccd
Change InatImports inat_ids column from string to text
JoeCohen Jan 14, 2026
04012cd
Clarify test failure message
JoeCohen Jan 14, 2026
bbc6a18
Add coment re iNat "Mushroom Observer URL" observation field
JoeCohen Jan 14, 2026
61b64a7
Delete unmirrored? validation
JoeCohen Jan 14, 2026
db85886
Continue after warn about listed prior import
JoeCohen Jan 14, 2026
a284c73
Change max ids to import to ~ 384
JoeCohen Jan 15, 2026
d987ac3
Fix test failure from prior commit
JoeCohen Jan 15, 2026
6bd356c
Fix rubocop offense
JoeCohen Jan 15, 2026
af6cd87
Clean trailing whitespace and commas from inat_id list
JoeCohen Jan 15, 2026
32ce1ed
Fix test fialure from last commit
JoeCohen Jan 15, 2026
967a64d
Hack to fix bug where tracker displayed wrong total results.
JoeCohen Jan 15, 2026
3981036
Prevent cookie overflow due to overlong flash
JoeCohen Jan 15, 2026
cec05fe
Update test/controllers/inat_imports_controller_test.rb
JoeCohen Jan 16, 2026
409e6ca
Update test/controllers/inat_imports_controller_test.rb
JoeCohen Jan 16, 2026
e1cac8c
Update app/jobs/inat_import_job.rb
JoeCohen Jan 16, 2026
4bea811
Update app/classes/inat/page_parser.rb
JoeCohen Jan 16, 2026
54b9983
Update app/controllers/inat_imports_controller/validators.rb
JoeCohen Jan 16, 2026
1b2276f
Update test/controllers/inat_imports_controller_test.rb
JoeCohen Jan 16, 2026
2be9712
Reverse ineffective commit
JoeCohen Jan 16, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion app/classes/inat/page_parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,10 @@ def next_request(**args)
user_login: @import.inat_username,
# only fungi and slime molds
iconic_taxa: ICONIC_TAXA,
# and which haven't been exported from or inported to MO
# and which haven't been exported from or imported to MO
# This field was written by iNat's defunct Import from MO feature
# is written by Pulk's mirror script, and by
# ObservationImporter#update_mushroom_observer_url_field
without_field: "Mushroom Observer URL"
}.merge(args)
# super_importers can import observations of other users.
Expand Down
32 changes: 30 additions & 2 deletions app/controllers/inat_imports_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@
def create
return reload_form unless params_valid?

warn_about_listed_previous_imports
assure_user_has_inat_import_api_key
init_ivars
request_inat_user_authorization
Expand All @@ -92,11 +93,22 @@
private

def reload_form
@inat_ids = params[:inat_ids]
# clean trailing commas and whitespace
@inat_ids = params[:inat_ids].sub(/[,\s]+\z/, "")
@inat_username = params[:inat_username]
render(:new)
end

# Were any listed iNat IDs previously imported?
def warn_about_listed_previous_imports
return if importing_all? || !listing_ids?

previous_imports = Observation.where(inat_id: inat_id_list)
return if previous_imports.none?

flash_warning(:inat_previous_import.t(count: previous_imports.count))
end

def assure_user_has_inat_import_api_key
key = APIKey.find_by(user: @user, notes: MO_API_KEY_NOTES)
key = APIKey.create(user: @user, notes: MO_API_KEY_NOTES) if key.nil?
Expand All @@ -111,8 +123,8 @@
importables: importables_count,
imported_count: 0,
avg_import_time: @inat_import.initial_avg_import_seconds,
inat_ids: params[:inat_ids],
inat_username: params[:inat_username].strip,
inat_ids: clean_inat_ids,
response_errors: "",
token: "",
log: [],
Expand All @@ -131,6 +143,22 @@
params[:inat_ids].split(",").length
end

def clean_inat_ids
# clean trailing commas and whitespace
inat_ids = params[:inat_ids]&.sub(/[,\s]+\z/, "")

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data High

This
regular expression
that depends on a
user-provided value
may run slow on strings with many repetitions of '\t'.
This
regular expression
that depends on a
user-provided value
may run slow on strings with many repetitions of '\t'.
Copy link
Member

@mo-nathan mo-nathan Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is low enough of a risk that it is not a blocker. However, I got this from CC:

⏺ Summary

Confirmed: Line 148 (and more critically line 97) use uncontrolled user input in regex operations, triggering a security warning.

Actual risk: Low

  • The regex is not vulnerable to ReDoS
  • Validation exists but happens after line 97
  • Line 148 is called after validation passes

Recommended fix: Extract sanitization to a helper method that validates before using the regex:

  def sanitize_inat_ids(ids)
    return "" if ids.blank?
    return "" if /[^\d ,]/.match?(ids)  # Explicit validation
    ids.sub(/[,\s]+\z/, "")
  end

Then use sanitize_inat_ids(params[:inat_ids]) in both reload_form (line 97) and clean_inat_ids (line 148).

This satisfies the security scanner by ensuring validation happens at point-of-use, eliminates code duplication, and follows the principle of defense-in-depth.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mo-nathan. 1. Thanks for the review.
2. The CC suggestion appears to be delusional.
But I don't understand how it solves the problem. Bad user input will still fall through to the 3rd line.
And the second line might make things worse. inat_ids should not end up having a number followed by a space followed by a comma.

previous_imports = Observation.where(inat_id: inat_id_list)
return inat_ids if previous_imports.none?

# remove previously imported ids
# just in case the iNat user deleted the Mushroom_Observer_URL field
# NOTE: Also useful in manual testing when writes of iNat obss are
# commented out temporarily. jdc 2026-01-15
previous_ids = previous_imports.pluck(:inat_id).map(&:to_s)
remaining_ids =
inat_ids.split(",").map(&:strip).reject { |id| previous_ids.include?(id) }
remaining_ids.join(",")
end

def request_inat_user_authorization
redirect_to(INAT_AUTHORIZATION_URL, allow_other_host: true)
end
Expand Down
50 changes: 10 additions & 40 deletions app/controllers/inat_imports_controller/validators.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,13 @@
module InatImportsController::Validators
SITE = "https://www.inaturalist.org"

# Maximum size of id list param, based on
# Puma max query string size (1024 * 10)
# MAX_COOKIE_SIZE
# - ~256 to allow for other stuff
MAX_ID_LIST_SIZE =
[1024 * 10, ActionDispatch::Cookies::MAX_COOKIE_SIZE].min - 256

private

def params_valid?
Expand All @@ -23,8 +30,6 @@ def imports_valid?
imports_unambiguously_designated? &&
valid_inat_ids_param? &&
list_within_size_limits? &&
fresh_import? &&
unmirrored? &&
not_importing_all_anothers?
end

Expand Down Expand Up @@ -58,51 +63,16 @@ def contains_illegal_characters?

def list_within_size_limits?
return true if importing_all? || # ignore list size if importing all
params[:inat_ids].length <= 255
params[:inat_ids].length <= MAX_ID_LIST_SIZE

flash_warning(:inat_too_many_ids_listed.t)
false
end

# Are the listed iNat IDs fresh (i.e., not already imported)?
def fresh_import?
return true if importing_all?

previous_imports = Observation.where(inat_id: inat_id_list)
return true if previous_imports.none?

previous_imports.each do |import|
flash_warning(:inat_previous_import.t(inat_id: import.inat_id,
mo_obs_id: import.id))
end
false
end

def inat_id_list
params[:inat_ids].delete(" ").split(",").map(&:to_i)
end
return [] unless listing_ids?

def unmirrored?
return true if importing_all? # cannot test check this if importing all

conditions = inat_id_list.map do |inat_id|
Observation[:notes].matches("%Mirrored on iNaturalist as <a href=\"https://www.inaturalist.org/observations/#{inat_id}\">%")
end
previously_mirrored = Observation.where(conditions.inject(:or)).to_a
return true if previously_mirrored.blank?

previously_mirrored.each do |obs|
flash_warning(:inat_previous_mirror.t(inat_id: mirrored_inat_id(obs),
mo_obs_id: obs.id))
end
false
end

# When Pulk's `mirror`Python script copies an MO Obs to iNat,
# it adds a link to the iNat obs in the MO Observation notes
def mirrored_inat_id(obs)
match = %r{#{SITE}/observations/(?'inat_id'\d+)}o.match(obs.notes.to_s)
match[:inat_id]
params[:inat_ids].delete(" ").split(",").map(&:to_i)
end

# Block importing **all** of another user's iNat observations
Expand Down
12 changes: 9 additions & 3 deletions app/jobs/inat_import_job.rb
Original file line number Diff line number Diff line change
Expand Up @@ -109,14 +109,20 @@ def parsing_should_stop?(parsed_page)
end

def import_parsed_page_of_observations(parsed_page)
log("Got parsed page with iNat " \
"#{parsed_page["results"].first["id"]}-" \
"#{parsed_page["results"].last["id"]}")
log_new_page(parsed_page)
inat_import.update(importables: parsed_page["total_results"])
observation_importer.import_page(parsed_page)
log("Finished importing observations on parsed page")
end

def log_new_page(parsed_page)
log("Got parsed page with iNat " \
"#{parsed_page["results"].first["id"]}-" \
"#{parsed_page["results"].last["id"]}")
log("Results on this page: #{parsed_page["results"].size}")
log("Total results: #{parsed_page["total_results"]}")
end

def more_pages?(parsed_page)
parsed_page["total_results"] > parsed_page["page"] * parsed_page["per_page"]
end
Expand Down
4 changes: 2 additions & 2 deletions app/views/controllers/inat_imports/new.html.erb
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@
<%= panel.with_heading { :inat_choose_observations.l } %>
<%= panel.with_body do %>
<%= f.label(:inat_id, "#{:inat_import_list.l}: ") %>
<%= f.text_field(:inat_ids, size: 10, value: @inat_ids,
class: "form-control") %>
<%= f.text_area(:inat_ids, value: @inat_ids,
class: "form-control") %>
<%= check_box_with_label(form: f, field: :all,
class: "mt-3", label: :inat_import_all.l) %>
<% end %>
Expand Down
5 changes: 2 additions & 3 deletions config/locales/en.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3098,7 +3098,7 @@
# observation/inat_imports
inat_import_create_title: Import from iNat
inat_choose_observations: "Inat Observations to Import:"
inat_import_list: "Comma separated list of up to 10 observation #s"
inat_import_list: "Comma separated list of up to about 384 observation #s"
inat_import_all: Import all my iNat observations instead of a list
inat_import_consent: "(Required) I consent to creating MushroomObserver Observation(s) based on my iNaturalist observations, including their photos."
inat_consent_required: Your consent is required to import Inat observations. Please check the checkbox to give your consent.
Expand All @@ -3107,8 +3107,7 @@
inat_missing_username: Missing your iNat Username
inat_no_imports_designated: You must list the iNat observation(s) to be imported
inat_list_xor_all: "Either (a) list the iNat observation(s), or (b) check \"Import all ...\" (not both)"
inat_previous_import: iNat [inat_id] previously imported as MO Observation [mo_obs_id]
inat_previous_mirror: iNat [inat_id] is a "mirror" of existing MO Observation [mo_obs_id]
inat_previous_import: "[count] listed ids previously imported and will not be re-imported"
inat_too_many_ids_listed: Too many ids listed
inat_importing_all_anothers: You cannot import all observations of another person
inat_observed_missing_date: Observed Missing Date
Expand Down
2 changes: 1 addition & 1 deletion db/cache_schema.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
# It's strongly recommended that you check this file into your version control system.

ActiveRecord::Schema[7.2].define(version: 2024_02_03_041048) do
create_table "solid_cache_entries", charset: "utf8mb3", force: :cascade do |t|
create_table "solid_cache_entries", charset: "utf8mb4", collation: "utf8mb4_0900_ai_ci", force: :cascade do |t|
t.binary "key", limit: 1024, null: false
t.binary "value", size: :long, null: false
t.datetime "created_at", null: false
Expand Down
9 changes: 9 additions & 0 deletions db/migrate/20260114002432_change_inat_ids_to_text.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
class ChangeInatIdsToText < ActiveRecord::Migration[7.2]
def up
change_column(:inat_imports, :inat_ids, :text)
end

def down
change_column(:inat_imports, :inat_ids, :string)
end
end
4 changes: 2 additions & 2 deletions db/schema.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
#
# It's strongly recommended that you check this file into your version control system.

ActiveRecord::Schema[7.2].define(version: 2026_01_01_095531) do
ActiveRecord::Schema[7.2].define(version: 2026_01_14_002432) do
create_table "api_keys", id: :integer, charset: "utf8mb3", force: :cascade do |t|
t.datetime "created_at", precision: nil
t.datetime "last_used", precision: nil
Expand Down Expand Up @@ -209,7 +209,7 @@
create_table "inat_imports", charset: "utf8mb4", collation: "utf8mb4_0900_ai_ci", force: :cascade do |t|
t.integer "user_id"
t.integer "state", default: 0
t.string "inat_ids"
t.text "inat_ids"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.string "token"
Expand Down
106 changes: 64 additions & 42 deletions test/controllers/inat_imports_controller_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,8 @@ def test_new_inat_import

assert_response(:success)
assert_form_action(action: :create)
assert_select("input#inat_ids", true,
"Form needs a field for inputting iNat ids")
assert_select("textarea#inat_ids", true,
"Form needs a textarea for inputting iNat ids")
assert_select("input#inat_username", true,
"Form needs a field for inputting iNat username")
assert_select("input[type=checkbox][id=consent]", true,
Expand Down Expand Up @@ -152,11 +152,66 @@ def test_create_no_consent
assert_flash_text(:inat_consent_required.l)
end

def test_allows_maximum_ids
user = users(:rolf)
inat_username = "rolf" # use different inat_username to test if it's updated
inat_import = inat_imports(:rolf_inat_import)
assert_equal("Unstarted", inat_import.state,
"Need a Unstarted inat_import fixture")
assert_equal(0, inat_import.total_imported_count.to_i,
"Test needs InatImport fixture without prior imports")

id = 1_234_567_890
reps = InatImportsController::Validators::MAX_ID_LIST_SIZE / id.to_s.length
id_list = (id.to_s * reps).chop

stub_request(:any, authorization_url)
login(user.login)

assert_no_difference(
"Observation.count",
"Authorization request to iNat shouldn't create MO Observation(s)"
) do
post(:create,
params: { inat_ids: id_list, inat_username: inat_username,
consent: 1 })
end

assert_response(:redirect)
assert_equal(id_list, inat_import.reload.inat_ids,
"Failed to save inat_ids at maximum length")
end

def test_strips_trailing_commas_and_space_chars_from_id_list
inat_import = inat_imports(:rolf_inat_import)
user = inat_import.user
assert_equal("Unstarted", inat_import.state,
"Need a Unstarted inat_import fixture")
id_list = "123,456,789, \n"
expected_saved_id_list = "123,456,789"
stub_request(:any, authorization_url)
login(user.login)

post(:create,
params: { inat_ids: id_list,
inat_username: "", # omit this to force form reload
consent: 1 })

assert_form_action(action: :create)
assert_select(
"textarea#inat_ids", { text: expected_saved_id_list, count: 1 },
"inat_ids textarea should have trailing commas and whitespace stripped"
)
end

def test_create_too_many_ids_listed
# generate an id list that's barely too long
id_list = ""
id = 1_234_567_890
id_list += "#{id += 1}," until id_list.length > 255
until id_list.length > InatImportsController::Validators::MAX_ID_LIST_SIZE
id_list += "#{id += 1},"
end

params = { inat_username: "anything", inat_ids: id_list, consent: 1 }

login
Expand Down Expand Up @@ -185,44 +240,11 @@ def test_create_previously_imported
post(:create, params: params)
end

# NOTE: 2024-09-04 jdc
# I'd prefer that the flash include links to both obss,
# and that this (or another) assertion check for that.
# At the moment, it's taking too long to figure out how.
assert_flash_text(/iNat #{inat_id} previously imported/)
end

def test_create_previously_mirrored
user = users(:rolf)
inat_id = "1234567"
mirrored_obs = Observation.create(
where: "North Falmouth, Massachusetts, USA",
user: user,
when: "2023-09-08",
inat_id: nil,
# When Pulk's `mirror`Python script copies an MO Obs to iNat,
# it adds a text in this form to the MO Obs notes
# See https://github.com/JacobPulk/mirror
notes: { Other: "Mirrored on iNaturalist as <a href=\"https://www.inaturalist.org/observations/#{inat_id}\">observation #{inat_id}</a> on December 18, 2023" }
)
params = { inat_username: "anything", inat_ids: inat_id, consent: 1 }

login
assert_no_difference(
"Observation.count",
"Imported an iNat obs which had been 'mirrored' from MO"
) do
post(:create, params: params)
end

# NOTE: 2024-09-04 jdc
# I'd prefer that the flash include links to both obss,
# and that this (or another) assertion check for that.
# At the moment, it's taking too long to figure out how.
assert_flash_text(
"iNat #{inat_id} is a &#8220;mirror&#8221; of " \
"existing MO Observation #{mirrored_obs.id}"
)
assert_flash_text(/#{:inat_previous_import.l(count: 1)}/)
# It should continue even if some ids were previously imported
# The job will exclude previous imports via the iNat API
# `without_field: "Mushroom Observer URL"` param.
assert_redirected_to(INAT_AUTHORIZATION_URL)
end

def test_create_strip_inat_username
Expand Down Expand Up @@ -280,7 +302,7 @@ def test_create_authorization_request
consent: 1 })
end

assert_response(:redirect)
assert_redirected_to(INAT_AUTHORIZATION_URL)
assert_equal(inat_ids.split(",").length, inat_import.reload.importables,
"Failed to save InatImport.importables")
assert_equal("Authorizing", inat_import.reload.state,
Expand Down
2 changes: 1 addition & 1 deletion test/jobs/inat_import_job_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -946,7 +946,7 @@ def test_inat_api_request_post_observation_field_response_error

# The observation should have been destroyed after the error
assert_equal(obs_count_before, Observation.count,
"Observation should be destroyed after error")
"MO Observation should be destroyed after iNat API error")

errors = JSON.parse(@inat_import.response_errors, symbolize_names: true)
assert_equal(status, errors[:error], "Incorrect error status")
Expand Down