Skip to content

Generating piece url based on curio multiaddr and contract mapper #22

Merged
lukasz-wal merged 12 commits intomainfrom
feature/curio-or-boost
Nov 17, 2025
Merged

Generating piece url based on curio multiaddr and contract mapper #22
lukasz-wal merged 12 commits intomainfrom
feature/curio-or-boost

Conversation

@lukasz-wal
Copy link
Copy Markdown
Collaborator

No description provided.

let (_, retrievability_percent) = match timeout(
Duration::from_secs(RETRIEVABILITY_TIMEOUT_SEC),
url_tester::get_retrivability_with_head(urls),
url_tester::get_retrivability_with_get(urls),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
url_tester::get_retrivability_with_get(urls),
url_tester::get_retrievability_with_get(urls),

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +92 to +97
let final_addr = if trimmed.ends_with("/https") {
trimmed.replace("/https", "/tcp/443/https")
} else if trimmed.ends_with("/http") {
trimmed.replace("/http", "/tcp/80/http")
} else {
trimmed.to_string()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should work on this as a string, we should parse this with a proper multiaddr lib.

@CodeWarriorr did we try any? Were there any issues?

Copy link
Copy Markdown
Collaborator Author

@lukasz-wal lukasz-wal Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of "multiaddr" from curio:
/dns/isabella.hegsx.com/https/http-path/%2Fipni-provider%2F12D3KooWSbAPqdYw1MGrQSV8LgjZDYBvha8bQ2YtHUCzRYyqLCzy
When we try to parse the original address by lib => UnknownProtocolString("http-path")
After removing unknown parts, our internal parser needs a port:
Failed to convert multiaddr: "/dns/isabella.hegsx.com/https" to URL: "Missing port"

}

pub async fn valid_curio_provider(address: &str) -> Result<Option<String>> {
let rpc_url = "https://api.node.glif.io/rpc/v1";
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be an env variable.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +37 to +40
let contract_address: &str = "0x14183aD016Ddc83D638425D6328009aa390339Ce";

let miner_peer_id_contract = Address::parse_checksummed(contract_address, None)
.map_err(|e| eyre!("Failed to parse miner id contact: {e}"))?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use https://docs.rs/alloy-primitives/latest/alloy_primitives/macro.address.html instead (that way it will verify that the address is correct at build time).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

let rpc_provider = ProviderBuilder::new()
.connect(rpc_url)
.await
.map_err(|err| eyre!("Building provider failed: {}", err))?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this map_err needed here? Doesnt seem to add anything valueble and err already implements std error?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

.map_err(|e| eyre!("Transaction failed: {e}"))?;

let peer_data: PeerData =
PeerData::abi_decode(response.as_ref()).map_err(|e| eyre!("Decode failed: {e}"))?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

const RETRI_TIMEOUT_SEC: u64 = 15;

/// return first working url through head requests
/// let't keep both head and get versions for now
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// let't keep both head and get versions for now
/// let's keep both head and get versions for now

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +87 to +91
&& matches!(
content_type,
Some("application/octet-stream") | Some("application/piece")
)
&& etag.is_some()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why check for these headers?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just verifying whether it is a file, I notice that the curio and boost contain etag
curio - application/octet-stream
boost - application/piece
but generally, yes, it is not necessary :-)

Comment on lines +202 to +237
match client.get(&url).send().await {
Ok(resp) => {
let content_type = resp
.headers()
.get("content-type")
.and_then(|v| v.to_str().ok());
let etag = resp.headers().get("etag");

// check if the response has a content-type:
// * application/octet-stream => curio
// * application/piece => boost
// and an etag
// in header to indicating a file
if resp.status().is_success()
&& matches!(
content_type,
Some("application/octet-stream") | Some("application/piece")
)
&& etag.is_some()
{
tracing::info!("url WORKING: {:?}", url);
success_clone.fetch_add(1, Ordering::SeqCst);
Some(url)
} else {
debug!("Retrivability: URL::GET not working url: {:?}", url);
None
}
}
Err(err) => {
debug!(
"Retrivability: Get request for working url failed for {:?}: {:?}",
url, err
);
None
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is big and complicated enough to extract and deduplicate with filter_working_with_get

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

/// return retrivable percent of the urls
/// let't keep both head and get versions for now
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// let't keep both head and get versions for now
/// let's keep both head and get versions for now

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

None => &cleaned,
};

let final_addr = if trimmed.ends_with("/https") {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that tcp is already there ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no idea, we didn't know anyone had 'tcp' in multiaddr from curio....

.with_state(app_state.clone());

let server_addr = SocketAddr::from(([0, 0, 0, 0], 3010));
let server_addr = SocketAddr::from(([0, 0, 0, 0], 3030));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this breaks docker-compose

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

chrono = { version = "0.4.38", features = ["serde"] }
regex = "1.11.1"
moka = { version = "0.12.10", default-features = false, features = ["future"] }
alloy = "1.0.41"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alloy has a lot build into default feature set, this adds up to build time.
giving that we use only small part, we should narrow it down.
something like this should work:

alloy = { version = "1.0.41", default-features = false,
  features = [
    "sol-types",
    "network",
    "providers",
    "rpc-types-eth"
  ]
}

Copy link
Copy Markdown
Collaborator Author

@lukasz-wal lukasz-wal Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adjusted a little bit and done

@lukasz-wal lukasz-wal force-pushed the feature/curio-or-boost branch 2 times, most recently from 68fea97 to a254055 Compare October 27, 2025 14:55
}
}
Err(e) => {
println!("Failed to parse multiaddr: {:?} due to {:?}", addr, e);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need second log, and most importantly why it's println! instead of tracing::{error,info,debug} ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake, it was left over from local testing, thanks ! fixed

result: ResultCode::Success,
working_url: first_url,
retrievability_percent,
retrievability_percent: retrievability_percent.unwrap(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need unwrap() here ? seems unnecessary and will panic on None
maybe use unwrap_or() or always return default 0.0:f64

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

break;
}
Err(e) => {
println!("Attempt {attempt}/3 failed: {e} for address: {address}");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

println!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@lukasz-wal lukasz-wal merged commit c5163e3 into main Nov 17, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants