Similarity trait Rust crate

documentation • source • llms.txt • crate • email

The Similarity trait defines one function with one input and one output, so you can compare any kinds of input values and return any kind of output value.

We use this trait in our programs to create multiple kinds of similarity functionality, such as for trying various similarity algorithms that we want to use with the same input type and same output type.

For examples, please see the directory examples.

Similarity of a pair

One way to use this trait is to calculate the similarity of a pair of values, such as two numbers, or two strings, or two images.

This is sometimes known as pairwise similarity or pair matching.

Example: given two numbers, then return the percent change.

use similarity_trait::*;
struct MyStruct;

impl SimilarityIO<(i32, i32), Option<i32>> for MyStruct {
    fn similarity(input: (i32, i32)) -> Option<i32> {
        Some(input.1.checked_sub(input.0)?.abs())
    }
}

let absolute_difference = MyStruct::similarity((100, 120));
assert_eq!(absolute_difference, Some(20));

Similarity of a collection

One way to use this trait is to calculate the similarity of a collection of values, such as an array of numbers, or vector of strings, or set of images.

This is sometimes called intra-group similarity or statistical correlation.

Example: given numbers, then return the population standard deviation.

use similarity_trait::SimilarityIO;
struct MyStruct;

impl SimilarityIO<&Vec<f64>, Option<f64>> for MyStruct {
    /// Similarity of numbers via population standard deviation
    fn similarity(numbers: &Vec<f64>) -> Option<f64> {
        if numbers.is_empty() { return None }
        let mean = numbers.iter().sum::<f64>() / numbers.len() as f64;
        let variance = numbers.iter().map(|x| (x - mean).powi(2)).sum::<f64>() / numbers.len() as f64;
        Some(variance.sqrt())
    }
}

let numbers = vec![2.0, 4.0, 4.0, 4.0, 5.0, 5.0, 7.0, 9.0];
let population_standard_deviation = MyStruct::similarity(&numbers).expect("similarity");
assert!(population_standard_deviation > 1.999 && population_standard_deviation < 2.001);

For examples, please see the directory examples.

Similarity of a pair or a collection

You may want to choose whether you prefer to calculate the similarity of a pair (such as two strings) or a collection (such as a vector of strings).

Example: given a pair of strings, then return the Hamming distance.

use similarity_trait::SimilarityIO;
struct MyStruct;

impl SimilarityIO<(&str, &str), usize> for MyStruct {
    /// Similarity of a pair of strings via Hamming distance.
    fn similarity(pair: (&str, &str)) -> usize {
        pair.0.chars().zip(pair.1.chars()).filter(|(c1, c2)| c1 != c2).count()
    }
}

let pair = ("information", "informatics");
let hamming_distance = MyStruct::similarity(pair);
assert_eq!(hamming_distance, 2);

Example: given a collection of strings, then return the maximum Hamming distance.

use similarity_trait::SimilarityIO;
struct MyStruct;

impl SimilarityIO<Vec<&str>, usize> for MyStruct {
    /// Similarity of a collection of strings via maximum Hamming distance.
    fn similarity(collection: Vec<&str>) -> usize {
        let mut max = 0;
        for i in 0..collection.len() {
            for j in (i + 1)..collection.len() {
                max = std::cmp::max(max, collection[i].chars().zip(collection[j].chars()).filter(|(c1, c2)| c1 != c2).count())
            }
        }
        max
    }
}

let collection = vec!["information", "informatics", "affirmation"];
let maximum_hamming_distance = MyStruct::similarity(collection);
assert_eq!(maximum_hamming_distance, 5);

How to learn more

Wikipedia links:

Item-item collaborative filtering
Edit distance
Hamming distance
Levenshtein distance
Paired difference test
Cosine similarity
Euclidean_distance
Correlation coefficient
Intraclass correlation
Rank correlation
Polychoric correlation
Goodman and Kruskal's gamma
Pearson correlation coefficient also known as product-moment correlation coefficient.
Jaccard index also known as coefficient of community, intersection over union, ratio of verification, critical success index, Tanimoto index.

Similarity research papers about patient matching:

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
examples		examples
src		src
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE.md		LICENSE.md
README.md		README.md
cspell.json		cspell.json
llms.json		llms.json
llms.txt		llms.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Similarity trait Rust crate

Similarity of a pair

Similarity of a collection

Similarity of a pair or a collection

How to learn more

About

Uh oh!

Releases

Packages

Languages

License

SixArm/similarity-trait-rust-crate

Folders and files

Latest commit

History

Repository files navigation

Similarity trait Rust crate

Similarity of a pair

Similarity of a collection

Similarity of a pair or a collection

How to learn more

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages