Skip to content

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

License

Notifications You must be signed in to change notification settings

nivalis-studio/string-similarity

 
 

@nivalis/string-similarity

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

Table of Contents

Usage

Installation

Install using npm:

npm install @nivalis/string-similarity

Or using other package managers:

# Using yarn
yarn add @nivalis/string-similarity

# Using pnpm
pnpm add @nivalis/string-similarity

# Using bun
bun add @nivalis/string-similarity

Basic Usage

This package provides ESM exports and is written in TypeScript:

import { compareTwoStrings, findBestMatch } from "@nivalis/string-similarity";

const similarity = compareTwoStrings("healed", "sealed");

const matches = findBestMatch("healed", ["edward", "sealed", "theatre"]);

Note: This package is ESM-only and requires Node.js 16+ or a modern bundler that supports ESM.

API

The package exports two functions:

compareTwoStrings(string1, string2)

Returns a fraction between 0 and 1, which indicates the degree of similarity between the two strings. 0 indicates completely different strings, 1 indicates identical strings. The comparison is case-sensitive.

Arguments
  1. string1 (string): The first string
  2. string2 (string): The second string

Order does not make a difference.

Returns

(number): A fraction from 0 to 1, both inclusive. Higher number indicates more similarity.

Examples
import { compareTwoStrings } from "@nivalis/string-similarity";

compareTwoStrings("healed", "sealed");
// → 0.8

compareTwoStrings(
  "Olive-green table for sale, in extremely good condition.",
  "For sale: table in very good  condition, olive green in colour.",
);
// → 0.6060606060606061

compareTwoStrings(
  "Olive-green table for sale, in extremely good condition.",
  "For sale: green Subaru Impreza, 210,000 miles",
);
// → 0.2558139534883721

compareTwoStrings(
  "Olive-green table for sale, in extremely good condition.",
  "Wanted: mountain bike with at least 21 gears.",
);
// → 0.1411764705882353

findBestMatch(mainString, targetStrings)

Compares mainString against each string in targetStrings.

Arguments
  1. mainString (string): The string to match each target string against.
  2. targetStrings (Array): Each string in this array will be matched against the main string.
Returns

(Object): An object with a ratings property, which gives a similarity rating for each target string, a bestMatch property, which specifies which target string was most similar to the main string, and a bestMatchIndex property, which specifies the index of the bestMatch in the targetStrings array.

Examples
import { findBestMatch } from "@nivalis/string-similarity";

findBestMatch('Olive-green table for sale, in extremely good condition.', [
  'For sale: green Subaru Impreza, 210,000 miles',
  'For sale: table in very good condition, olive green in colour.',
  'Wanted: mountain bike with at least 21 gears.'
]);
// →
{ ratings:
   [ { target: 'For sale: green Subaru Impreza, 210,000 miles',
       rating: 0.2558139534883721 },
     { target: 'For sale: table in very good condition, olive green in colour.',
       rating: 0.6060606060606061 },
     { target: 'Wanted: mountain bike with at least 21 gears.',
       rating: 0.1411764705882353 } ],
  bestMatch:
   { target: 'For sale: table in very good condition, olive green in colour.',
     rating: 0.6060606060606061 },
  bestMatchIndex: 1
}

Release Notes

2.0.0

  • Removed production dependencies
  • Updated to ES6 (this breaks backward-compatibility for pre-ES6 apps)

3.0.0

  • Performance improvement for compareTwoStrings(..): now O(n) instead of O(n^2)
  • The algorithm has been tweaked slightly to disregard spaces and word boundaries. This will change the rating values slightly but not enough to make a significant difference
  • Adding a bestMatchIndex to the results for findBestMatch(..) to point to the best match in the supplied targetStrings array

3.0.1

  • Refactoring: removed unused functions; used substring instead of substr
  • Updated dependencies

4.0.1

  • Distributing as an UMD build to be used in browsers.

4.0.2

  • Update dependencies to latest versions.

4.0.3

  • Make compatible with IE and ES5. Also, update deps. (see PR56)

4.0.4

  • Simplify some conditional statements. Also, update deps. (see PR50)

5.0.0

  • BREAKING: Converted to TypeScript and ESM-only
  • BREAKING: Changed from default export to named exports
  • BREAKING: Removed UMD/browser builds - use a bundler or modern browser with ESM support
  • Updated to use modern TypeScript and build tools
  • Package now scoped as @nivalis/string-similarity

About

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

No packages published

Languages

  • TypeScript 94.5%
  • JavaScript 5.5%