Similar text in C#

By May 24, 2016

Description:

Similar text in CSharp

Preferencesoft

In PHP, there is a function called similar text, which from two strings returns a rate of similarity between two strings. The algorithm used is based on the calculation of the sum of the lengths of the longest strings common to both strings.
In this article, we will implement such a function in CSharp.

Rate of similarity, algorithm

Multiple definitions can be considered for the rate of similarity.

This rate may be a function of the common number of characters, of the common number of words, the number of larger common substrings. The rate may reflect the order between characters or words etc.

The PHP function algorithm calculates the sum S(str1, str2) of the lengths of the longest strings common to both strings str1 and str2 as follows:

  •          Searching a largest string str common to str1 to str2

  •          Initialize a variable sum to Length(str)

  •          If str is not empty

    •    If the strings s1 and s2 on the left of str in str1 and str2 are not empty, then sum = sum + S(s1, s2)

    •    If the strings s1 and s2 on the right of str in str1 and str2 are not empty, then sum = sum + S(s1, s2)

  •          Return sum

As the common characters found, appear in str1 and str2, it multiplies the sum by 2.
The rate of similarity is defined by:

Give an example of calculation:

similar strings

 

Initially search the longest string common to str1 and str2: str=”QWERTY”
Hence sum=6.

In str1 the substring on the left of str is empty.

In str1 the substring on the right of str is: s1=”AB”
In str2 the substring on the right of str is: s2=”B”

 

strings

The longest string common to s1 and s2 is: str=”B”

Finally Sum = 7

 

Program in CSharp

 

You can download the program on CodePlex  at: https://similartext.codeplex.com

The class SimilarString contains the method SimilarText which returns the number of found common characters and the percentage of similarity.

Here is an example of use:

static void Main(string[] args)
        {
            string s1;
            string s2;
 
            s1 = Console.In.ReadLine();
            s2 = Console.In.ReadLine();
            SimilarString ss = new SimilarString();
            double p = 0;
            int n = ss.SimilarText(s1, s2, out p);
            Console.Out.WriteLine("{0}   percent:{1}", n, p);
            Console.In.ReadLine();
        }

 


CSharp

Categories

Share

Follow


KodFor Privacy Policy