Count character occurrences in C#

By May 15, 2016

Description:

Count character occurrences in CSharp

Preferencesoft

In this article, we will examine several methods to count the occurrences of each character in a given string and implement these methods in C# to compare their performance.

Algorithms

It is quite simple to calculate the number of occurrences of a given character within a string. But a problem arises when one wants to calculate the number of occurrences of each character in a string.  Must be careful not to recalculate several times the number of occurrences, in case of multiple instances of a same character in the string.

One approach is to consider the first character in the string and go throw string to find the number of occurrences of this character, to remove it from the string and start again on the new string until get an empty string.

A second method uses a labeling table, allowing not to repeat several times the counting, if a character appears more than once.

A third method is to increment the counter of a character in a sorted list.

Algorithm 1

We copy the given string in another called str. Until str is not empty, it determines the number of occurrences of the character str[0] in str with a loop, then removes this character from str.

Algorithm 2

We declare a Boolean array named check of the same length as the input string s and initializes it to false. We browse the string s and stops on a character such as the corresponding element in the table contains false. It resets a counter, and from this position, is marked to true in the table all identical characters, incrementing the counter.

Algorithm 3

It declares a sorted list named sorted, the key and value are integers. The value represents the number of occurrences of a character. It runs through the string and for each encountered character, we check its presence in the sorted list. If present, the corresponding value is incremented otherwise we created equally key code (Unicode) character and sets the value to 1.

Program

In the program, we declare the following methods corresponding to the 3 previous algorithms:

  •          CountOccurrencesDelete

  •          CountOccurrencesTable

  •          CountOccurrences

 

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
 
namespace CharacterOccurrences
{
    class Program
    {
        static public string CountOccurrencesDelete(string s)
        {
            string str = s;
            string r = "";
            int i = 0;
            while (str != "")
            {
                char first = str[0];
                int count = 1;
                for (int j = 1; j < str.Length; j++)
                {
                    if (str[j] == first)
                    {
                        count++;
                    }
                }
                r += "Character: " + first + ": " + count + " occurrences\n";
                str = str.Replace(new String(first, 1), "");
            }
            return r;
        }
 
        static public string CountOccurrencesTable(string s)
        {
            int len = s.Length;
            bool[] check = new bool[len];
            string r = "";
            for (int i = 0; i < len; i++)
            {
                check[i] = false;
            }
            int k = 0;
            while (k < len)
            {
                if (check[k])
                {
                    k++;
                }
                else
                {
                    check[k] = true;
                    char first = s[k];
                    int count = 1;
                    for (int j = k+1; j < len; j++)
                    {
                        if (s[j]==first)
                        {
                            count++;
                            check[j] = true;
                        }
                    }
                    r += "Character: " + first + ": " + count + " occurrences\n";
                } 
            }
            return r;
        }
        static public string CountOccurrences(string s)
        {
            SortedList<int, int> sorted = new SortedList<int, int>();
            int l = s.Length;
            for (int i = 0; i < l; i++)
            {
                int c = (int)s[i];
 
                if (sorted.IndexOfKey(c) < 0)
                {
                    sorted.Add(c, 1);
                }
                else
                {
                    sorted[c]++;
                }
            }
 
            string r = "";
 
            foreach (var pair in sorted)
            {
                r += "Character: " + (char)pair.Key + ": " + pair.Value + " occurrences\n";
            }
 
            return r;
        }
 
        static public string GenerateString(int n)
        {
            string s = "";
            Random rnd = new Random();
            for (int i = 0; i < n; i++)
            {
                s+= ((char)(48 + rnd.Next(200))).ToString();
            }
            return s;
        }
        static void Main(string[] args)
        {
            //string input = Console.In.ReadLine();
            string input = GenerateString(200000);
            var s1 = Stopwatch.StartNew();
            Console.Out.Write(CountOccurrences(input));
            s1.Stop();
            Console.Out.WriteLine();
            var s2 = Stopwatch.StartNew();
            Console.Out.Write(CountOccurrencesTable(input));
            s2.Stop();
            Console.Out.WriteLine();
            var s3 = Stopwatch.StartNew();
            Console.Out.Write(CountOccurrencesDelete(input));
            s3.Stop();
            Console.Out.WriteLine();
            Console.WriteLine("With sorted list "+((double)(s1.Elapsed.TotalMilliseconds)).ToString("0.00 ms"));
            Console.WriteLine("With table "+((double)(s2.Elapsed.TotalMilliseconds)).ToString("0.00 ms"));
            Console.WriteLine("By deletion "+((double)(s3.Elapsed.TotalMilliseconds)).ToString("0.00 ms"));
 
            Console.Read();
 
            Console.In.ReadLine();
        }
    }
}

And the result:

With sorted list 266,98 ms

With table 394,99 ms

By deletion 438,33 ms

 

For the string:
“aezrtyuio MFGHU789 45F”

 

Character:  : 2 occurrences

Character: 4: 1 occurrences

Character: 5: 1 occurrences

Character: 7: 1 occurrences

Character: 8: 1 occurrences

Character: 9: 1 occurrences

Character: F: 2 occurrences

Character: G: 1 occurrences

Character: H: 1 occurrences

Character: M: 1 occurrences

Character: U: 1 occurrences

Character: a: 1 occurrences

Character: e: 1 occurrences

Character: i: 1 occurrences

Character: o: 1 occurrences

Character: r: 1 occurrences

Character: t: 1 occurrences

Character: u: 1 occurrences

Character: y: 1 occurrences

Character: z: 1 occurrences

 

Character: a: 1 occurrences

Character: e: 1 occurrences

Character: z: 1 occurrences

Character: r: 1 occurrences

Character: t: 1 occurrences

Character: y: 1 occurrences

Character: u: 1 occurrences

Character: i: 1 occurrences

Character: o: 1 occurrences

Character:  : 2 occurrences

Character: M: 1 occurrences

Character: F: 2 occurrences

Character: G: 1 occurrences

Character: H: 1 occurrences

Character: U: 1 occurrences

Character: 7: 1 occurrences

Character: 8: 1 occurrences

Character: 9: 1 occurrences

Character: 4: 1 occurrences

Character: 5: 1 occurrences

 

Character: a: 1 occurrences

Character: e: 1 occurrences

Character: z: 1 occurrences

Character: r: 1 occurrences

Character: t: 1 occurrences

Character: y: 1 occurrences

Character: u: 1 occurrences

Character: i: 1 occurrences

Character: o: 1 occurrences

Character:  : 2 occurrences

Character: M: 1 occurrences

Character: F: 2 occurrences

Character: G: 1 occurrences

Character: H: 1 occurrences

Character: U: 1 occurrences

Character: 7: 1 occurrences

Character: 8: 1 occurrences

Character: 9: 1 occurrences

Character: 4: 1 occurrences

Character: 5: 1 occurrences

 

With sorted list 22,83 ms

With table 27,85 ms

By deletion 30,29 ms

 


CSharp

Categories

Share

Follow


KodFor Privacy Policy