Count Occurences of Each Unique Element in a List¶

Date published: 2018-02-15

Category: Python

Subcategory: Beginner Concepts

Tags: dictionaries, collections, lists

Import Libraries¶

In [1]:

                
                    Copied!
                    
from collections import Counter
from collections import defaultdict
from collections import Counter
from collections import defaultdict

Count Occurences of Each Unique Element - 3 Different Ways¶

Let's say your friend gets a list of grades for the semester. We want a count for each letter grade.

In [2]:

                
                    Copied!
                    
high_school_semester_grades = ["B+", "A", "B+", "A", "A+", "A-", "A"]
high_school_semester_grades = ["B+", "A", "B+", "A", "A+", "A-", "A"]

There are several Pythonic ways to easily do this.

Create a Dictionary¶

The dictionary object is built into Python's standard library.

In [3]:

                
                    Copied!
                    
dict_count_occurences_of_letters = {}

for letter in high_school_semester_grades:
    if letter in dict_count_occurences_of_letters:
        dict_count_occurences_of_letters[letter] += 1
    else:
        dict_count_occurences_of_letters[letter] = 1
dict_count_occurences_of_letters = {}

for letter in high_school_semester_grades:
    if letter in dict_count_occurences_of_letters:
        dict_count_occurences_of_letters[letter] += 1
    else:
        dict_count_occurences_of_letters[letter] = 1

In [4]:

                
                    Copied!
                    
dict_count_occurences_of_letters
dict_count_occurences_of_letters

Out[4]:

{'A': 3, 'A+': 1, 'A-': 1, 'B+': 2}

In [5]:

                
                    Copied!
                    
type(dict_count_occurences_of_letters)
type(dict_count_occurences_of_letters)

Out[5]:

dict

Create a defaultdict¶

We import the defaultdict subclass from the Collections module in Python.

In [6]:

                
                    Copied!
                    
defaultdict_count_occurences_of_letters = defaultdict(int)
defaultdict_count_occurences_of_letters = defaultdict(int)

In [7]:

                
                    Copied!
                    
for letter in high_school_semester_grades:
    defaultdict_count_occurences_of_letters[letter] += 1
for letter in high_school_semester_grades:
    defaultdict_count_occurences_of_letters[letter] += 1

In [8]:

                
                    Copied!
                    
defaultdict_count_occurences_of_letters
defaultdict_count_occurences_of_letters

Out[8]:

defaultdict(int, {'A': 3, 'A+': 1, 'A-': 1, 'B+': 2})

In [9]:

                
                    Copied!
                    
type(defaultdict_count_occurences_of_letters)
type(defaultdict_count_occurences_of_letters)

Out[9]:

collections.defaultdict

Create a Counter¶

We import the Counter subclass from the Collections module in Python.

In [10]:

                
                    Copied!
                    
counter_count_of_letters = Counter(high_school_semester_grades)
counter_count_of_letters = Counter(high_school_semester_grades)

In [11]:

                
                    Copied!
                    
counter_count_of_letters
counter_count_of_letters

Out[11]:

Counter({'A': 3, 'A+': 1, 'A-': 1, 'B+': 2})

In [12]:

                
                    Copied!
                    
type(counter_count_of_letters)
type(counter_count_of_letters)

Out[12]:

collections.Counter

Additional: Sort Data Structures by Count of Letters¶

Dictionary sort values¶

In [13]:

                
                    Copied!
                    
sorted(dict_count_occurences_of_letters.items(), key=lambda x: x[1], reverse=True)
sorted(dict_count_occurences_of_letters.items(), key=lambda x: x[1], reverse=True)

Out[13]:

[('A', 3), ('B+', 2), ('A+', 1), ('A-', 1)]

defaultdict sort values¶

Same method as the dictionary way above.

In [14]:

                
                    Copied!
                    
sorted(defaultdict_count_occurences_of_letters.items(), key=lambda x: x[1], reverse=True)
sorted(defaultdict_count_occurences_of_letters.items(), key=lambda x: x[1], reverse=True)

Out[14]:

[('A', 3), ('B+', 2), ('A+', 1), ('A-', 1)]

Counter sort values¶

Counter has a built-in method called most_common to return a list of tuples sorted from most common to least common.

In [15]:

                
                    Copied!
                    
counter_count_of_letters.most_common()
counter_count_of_letters.most_common()

Out[15]:

[('A', 3), ('B+', 2), ('A+', 1), ('A-', 1)]

Evaluation of Three Data Structures¶

For the above scenario, I prefer Counter because it involves the most concise and easy to read code. There's a lot of operations on the background, but that's abstracted away from us.

Just be careful using unique structures such as Counter because other programmers may be unfamiliar with them. I'd recommend to link to the relevant Python official documentation.