The set Type in Python

Life is 10% what happens to us and 90% how we react to it. If you don't build your dream, someone else will hire you to help them build theirs.

The set Type in Python

Now let’s talk about sets in Python. The set class allows you to create a unique mapping of immutable elements. One way to work with sets, is to use the set() function to construct a set. By default, that set will be an empty_set and it will show up in the shell or represented as set parenthesis, parenthesis. Now if you use that set constructor function on some kind of sequence, like this tuple, then it will create a set with curly braces around the elements. Notice that it is an unordered list.

 # The set class provides a mapping of unique immutable elements 
 empty_set = set()
 print('empty_set ->', empty_set) 
 alpha = set(('a', 'b', 'c', 'd'))
 print('alpha ->', alpha) 
 dup_list = ['c', 'd', 'c', 'd', 'e', 'f'] 
 beta= set(dup_list) 
 print('beta ->', beta) 
 uniq_list = list(beta)
 print('uniq_list ->', uniq_list) 
 gamma = alpha.union(beta) 
 print('gamma->', gamma) 
 gamma = alpha|beta 
 print('gamma ->', gamma) 
 delta =alpha.intersection(beta) 
 print('delta ->', delta) 
 delta = alpha & beta
 print('delta ->', delta) 
 epsilon = alpha.difference(beta) 
 print('epsilon->', epsilon) 
 epsilon = alpha - beta 
 print('epsilon ->', epsilon) 
 
 The Output: 
 empty_set  -> set() 
 alpha -> {'a', 'b','d', 'c'} 
 beta -> {'e', 'f', 'd', 'c'} 
 uniq_list -> ['e', 'f', 'd', 'c']
 gamma -> {'f', 'd', 'c', 'e', 'b', 'a'} 
 gamma -> {'f', 'd', 'c', 'e','b', 'a'} 
 delta -> {'d', 'c'} 
 delta -> {'d', 'c'} 
 epsilon -> {'a', 'b'}
 epsilon -> {'a', 'b'}

One of the uses of sets, is to eliminate duplicates. Say you have a list containing some duplicate elements. If you were to go create a set from that, that set beta would have no duplicates in it. You could then use the list() function to convert that set back into a list that would have a unique set of elements. So notice the uniq_list no longer has any duplicates in it. Of course the other purpose of sets is to do set operations. Like you can do the combination of two sets through a union. So the alpha method union, or I should say that set_alpha method union, with beta, yields the combination of those two sets with any duplicates removed. Since they both had c and d, those only appear once. Alternatively to using the union() method, you can use the vertical bar, also to create a union between two sets.

To find where two sets overlap, or intersect, you can use the set intersection method. You can also use the ampersand operator and we see, in the delta set, where it’s got the two elements that both alpha and beta shared. You can do a set difference with the difference() method, and all of the elements from the set that you’re using that method on, will have them removed if they appear in the beta set or the set that you’re acting on. So if we look at the difference in epsilon, it’s the a and b that are in alpha, with the d and c removed that were a part of beta as well. A symmetric difference is described as the union minus the intersection. We can see there’s a method for that or you can use the caret operator. So looking at the eta set, we see, similar to gamma, all of the elements, except where they overlapped, removed from delta is gone. To see if two sets have anything in common at all, any elements that are shared, you can use the isdisjoint() method.

And epsilon and delta show isdisjoint() to be true, because epsilon and delta have no elements in common. However, with epsilon and eta, since those two sets do have some elements in common, they are not disjoint. You can see if one set is a subset of another, that will be true if that other set, in this case eta, contains all of the elements of the set epsilon. And since eta actually does have all of the elements of epsilon, we can see epsilon is a subset of eta. On the other hand, epsilon issubset of beta is false. If we look at beta, it does not have all of the elements of eta and so that’s false.

 #epsilon -> {'a', 'b'} 
 #delta -> {'d', 'c'} 
 #eta -> {'f', 'e', 'b', 'a'}
 #beta = {'e', 'f', 'd', 'c'} 

 print('epsilon.isdisjoint(delta) ->',epsilon.isdisjoint(delta)) 
 print('epsilon.isdisjoint(eta) ->',epsilon.isdisjoint(eta)) 
 print('epsilon.issubset(eta) ->',epsilon.issubset(eta)) 
 print('epsilon.issubset(beta) ->',epsilon.issubset(beta)) 
 
 The output : epsilon.isdisjoint(delta) -> True
 epsilon.isdisjoint(eta) -> False 
 epsilon.issubset(eta) -> True
 epsilon.issubset(beta) -> False

Kind of turning the tables around 180 degrees here, we have issuperset(). So if the set eta, in this case, contains all of the elements of the issuperset epsilon, then it would be considered a superset. Let’s see, eta versus epsilon, eta does have all of the elements of epsilon so it is a superset of it. On the other hand, looking at isbeta, issuperset of epsilon, beta does not have all the elements of epsilon, so issuperset is false. Now to this point, we could have been using either a set or something else similar called a frozenset. The difference is, frozensets are immutable. You can do different set operations to see how they compare with other sets, but you cannot modify frozensets. So what we’ll talk about for the remainder of this, are things that only apply to sets and not frozensets.

 #In the example that follows, the following sets are being referenced
 #epsilon -> {'a', 'b'} 
 #delta -> {'d', 'c'} 
 #eta -> {'f', 'e', 'b', 'a'}
 #beta = {'e', 'f', 'd', 'c'} 

 print('beta.issuperset(epsilon)->', beta.issuperset(epsilon)) 
 feta = frozenset(eta) #frozensets are immutable without updating methods 
 print('feta ->', feta)
 
 The output is: beta.issuperset(epsilon)-> False  
 feta -> frozenset ({'f', 'e', 'a','b'})

With a set, you can add elements. Elements that are added have to be unique. So if you add them more than once, it still only puts them into the set one time. And we can see the creation of the empty set zeta, adding three once, and trying to add it again, it’s still just in there once and then we add the four to that set. With discard() method, it will remove the element, if it’s present, if it’s not present it doesn’t raise an error. So initially there was an a in the gamma set, it was discarded, and then we see that it’s gone. If we try to discard z, which is not present, no error occurred. Had we tried to remove z, an error would have occurred. We remove b and we can see that the b is no longer an element. With the pop() method, it will also remove an element from a set, but it’s a random_element. In this case, the random_element happened to be f and so we see f is no longer present in that set.

 zeta = set() 
 print('zeta->', zeta) 
 zeta.add(3) 
 print('zeta ->', zeta) 
 zeta.add(3) 
 print('zeta->', zeta) 
 zeta.add(4) 
 print('zeta ->', zeta) 
 print('gamma ->', gamma) 
 gamma.discard('a')
 print('gamma ->', gamma) 
 gamma.discard('z') 
 print('gamma ->', gamma)
 gamma.remove('b') 
 print('gamma ->', gamma) 
 random_element = gamma.pop()
 print('random_element ->', random_element) 
 print('gamma ->', gamma)
 
 The output : 
 zeta ->set() 
 zeta -> {3} 
 zeta -> {3} 
 zeta -> {3, 4} 
 gamma ->  {'f', 'd', 'c', 'e', 'b','a'} 
 gamma ->  {'f', 'd', 'c', 'e', 'b'} 
 gamma -> {'f', 'd', 'c', 'e','b'} 
 gamma -> {'f', 'd', 'c', 'e'} 
 random_element -> f 
 gamma -> {'d','c', 'e'}

Be careful if you’re doing modifications of a set; if you’ve created a reference to that set it’s going to be affected. If you want to be sure that you have another copy of a set, you can use the copy() method. Notice if you use the clear() method, it will remove all elements from a set. Zeta previously had elements 3 and 4 in it. After having executed that clear() method, we can see that zeta is back to being an empty set. But notice too that the reference to zeta, done by simple assignment, is also empty. But the copy created with that copy method assignment is not affected.

 zeta_ref = zeta
 zeta_copy = zeta.copy() 
 zeta.clear() 
 print('zeta ->', zeta)
 print('zeta_ref ->', zeta_ref) 
 print('zeta_copy ->', zeta_copy) 
 
 The output: 
 zeta -> set() 
 zeta_ref -> set() 
 zeta_copy-> {3, 4}

Finally, we might mention that the same set operations can be used to perform updates. And it will actually affect the set that you’re doing in the operation on. So in each of these, so that the actual alpha set is not affected, a copy is being made before one of these update methods is used. And then it will update the set to be something like just the difference, just the intersection, the symmetric difference, or in effect when you do the update method, basically you are updating that one set with the combination of all the elements from the other set, so it’s the union of the two. Now you know how to work with sets in Python, being able to do basic set operations on them, create them, and update them. You’re also aware that there is a frozenset that cannot be updated because it’s immutable.

 #The following sets are being referenced: 
 #alpha= {'a', 'b', 'd', 'c'} beta = {'e', 'f', 'd', 'c'}
 print('alpha ->',alpha) 
 alpha_diff = alpha.copy() 
 alpha_diff.difference_update(beta)
 print('alpha_diff ->', alpha_diff) 
 alpha_intersect = alpha.copy()
 alpha_intersect.intersection_update(beta) 
 print('alpha_intersect ->',alpha_intersect) 
 alpha_sym_diff = alpha.copy()
 alpha_sym_diff.symmetric_difference_update(beta) 
 print('alpha_sym_diff->', alpha_sym_diff)  
 alpha_union = alpha.copy()
 alpha_union.update(beta) 
 print('alpha_union ->', alpha_union)  
 
 The output : 
 alpha_diff -> {'b', 'a'} 
 alpha_intersect ->{'d', 'c'} 
 alpha_sym_diff -> {'f', 'e', 'b', 'a'} 
 alpha_union -> {'f','d', 'c', 'e', 'b', 'a'}