|
Chapter 9: Collections
9 CollectionsCollections are one of the cornerstones of a modern programming language. Java has not always had good collection support, and only since Java 1.2 has the platform benefited from a strong collection class library. The collections provided in Microsoft .NET are less flexible than those in Java but include some interesting features that help to address the differences.The first part of this chapter covers indexers, which are a useful language construct employed by all of the collection classes. We move on to explain the interfaces that define the basic collection contracts; then we discuss the collections themselves. Much of the chapter is given over to the individual collection classes. The most commonly used collections from Java are available and remain largely unchanged. Some new collections aren't available in Java, and we explain their purpose and use. However, the .NET collection framework also has significant omissions, which Java developers will miss. We finish this chapter with a section on thread-safe collections and the classes that are available to build strongly typed custom collection types.
IndexersIndexers are a new C# feature that allows classes to be indexed in a similar way to arrays. Indexers are particularly applicable for use in conjunction with collection data members. Using the array-style syntax provides a logical mechanism through which safe access can be provided to the underlying collection data. The .NET standard collection classes use indexers extensively, as we will see later in this chapter.
Of particular use is the ability to overload an indexer, providing access based on different key types. The following example demonstrates the implementation of an indexer to provide access to a collection using both string and int as keys. This example uses an array as the underlying collection, but the same approach is equally applicable to any collection. The example we use is a set of paintbrushes. A brush can be selected by its position or its color. For the purpose of the example, only three brushes are manually defined.
public struct Brush {The indexers (indicated by boldface) allow a Brush to be retrieved using either its position in the array or its color. In the case of the color, we use a switch statement because we know the set of brush names. We could have used a foreach loop to look through the array and select the brush with the correct name if the number of brushes had been greater.
Collection InterfacesLike Java, the .NET Framework declares interfaces that define the core collection functionality; these interfaces provide common functionality that can be extended to achieve specialized behavior.A class that implements one of the collection interfaces is known as a collection class. Note that a collection class is not required to implement every interface member and may throw a NotSupportedException when unimplemented members are invoked. All of the interfaces covered in this section are found in the System.Collections namespace.
ICollectionThis is the foundation of the collections namespace and is implemented by all the collection classes. It defines only the most basic collection functionality, as shown in Table 9-1.Table 9-1 The ICollection Interface
The equivalent Java interface is java.util.Collection, containing 15 methods that provide commonality across all collection classes. .NET takes the approach of pushing method definitions down into specialized interfaces for lists and dictionaries.
IEnumerable, IEnumerator, and the foreach KeywordThe IEnumerable interface is very simple but useful. It contains only one method:
IEnumerator.GetEnumerator(); Classes that implement this method must return a class that implements the IEnumerator interface. The IEnumerator interface defines the notion of a cursor that moves over the elements of a collection. It has three members, as described in Table 9-2, for moving the cursor and retrieving elements from the collection. Table 9-2 The IEnumerator Interface
In common with Java, .NET enumerators are not thread safe, and when implementing IEnumerator, take care to provide a copy of the elements or to ensure that modifications cannot be made to the underlying data while the enumerator is in use. The following example demonstrates use of IEnumerator and IEnumerable:
using System ; Here's a fragment that demonstrates using the EnumExample class, iterating over a string array:
EnumExample x_example = new EnumExample( And here's the output:
Element: allen We can use the IEnumerator implementation to provide support for an IEnumerable implementation. In this example, we pass the underlying array into the enumerator.
class Enumerator : IEnumerable {To get hold of the enumerator, a call is made to the GetEnumerator method. Any class that implements the IEnumerable interface can be iterated over using the foreach statement, which provides a clean syntax when iterating over collections. Unfortunately, the foreach statement provides no mechanism to access the index of the element being worked with. If index access is a requirement, we suggest using a for loop.
IComparer and IComparableLike Java, .NET defines two interfaces used to sort collections. One is for classes that will make comparisons and the other for classes that will be compared; these are similar to the Java interfaces for sorting. .NET defines the interface System.IComparable, which is a direct equivalent of java.lang.Comparable. Both the Java and .NET interfaces define a single method that takes an object argument. The return value is an int stating the relative rank of the IComparable to the object argument. The second interface is System.Collections.IComparer, implemented by classes that are able to sort the elements of a collection. The only method in this interface is Compare, which takes two object arguments and returns their rankings.Most of the fundamental classes implement IComparable, including the struct implementations of all simple value types, making sorting with simple types straightforward. .NET provides two utility classes that can be used to assist with simple sorting. These are System.Collection.Comparer and System.Collection.CaseInsensitiveComparer. These classes are used in conjunction with sorting routines such as those implemented by System.Array. Here's an example:
string[] x_arr = new string[] {"Allen", "Jones", The output follows:
STR: Adam Instances of both comparers are obtained using their static Default field and are not constructed directly. The single instance can be shared between sorting operations in different threads and classes.
Other Collection InterfacesTwo other collection interfaces are worth noting: IList and IDictionary. These provide the foundation for indexed array collections (such as ArrayList) and key/value collections (such as Hashtable). We'll discuss the classes that implement these interfaces in the next section.
Basic CollectionsThe .NET Framework includes classes that compare to some of the core collections found in Java, but the overall support and flexibility fall short. This section covers the basic .NET collections, which are similar to their Java equivalents. We begin with coverage of arrays; .NET introduces some clever changes that allow arrays to be treated like collections.
ArraysThe most fundamental type of collection is the array. The use of arrays in .NET is similar to that in Java, with one significant enhancement: all arrays derive from the System.Array class. By deriving all arrays from the System.Array class, .NET provides useful functionality for array creation and manipulation. Although Java arrays are technically classed as objects, the language exposes methods only from the java.lang.Object class.Predominantly, arrays are created, assigned, and accessed using the integral language support provided by C#. For example:
int[] x_arr = new int[10];
The remainder of this section focuses on the object properties of arrays and their use as a collection. Arrays as Objects All the array actions that are normally handled using language syntax are possible through the members of System.Array. For example, the following code fragments are functionally equivalent:
// Using language syntax These examples demonstrate the object capabilities of .NET arrays. Bear in mind that while it is possible to use the System.Array class to handle all array features, there is no need to do so and the native support is recommended for most array operations. The primary advantage of the System.Array class is that it implements ICollection, IList, and IEnumerable, which allow arrays to be treated as collections. In addition, there are some useful static methods in Array that support sorting and searching. This feature will not revolutionize programming, but it is clever, and we admire the forethought of the designers. Table 9-3 summarizes the methods and properties available in the .NET System.Array class and identifies their Java equivalents. Because Java doesn't include a class comparable to System.Array, the alternatives are all static methods from the java.util.Arrays and java.lang.System utility classes. Table 9-3 Comparison Between Java and C# Array Methods
HashtableThe System.Collections.Hashtable class is similar to the Java java.util.HashMap class. The only important difference is that key/value pairs are stored and accessed via an indexer. Here's an example of how to set and get a Hashtable element:
Hashtable x_hash = new Hashtable(); The .NET Hashtable also provides the Add method, which adds a new key/ value pair to the element set only if the key does not already exist. If the key does exist, an exception will be thrown. Table 9-4 contrasts the principal methods of the System.Collections.Hashtable and java.lang.HashMap classes. Table 9-4 Comparison Between C#'s Hashtable and Java's HashMap Classes
ArrayListThe .NET System.Collections.ArrayList is equivalent to java.util.ArrrayList. Like Hashtable, ArrayList implements an indexer used to get and set the value at a specific index. The ArrayList class implements the IList interface and provides the unusual static method Adapter, which makes the methods for binary searches and sorting available on any IList by wrapping the collection in an ArrayList instance. This is not as useful as it sounds because the operations aren't optimized for the underlying structure of the list. Apart from the indexer and the adapter support, the Java and .NET classes are much the same. Table 9-5 compares the Java and C# ArrayList implementations.Table 9-5 Comparison of the Java and C# ArrayList Implementations
QueueA queue is a first in, first out (FIFO) collection, in which objects are retrieved in the order they were added. This is useful for processing messages or events. Although Java doesn't provide a class that enforces the FIFO constraint, we've seen examples of the LinkedList class being used to represent a queue. The problem with this approach is that LinkedList elements can be manipulated in ways contradictory to the queue model. Happily,.NET provides the System.Collections.Queue class, a concrete implementation of a queue that enforces the FIFO rule.The queue implementation is based on a circular array, and objects are inserted at one end and removed from the other. As with the ArrayList class, if the number of elements exceeds the capacity of the queue, the underlying array is resized to provide more space. Here's a simple example demonstrating how to use the Queue class:
Queue x_queue = new Queue(); The results of this example follow:
Dequeued: first element Elements are added to the Queue using the Enqueue method and are removed using Dequeue. The Queue class implements the IEnumerable interface and can be used in a foreach loop. In contrast with an ArrayList, it's not possible to get an element by index. However, it's possible to see what value is at the head of the queue by calling Peek. It can be determined that a value is contained in the Queue by calling the Contains method. The number of elements in the queue is available using the Count property.
StackUnlike a queue, a stack is a last in, first out (LIFO) collection, in which the most recently added element is the one that will be returned first. Java and .NET both provide stack implementations. The .NET stack class is System.Collections.Stack. Here's an example using Stack:
Stack x_stack = new Stack(); The result of this example follows:
POPPED: third element The Java java.net.Stack class has been around since Java 1.0 and is derived from the Vector class. This presents a potential misuse problem whereby a reference to a Stack can be cast to a Vector, allowing the underlying elements to be manipulated as a list. The .NET implementation isn't derived from any other collection and can't be misused in this manner. Table 9-6 shows the mapping between the Java and .NET stack classes. Table 9-6 Comparison Between the Java and C# Stack Implementations
SortedListThe System.Collections.SortedList class is a cross between a Hashtable and an ArrayList. Two arrays are maintained, one for keys and one for values. The value array is used when a value is requested by index, like an ArrayList. When a value is requested by key, the index is obtained by searching the key array and then the matching value is obtained from the value array, which behaves like a Hashtable but requires more operations.When a new item is added, the key array is re-sorted, and the value array is adjusted to reflect the changes. This is an unusual approach, and it can be seen that special support was added to the System.Array class to assist in the sort operations. (For example, look at the Sort(Array, Array) method as an example of this special support.) The closest analog in Java is the java.util.SortedSet interface and related concrete implementations. In general, requesting values by key is slower with a SortedList than a Hashtable. However, the advantage of being able to request a value by index makes this class valuable for some problems. Here's a simple example for this class:
SortedList x_list = new SortedList(); The output from this fragment is
First element: freeman The sorting operations can be handled by using the CompareTo method of classes that implement IComparable or by an instance of IComparator.
Specialized CollectionsThe System.Collections.Specialized namespace contains a series of collections that either are strongly typed or offer functionality that is not widely required.
Strongly Typed CollectionsThe strongly typed collections all deal exclusively with strings.NameObjectCollectionBase and NameValueCollection The NameObjectCollectionBase class is an abstract class that is based on a Hashtable but that accepts only strings as key types. The only concrete implementation of this class in the collections namespace is NameValueCollection. The NameValueCollection class is derived from NameObjectCollectionBase but can be used to store multiple values for each key. However, both the key and the values must be strings. The most obvious use of this class is processing HTTP headers. Here's an example:
NameValueCollection x_collection = new NameValueCol lection(); The result of this fragment follows:
VALUE: value1 StringCollection The StringCollection class represents an implementation of the IList interface such that the interface handles only strings. StringDictionary The StringDictionary class is the dictionary equivalent of the StringCollection class. In essence, it's an implementation of a Hashtable that accepts only strings as keys.
Unusual CollectionsThe following two collections have characteristics that make them useful in specific circumstances but not as everyday collections.ListDictionary The ListDictionary class implements the IDictionary interface using a single-linked array. It behaves like a Hashtable, which implements the same interface, but is faster when dealing with up to ten items. This is based on the premise that iterating over a small array is faster than computing hash codes. For more than ten items, the iteration takes longer than hashing and the benefits of this class are negated; a Hashtable will offer better performance HybridDictionary The HybridDictionary class provides the best of both a ListDictionary and a Hashtable. If the number of elements in the collection is small, they are maintained in a ListDictionary. Once the number goes above what's optimal for a ListDictionary, they are automatically transferred to a Hashtable for better performance. This results in a one-off performance hit. If the number of elements falls below the ListDictionary threshold, they remain in the Hashtable.
The CollectionsUtil ClassThe System.Collections.Specialized namespace also contains the CollectionUtil class, which we think would be better located in the main System.Collections namespace. Static methods of this factory class create case-insensitive instances of the ArrayList and Hashtable classes discussed earlier. This is achieved by using case-insensitive implementations of the underlying hash code provider and comparer.
SynchronizationJava has the legacy of the classic collections (Hashtable, Vector, and so forth) and the new collections that were introduced in Java 1.2 (HashMap, ArrayList, and so forth). Although some implementation differences exist, the main change was that none of the methods in the new collections are synchronized, but the classic collections are all thread-safe..NET has a different approach to thread-safe collections. All of the concrete classes contained in the System.Collections namespace are concurrently safe for multiple readers and a single writer. Multiple concurrent writers will cause problems. A synchronized (thread-safe) implementation of a concrete collection can be obtained by calling the Synchronized method, which returns a wrapped version of the collection, just like calling java.util.Collections.synchronizedCollection. This wrapped version is thread safe and can be used by concurrent writers. In addition, the ICollection interface provides a method to determine whether a collection is synchronized, and it provides an object that can be used for locking when writing thread-safe wrappers or using monitors.
Here's an example:
Hashtable x_unsafe = new Hashtable(); The result of these statements is shown below:
Sync state: False Even though it's possible to generate a thread-safe collection, enumerators are still troublesome. .NET enumerators are backed by the underlying collection, and any changes to the elements can cause problems. .NET takes the approachin common with Javaof throwing an exception if the elements are changed during the life of an enumerator.
Custom CollectionsLike Java, .NET makes it difficult to subclass concrete collection classes to provide specialized behavior without implementing large amounts of functionality. There always seems to be some vital method or inner class that is not accessible. Both languages provide abstract classes as the mechanism for custom designs, but we think that the .NET classes are better thought out.The abstract classes in .NET are listed in Table 9-7. Table 9-7 Abstract .NET Collection Classes
The most interesting aspect of these classes is that none of the collections detailed in this chapter are derived from them. There is no specific reason why custom collections have to be subclassed from one of these bases, but it is generally good practice to do so.
SummaryThis chapter has illustrated that collections in .NET are broadly similar to those in Java, although less comprehensive in nature. The most frequently used classes (Hashtable and ArrayList) are almost identical, allowing for basic differences between the Java and C# language.There are some striking omissions from the .NET class library, most notably support for sets and the base classes for providing least recently used (LRU) maps, such as the LinkedHashMap Java class. However, the ability to treat arrays as collections is a useful feature.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||