Softpanorama
May the source be with you, but remember the KISS principle ;-)

Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor

Introduction to Perl 5.10 for Unix System Administrators

(Perl 5.10 without excessive complexity)

by Dr Nikolai Bezroukov

Contents : Foreword : Ch01 : Ch02 : Ch03 : Ch04 : Ch05 : Ch06 : Ch07 : Ch08 :


Prev | Up | Contents | Down | Next

3.4. Operations on hashes

Associative arrays or hashes are a generalization of a regular arrays (lists) to a non numeric indexes. They provide a built-in dictionary capabilities in Perl. You put values into the hash by defining key-value pairs. Like Perl arrays, hashes grow and shrink automatically when you add or delete pairs. The main difference is that array indexes are converted to numeric before retrieving the value and in associative arrays they are not and thus can be arbitrary strings (for regular arrays all non-numeric indexes are equivalent to the index 0). The first language that introduced this type of data structures into the language was probably Snobol.

Definition and Initialization

To define a hash we use the usual parenthesis notation for lists, but the array itself is prefixed by a % sign. Suppose we want to create an array of URLs of sites and their IP addresses. It would look like this:

%hosts = ("www.yahoo.com", "131.111.11.1", www.northenlight.com, "128.11.1.1");
for readability instead of comma it is possible to use =>, but this is just syntax sugar -- no useful additional diagnostics is generated.  Still it makes code more readable and as such is strongly recommended. Using this notation the previous example can be rewritten as
%hosts = ("www.yahoo.com" => "131.111.11.1", 
   "www.northenlight.com" => "128.11.1.1"
   );

Now we can find the IP addresses of sited with the following expressions:

$hosts{"www.yahoo,com"}; # Returns 131.111.11.1
$hosts{"www.northenlight.com"}; # Returns 128.11.1.1

If one key has several values, than the last one will be used, for example:

%voltages = ('US'=> 110, 'Russia'=>220, 'Russia'=> 127);
print "$voltages{'Russia'}"; # will print 127

Notice that like list arrays each % sign has changed to a $ to access an individual element because that element is a scalar. Unlike list arrays the index (in this case the person's name) is enclosed in curly braces. The idea being that associative arrays are different from arrays and does not convert index to numeric value. That's why brackets should be different. Actually associative array index called key is always a string value, so technically operation {} converts any value to a string. There is no alternative numeric index for retrieving this value and the sequential order of elements in the hash is undefined. For example

$hosts{2}; # error. Will not return the second element that we put into hash 

An  hash can be converted back into a list array just by assigning it to a list array variable. Again the order of elements in which they will be assigned sequential indexes is undefined. A list array can be converted into an  hash by assigning it to an  hash variable. The elements will be treated as pairs, so the list should have an even number of elements for all pairs to make sense, or the last key will have the value undef.

@host_lst = %hosts;
# @info is a list array. It now has 4 elements
%hosts=@host_lst # convert an array to a hash (even number of elements expected)

To change the value for any single key you can say:

$hosts{'www.yahoo.com'} = '131.1.1.1';

There is also a good but rarely used method to initialize hashes that contain words using qw lexical feature:

%months = qw( 
	jan 1 
	feb 2 
	mar 3 
	); 

As you can see for keys and values that are single words this method avoids using quotes and thus quotes related errors (unmatched quotes). The latter is pretty frequent error, especially with large lists.

Operations on hashes

Assignment is possible both from and to hashes elements much like from and to elements of list with the only difference that curvy brackets should be used and that the index (key) is a string:
$month{"Jan"}=31;

It is important to remember that hashes do not have "natural" order of their elements like lists, but is it possible to access all the elements in turn using the keys function, values function. or each function. This functions are usually used in loops and will discuss them later. Right now we just provide a couple of typical examples without explanations:

foreach $ip (keys %hosts){
print "The name of the site for the address $ip is $hosts{$ip}\n"; }
foreach $ip (values %hosts){ 
	print "One of portals has IP address $ip. Guess which one. \n";
}

There is also a function which returns a two element list ( key/value pair). Every time each is called it returns another key/value pair:

while (($dns, $ip) = each(%hosts)){ 
   print "$dns has IP address $ip\n"; 
} 

Please note the difference. Unlike two previous example while loop is used instead of foreach loop. Please think why foreach loop would not work in this case.

Built-in Functions for Hashes

There are six main built-in functions for hashes: exists, delete, undef, keys, values, and each. As we already seen the functions keys and values provide for key and value part of the pair, correspondingly. The each function provides two elements (key-value pair).

exists function

Generally the standard way to test existence of scalar is the test for the undef value with the defined built-in function. In hash there are two components of each pair: key and value. So here we need to check the existence of the key not the existence of the value and thus we need a different function. The exists function does exactly that:

if (exists($hosts{'www.yahoo.com'})) {
   print $hosts{'www.yahoo.com'};}

delete function

The delete function removes one element from a hash. For example:

# Can be useful in security-related scripts
delete $ENV{PATH}; # deletes PATH environment variable.

undef function

undef function can be applied only to hashes as a whole. For example:

undef(%hosts);

has the effect of deleting the entire hash hosts. This is a pretty logical for undef to operate on the whole hash.

keys function

When keys is called it returns a list of the keys (indices) of the  hash. It is usually used in loops to iterate through a hash. For example, the following code fragment prints out all the key value pairs in the  hash %hash:
foreach $dns (keys %hosts){
   print "$dns => $hash{$dns}\n";
}

It can be also used to get s a list of keys from a given hash:

@all_dns_names = keys (%hosts);

which assigns to @all_dns_names the keys of %hosts in some random order.

values Function

The value function is complementary to keys function. In scalar context it can be used the same way as keys function in the example above.

In array context when values is called it returns a list of the values of the array. This function return their lists in the same order as keys function, but this order has nothing to do with the order in which the elements have been entered. It is usually used in looks as we saw above. One can also use the values function outside loops to get a list of values of a given hash:

@all_ip_addr = values(%hosts);

However, realize the overhead that occurs by creating an array element for every key value, and if the hash is a huge one, you will end up with a huge array. If this is the case, you probably want to use each, described below.

each Function

The each function is the way to get each element of the hash. It returns a ($key, $value) pair for each element of the hash. After the last pair was processed, each returns the value undef.

Following is an example of the usage of each:

while (($dns, $ip) = each (%hosts)) {
   if ($ip eq "127.0.0.1") {
      next; # skip local loop
   }
}

Note: You should not delete, insert elements of change keys in the hash while iterating through the list using each. In this case each can became confused and results are unpredictable. In the example above the user tried to change all keys to uppercase:

while ( ($dns, $addr) = each (%hosts) ) {
   $hosts{uc($dns)}=$addr # bad idea 
}

In this case keys function would be safer solution:

foreach $dns (keys %ip){ 
   print "I know the ip of $dns, but will not tell what it is\n";
}

Summary

A common way of initialization of hashes is to use => notation:

%mconvert=('Jan'=>1, 'Feb'=>2, 'Mar'=>3, 
	'Apr'=>4, 'May'=>5, 'Jun'=>6, 
        'Jul'=>7, 'Aug'=>8, 'Sep'=>9, 
        'Oct'=>10, 'Nov'=>11, 'Dec'=>12);

Prev | Up | Contents | Down | Next


Last modified: July 07, 2013