Perl Script to Parse Log File

Problem

You have a log file on UNIX operating system. Each line in the log file contains an IP address in the 10th column where columns (or fields) are separated by a space. Write a Perl program to parse the log file and print the unique IP addresses in the log file

Solution

Read the file line by line then split the current line into an array of tokens using space as separator. Push the 10th token into a result array. To print the unique IP addresses in the result array I will suggest two methods. You can sort the array which is O(nlogn) operation then you go through the sorted array element by element while printing the current element as long as it is different from the previously printed element.

Code

Here is the code in Perl

  1. #!/usr/bin/perl
  2.  
  3. # Open log file
  4. open(IN, "mylog.txt") or die "can not open file";
  5.  
  6. # Array to store final list of unique IPs
  7. my @uniqueIPs = ();
  8.  
  9. # Read log file line by line
  10. while ()
  11. {
  12. # Current line is stored in $_
  13. # split current line into tokens
  14. @line = split(" ", $_);
  15. # Extract the 10th token which is the IP address
  16. # and push to the result array
  17. push(@uniqueIPs, $line[9]);
  18. }
  19.  
  20. # Sort the result array
  21. @uniqueIPs = sort @uniqueIPs;
  22.  
  23. # Print the first IP address
  24. my $ip = $uniqueIPs[0];
  25. print $ip . "\\n";
  26.  
  27. # Go through the list of IPs
  28. # compare the current IP to the previous
  29. # one and print it if different
  30. for ($i = 1; $i < $#uniqueIPs; $i++)
  31. {
  32. if ($uniqueIPs[$i] ne $ip)
  33. {
  34. print $uniqueIPs[$i] . "\\n";
  35. $ip = $uniqueIPs[$i];
  36. }
  37. }

Solution

The other way is to hash the IP addresses where the key is the IP itself and the value is the count. Once the hash table is fully populated you go through the hash and print the keys

Code

Here is the code in Perl

  1. #!/usr/bin/perl
  2.  
  3. # Open log file
  4. open(IN, "mylog.txt") or die "can not open file";
  5.  
  6. # Hash to store final list of unique IPs
  7. my %uniqueIPs = ();
  8.  
  9. # Read log file line by line
  10. while ()
  11. {
  12. # Current line is stored in $_
  13. # split current line into tokens
  14. @line = split(" ", $_);
  15. # Extract the 10th token which is the IP address
  16. # then hash it. The ip (token) is the hash key
  17. # and the hash value is not important. I hardcoded
  18. # the value to 1
  19. $uniqueIPs{$line[9]} = 1;
  20. }
  21.  
  22. # Go through the hash table and print the keys
  23. # which are the unique IPs
  24. for $ip (keys %uniqueIPs)
  25. {
  26. print $ip . "\\n";
  27. }
Search Terms...

Leave a Reply