Plotting ASCII histograms in Perl with Gnuplot

Plotting ASCII histograms in Perl with Gnuplot

Histograms are one of the simplest and handiest data visualizations tools. They give you a quick view of how tightly or loosely your data clusters, whether it is normally distributed, and whether it has outliers. If you spend a lot of time processing data in Perl and don't need a high fidelity plot, it is nice to just plot using ASCII art. Luckily, the Perl library Chart::Gnuplot give you access to the powerful Linux command line plotting tool Gnuplot that can be used to do exactly that. 

First off, we load a few libaries: strict and warnings to help catch obvious coding mistakes, POSIX to give us Floor and Ceiling functions, and Chart::Gnuplot for obvious reasons...

use strict; 
use warnings; 
use POSIX; # Floor, Ceiling functions
use Chart::Gnuplot; # Perl Gnuplot library

Next, generate an array of data for which you want to plot a histogram. This would typically be data you've parsed out of a text file or calculated based on its contents. For this example I generated some fake classroom grade data for 20 students in Excel using the formula norm.inv(rand(), 70, 10) (that's a Normal distribution with a mean of 70 and standard deviation of 10). 

my @data=(
61.98482, 65.68389, 56.50473, 82.21215, 54.66456, 
76.90416, 70.34352, 47.24502, 53.32660, 66.73437, 
76.98941, 75.92255, 76.78182, 66.64782, 68.04657, 
54.65173, 68.58608, 63.18633, 89.06680, 69.02130,
);

Now convert the raw data into the data for the histogram plot itself:

# Generate the  histogram plot data
my $binwidth = 10;
my @histdata=sort {$a <=> $b} @data; # Sort data numerically
my $x_min=floor($histdata[0]/$binwidth)*$binwidth; # Beginning of x plot range
my $x_max=ceil($histdata[@histdata-1]/$binwidth)*$binwidth; # End of x plot range
my @bins = map $binwidth*$_, $x_min/$binwidth..$x_max/$binwidth;
my @histo=(0) x @bins;
foreach (@histdata){ # Create the histogram plot data
	$histo[ int( ( $_ - $x_min )/$binwidth ) ]++;
}

After that, we create the Gnuplot chart and configure it for histogram plotting:

my $tmpfile = "temp.tmp";
my $chart = Chart::Gnuplot->new(
	terminal=>'dumb', # ASCII plotting
	output =>$tmpfile, # Gnuplot will plot to this temp file
	title=>"Final exam grades distribution",
	xlabel=>"Grade",
	ylabel=>'',
	border => {
		sides    => "bottom, left",
	},
	xtics => {
	  mirror => "off",
	},
	ytics => {
	  mirror => "off",
	},
);

Setting the terminal type to 'dumb' makes Gnuplot plot in ASCII art to the file denoted by 'output'.

After that, we add our histogram data as a data set in the chart, then tell Gnuplot to plot it. Finally we read back the ASCII plot from the temp file and print it to the console or a summary file...

my $dataset = Chart::Gnuplot::DataSet->new(
	using => "2:xticlabels(1)",
	style => "histograms",
	xdata => \@bins,
	ydata => \@histo,	
);
 
$chart->plot2d($dataset);
 
my $content;
open(my $fh, '<', $tmpfile) or die "cannot open file $tmpfile";
{
	local $/;
	$content = <$fh>;
}
print($content);

Putting it all together we get the full code at the bottom of this post, and running that, we get the following output:

                        Final exam grades distribution
 
  8 +-+                           *****
    |                             *   *
  7 |-+                           *   *
    |                             *   *
  6 |-+                           *   *
    |                             *   *
  5 |-+                           *   *      ****
    |                             *   *      *  *
  4 |-+                 *****     *   *      *  *
    |                   *   *     *   *      *  *
    |                   *   *     *   *      *  *
  3 |-+                 *   *     *   *      *  *
    |                   *   *     *   *      *  *
  2 |-+                 *   *     *   *      *  *      ****
    |                   *   *     *   *      *  *      *  *
  1 |-+       *****     *   *     *   *      *  *      *  *
    |         *   *     *   *     *   *      *  *      *  *      +
  0 +----------------------------------------------------------------------+
             40        50        60         70        80        90
                                     Grade

If you found this useful, please let me know about it in the comments below. Thanks for reading! 

Full code for this post:

use strict; 
use warnings; 
use POSIX; # Floor, Ceiling functions
use Chart::Gnuplot; # Perl Gnuplot library
 
my @data=( # Some dummy class grades generated in Excel
61.98482, 65.68389, 56.50473, 82.21215, 54.66456, 
76.90416, 70.34352, 47.24502, 53.32660, 66.73437, 
76.98941, 75.92255, 76.78182, 66.64782, 68.04657, 
54.65173, 68.58608, 63.18633, 89.06680, 69.02130,
);
 
print ("The raw data:\n".join("\n",@data));
 
# Generate the  histogram plot data
my $binwidth = 10;
my @histdata=sort {$a <=> $b} @data; # Sort data numerically
my $x_min=floor($histdata[0]/$binwidth)*$binwidth; # Beginning of x plot range
my $x_max=ceil($histdata[@histdata-1]/$binwidth)*$binwidth; # End of x plot range
my @bins = map $binwidth*$_, $x_min/$binwidth..$x_max/$binwidth;
my @histo=(0) x @bins;
foreach (@histdata){ # Create the histogram plot data
	$histo[ int( ( $_ - $x_min )/$binwidth ) ]++;
}
 
# Create a Gunplot chart and configure it for histogram plotting
my $tmpfile = "temp.tmp";
my $chart = Chart::Gnuplot->new(
	terminal=>'dumb', # ASCII plotting
	output =>$tmpfile, # Gnuplot will plot to this temp file
	title=>"Final exam grades distribution",
	xlabel=>"Grade",
	ylabel=>'',
	border => {
		sides    => "bottom, left",
	},
	xtics => {
	  mirror => "off",
	},
	ytics => {
	  mirror => "off",
	},
);
 
# Add the data set in @histo, with bins demarcated by @bins
my $dataset = Chart::Gnuplot::DataSet->new(
	using => "2:xticlabels(1)",
	style => "histograms",
	xdata => \@bins,
	ydata => \@histo,	
);
 
# Plot the histogram - this line plots to the temp file denoted above
$chart->plot2d($dataset);
 
# Read the plot back from the temp file...
my $content;
open(my $fh, '<', $tmpfile) or die "cannot open file $tmpfile";
{
	local $/;
	$content = <$fh>;
}
# ... and print to the terminal
print($content);

 

Add new comment

Guest

  • No HTML tags allowed.
  • Web page addresses and email addresses turn into links automatically.