Category ArchiveCoding Tips
Coding Tips & Computers & Generic Mapping Tools 01 Aug 2008 03:37 pm
Creating Pie Wedge plots in GMT
A coworker recently approached me if asked if I knew how to make Pie Wedge plots in the Generic Mapping tools using the -Sw(w) switch. I had never done this before, but I thought it would be a cool thing to do so I tried my hand a making one.
It was tougher than I thought, and while I have seen these types of plots quite a bit in the fisheries world, there didn’t seem to be any examples of how to do this. So I thought I would post up here what I did, both for myself in the future and for anyone else in the world who may be interested in this type of plot.
Basically what I want to do is to make a dummy plot with a pie chart centered at every 5×5 degree box with a different size outer circle based on the total number in the box, and wedges representing percentages of that total amount.
For this exercise I am using the PXSY routine of GMT4.2.0 with my default MEASURE_UNIT = in.
To make this chart you have to have 5 columns of data:
Longitude — Latitude — Radius — StartAngle — EndAngle
So say I want a pie chart to represent how many types of widgets I sold in the area from 160-155W, 18-23N, with each wedge a portion of the widgets and centered in the middle of the 5×5 box. Here’s the data:
#lon lat blueWidget greenWidget redWidget
-157.5 20.5 200 200 400
So let’s say I make a map where each inch = 1 degree. The largest that I want my pie wedge diameter is 1 inch, so I know that for this example (only one data point) I will make the radius 0.5. I also know that the total for this example point is 800, so I convert this into angles. I actually have to make 3 rows of data now since I have three widgets. I also converted to 0-360 degrees.
$>cat pienc.xy
#lon lat radius startAngle endAngle
202.5 20.5 0.5 0 90 #end angle is 360 * (200/800)
202.5 20.5 0.5 90 180 #start angle is row-1 end angle
202.5 20.5 0.5 180 360 #Finish circle
So, a nice shell script to plot this up with the output below:
$>cat pienc
#!/bin/ksh
psfile=pie.ps
psbasemap -Jm1 -R200/206/17/23 -Bf1a1g1/f1a1g1WeSn -X1.5 -Y4 -P -K > $psfile
pscoast -Jm -R -O -K -Di -G200/200/200 -W1/0/0/0 >> $psfile
psxy pienc.xy -Sw -Jm -R -O -K >> $psfile
echo “203 24 12 0 0 6 Pie Chart Example” | pstext -Jm -R -O -N >> $psfile
$>display pie.ps
So that’s all well and good, except it would be nice to have different colors for each wedge representing a different widget. To get color in there you have to give a new column of values that will be mapped to a color value in a color lookup table (a cptfile in GMT). This column must be in the third row and then the -C switch must be given in the psxy call.
$>cat pie.xy
#lon lat COLORVALUE radius startAngle endAngle
202.5 20.5 1 0.5 0 90 #end angle is 360 * (200/800)
202.5 20.5 2 0.5 90 180 #start angle is row-1 end angle
202.5 20.5 3 0.5 180 360 #Finish circle
And my cptfile:
$>cat pie.cpt
0 0 0 255 1.1 0 0 255
1.1 0 255 0 2.1 0 255 0
2.1 255 0 0 3.1 255 0 0
The adjusted script:
$>cat pie
#!/bin/ksh
psfile=pie.ps
psbasemap -Jm1 -R200/206/17/23 -Bf1a1g1/f1a1g1WeSn -X1.5 -Y4 -P -K > $psfile
pscoast -Jm -R -O -K -Di -G200/200/200 -W1/0/0/0 >> $psfile
psxy pie.xy -Sw -Jm -R -O -K -Cpie.cpt >> $psfile
echo “203 24 12 0 0 6 Pie Chart Example with Color” | pstext -Jm -R -O -N >> $psfile
$>display pie.ps
And that’s pretty much it. Now to go sell some more widgets.
Linux & Mac OS X & R & python 22 Jul 2008 09:11 pm
Trying Sage mathematical software part II - Running trials (#1) Continued
This was quite possibly the worst idea for title naming that I could have thought of. Anyway, I played around a bit more tonight, and I thought that I would give an update to the three people that are waiting with bated breath.
Anywho, I decided to continue trying to map the data from the netcdf file onto a projection, and here’s what I ran into.
It looks like the basemap module is installed (as basemap) but that it depends on matplotlib > 0.98 and 0.91 is installed. I tried to be tricky and move my locally installed matplotlib over to the sage/local/lib/python2.5/site-packages directory but then that version of matplotlib needed a newer version of numpy than what was installed. At this point I tried
hostname $> sage -upgrade
to see if updated packages/modules were available. This started a huge chain reaction of downloads and source compiling to get to the latest, greatest versions. This process took exactly 59m10.482s to complete (I know because it told me!).
But once again, I get this error:
sage: from basemap import basemap
ImportError: your matplotlib is too old - basemap requires version 0.98 or higher, you have version 0.91.1
At this point though, it’s not working on either the linux or OSX platforms due to outdated dependencies, so either I need to find another way to plot mapped projections or use something else.
Again, this isn’t a knock against Sage, because I really don’t think that is an ideal test for this software. But honestly, a lot of why I went for this approach was to avoid having to use separate approaches for data manipulation and visualization, and this would be a common task. Matlab’s mapping toolbox is useless to me for plotting, so I end up using m_map, which is still not as good as GMT, but it gets the job done in house.
My main thoughts at this point are that it seems easy to get into dependency hell here, as one module upgrade can force another, and so on. At this point it’s another block of time spent on setup, and no result. Time to stop for the time being.
Technorati Tags:
Computers, linux, Mac OS X, programming, python, R, sage, scientific programming
Computers & Linux & Mac OS X & R & python 21 Jul 2008 09:43 pm
Trying Sage mathematical software part II - Running trials (#1)
Part 1 of the sage experience was just installing the software. This was incredibly easy on both OSX and linux (CentOS 5.2 and Fedora 9). For the Fedora 9 install I just downloaded the latest version of sage which was compiled for Fedora 8, and this seemed to be just fine.
So for me, I really just wanted to be able to do a few different examples which would be close to “real world applications” for me.
Some things that I would like to be able to do in sage:
1. Load in a 2-D NetCDF satellite data file and display it as a map projection. This should be really simple. I would usually just use GMT for this (a small shell script wrapping psbasemap, grdimage, and pscoast).
2. Load in a data series with dates and locations, and match this to corresponding satellite data in time and space. Normally I would use a perl script that I wrote many moons ago to do this. I would basically sort the data, then match a block of data at a time using GMT’s grdtrack function. I know that this is inefficient, and really I would like to be able to pull extra data in x,y, or t and take the mean or median value, which would be more CPU intensive, but better than matching just one point in space and time to the nearest pixel.
3. Load in a multivariate data series and do multivariate statistics (e.g. LME, GLM/GAM, RDA). This is where the R interface would come into play. Normally I would prepare the data elsewhere, then import the flat table into R and use the R functions. This may involve installing more packages (nlme, mgcv, etc).
4. Load in a 3-D set (x,y,t) of satellite data files and perform an EOF analysis on them (akin to SVD in Matlab). Normally I would do this in Matlab or Ferret. I’m just curious how easy it would be to do this here.
There are other things that I could do, but these are a few off the top of my head, and things that I am doing now, so it would be incentive to try Sage out with. For tonight, I’ll just work on #1, which should be really fast.
The data file I’m using is just a NetCDF file (created by GMT) which I can read with pupynere in python. Here I’m going to use the scipy.io.netcdf module (which is actually based on pupynere I believe).
sage: from scipy.io.netcdf import *
sage: from pylab import *
# Read in file metadata to object
sage: ncfile = netcdf_file(’RS2006001_2006031_sst.grd’)
# get the variables in the data file
sage: ncfile.variables
{’x': <scipy.io.netcdf.netcdf_variable object at 0xb47b08c>,
’y': <scipy.io.netcdf.netcdf_variable object at 0xb47b16c>,
’z': <scipy.io.netcdf.netcdf_variable object at 0xb47b1ec>}
# Yank out data
sage: longitude = netcdf.variables['x'][:]
sage: latitude = netcdf.variables['y'][:]
sage: sst = netcdf.variables['z'][:]
# just plot sst to test 2D image plotting
sage: plot(sst)
[<matplotlib.AxesImage instance at 0xc03636c>]
Nice, but it’s upside down. Let’s flip it vertically.
sage: clf
sage: plot(flipud(sst))
[<matplotlib.AxesImage instance at 0xb86a2ac>]
sage: savefig(’temp.png’)
Easy, but I want to put this on a projection. Normally I would use the basemap tools which are an add on to matplotlib. I don’t see these installed, and I didn’t see them in the extra sage packages on line, so I downloaded them from SourceForge and installed them.
The first step you have to do is to install the geos package, just read the README in the geos folder and hit
./configure
make
and then we get our first epic fail. Something in the geos chain won’t compile, and I’m just about fried enough to call it quits for this evening.
At this point I’ve been playing with this for more than 2 hours, and I still have yet to make a simple map on a projection. There has to be something I’m missing, but at this point I’m going to pause until tomorrow. So not the best testing evening, but there are some positives so far. The bundling of most packages is a plus, and the ease of loading in NetCDF files is nice. Data displays well using the Pylab interface, even though I am still forced to save to a file at this point.
So immediate goals:
1. Get a backend working for viewing plots in widgets (akin to ipython -pylab)
2. Get the basemap tools installed so that I can make a map with a projection!
Technorati Tags:
Computers, linux, Mac OS X, python, R, scientific programming, sage
Coding Tips & Computers 16 Jul 2008 04:17 pm
Making seasonal climatologies in FERRET
In my travels one of the things that I end up doing is to make seasonal climatologies out of 3 dimensional data sets (longitude, latitude, time). For example, I may have a series of monthly sea surface temperature images for the North Pacific from January 1950 through December 2007 and I want to get the spatial averages for each seasonal. In other words for each spatial point I want to average all the points in time for all January-March months (season 1), April-June months (season 2) etc.
What I would usually do is to do something kludgy like either write a shell script to cycle through all the months that I wanted to collapse, and then dump the data from the GMT NetCDF format into binary files and then run blockmean and regrid them in GMT’s NetCDF format. Sometimes if the data was reasonable in size I would just load them directly into Matlab and index the months that I wanted, subset the array and then do a nanmean(data,3).
Neither one of these methods is the easiest or most efficient way to do things, this I know. So with my newfound friend Ferret I figured that there must be a really simple, built-in way to make the climatologies from this XYT data, and there was.
Basically Ferret allows for a Modulo regridding to a built-in climatological time axis. What would have taken quite a bit of time in my previous attempts took me 10 seconds in Ferret.
! load example data set
SET DATA levitus_climatology
USE climatological_axes
LET sst_clim = SST[d=1,X=100:260,Y=10;60,GT=seasonal_reg@MOD]
CANCEL DATA climatological_axes
SHADE sst_clim[L=1] ! shade Quarter 1
This now provides an L=4 data set of seasonal climatologies centered in time on [15-Feb, 16-May, 16-Aug, 15-Nov].
So, while 10 years ago half the fun may have trying to be clever and code a faster solution than the last one that I had made, in my old age I tend to just want to get this portion of it done so that I can interpret the results. This is very nice indeed.
Technorati Tags:
Computers, linux, programming, scientific programming
Linux & Mac OS X & Matlab & python 10 Jul 2008 10:04 am
Matlab 2008a activation woes
OK, so I spent a bit of time yesterday thinking about some alternatives to Matlab versus the time/money balance of learning a completely new system. Today’s adventure activating Matlab 2008a has me definitely leaning in the “alternative” direction.
A little background on Matlab 2008a and the activation model. With 2008a Matlab has moved to an activation model, where you have to basically register your computer with The Math Works. With an individual license there are two choices you have: lock the license to a computer, or lock it to an individual. As I use Matlab at home, at work, and on my laptop I chose to lock it to myself.
Now Matlab 2008a came out in February 2008, and I’m just trying to install it now, which is all based on my fear of activation, since usually I am eager to download and install the latest version. So I bit the bullet yesterday and installed the OSX (laptop) and linux versions. Usually I get a nice CD package with installations for all platforms but that must have ended as well.
So this morning I figured that I would give installation a go. Usually a 5-10 minute affair, I spent 90+ minutes on this, and it still doesn’t work on the linux box. Basically what would happen is that I would go through the automated steps (generate the file key, license file, download and give info where needed, rinse, repeat) and watch Matlab completely have a hissy fit that either my username didn’t match, or my host ID didn’t match or that I didn’t have the desk pointed northeast (well, not that last one). I did my civic duty and dug around the troubleshooting site, where I found exactly three entries for activation problems. This must mean that I am completely unlucky to both have a problem and to have a non-standard problem that could not be fixed with answers #1-3.
At any rate, time is money right? So I figured I would give tech support a call. I got through on the second try, and actually go a human who was surprisingly helpful. The main problem on both machines was that the host ID generated was from the MAC address of the active internet connection, which is not what the license manager wants to see. So the laptop fix was an easy one after all, just use the MAC address of the primary connection, which was the wired connection (I was using the wireless at the time). In hindsight, yeah, that makes sense, but there’s still too much voodoo involved for me.
The linux box was worse, where I do not have eth0 enabled since I have problems with it, and I use eth1 instead. So while to the operating system there is only eth1 (as far as I understand) the license file wants to communicate with eth0, which of course doesn’t exist in the software space (now I may be way off on this, but this is what I could come up with for now). So my options are to either rename eth1 to eth0, or activate eth0. Well guess what, at this point I’m inclined to just not run Matlab 2008a since I don’t really feel like risking the possibility of borking my internet connection just to get the latest version of Matlab running. So remind me again why I am paying those maintenance fees? Maybe I’m no activation super genius, but should it really be this difficult to get software that you’ve paid good money for to run? Sure, there’s still an open thread with tech support but how much more time do I want to spend on this today? Well that’s easy, none.
Now I may feel differently in a bit and try again. Or I may just drink more coffee and start installing Sage or Enthought and see how that goes. Or maybe I’ll just try to get some work done.
Technorati Tags:
linux, Mac OS X, python, matlab
Computers & Linux & Mac OS X & Matlab & R & python 09 Jul 2008 08:22 am
Matlab, Python, or R… Time versus Money
I’ve been a Matlab user for 15 years, and over that time period I’ve of course become fairly dependent on it to get things done quickly. The downside? It’s expensive. It’s a pretty penny to buy the base package, toolboxes are extra, and there are recurrent “maintenance” costs each year to get upgrades.
Sure, that’s standard practice, but each year I have to stand up and justify to my boss why we need to pay these costs for our multiple Matlab users in our shop (a multi-user concurrent license is out of the question, don’t even ask). So what’s a user to do?
For years we’ve just bit the bullet and paid the fee, but with options such as R and Numpy/SciPy out there it may be time to loosen the chain a bit. Or maybe not.
A couple of possible alternatives to Matlab and their respective pros and cons:
R
R is a really nice statistical environment which has pretty much become the industry standard, replacing the very expensive S-Plus. It’s easy to install, has an excellent GUI on OS X, and has a ton of community released packages which are usually made during the preparation of scientific papers. There are some downsides, as there can be multiple (sometimes possibly conflicting) packages (e.g. gam vs mgcv) but choice is good, right? The cons for me are that it’s a new language to learn, and even though I write an m-script for everything, I find the scripting in R a bit clunky, even writing in TextWrangler and then hitting CTRL-R to have the SendToR script source the code for me. It’s just something new, and while the built in functions are really nice, the learning curve for coding things is higher, and will it be faster in the long run than just using Matlab?
Numpy/SciPy
The Numpy/SciPy combo in Python is a viable alternative to Matlab, even having a page dedicated to showing you how easy it is to transition from Matlab. As with R, it’s free, and there are a ton of functions available, but there is a downside for me. I’ve successfully installed it on CentOS 5.1 and OS X 10.5, but it was a bit complicated. I know that these are packaged in many distributions, but not in CentOS, and I had to install from either source or .egg files, which isn’t all that tough, but took some time. I’m not writing the 24.3 steps I did to get it installed because honestly, I didn’t write it down and I don’t remember what I did. Next time I promise to list it out! On OS X I did it all through MacPorts on the MP version of python 2.5. Again, it took some massaging to get it all set up since I was using the non-default install of python.
Overall though, the reason for this little diatribe is that while there are alternatives to Matlab, they all involve learning new ways to do things which, after I successfully learn them, may not be faster than just doing it in Matlab. Most of the time I just need to get things done, and the $7/day cost of Matlab may be well worth it if I’m saving more than 10 minutes of time during that day (assuming for a minute that I am earning $42/hour).
I’m rambling a bit here, but these are just questions that I ask myself as I code things up at the desk. For each of these tools has their place, and in terms of maximum comfort and speed, I use each of them for their strengths. The main dilemma is that in a perfect situation I would drop the commercial Matlab for the free/open source alternatives, but at a minimal cost in dollars and time.
Technorati Tags:
linux, Mac OS X, programming, python, R, scientific programming
Mac OS X & R 02 Jul 2008 10:31 pm
Using R2WinBUGS in Mac OS X
As I alluded to in a previous post, the main reason for getting WinBUGS to run on the Mac was to be able to run WinBUGS through R using the R2WinBUGS R package.
Once I got DarWine up and running it was really only a matter of fixing some variables in the R2WinBUGS call of my model.
While written in Japanese, this guide had enough information to get me started down the path (no pun intended).
One of the main things that you have to do is to define WINEPATH and WINE in the call, which are of course buried a bit in OS X
bug.out <- bugs(…,
useWINE=TRUE,
WINEPATH = “/Applications/Darwine/Wine.bundle/Contents/bin/winepath”,
WINE = “/Applications/Darwine/Wine.bundle/Contents/bin/wine”…)
It works, but as of now it ain’t pretty. I also am not seeing faster model runs than when I use a virtual machine in Parallels, so I may need to test this further.
Mac OS X & R 02 Jul 2008 08:24 am
Running WinBUGS on Mac OS X
I’m working on a project using a state-space model that was built in WinBUGS, and obviously, I’d like to run it on anything but Windows XP.
The main problem is that it’s WinBUGS, not OSXinBUGS or LinBUGS, so natively I’m out of luck. All is not lost however, for some people much more clever than I have provided instructions to get WinBUGS up and running on OS X using DarWine, which is the Darwin version of Wine which is in rough terms a compatibility layer for running Windows programs on *nix.
The easiest way to get all of this up and running is to follow the excellent tutorial at http://idiom.ucsd.edu/~rlevy/winbugsonmacosx.pdf, which is how I got up and running

The next step is to get this layered version recognized by R so that I can call WinBUGS directly from R.
Technorati Tags:
R, Mac OS X, WinBUGS
Coding Tips 03 Jun 2008 12:06 pm
Examples for using Ferret
This is mostly for me, but I wanted to post some examples of using Ferret so that maybe others could get a handle on how to start using this program.
To load data in Ferret:
[Local data file] - yes? set DATA datafile
[Remote data file] - yes? set DATA ‘http://www.address.com/file.nc’
To see what data is loaded:
yes? SHOW DATA
To see what’s in the data:
yes? SHOW GRID <DATAVAR> ! Where datavar is like CHLA for Chlorophyll a
yes? SHOW GRID <DATAVAR[d=2]> ! if datavar is in the second dataset loaded (from SHOW DATA)
To display the first four time points:
yes? SHOW GRID/L=1:4 <DATAVAR[d=1]>
To make a shaded plot of the first timestep of the CHLA variable over a smaller range:
yes? SET REGION/X=180:220/Y=0:40/L=1
yes? SHADE CHLA
To make a map similar to GMT:
yes? SET REGION/X=180:220/Y=0:40/L=1
yes? go mp_mercator
yes? go mp_grid CHLA
yes? go mp_aspect
yes? SHADE/NOAXIS/PALETTE=rainbow_bylevels/lev=(0,0.5,0.05) CHLA
yes? go fland 5 black ! add land in black from etopo5
To print to a postscript file:
yes? SET MODE METAFILE
<make your plot>
yes? PPL CLSPLT
Then on the command line use the Fprint command to convert the metafile to postscript
Linux & Matlab 24 Apr 2008 10:03 am
Command line OpenDAP access in Matlab 2007b
In searching for a way to access OpenDAP data in Matlab I got to the Matlab Ocean Toolbox, which calls on the Loaddap backend. To get this running was pretty simple, although I went a bit backwards about it.
1. Compiling libdap-3.7.3 under CentOS 5.1
Basically there was an RPM for libdap that came with the Ocean Toolbox package, but I had already compiled the library from source. The only snag I hit was here during the make phase:
Vector.h:141: error: extra qualification ‘Vector::’ on member ‘value’
Vector.h:142: error: extra qualification ‘Vector::’ on member ‘value’
Vector.h:143: error: extra qualification ‘Vector::’ on member ‘value’
Vector.h:144: error: extra qualification ‘Vector::’ on member ‘value’
Vector.h:145: error: extra qualification ‘Vector::’ on member ‘value’
Vector.h:146: error: extra qualification ‘Vector::’ on member ‘value’
Vector.h:147: error: extra qualification ‘Vector::’ on member ‘value’
Vector.h:148: error: extra qualification ‘Vector::’ on member ‘value’
What I ended up doing was just to delete
Vector::
on all lines in the Vector.h file and it compiled. To get loaddap on there, I just did a simple
rpm -ivh –nodeps loaddap-3.6.0-1.i386.rpm
with the –nodeps switch to get rid of the matlab dependency issues. So far so good!





