Do it Yourself Climate Science with R and GHCN

As a scientist but yet-to-be-convinced about climate change, I was challenged to have a look at the climate data myself. After all, statistical modelling is what currently interests me and the open source software R does modelling particularly well.

It turns out that the main surface temperature dataset upon which all of the models are based is publically available and can be analysed with R

First you must get and install R – the program is available for all platforms from http://cran.r-project.org/

then you need to run r and install package RghcnV3 the package written to access and transform the GHCN data

the actual data can be downloaded from ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v3/

monthly average, min or max data can be downloaded. This needs to be unzipped to the .dat file

this file is then loaded into R

> require(RghcnV3)
> v<-readV3Data(filename='c:\\yourdirectory\\foo.dat') #assuming your data has been renamed foo.dat and you have a windows machine - syntax is slightly different for a linux or mac
> stationNames<-dimnames(v) # get the station names

Now you need the id of the station you want. This is available here http://data.giss.nasa.gov/gistemp/station_data_v2/v2.temperature.inv.txt

for example - I tried Omeo : 50194911000

so now which column is Omeo ?

> which(stationNames[[1]]==”50194911000″)

this gives column 5778

so let’s get the data for that :

> odata<-data.frame(v[5776,,])

this creates a data frame of the omeo data. you can have a look at it by typing odata

X1941 X1942 X1943 X1944 X1945 X1946 X1947 X1948 X1949 X1950 X1951 X1952
Jan NA NA 23.0 25.1 24.5 26.0 24.8 21.8 22.4 22.6 23.7 24.6
Feb NA 23.3 24.0 22.3 22.4 23.6 25.2 23.6 21.8 21.9 23.5 23.2
Mar NA 21.9 22.4 20.2 19.1 18.0 20.3 18.4 20.5 20.2 21.7 21.7
Apr NA 16.9 14.7 13.6 15.9 14.8 16.3 16.2 14.3 15.5 13.3 15.4
May NA 13.2 11.1 11.0 11.8 11.2 13.5 10.3 11.3 11.7 11.3 10.8
Jun NA 9.8 7.8 7.7 10.8 7.3 9.2 8.9 7.3 9.0 10.2 9.1
Jul NA 8.0 7.1 7.8 7.6 8.9 8.2 6.6 7.7 9.3 7.5 6.9
Aug NA 9.7 7.8 9.1 9.8 8.5 8.5 9.5 8.8 9.3 8.1 9.4
Sep NA 11.6 11.8 12.8 11.2 11.0 12.0 12.4 11.7 12.4 12.2 11.0
Oct NA 15.1 15.2 16.4 14.6 14.6 13.7 13.7 15.3 14.4 13.8 14.1
Nov NA 18.9 17.1 21.7 18.9 20.2 16.9 18.0 17.0 17.5 16.1 15.7
Dec NA 24.2 21.7 21.5 24.6 22.6 21.1 22.6 20.7 21.9 21.1 20.4

This is a small piece of the data - in Omeo's case available mostly from 1943 to 2011.

To get a rough and ready plot, you can stack it into a single vector

> stackedOmeo<-stack(odata)

and this can be plotted

> plot(stackedOmeo$values,type=’l')

now, there are many missing values from 1701 so let’s just plot the ones we have

> plot(stackedOmeo$values[2900:3700],type=’l')

and you can try and spot the warming.

R has many tools to fit a warming line : linear model, robust linear model, polynomials etc so have fun

North West Foot and Ankle Clinic

Daniel Monteleone has established the North West Foot and Ankle Clinic on our premises at Gladstone Park Shopping Centre.

Podiatry is a natural fit with ophthalmology as it is vitally important that the thousands of diabetic patients that we see have their feet checked regularly. Diabetic vascular disease and neuropathy can have disastrous consequences on the feet. Our patients can now access podiatry as well as nutrition services on site

Update on Sessional Consulting Room Space

We have now moved into our new premesis on the roof of Gladstone Park Shopping Centre. We have a beautiful light consulting room for rent by medical or other health practitioners. We can provide telephony with an dedicated direct number answered in the practitioner’s name, reception, HICAPS, transcription, records or other services as required. Rent free, fitout and marketing assistance is available. Please refer to our website for more information or call Olivija

Rudd’s $90b back-of-a-postage-stamp health blueprint

It has been interesting reading the analysis and opinion around the new Rudd health plan. In fact, I am confident that the commentators have put an order of magnitude more of thought into the plan than the authors of the plan itself. The idea is a flawed diversion destined for failure.

The first reason for this is that there has been as yet no extra money allocated to the health system. The entire premise is that an equal number of bureaucrats (presumably the existing ones relocated) will somehow magically extract new efficiencies from the same health dollar. Efficiencies that these same bureaucrats have been unable to think of or implement until now. It is analogous to Victoria hoping that all the public transport woes will be solved by covering over the name Connex with stickers that say Metro.

The second reason is the cowardice that keeps the States ultimately responsible for health delivery. The states have to underwrite the so called efficient price. The states have to negotiate enterprise agreements with the staff of the hospital. It seems to me this is a recipe for more rather than less blame shifting. The courageous alternative would be for all public hospitals to be run by the Commonwealth so that failures and blame would unambiguously rest there.

The third reason is that this plan lacks any new methods of rationing of healthcare. Although rationing is one of those words which must remain unspoken, the central problem of economics is after all the reconciliation of unlimited wants against limited supply. It is patently ridiculous that for many elective operations there exists no agreed threshold for surgery and no prioritization of those waiting according to the magnitude or impact of their symptoms. Admitting the fact that expectations will always exceed demand and having a rational debate about this point is essential to the design of a new system.

Caution is recommended in interpreting what is meant by independent – as in independently determined efficient price. This is from the same crowd that think that Fair Work Australia and the Henry review of taxation are exemplars of independence.

The cynical view is that this is a political exercise designed to fail and to leave the excuse that “we would have fixed all this if only the (State Bureaucrats / Liberal Party / Greedy Ophthalmologists / insert your villain of the day) had let us so you can’t blame us – we tried and its not our fault.

Sample Size Calculations for Ethics Committee Applications

One of the key things that ethics committees look at when determining whether to accept a research proposal is the statistical justification. The ethical issue is twofold:

1. Is an excessive number of subjects being requested while ;
2. Is the number of subjects sufficient to reasonably expect to get a significant result.

Doing the statistical analysis for the application can be daunting and confusing for a prospective researcher. The online resources are often not very good either. I have found the easiest program to use for these calculations to be sigmastat. It is, however, quite expensive even for an academic license.

Fortunately, there is a free alternative. The free software package
r

can do all the power calculations for you and is available for windows, mac and even linux. It is not the most user friendly package so this guide is designed to help you use the program for your ethics commitee application.

First decide what test you are going to use.

  1. For two groups with a continuous (assume normally distributed ) variable – for example the length of male vs female elephant tails – use a t test. You must also decide whether the test needs to consider whether the first group is larger or the same as the second (a one tailed test) or whether the first group could be larger, smaller or the same as the second (a two tailed test)
  2. For more than one group with a continuous (assume normally distributed) variable – for example the average height of asians, africans and north americans – use an ANOVA
  3. For the comparison of success or failure as a proportion of two or more groups – for example the proportion of women who conceived using a fertility treatment – use a test of proportions

Let’s consider these one by one

  1. T-test
    work out your parameters – in this example assume that the standard deviation of the elephant tails is 20% and we hypothesize that the male ones will on average be 40% greater.

    Fire up R and type ?power.t.test this will display the help screen

    In this case the command to get your analysis is

    power.t.test(delta=0.4,sd=0.2,sig.level=0.05,power=0.8)

    this gives you 5 subjects per group

    if you want the one tailed option (ie we are only testing whether male tails are longer male tails being significantly shorter is not a possibility)- you need
    power.t.test(delta=0.4,sd=0.2,sig.level=0.05,power=0.8,alternative=”one.sided”)

  2. ANOVA test

    This is the most tricky of the three. We first need to know our expected difference in the means and the standard deviations of the groups – is this example let’s say standard deviation again is 20% and we are looking for a difference in the means of 40%.

    The first step is to work out Cohen’s D value – this is the difference in the means divided by the standard deviation – in this case 2.0

    The next step is to work out Cohen’s effect size – this is given by

    so in our example we have f = 0.707

    now within r we need to load an additional module so start r and type

    require(pwr)

    then ?pwr.anova.test for the help screen

    and for our power calculation

    pwr.anova.test(k=6,f=0.707,sig.level=0.05,power=0.8)

    which gives us 6 subjects per group (you can’t use fractions of a subject)

  3. Test of Proportions
    this one is fairly easy – for this example let’s say that we guess that 80% of women on a fertility treatment will conceive and of those on the placebo 10% will conceive

    In R type

    ?power.prop.test

    and for our example

    power.prop.test(p1=0.8,p2=0.1,sig.level=0.05,power=0.8)

    which gives us 7 per group

  4. I hope this is useful and i might revise this with some more tricky power calculations later. Any experts are welcome to comment

Roxon you’re wrong

You know if a proposed measure is opposed by both the coalition and the greens that there is a problem with it. You can understand the proposed change to the cataract rebate better if you look at it as a test case to see how much the Government can get away with beating up the doctors and patients of private medicine. I am sure there will be some negotiated deal eventually – maybe cutting the rebate by 25% but this is a prelude to cutting all rebates for all procedures and consultations by a similar amount.

The justification for a 50% cut is nonsense as I have previously written. Faced with the fact, all the Minister can come up with is to insult the profession. Minister, I know you have a dystopian vision of a health care system without any doctors but really – you should be able to do better than this. Claims of ophthalmologists with incomes of $500 000 per year or more conveniently ignore the fact that this is gross practice income. The most efficient practices cost 50% or more to run. There is about 1 hour of non-medical time spent on every consultation and rents, insurance, capital, computers, utilities etc are all going up.

The intolerable situation of patients now being faced with thousands of dollars out of pocket or their operations being cancelled for months must not be allowed to stand. Bouquets to the Greens and Brickbats to our irrational Minister

bunk beds and portaloos for our hospitals

The ALP ‘solution’ to the problem of overcrowding of the Christmas Island immigration facility of installing bunk beds and portaloos in the recreation facilities is very cost effective. In fact, we might expect our health minister to broaden this strategy to our public hospitals and install bunk beds and portaloos in the hospital corridors as a remedy to the recently reported overcrowding of these facilities.