Reflecting on crakar and snowman

(why am I thinking about cocaine now?)

So recently two very prolific climate contrarian commenters picked up their toys and went home. Skip did a nice piece on that surprise event.

crakar was one of my most prolific commenters, contributing about 100 comments per month since last December. He always struck me as a congenial fellow but he was definately antagonistic to the science of global warming and contributed mainly misinformation and misunderstanding. Nevertheless, I am actually a bit sorry to see him go on a personal level even though his presence was on balance a negative contribution.

I think that crakar approached this issue somewhat like a game, where he had his chosen side and used whatever he could to defend it and whatever he could to attack the other side. The problem with this is that this is not a game, this is a globally unfolding event with serious and potentially tragic consequences for millions upon millions, if not all of humanity. It is that serious, which is why tolerance of his brand of wilful ignorance is not really an option despite his likely (and clearly thoughtless) sincerity.

I got flack from otherwise friendly followers of AFTIC for not banning him and perhaps I should have. I am still internally debating this question. As it happened, things worked out even better in the sense that his continuous regurgitation of fully debunked arguments have ceased and I did not have to use the nuclear option on him.

My feelings about Snowman are very different. In my opinion, it is very likely he was here on assignment. He openly acknowledged that his idea of the debate was political and not scientfic even while he tried to argue about the science. He wrote very eloquently, with a stylish flourish that was matched only but its complete detachment from reality. His first couple of days here, he tried being his own sockpuppet but when called on it silently ceased and did not try again. While I found crakar to be totally wrong but generally sincere, Snowman I would describe as sociopathic in his disregard for intellectual honesty.

To me, crakar represents a demographic that should be talked to. His “type”, if I may, can possibly be reached and if not at least he can be effectively and publically refuted. Time consuming, I know! Snowman on the other hand is just here to kick sand in everyone’s face all the while with a disingenuous smile. Him, I will not miss!

92 thoughts on “Reflecting on crakar and snowman

  1. Kind of a broad question, Paul, and in what sense is it related to Gs point about the perverse subsidization of fossil fuel companies?


  2. Perverse subsidy is an economic term. Basically its the idea that we make doing something destructive profitable to some group; we collectively make it in their interest to something that, while fine for them (at least in the short run) is bad for the rest of us.

    For example, American drug policy (aggressive arrest and interdiction) is arguably a form of perverse subsidy. We are paying money allegedly to stop drugs but in reality we make the drug trade profitable to those who would violate the law, murder, etc.

    Fiscal conservatives argue that reactive public health care through medicaid treatments is arguably a form of perverse subsidy. It in essence “pays” people to be poor, irresponsible, and unhealthy.

    In terms of American energy policy, the oil industry is one of the most spoiled. Besides a number of tax codes that favor expansion of drilling, etc (I am not keen on those details at the moment), it gains from the use of public expenditures to pay for roads and the vast US military which secures our access to mideast oil fields. AS consumers we don’t see these costs because we don’t pay them at the pump (GFWs point as I perceived it) but in essence we are perversely subsidizing the oil industry.

    Its perverse because its in the industry’s interest to keep us gorging on “cheap” oil despite its associated military and logistical costs and its effects on atmospheric CO2 and the potential environmental calamities that could result.

    The solution is for the price of oil to represent its true “cost”, which is not currently the case.

    More clear?


  3. Oil was here before concrete roads. Trails have been improved for centurys, whether that is a subsidy for the mode of transport may be debateable. In the end cars and roads won out.
    The tax on gas, pays for roads. To the extent other payments contribute, I don’t know.

    Drugs, agreed.
    Medicare, medicaid, agreed.

    We’ll need to disagree on true cost.
    Military cost is another poor policy choice, and I’m sure we’ll disagree as to why.

    Basically sounds like we agree that subsidy can not exist without government butting in where it doesn’t belong.

    What kind of subsidy is ethanol? Wind Power? Solar thermal or electric?

    My comment to G has to do with how wealth in one area betters others in other areas.
    For example, our efficiency in the economy with the use of oil frees up time for people to improve medical knowledge, which helps all globally.

    If one does not view it that way, then one may point to a single part of a society (burning carbon for example) and call it bad, selfish, etc. without regard for the total impact.


  4. In the end cars and roads won out.

    Do you know how?

    The tax on gas, pays for roads.

    Not entirely.

    To the extent other payments contribute, I don’t know.

    Neither do I with precision but I do know that we subsidize the the commuter culture.

    We’ll need to disagree on true cost.

    How do you know if by your own admission you don’t even know what they are? (And neither do I but that’s my point: let us consider the relative “costs” of current policy. Let’s explore that issue together. Its a subject that every single denier I have *ever* encountered dodges. Always. Without exception.)

    Military cost is another poor policy choice,

    Please rephrase. This could mean anything.

    and I’m sure we’ll disagree as to why.

    Lacking your certainty I need clarification.

    Basically sounds like we agree that subsidy can not exist without government butting in where it doesn’t belong.

    What kind of subsidy is ethanol? Wind Power? Solar thermal or electric?

    What are you implying with these questions? That you do not know? And if so, why, if it all, do you oppose them?

    our efficiency in the economy with the use of oil . . .

    *Exactly* the thing in dispute, Paul. It only looks “efficient” because it is *subsidized*. My key point.

    . . . frees up time for people to improve medical knowledge, which helps all globally.

    I am tempted to swear at this statement. There is nothing efficient about it. Our patterns in oil use are an *indulgence*. We can advance medically, scientifically, socially and however else *without* driving cars that average less than 20 mpg.

    If one does not view it that way, then one may point to a single part of a society (burning carbon for example) and call it bad, selfish, etc. without regard for the total impact.

    I fear I have tried and simply cannot decipher this. Forgive my denseness, Paul.


  5. Skip,

    Are you the one that posted on Amazon inviting discussion and debate?

    I’d like to have some discussion with you (or anyone knowledgable about climate data) on using the GISS data. Where should I go to do that? (I’m not real in tune yet with the workings of these blog sites.)


  6. Hey Murf:

    Yeah hey, hozit going? Welcome to Coby’s humble but quite enlightening little blog. I have only a vague familiarity with the GISS “controversy” but it has been extensively discussed in this blog in other links.

    I’ll do some fishing around. I have not committed these discussions to memory but among the better authorities on this are frequent posters Dhogaza, Dappled Water, and GFW if memory serves.

    Anyway I hope this is fun for you; if Dhogaza bites, don’t worry; he doesn’t have rabies.



  7. Skip
    I’m insulted you didn’t include me among the luminaries in your list – oh well, have to study harder then.
    By GISS, I assume you mean the Goddard Institute of Space Studies. And I also assume you want to talk about the ‘controversy’ regarding the corrections to their datasets?
    If so – go for it. But I’m not sure why you would bother. The issue was put to bed years ago.


  8. Thanks for the responses. I hope you’ll bear with me–I’m pretty new to this whole climate issue and these blogs. I can see this is old hat to you guys.

    Actually, Mandas, I’m not interested at all in the corrections to the GISS data at the moment.

    Rather, I’m trying to do a little analysis for myself to see what the GISS data actually show, ignoring any and all issues of data quality–I just want to see what the basic data, as given and without any further adjustments, show in terms of a single global annual series.

    Here’s where I am: I’m presently using, if I understand it correctly, the global set of GISS temperature data [raw GHCN + USHCN corrections]. Looking at the dataset as a matrix of year-months x country-stations, how should one go about getting the data into a single global average annual series, given that there’s so many missing values?

    So far, I’ve considered two ways one might produce a single global series:
    (1) average all the available data over all stations by year-month, disregarding any missing values, then average the monthly series by year to get average annual;
    (2) average each country by year, omitting any years for a country where there are one or more months missing in the station’s data, then average over all the countries by year.

    What do you think of either of these methods? What are other (better?) ways to do it? Or, is there some reason it doesn’t make sense to do this at all?



  9. Murf,
    I can see what you are suggesting/attempting, but I would advise caution in drawing any conclusions based on – what appears to me at least – to be a fairly cursory re-analysis of the data. And I apologise if I am underestimating your capabilities.
    Firstly though, a detailed response to your question is a little beyond my field of expertise. I could offer you some considerations based on statistics and how to account for gaps and weightings in data etc, but that may not be sufficient for your use.
    However, specific to your request, I can say that you cannot simply average all the data points to determine a global average temperature, nor can you simply ommit data points and average the remainder. I would also suggest that you cannot ignore the issue of data quality nor adjustments. Not all points should receive equal weighting, as they may cover smaller or larger areas of interest, or they may be inherently more ‘reliable’ or accurate (and should therefore include smaller error margins). You would also need to know the position etc of all the data points and their positions relative to each other etc, plus the temporal issues for both the available and missing data.

    Here is a link you may follow to get some more information(but it appears you already have if you are discussing grids and matrices). If not, it may help. There are also some references on the site to papers outlining calculation methods which may assist you:

    Sorry I can’t be more help.


  10. First let me confess rank unfamiliarity with how these data are averaged and weighted to calculate changes in mean temperatures.

    However, out of curiosity, Murf: What is your training and source of your interest here?



  11. Mandas wrote: “I could offer you some considerations based on statistics and how to account for gaps and weightings in data etc…I can say that you cannot simply average all the data points to determine a global average temperature, nor can you simply ommit data points and average the remainder.”

    Mandas, I’d like to hear your considerations (in first part) and the reasons (in second part).

    Mandas wrote: “I would also suggest that you cannot ignore the issue of data quality nor adjustments.”

    Mandas, No doubt you’re ultimately right–it’s more complicated than my simplistic approach allows, but I want to start simple and just see what I get at that level. If nothing else, it should help me understand the data issues better. From what I’ve seen in these climate blogs, it looks like just about everything is ultimately and endlessly arguable, especially with regard to data quality and adjustments; I don’t want to deal with all that at the moment.

    Skip, I’m just a curious old geezer with a bit of available time who likes to ‘play’ with data.


  12. Murf
    Before I would begin to offer you any advice on how to apply statistics, I would need to know exactly how much you already know.
    It would be pointless me attempting to explain complicated statistical methods if you only have a high school education – you just wouldn’t understand the maths (I’m not a statistician either and no expert in mathematics – just someone who did statistics as part of my science degrees). It would also be pointless – and I suspect rather patronising – if I explained basic statistical concepts if you are a mathematician or similar.
    I know this because I have attempted to explain some very basic statistical concepts (such as the difference between causality and correlation) elsewhere on this blog to a well-known frequent poster – and it felt a little like beating my head against a brick wall.
    I suspect from your request for me to give reasons why you cannot simply average all the available data, that you have only a very basic statistical understanding. Let me give you a VERY simple lay example. If there are five measuring stations in a relatively small area (such as city), but only one measuring station in a relatively large area (such as a rural area), then you cannot simply add up all the numbers, divide by 6, and get the mean temperature for the region. Any calculation applied this way would be heavily biased towards the city readings. You would have to apply a weighting to increase the value of the rural station, or decrease the value of the city stations, because of their effective areas of coverage.
    Similarly, if you are missing some data points (it may be because a reading wasn’t recorded for some reason), you cannot simply add up all the rest and divide by the number of readings to get the mean – the calculated mean would only be valid if the missing data was exactly equal to the mean, and that would be highly unlikely. Consequently, you would need some mechanism to extrapolate for the missing data, or you would have to apply appropriate error bands to demonstrate your confidence (or lack of it) in the results.
    But I have gone on long enough. If you really want to understand some of these concepts better, this is not the place for me to teach my limited knowledge of statistics – you should read a text book. But even that would not be adequate to understand the concepts involved in the calculations used in this science, which are very complex indeed.


  13. I know this because I have attempted to explain some very basic statistical concepts (such as the difference between causality and correlation) elsewhere on this blog to a well-known frequent poster – and it felt a little like beating my head against a brick wall.

    If you’re referring to our lengthy discussion regarding the usefulness of tree chronologies as temperature proxies, please give your head a rest, because I am *well* aware of the difference between causality and correlation.

    Our differing view of the usefulness of the work by Briffa and others lies elsewhere.


  14. Not referring to you dhogaza – I am referring to the namesake of this thread.
    And yes – we can differ on the issues about tree rings etc, but I think we are in broad agreement on most other issues here.


  15. Mandas, I think my question is, actually, why do you think there’s likely to be bias in this set of over 6,000 stations? In other words, what leads you to believe the missing values are not randomly distributed around the true population means?

    Assuming decent reason to suspect non-randomness, how would you ‘extrapolate’ for the missing data or develop appropriate error bands?

    Finally, where do I go to find out precisely how the well-known publishers of these temperature graphs handle the missing data problem? I looked on the GISS site, but I didn’t uncover it, if it’s there.

    There’s a reasonable chance I can pick up on your data analysis expertise, so give me a shot.


  16. I just located an interesting-sounding article:

    Filling missing temperature values in weather data banks
    By Kotsiantis, S.; Kostoulas, A.; Lykoudis, S.; Argiriou, A.; Menagias, K.
    Intelligent Environments, 2006. IE 06. 2nd IET International Conference on
    Page(s): 327 – 334
    5-6 July 2006
    Volume: 1 Issue:

    I”m a little tied up at the moment, but I hopefully can read through this in the next couple of days.


  17. Murf,
    Seems you are answering your own questions – but to respond briefly to your post to me:
    Without analysing the data, there is no way of knowing whether the missing data is random or otherwise. In general terms, the more data that was missing from more locations over a great period, the more likely it is to be random.
    If some sort of clustering were observed (spacially or temporally), then the more likely it is to be non-random.
    Extrapolation is a difficult concept and I can only speak from datasets I use myself in my own work. An example may be (and this is a VERY simple example only) if there was an observed correlation between two data sets (eg at adjacent measuring stations), you MAY be able to use the data from one to extrapolate missing data from the other. Another (more robust) method is called ‘regression analysis’ to predict missing data based on known information. Here is a link to an explanation of how it works:
    Hope this helps.


  18. Does anyone have any specific reasons, either a priori or from analyzing the data — specific, that is, to the GISS temperature yr-mo x station dataset, not just general analytical considerations — why missing values are likely to be non-random with respect to the global annual means. (Actually, relative-to-means may be an overly stringent specification since it’s basically the years relative to one another, i.e., the trend rather than the actual mean values, that’s of primary interest to me, but I’ll ignore that.)


  19. why missing values are likely to be non-random with respect to the global annual means.

    To be perfectly honest, I didn’t realize they were (I have only followed this casually). Murf, I don’t suppose I could persuade you to explain again what you’re driving at.

    Is it that there is more/less missing data for “warmer” years, and thus this might call into question any observed “trends”. Help catch up the lowly social scientist here.



  20. Skip, Thanks for the response.

    What I’m driving at is that whether or not it’s necessary to do anything to the data to deal with missing values depends on whether they are randomly distributed about the global means (this is pehaps an unnecessarily strong condition, I think, as previously noted, but I’ll go with it).

    So what I’m looking for is any input where someone has reason to think they are biased (nonrandom), either a priori suspicion specific to the dataset or conclusions from actually examining the data.

    I’m at a little of a loss to know how to explain it any more. I’m also a little confused by your question, as I’m not saying the values are biased, I’m just asking if anybody has good (specific) reason to think they are, such that you would need to attempt the kind of things (roughly speaking) to the data that Mandas alludes to.

    If I’m missing what you’re saying or asking, try me again.

    I’m also a little surprised that nobody seems curious as to what the data show without further adjustment, even if there are missing value bias problems or other problems. I think I’d nearly always want to look at that, just as a matter of course, at least unless I had decent reason to think bias really does exist in the dataset under examination. It’s just one step in data analysis.

    Maybe that article I mentioned but have not yet read will shed some light.

    Maybe, also, I’m in the wrong forum here. If there’s better places that you can recommend, where people might have actually worked with the data, I’d be happy to go there.


  21. I’m at a little of a loss to know how to explain it any more.

    No I think I get it now. But yeah, you’re right I really don’t know.

    As a layman I have to assume that the obvious possibility of confounding effects of data bias were accounted for in the temperature range calculations. Exactly how this was done I could not say.

    My point was simply that I seriously doubt there is a blind spot here that climate scientists measuring trends in mean temps have bungled and missed.

    Wish I could be more helpful. However, if you *do* dig up something that shows, for example, that the missing/distorted data are *not* randomly distributed around the means, that this has *not* been taken into account by climate researchers, and that this has implications one way or another for our understanding of AGW, then that is the kind of finding/contention that is quite pertinent to this forum.


  22. Skip, Just to try to make sure it’s clear: I’m not even remotely suggesting the preparers of the data for the published graphs have not properly accounted for any such bias. I have no reason to suggest anything one way or the other–I’m not in that battle.

    I’m just trying to find out how they do it(and possibly why they think it’s necessary) to see what I can make of the data doing my own analysis with it.


  23. Hmm.

    Well I wish I knew where to begin. My guess is Goddard, among others, explains how they make these adjustments but again I confess I don’t really know.

    I don’t suppose if/when you dig something up you’d post the relevant links here. We can all afford to learn more.


  24. Skip, I’ll post if I find anything that looks like it might be worthwhile. I’m always encouraged when someone is interested in learning more–learning stuff is the fun of analysis for me.


  25. Murf
    With regard to your on-going query regarding the randomness or otherwise of the missing data, I’m not sure if I made my earlier point clear.
    There is no way of determining whether the missing data is random or otherwise without examining it. That way you can determine if it is random (ie no observed clustering), or if it is biased (ie an observed clustering about a specific point or set of points).
    Because I haven’t seen the data, I can’t say either way how it should be interpolated, and I can’t say if I have any reasons either way if it is biased.
    Alternatively, if you want an expert opinion, you could ask the experts at the NOAA National Climate Data Centre. Here is their email address:


  26. Mandas, after giving it a little thought, it seems to me fairly easy in principle to get a pretty good idea from the data of whether there’s serious missing value bias in the (‘unadjusted’) global annual means or not. Either there’s location bias (a priori, very possible I’d think) or month bias(not very likely I’d think) .

    Location bias appears a bit more challenging to handle, but I have an idea or two for an approach that might be sufficiently satisfactory for my purposes (I’m not publishing).

    I’m going to check it out in the next day or two, hopefully. I’ll let you know the results.


  27. an approach that might be sufficiently satisfactory for my purposes (I’m not publishing).

    Ah, the luxuries of retirement, eh? I saw that and had to let loose with a good-natured laugh. Let us know what you find, Murf.

    Thanks for your contribution,


  28. I (and a cohort) have spent several days now looking at and working with the GISS data (makes us experts, of course).

    I was a bit surprised when I first realized how extremely lumpy the global geographic distribution of weather stations actually is.

    We attempted to grid the globe and come up with station samples that would be reasonable approximations to geographic homogeneity. We’ve not been able to get anything that appears satisfactory enough to have confidence that we’re ‘global’ and that lat-lon drift biases over time are accounted for sufficiently.

    Also, I have serious doubts about anyone being able to do anything legit with missing values for this data–maybe, but I’d have to see the proof and the procedural details.

    I think I’ll try emailing the NOAA folks. It’s looking like I’m going to have to be skeptical about ANYONE’S derivation of global mean temperatures from this data until someone can detail to me the steps they’ve taken to derive statistically acceptable global means.

    It does appear one can get reasonable means for the U.S., so maybe I’ll take a look at that.

    I may yet change my mind on the global means as I play with the data more or get more info. But, with my current experience with the data, that’s where I am now.


  29. Interesting, Murf.

    Thanks for your investigative approach and keeping us updated.

    Who knows, maybe you’ll bring down the whole AGW paradigm.

    I for one would be relieved.


  30. Aha. I have found some materials detailing the interpolation techniques used for this type of data (or so it appears). It will take me a bit of time to gain some familiarity with them and, assuming they are are demonstrated to be sound, see if I can apply them myself.


  31. I will have to say, after spending a bit of time in this, trying to calculate an unbiased global annual temp means time-series with a reasonable degree of confidence from GISS weather stations monthly data is different from anything I’ve encountered before.

    Keeping in mind I’m an absolute beginner at dealing with climate data, I’ve been thinking about trying to understand what’s involved by considering what would constitute an ideal data set for the purpose of deriving an unbiased global mean time-series from station readings at some sufficiently small grid interval around the globe (how ‘sufficiently small’ would be established, I’m not sure).

    Here’s what seems to me would be the ideal specs for such a dataset (which constitutes a sample, of course, of a larger ‘population’ of readings):

    (1) the data’s geographic distribution, mapped to a grid
    (a) the grid must be tight enough to allow a high degree of confidence in the sample global time-series means (I don’t know how such a thing (i.e., grid size) can be determined–can anyone help me here?)
    (b) must be weighted to account for the disparities in land areas among evenly-spaced lat-lon grids (this, I think, can be readily done for any grid size)

    (2) the data values
    (a) there would be at least one reading with no missing month values for all years under consideration for each grid tract
    (b) the values would be fixed or fully random across the station grid-year matrix with respect to certain other potential temperature-affecting factors, such as population concentration or altitude above sea level.

    Does this make sense? Are there other significant factors to consider?

    The actual data, as I’ve learned, falls far short of this ideal, which is where, I suppose, the story gets interesting. The big deal appears to be the large number of missing values with respect to a mapping to a grid that would be tightly enough defined (assuming that can be determined) to confidently yield suffciently accurate unbiased global means.

    I’m finding lots of papers related to the missing values problem, but I’ve not run across one that empirically verifies, for this type of data and objective, any of the methods offered. A principal components type method seems to be the technique of choice for climate researchers, judging by what I’ve seen so far.

    I’m no doubt old-school, but I don’t think I or any of the people I’ve worked with over the years were ever very comfortable with data imputation, at least in any situation remotely resembling this one. But I would certainly yield if someone can point me to any empirical verifications.


  32. I must say this is a really impressive hobby you’ve picked up on Murf. If was you I’d be watching the playoffs (like I am right now as the Cowboys go down in flames–haha.)

    Although I cannot keep up with your technical explanation, and I fully appreciate your “old school” skepticism of data imputation and other “tricks” (oh dear did I use that word; i must be an academic fraud) for accounting for it, a general question I might ask is:

    How badly does the missing data problem compromise the robustness of the generally accepted conclusions regarding warming? On a much smaller scale have the same problem in a data set I used for a recent article submission (nothing nearly as advanced as what goes into estimating temperature changes) but my argument to the anonymous reviewers is essentially, “Yeah my data are incomplete, even lousy on some factors, but all the data we *do* have point toward the same general conclusion, I have in several of my estimations biased them *against* those conclusions, and the apparent selection bias for missing data would be that the more we knew the more it almost certainly would support what I’ve found.”

    My perception is that the evidence of warming is pretty sound whatever the limits of temp data. But keep having fun and reporting back. Go Vikings.


  33. Skip,

    Point well taken.

    I’m still hoping to locate a paper or two with some empirical testing of the EOF (PCA) imputation technique as applied to this type of spatial-temporal data.

    I have run out of free time to explore this much further for now–interesting as it’s been–as I’ve picked up some economic data contract work that I have to get busy on. I’ll check back off and on, though, to see if there’s anything new.

    Thanks for your responses.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s