2013-05-14

Whither TwoTwentyTwo?

This post follows up on various mumblings from various voices about how Malaysia should have more of the quantitative insight that was available to the public in the United States, around the 2008 and 2012 general elections in the United States. In particular, the popular polling aggregation website FiveThirtyEight by Nate Silver has been used as an example of what would be nice to have in the Malaysian political environment.

Well, if you want that, you first need to understand,

what it is:


Educated guessing...


Or, in loose parlance, science. Or quantitative analysis involving statistics. Or probabilistic logic, what have you. It's just a bunch of operations on data. Which brings me to the next point.

... based on data...


They have loads and loads of data, for this sort of thing, over in the States. A lot of the data comes from polls. I'm not sure what the good data sources are in Malaysia, at this point in time. Maybe there will be more, next time.

... that is available to the public...


Obviously, any political party worth its salt has its own research unit for obtaining data which gives it a competitive advantage. Don't expect these data sets to be floating around for free.

In Malaysia, the only decent published polls data which seems to be professionally* produced is by the Merdeka Centre for opinion research. (* I would have said "I use this term loosely," but really I don't even know what I'm talking about well enough to say that.)

For example, here's a report that the Merdeka Centre published shortly before the last general election in Malaysia, with 1600 samples. Of course, the report is a summary, and what you really want is to beg, borrow, buy, or steal their raw, unabstracted data, so that you can run additional analyses on it.

... that is in context...


Preferably, anyone processing data wants that data to be fresh. Doing periodic surveys could eventually yield further insight about the effectiveness of various political campaigns, on various demographics. Time series are lovely, if you know what to do with them.

Also, the "experimental design," or identification of what characteristics of respondents are noteworthy will determine the available meta-data for each sample, and therefore its availability for intersection with other data sets, past, present, and future.

... that exists.


And the only thing stopping it from existing is... itself, open for discussion.

No comments :

Post a Comment