Innings-level penalty runs – Data version updated to 0.8

47 Test matches were played in 2016 however, as dedicated observers of Cricsheet will have noted, I have, until now, only provided data for 46 of those matches. The 3rd Test of the 2016 New Zealand tour of India was the sole omission. As of today that has been rectified, and we now provide full Test data for 2016.

In that 3rd Test, while batting, Ravindra Jadeja persistently ran on the pitch resulting in 5 penalty runs being awarded to New Zealand before the start of their 1st innings. This method of applying penalty runs was not one that my existing data format could support (as we previously expected penalty runs to be applied on a particular delivery), meaning that I’ve had to apply a small update to the format to support this new development. This has resulted in an update of the data format version from 0.7 to 0.8.

The only change between versions is the addition of an optional penalty_runs field within each innings. If penalty runs were added to an innings, either before or after the innings, then this field will be provided (with pre or post used as appropriate).

Previous data files will be exactly as they were, save for the change of version number, while the newly-added data file for the aforementioned Test match will actually use the new field. If you’ve written code that uses the data I provide you may want to tweak it to take account of the new field.

A long overdue addition – Women’s data

I’ve been making data available on this site since 2009 and have gradually increased the number of files I provide to the point that, as I write, 2,780 matches are available. Over time I’ve expanded from just matches involving Full Members, to the Indian Premier League, non-ODI international one-day matches, and international T20s. This gradual expansion means that I’m now providing over 380 matches involving only the Associates and Affiliates, meaning that I’m not just covering the Full Members. This has been an improvement, however there is still one issue, and that is that I’ve only been providing data for Men’s cricket.

I’ve wanted to add data for Women’s cricket for a while. I started the project with the idea of providing cricket data, but I didn’t really think of anything beyond Men’s cricket. Raf Nicholson expressed very well the trap I let myself fall into.

At its heart, it comes down to this: The first “C” in ICC has always, since its formation in 1909, stood for “cricket”, though what it should really have been called, up until it took control of women’s cricket in 2005, was the IMCC – the International Men’s Cricket Council. When a male journalist says, “I am a cricket correspondent”, he means “I am a men’s cricket correspondent.” When a blog refers to itself as an “England cricket blog”, what this generally means is “an England men’s cricket blog”. And when ordinary cricket fans say “cricket”, almost without exception what they really mean is “men’s cricket”. In short, men’s cricket is the default setting.

I very much fell into the trap of viewing Men’s cricket data as cricket data, and not considering Women’s cricket at all. This is unfair, and something I’ve been planning to fix. Men’s sports have awesome data as Allison McCann has noted, while Women’s sport is poorly served.

And just because the data doesn’t exist doesn’t mean we can’t compile it ourselves or make estimates based on what is available. I just think that in addition to praising the virtues of men’s sports data, we need to acknowledge that good women’s sports data is severely lacking.

I’m happy to announce that, as of today, Cricsheet will finally be providing data for Women’s cricket. The initial release consists of 257 matches, comprising 148 T20Is, 69 ODIs, 37 International T20s, and 3 Test Matches, and includes matches from as far back as 2009.

The addition of Women’s data has a practical implication for the data we already provide. The Data Format has just been changed to update the version from 0.6 to 0.7, to allow for the addition of gender as a new field in the info section. Right now this field contains either female or male, but I reserve the right to have other values in the future.

The Downloads page on the site has also been updated to allow users to download Women’s or Men’s matches in all of the variations we previously provided, as well as continuing to download all matches for all genders.

As Raf Nicholson wrote men’s cricket is the default setting, and I’ve been guilty of having that mindset. Today is a small step on the path to changing that, and to stop viewing men’s cricket as the default.

Until that changes, we have a problem. Until that changes, I’m going to keep telling the world that I am a feminist. Cricket needs feminism. End of story.

Version updated to 0.6

The data version included in every data file I provide, and explained on the format page of the site, has just been changed from 0.5 to 0.6. This actually reflects a relatively minor change, and is the first time I’ve bumped the version number since February 2013.

In the 1st Test of 2014 between Pakistan and Australia, Sarfraz Ahmed was dismissed and play stopped for tea. After the break Zulfiqar Babar, who had been batting with Ahmed, didn’t come back out and retired hurt. This meant that in the data I needed to record 2 dismissals related to a single delivery. A complication had arisen.

As I’d never even considered multiple wickets on a single delivery as a possibility, and since it had never occurred in the previous 31,271 wickets I provide data for, I’ve had to tweak the data format, along with numerous scripts, to allow for this possibility. The change I’ve implemented allows the wicket entry on a delivery to contain a list of wickets, rather than always assuming just one. Balls where only a single wicket fell (all 31,271 of them thus far) are unchanged, this tweak simply allows for the possibility of something different.

If you’ve written code that uses the data I provide you should make a small tweak to check for the existence of multiple wickets on a delivery, however, if you don’t, you’ll probably be fine apart from when you try to process that single Test match where this issue.

There will be substantial changes to the data format coming in the next number of months, which will add new information for many of the matches currently covered. These may require tweaks to some of your code, but I will be providing parallel versions of the data files for a period of time, allowing users to continue to use the older version while updating their code. More details on these changes soon.

A new data version

I’ve been adding matches to the site on a fairly regular for the last few years, despite the lack of new articles on the site. Now however period of silence is finally over as there is a new data version to announce. Today I’ve moved all of the data files to version 0.5 and made a few other small changes to the site. First of all we’ll deal with the data format changes for version 0.5. These are fairly minor for the most part.

The first change is the addition of a revision field to the meta section of the file. This is set to 1 for every file at the moment and will increment any time there is a revision to the file. This replaced the updated field which I’ve decided was of little use.

The second change is the addition of new fields to deal with the situation where a match is decided by a bowl-out. The first field is the addition of a bowl_out to the outcome part of the info section which indicates which team won the match by the bowl-out. The second bowl_out, an addition to info, is an array containing details of the details of the actual bowl-out. It lists each ball bowled showing the bowler and the outcome. An example of a bowl-out can be seen in the file for the first West Indian T20 international in 2006.

The final change is the addition of a supersub entry to any delivery in which a super-substitution was made. This will be an array containing an entry for each substitution, containing in, out, and team fields showing which player came in, who was replaced, and which team made the substitution. You can see the only example on the site at this time in a South Africa vs New Zealand T20 match from 2005.

A number of changes are already in the works for version 0.6 of the data. More details on what those will be will come in the next few weeks.