rhandir ([info]rhandir) wrote,
@ 2005-09-19 17:15:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
Megatokyo and Timeliness.
Abstract:
Many people "feel" that Megatokyo isn't updated regularly. This author examined the actual data on how often new comics are posted on megatokyo.com, and discovered a counterintuitive result.

Introduction
There are many things in life that we all "know" are true. Winter is snowy. Customer service lines are aweful. Megatokyo updates so slooooooowly and irregularly that it's painful. Like many things that we "know", the truth is more complex than that. This author collected some data from the megatokyo site and discovered that basic assumptions about how often it is updated are incorrect.

Methods
The javascript drop down code from the Megatokyo front page was captured and brought into OpenOffice[0] as a text document. The capture was done early on Monday, September 17, 2005, at Comic #761.

Regex search and replace was used to reduce the dropdown to four tab delimited colums: Comic #, Date, (duplicate) Comic #, and Title. [1] The file was brought into OpenOffice's spreadsheet program using the Open... dialogue and picking txt/csv format.[2] Columns were marked as the appropriate kind of data.

A series of simple formulas were applied to analyze the data. Letters mark column values, numbers are row values. (For instance the first cell would be A1.)

Turn dates into days of the week: [3]
=CHOOSE(WEEKDAY(B2);"Sun";"Mon";"Tue";"Wed";"Thu";"Fri";"Sat")

Count how many times a word appears in a range of cells:
=COUNTIF(E2:E762;"SHORT")

Days between:
=DAYS(B762;B2)

Division
=H6/H8
Multiplication
=PRODUCT(M13;24)
24 is a constant in this example
Subtraction
=56-N14
56 is a constant in this example
Addition
=SUM(K31:M31)

Discussion

I. Weekday Updates?
Fred Gallagher has stated that his goal is to update three days a week, ideally on Mondays Wednesdays and Fridays. Let's see how he does.

I turned the list of dates into a list of days of the week, and then counted how many times each day showed up. Here's the raw data:
Mon	Tues	Wed	Thurs	Fri	Sat	Sun
240	0	248	15	233	5	6
Hmm. Remarkable, especially since he's been doing this for five years.

What's that in percentages? (Given 761 comics)
Mon		Tues		Wed		Thurs		
31.54%		0.00%		32.59%		1.97%		

Fri		Sat		Sun
30.62%		0.66%		0.79%

Looks like Fred's on target, eh? Perfect numbers would be 33.33% on Mon, Wed, Fri. Being only human, Fred's getting 32%, 33%, and 31%.
Short summary: with rounding, Fred updates Megatokyo on the days he says he will 96% of the time.

II. How many strips does he get done a year, then?
Using the first listed date (08/14/00) as the anniversary date, we end up with five periods of between 364 and 366 days, which is close enough to 365 on average, considering that a solar year has a couple extra hours in it anyway, and odds are one of those was a leap year.
Raw Data:
Days	Strips
364	158
366	139
364	151
366	147
364	152

In an ordinary calendar year, there are 52 weeks times 3 updates, or 152 possible timely updates, but that's not quite as exact as it could be. The number of days or hours between ideal updates is a slightly better measure. If he's supposed to be updating three times a week, ideally he would have 56 hours beween updates. (7 days a week divided by 3 days equals 2.33 days between updates.)

Fred's actual score, in hours:
End Year One		55.29
End Year Two		63.19
End Year Three		57.85
End Year Four		59.76
End Year Five		57.47

Right column is average #of hours behind schedule. Note means that in year one he was almost 45 minutes ahead of schedule.
End Year One		0.71
End Year Two		-7.19
End Year Three		-1.85
End Year Four		-3.76
End Year Five		-1.47

Yeah, so ON AVERAGE, at no point has Fred been behind schedule more than 7 hours and 11 minutes, and that was three years ago.


III. But what about the guest strips, short stories, dead piro days, shirt guy dom days, etc.?
For convenience, call those "off topic" strips.
Not all of those were accurately marked in the listing on the site. I'll have to go back and recount those again. However, and this is a big however, out of the run of 761 strips,
(excluding "entertaining strips" like guest strips, Nani Nazi Megatokyos, Short Story strips, Adventures of Piro and Seraphim) you end up with only 108 "non-story" strips. In the past two years, there've been around 20 a year out of an average 150. If you include the full run of "off topic" strips, it comes to 143 strips, or about a fifth of the total run.

Here's the ironic part:
A third of those "off topic strips" were in Year One.

IV. Protests

  • But you didn't count actual days between strips!

  • The averages in section two do look like they blurr the length of some of those gaps. Rest assured, a team of trained hamsters is looking into this right now. We'll get you a histogram.

  • Exactly how many days did he run late?

  • Twenty-six, see section one. The way they were counted discards lateness that flowed over onto the next Monday, Wednesday, or Friday. But the average number of hours behind schedule couldn't be as low as it was if that happened often.

  • The count of "off topic" strips is inaccurate!

  • Yes.


Conclusion
Fred Gallagher catches a lot of flak for being "late". Simple averages prove that he's not very late, and most likely isn't late at all. More detailed stats will be run in the future. I'll happily provide my data for others to analyze in a month or two.



[0]
Free at OpenOffice.org. Can import microsoft and wordperfect stuff too.
[1]
Use Find and Replace, select the regex checkbox, and use \t to replace characters with tab, \r to replace characters with a line break. (Under windows) If you pick regex, you cannot find and replace square brackets like these: [ ]. Edit: \r no longer seems to work. User error?
[2]
Using "import" gives an error.
[3]
This formula brought to you by http://www.openofficetips.com/
I could not have worked that one out by myself.

Revised November 2006 to add Public Domain Dedication.


Copyright-Only Dedication (based on United States law) or Public Domain Certification

The person or persons who have associated work with this document (the "Dedicator" or "Certifier") hereby either (a) certifies that, to the best of his knowledge, the work of authorship identified is in the public domain of the country from which the work is published, or (b) hereby dedicates whatever copyright the dedicators holds in the work of authorship identified below (the "Work") to the public domain. A certifier, moreover, dedicates any copyright interest he may have in the associated work, and for these purposes, is described as a "dedicator" below.

A certifier has taken reasonable steps to verify the copyright status of this work. Certifier recognizes that his good faith efforts may not shield him from liability if in fact the work certified is not in the public domain.

Dedicator makes this dedication for the benefit of the public at large and to the detriment of the Dedicator's heirs and successors. Dedicator intends this dedication to be an overt act of relinquishment in perpetuity of all present and future rights under copyright law, whether vested or contingent, in the Work. Dedicator understands that such relinquishment of all rights includes the relinquishment of all rights to enforce (by lawsuit or otherwise) those copyrights in the Work.

Dedicator recognizes that, once placed in the public domain, the Work may be freely reproduced, distributed, transmitted, used, modified, built upon, or otherwise exploited by anyone for any purpose, commercial or non-commercial, and in any way, including by methods that have not yet been invented or conceived.

Creative Commons License
This work is licensed under a Creative Commons Public Domain License.


This work is hereby released into the Public Domain. To view a copy of the public domain dedication, visit http://creativecommons.org/licenses/publicdomain/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.



(Post a new comment)

Awesome orginal research
(Anonymous)
2005-09-20 01:11 am UTC (link)
The statistics you have definitely debunk the claim of irregukar updates. You just got to make sure that your statistics is accurate.

(Reply to this)


[info]ghastlycomic
2005-12-04 12:34 am UTC (link)
I think the study needs to look at the time of day the strip goes up.

It's all well and good to say you update every MWF, but if your update doesn't go up until late in the evening I think that counts as a late update even through it fell on the right day.

In fact I'd say if it hasn't gone up by 10am the day it's supposed to update it should be considered a late update.

(Reply to this) (Thread)

Agreed!
[info]rhandir
2005-12-08 01:06 pm UTC (link)
Thanks for the reply, Ghastly. Yep, you nailed the weakness of the study: the definition of "late". Good catch.

I think you are on to the main thing, that it's not the lateness that bothers people, it's the violation of expectations. For instance, I _know_ Mac Hall will update sometime this week. Maybe. And I will enjoy it when it does. But I've never bothered to find out when they actually update, because there's no obvious pattern that I can see when I use the standard* "load my comics bookmarks into firefox tabs in the morning".

Fred, on the other hand, repeatedly refers to a MWF schedule, and has a kind of storyline that lends itself to cliffhangers (real and unintentional). So the expectation that "today" I'll find out "what happens next" gets violated a lot.

I really should go back, and think about how to quantify an expectancy violation. (10 am sounds like a good place to start.) That's a lot tougher than doing a straightforward number crunch.

Frankly, I have a problem with people who obviously love the series (or used to) slagging Fred for being tardy, when they aren't _really_ complaining about what they think they are complaining about. (Kinda like those arguements you see couples get into where they think they are fighting about the same thing, but really they are talking at cross purposes.)

I mean, if I get to find out "what happens next" three days a week, and those days are Tues, Thurs, and Saturday, because I'm not awake at 11:59 on MWF, that's good enough for me. I can certainly understand (and sympathize with) why that bothers people, but I was trying to gently point out that it really isn' that important.

It would be interesting to go back through a mirror of the site and see what the timestamps are on the individual comic images. Hmm. "Further research is required."

Thanks.

*okay, standard for me.

(Reply to this) (Parent)


Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…