| rhandir ( @ 2005-09-19 17:15:00 |
Megatokyo and Timeliness.
Abstract:
Introduction
There are many things in life that we all "know" are true. Winter is snowy. Customer service lines are aweful. Megatokyo updates so slooooooowly and irregularly that it's painful. Like many things that we "know", the truth is more complex than that. This author collected some data from the megatokyo site and discovered that basic assumptions about how often it is updated are incorrect.
Methods
The javascript drop down code from the Megatokyo front page was captured and brought into OpenOffice[0] as a text document. The capture was done early on Monday, September 17, 2005, at Comic #761.
Regex search and replace was used to reduce the dropdown to four tab delimited colums: Comic #, Date, (duplicate) Comic #, and Title. [1] The file was brought into OpenOffice's spreadsheet program using the Open... dialogue and picking txt/csv format.[2] Columns were marked as the appropriate kind of data.
A series of simple formulas were applied to analyze the data. Letters mark column values, numbers are row values. (For instance the first cell would be A1.)
Turn dates into days of the week: [3]
=CHOOSE(WEEKDAY(B2);"Sun";"Mon";"Tue";"W ed";"Thu";"Fri";"Sat")
Count how many times a word appears in a range of cells:
=COUNTIF(E2:E762;"SHORT")
Days between:
=DAYS(B762;B2)
Division
=H6/H8
Multiplication
=PRODUCT(M13;24)
24 is a constant in this example
Subtraction
=56-N14
56 is a constant in this example
Addition
=SUM(K31:M31)
Discussion
I. Weekday Updates?
Fred Gallagher has stated that his goal is to update three days a week, ideally on Mondays Wednesdays and Fridays. Let's see how he does.
I turned the list of dates into a list of days of the week, and then counted how many times each day showed up. Here's the raw data:
What's that in percentages? (Given 761 comics)
Looks like Fred's on target, eh? Perfect numbers would be 33.33% on Mon, Wed, Fri. Being only human, Fred's getting 32%, 33%, and 31%.
Short summary: with rounding, Fred updates Megatokyo on the days he says he will 96% of the time.
II. How many strips does he get done a year, then?
Using the first listed date (08/14/00) as the anniversary date, we end up with five periods of between 364 and 366 days, which is close enough to 365 on average, considering that a solar year has a couple extra hours in it anyway, and odds are one of those was a leap year.
Raw Data:
In an ordinary calendar year, there are 52 weeks times 3 updates, or 152 possible timely updates, but that's not quite as exact as it could be. The number of days or hours between ideal updates is a slightly better measure. If he's supposed to be updating three times a week, ideally he would have 56 hours beween updates. (7 days a week divided by 3 days equals 2.33 days between updates.)
Fred's actual score, in hours:
Right column is average #of hours behind schedule. Note means that in year one he was almost 45 minutes ahead of schedule.
Yeah, so ON AVERAGE, at no point has Fred been behind schedule more than 7 hours and 11 minutes, and that was three years ago.
III. But what about the guest strips, short stories, dead piro days, shirt guy dom days, etc.?
For convenience, call those "off topic" strips.
Not all of those were accurately marked in the listing on the site. I'll have to go back and recount those again. However, and this is a big however, out of the run of 761 strips,
(excluding "entertaining strips" like guest strips, Nani Nazi Megatokyos, Short Story strips, Adventures of Piro and Seraphim) you end up with only 108 "non-story" strips. In the past two years, there've been around 20 a year out of an average 150. If you include the full run of "off topic" strips, it comes to 143 strips, or about a fifth of the total run.
Here's the ironic part:
A third of those "off topic strips" were in Year One.
IV. Protests
Conclusion
Fred Gallagher catches a lot of flak for being "late". Simple averages prove that he's not very late, and most likely isn't late at all. More detailed stats will be run in the future. I'll happily provide my data for others to analyze in a month or two.
[0]
Free at OpenOffice.org. Can import microsoft and wordperfect stuff too.
[1]
Use Find and Replace, select the regex checkbox, and use \t to replace characters with tab, \r to replace characters with a line break. (Under windows) If you pick regex, you cannot find and replace square brackets like these: [ ]. Edit: \r no longer seems to work. User error?
[2]
Using "import" gives an error.
[3]
This formula brought to you by http://www.openofficetips.com/
I could not have worked that one out by myself.
Revised November 2006 to add Public Domain Dedication.
Copyright-Only Dedication (based on United States law) or Public Domain Certification
The person or persons who have associated work with this document (the "Dedicator" or "Certifier") hereby either (a) certifies that, to the best of his knowledge, the work of authorship identified is in the public domain of the country from which the work is published, or (b) hereby dedicates whatever copyright the dedicators holds in the work of authorship identified below (the "Work") to the public domain. A certifier, moreover, dedicates any copyright interest he may have in the associated work, and for these purposes, is described as a "dedicator" below.
A certifier has taken reasonable steps to verify the copyright status of this work. Certifier recognizes that his good faith efforts may not shield him from liability if in fact the work certified is not in the public domain.
Dedicator makes this dedication for the benefit of the public at large and to the detriment of the Dedicator's heirs and successors. Dedicator intends this dedication to be an overt act of relinquishment in perpetuity of all present and future rights under copyright law, whether vested or contingent, in the Work. Dedicator understands that such relinquishment of all rights includes the relinquishment of all rights to enforce (by lawsuit or otherwise) those copyrights in the Work.
Dedicator recognizes that, once placed in the public domain, the Work may be freely reproduced, distributed, transmitted, used, modified, built upon, or otherwise exploited by anyone for any purpose, commercial or non-commercial, and in any way, including by methods that have not yet been invented or conceived.

This work is licensed under a Creative Commons Public Domain License.
This work is hereby released into the Public Domain. To view a copy of the public domain dedication, visit http://creativecommons.org/licenses/pub licdomain/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
Abstract:
Many people "feel" that Megatokyo isn't updated regularly. This author examined the actual data on how often new comics are posted on megatokyo.com, and discovered a counterintuitive result.
Introduction
There are many things in life that we all "know" are true. Winter is snowy. Customer service lines are aweful. Megatokyo updates so slooooooowly and irregularly that it's painful. Like many things that we "know", the truth is more complex than that. This author collected some data from the megatokyo site and discovered that basic assumptions about how often it is updated are incorrect.
Methods
The javascript drop down code from the Megatokyo front page was captured and brought into OpenOffice[0] as a text document. The capture was done early on Monday, September 17, 2005, at Comic #761.
Regex search and replace was used to reduce the dropdown to four tab delimited colums: Comic #, Date, (duplicate) Comic #, and Title. [1] The file was brought into OpenOffice's spreadsheet program using the Open... dialogue and picking txt/csv format.[2] Columns were marked as the appropriate kind of data.
A series of simple formulas were applied to analyze the data. Letters mark column values, numbers are row values. (For instance the first cell would be A1.)
Turn dates into days of the week: [3]
=CHOOSE(WEEKDAY(B2);"Sun";"Mon";"Tue";"W
Count how many times a word appears in a range of cells:
=COUNTIF(E2:E762;"SHORT")
Days between:
=DAYS(B762;B2)
Division
=H6/H8
Multiplication
=PRODUCT(M13;24)
24 is a constant in this example
Subtraction
=56-N14
56 is a constant in this example
Addition
=SUM(K31:M31)
Discussion
I. Weekday Updates?
Fred Gallagher has stated that his goal is to update three days a week, ideally on Mondays Wednesdays and Fridays. Let's see how he does.
I turned the list of dates into a list of days of the week, and then counted how many times each day showed up. Here's the raw data:
Mon Tues Wed Thurs Fri Sat Sun 240 0 248 15 233 5 6Hmm. Remarkable, especially since he's been doing this for five years.
What's that in percentages? (Given 761 comics)
Mon Tues Wed Thurs 31.54% 0.00% 32.59% 1.97% Fri Sat Sun 30.62% 0.66% 0.79%
Looks like Fred's on target, eh? Perfect numbers would be 33.33% on Mon, Wed, Fri. Being only human, Fred's getting 32%, 33%, and 31%.
Short summary: with rounding, Fred updates Megatokyo on the days he says he will 96% of the time.
II. How many strips does he get done a year, then?
Using the first listed date (08/14/00) as the anniversary date, we end up with five periods of between 364 and 366 days, which is close enough to 365 on average, considering that a solar year has a couple extra hours in it anyway, and odds are one of those was a leap year.
Raw Data:
Days Strips 364 158 366 139 364 151 366 147 364 152
In an ordinary calendar year, there are 52 weeks times 3 updates, or 152 possible timely updates, but that's not quite as exact as it could be. The number of days or hours between ideal updates is a slightly better measure. If he's supposed to be updating three times a week, ideally he would have 56 hours beween updates. (7 days a week divided by 3 days equals 2.33 days between updates.)
Fred's actual score, in hours:
End Year One 55.29 End Year Two 63.19 End Year Three 57.85 End Year Four 59.76 End Year Five 57.47
Right column is average #of hours behind schedule. Note means that in year one he was almost 45 minutes ahead of schedule.
End Year One 0.71 End Year Two -7.19 End Year Three -1.85 End Year Four -3.76 End Year Five -1.47
Yeah, so ON AVERAGE, at no point has Fred been behind schedule more than 7 hours and 11 minutes, and that was three years ago.
III. But what about the guest strips, short stories, dead piro days, shirt guy dom days, etc.?
For convenience, call those "off topic" strips.
Not all of those were accurately marked in the listing on the site. I'll have to go back and recount those again. However, and this is a big however, out of the run of 761 strips,
(excluding "entertaining strips" like guest strips, Nani Nazi Megatokyos, Short Story strips, Adventures of Piro and Seraphim) you end up with only 108 "non-story" strips. In the past two years, there've been around 20 a year out of an average 150. If you include the full run of "off topic" strips, it comes to 143 strips, or about a fifth of the total run.
Here's the ironic part:
A third of those "off topic strips" were in Year One.
IV. Protests
- But you didn't count actual days between strips!
- Exactly how many days did he run late?
- The count of "off topic" strips is inaccurate!
The averages in section two do look like they blurr the length of some of those gaps. Rest assured, a team of trained hamsters is looking into this right now. We'll get you a histogram.
Twenty-six, see section one. The way they were counted discards lateness that flowed over onto the next Monday, Wednesday, or Friday. But the average number of hours behind schedule couldn't be as low as it was if that happened often.
Yes.
Conclusion
Fred Gallagher catches a lot of flak for being "late". Simple averages prove that he's not very late, and most likely isn't late at all. More detailed stats will be run in the future. I'll happily provide my data for others to analyze in a month or two.
[0]
Free at OpenOffice.org. Can import microsoft and wordperfect stuff too.
[1]
Use Find and Replace, select the regex checkbox, and use \t to replace characters with tab, \r to replace characters with a line break. (Under windows) If you pick regex, you cannot find and replace square brackets like these: [ ]. Edit: \r no longer seems to work. User error?
[2]
Using "import" gives an error.
[3]
This formula brought to you by http://www.openofficetips.com/
I could not have worked that one out by myself.
Revised November 2006 to add Public Domain Dedication.
Copyright-Only Dedication (based on United States law) or Public Domain Certification
The person or persons who have associated work with this document (the "Dedicator" or "Certifier") hereby either (a) certifies that, to the best of his knowledge, the work of authorship identified is in the public domain of the country from which the work is published, or (b) hereby dedicates whatever copyright the dedicators holds in the work of authorship identified below (the "Work") to the public domain. A certifier, moreover, dedicates any copyright interest he may have in the associated work, and for these purposes, is described as a "dedicator" below.
A certifier has taken reasonable steps to verify the copyright status of this work. Certifier recognizes that his good faith efforts may not shield him from liability if in fact the work certified is not in the public domain.
Dedicator makes this dedication for the benefit of the public at large and to the detriment of the Dedicator's heirs and successors. Dedicator intends this dedication to be an overt act of relinquishment in perpetuity of all present and future rights under copyright law, whether vested or contingent, in the Work. Dedicator understands that such relinquishment of all rights includes the relinquishment of all rights to enforce (by lawsuit or otherwise) those copyrights in the Work.
Dedicator recognizes that, once placed in the public domain, the Work may be freely reproduced, distributed, transmitted, used, modified, built upon, or otherwise exploited by anyone for any purpose, commercial or non-commercial, and in any way, including by methods that have not yet been invented or conceived.

This work is licensed under a Creative Commons Public Domain License.
This work is hereby released into the Public Domain. To view a copy of the public domain dedication, visit http://creativecommons.org/licenses/pub