I'd like to see if there's a way of measuring if players make other players on the field better (or worse). Can someone write the code to go through Matt's play by play data to get the info?
Here's how I envision it.
Suppose we want to know: When Christopher Castonguay is on the field, does he make his teammates better? Each player as a proportion of goals for to goals against, broken down on offense and on defense for each game. For example:
Now, can we count the number of points that Brent and Chris played together? If we can, we can calculate the proportion of Brent's goals given CC is on the field. So, hypothetically:
Then with Chris on the field, Brent's numbers are:
Then we subtract Brent's score with Chris from Brent's score without Chris
If we sum that up we get 0.0858. Can we do this for all players who played with CC over all the weeks? This will give us a numerical number to say that CC is bringing players up or down.
Christopher Cas...
Wed, 2015-12-16 14:03
Permalink
This seems like it could be
This seems like it could be solved using Visual Basics, but I don't really know. Alex, if someone messages you, let me know because I would like to learn how to code something like this.
Curious why you only consider Offensive points.
Alex Bush
Wed, 2015-12-16 15:08
Permalink
I'll consider both O and D
I'll consider both O and D points, but I got tired of typing and switched to O for the example.
I thought you'd ask why I chose you...
Steve Bisang
Wed, 2015-12-16 21:34
Permalink
I am semi decent
With Excel, not really VBA proficient but pretty good with logic formulas.. but also shitty with math. I can kind of envision what needs to be done, but would need a good chunk of data to mess with... and I think it would take some serious effort to compute each player with everyone on their team. You would first need to find out how many points each player played with each player. Do you have that data easily?
Alex Bush
Wed, 2015-12-16 21:43
Permalink
data
It's in the public sheet under MM Play by Play
I also made a table, which I think is the easiest (AB - Workings). MY first approach was the figure out how many points each player played with each other player and then to see which of those were scored.
Send me an email
Steve Bisang
Wed, 2015-12-16 21:54
Permalink
I will try tomorrow
Not sure how busy I will be, but my initial feeling is a vlookup formula for each player against each player should do it somehow. I can't access Google docs at work though.. which is where I have all my sheets that I can pull stuff from
Andrew W Cameron
Wed, 2015-12-16 22:41
Permalink
I would think avoiding VBA would be good
I would think avoiding VBA would be best so we can base it directly in Google Docs. Question on that though: if there is a calculationally intense tab in the online sheet, will the rest of the sheet slow down? I did my 'who is throwing to whom' tab last year and had to keep the whole thing on manual. Once inputs were updated, I hit F9 and let it run for a number of minutes, then pasted the results into the online sheet. I'm not familiar enough with Google Docs to know how it handles this.
I would hate to a) put up a tab that is going to make the whole sheet unuseable for everyone and b) spend a whole bunch of development time for something that is going to cause breakage outlined in a).
Greg Cormier
Mon, 2016-01-04 10:42
Permalink
.
I did some fiddling with google docs formulas around week 2 to poke around the stats a bit. I am by no means an expert...
Google Docs is weird.. but I don't think it will become slow. In fact, it seems to be more of an issue to have it actually update.. I've generally found the best way is to put a button that when pushed does the calculations. Otherwise, gdocs will NOT refresh any custom functions on the fields, even if cells are updated. It only refreshes fields with built-in formulas if cells are updated. I think only the owner might be able to click that button as well since the document is read only to everyone else?
Also, if anyone wanted to see the tab with custom stuff.. it gets a bit complicated with allowing custom functions to run in the doc.
Otherwise, the built in scripting stuff isn't too terrible... it's just a finicky as VBA more or less.
Ian Ewing
Wed, 2015-12-16 23:03
Permalink
Christmas break project
I had planned to take a couple days over the holidays to try and put the play-by-play data in a more accessible (for computers) format. I've been trying to figure out the best (easiest) way to get it into a C++ text string (because I'm at least vaguely functional with C++...not so with VBA). Suggestions on how to best do this are welcome. I'm thinking one string per point - from there you can build counters for all the usual things (passes, Ds, Gs, As, TAs, turns in general), plus build a table of who's on with each other for each individual point and the result. Then you can just search for Brent && Chris in each string and count the results.
It also occurs to me that the data must originate, in the app, in a similar sort of string that then outputs to Google Sheets, but I don't know if those strings are preserved anywhere. Should probably ask Matt, because otherwise we're basically reverse engineering it.
My first goal was to try and recreate the sort of data you see here (in the Assists to Goals chart), so I could understand how my team currently functions - who passes to who, and also who ignores who. But calculating any one individual's effects on another (using the basic calculation Bush suggests, anyway) wouldn't be too hard to do. That said, I think building a table of ALL player pairs seems unwieldy (at least to present, if not also to code).
Christopher Keates
Thu, 2015-12-17 09:43
Permalink
Bush would like descrete
Bush would like descrete events, and he's probably correct in this, for a number of different analytically relevant reasons that I can think of (ease of sorting, comparing, aggregating, etc).
He wants:
EVENT_ORIGIN, EVENT_DESTINATION, EVENT_RESULT
This makes sense from a spreadsheet perspective, as well as a basic game flow perspective. Aggregate analysis works best on ordered, column data (which the current event chains are not).
e.g.:
KEATES, EWING, CATCH
EWING, BUSH, CATCH
BUSH, KEATES, CATCH
KEATES, NULL, THROW AWAY
NULL, AMOS, D
AMOS, NULL, PICKUP
AMOS, ASHLIN, GOAL
From a data perspective this lends itself very easily to a List object filled with Event objects. You can thus navigate a list of events like ordered nodes in a linked list, forward and back. I've a few working ideas for a DB of this stuff.
Andrew W Cameron
Fri, 2015-12-18 11:34
Permalink
Good start...
I like the line-by-line format. I would recommend adding a few items like:
This way it would make it easier to link to a table showing who is on the field and reducing computation time for aggregating over games, points, possessions, etc.
I had thought of including home team score and away team score as well, but I don't think that one is necessary.
Christopher Keates
Fri, 2015-12-18 12:30
Permalink
The data model I'm fiddling
The data model I'm fiddling with has events tracked by possessions, possessions by points, points by game. A point has 12 people on. A point has 1 or many possessions. A game has has 1 or many points. A game has 2 teams. A team has 12 players (and is unique by date given roster moves). Some other details I'm forgetting, this is all loosely detailed. All tied together by appropriate IDs as FKs. I should draw it out at some point...
From there you can drive most reports needed through SQL, which is actually pretty straight forward. It exports to JSON easily, exports to .csv easily, integrates well with any language needed.
Christmas plans, I guess.
Greg Linton
Fri, 2015-12-18 13:46
Permalink
Maybe not C++
If we're going to spend the time to convert the play-by-play into a "computer accessible" format, maybe we do it in a more... broadly usable manner. While I think we *all* love C++ (#NotReally), converting it to a format that anyone can use for any programming language would likely serve us better.
The obvious choice to me seems to be JSON, since pretty well any language can read and use it's data structure quite easily, and it's still human-readable.
It hadn't even occured to me to get this nerdy about Parity, but all y'all have got me really curious about how this might work.
EDIT:
Well, it seems my curiosity was slighty stronger than I thought, so I found a few minutes and whipped together a JSON data structure. This sample document contains the first 4 points of the first game of week 1. I haven't captured *every* possible event type/situation yet (intercetions for example, where a D and the subsequen throw are made by the same player) but it's easy simple to extend.
The finer points of the structure (which values are allowed for a given field) would need to be defined and standardized, but at least it's a start.
If this works for people, at least it gives something to work towards. If there's comments, feel free to let me know. I might even start a new thread...
Will Leckie
Fri, 2015-12-18 15:54
Permalink
{ "Source": "Al Colantonio
Greg Linton
Sat, 2015-12-19 22:17
Permalink
You know you're...
You know you're a nerd when you're chirping people in JSON...
Still though, *like*.
Ariel Untiveros
Thu, 2015-12-17 09:35
Permalink
Oh man!
So much geek in one thread. Shouldn't you all be camping out for the screening of this movie that's about to come out? ;)
Geofford Seaborn
Thu, 2015-12-17 14:55
Permalink
+1
+1
Karlis Bouse
Thu, 2015-12-17 17:31
Permalink
Meh...already seen it. ;) K
Meh...already seen it. ;)
K
Matt Masse
Mon, 2016-01-04 16:27
Permalink
If anyone is coding in VBA or GAS and needs any help. LMK
If anyone is coding in VBA or GAS and needs any help. LMK
Google Hangouts preferred :-).
Alex Bush
Thu, 2016-02-11 00:01
Permalink
So I managed to get this done...
I managed to get this done for the first 13 weeks of games. Basically, I measured the ratio of player A scoring when player B was on the field and the ratio of player A scoring when player B was not on the field, then I took the difference in those two values. I then summed that over all possible players and averaged it to get a value for player B that I think measures his or her effect on the other players on the field (obviously there's the effect of all 5 other players, but let's not worry too much about that. I would also need to break it down by O and D points, which I didn't do either.
You can read the number like: "On average, when player B is on the field, his teammates have x (higher/lower) chance of being on the field when their team scores"
Here are some highlights
Adam Mac: +5.14%
Al Colantonio: +2.45%
Keates: +10.35%
Geofford: +9.07%
Mehmet: +5.63%
Steve Chow: +11.42%
Take from it what you will.
Owen Lumley
Thu, 2016-02-11 09:08
Permalink
Not gonna share any lowlights?
Or maybe some middlelights or alllights?
Adam MacDonald
Thu, 2016-02-11 11:16
Permalink
So is this like W.A.R. stat
So is this like W.A.R. stat from baseball?
Alessandro Cola...
Thu, 2016-02-11 12:01
Permalink
What I'm taking from this is
What I'm taking from this is that my Points Above Replacement number is close to the average so something else must be why I'm on the list.
I am going to assume just being me makes me "noteworthy".
Alex Bush
Thu, 2016-02-11 12:58
Permalink
Not "noteworthy"
Chirpworthy
Ariel Untiveros
Thu, 2016-02-11 16:52
Permalink
...so you're on P.A.R. with
...so you're on P.A.R. with the league?!?
Alessandro Cola...
Thu, 2016-02-11 14:27
Permalink
Oh snap?
"Chirpworthy"! Oh snap?
Are there any noteworthy or chirpworthy women?
Alex Bush
Thu, 2016-02-11 14:30
Permalink
Middlelights
I'll keep the lowlights to myself:
Owen "Because I asked for it" Lumley: +6.39%
Jessie Robinson: +3.19%
Fred Caron +2.53%
Justice Price: +2.13%
Alex Bush: +1.58%
Stan Kent: +1.35%
Ashlin Kelly: +0.67%
Ian Ewing: -1.34%
Brent Burton: -1.78%
Christopher "I run OCUA" Castonguay: -2.39%
Alex Bush
Thu, 2016-02-11 14:51
Permalink
A couple more fun(?) things
Most points played together:
Andre Scott + Jessire Robinson played 160 points together. It helps that they've also played 13 games together!
The most scoring pair:
Jessie Robinson + Alex Bush were on the field together for 89 points. Watch out when they are on! Jessie throws it recklessly and Alex cuts poorly!
Best ratio of points scored to points played:
Steph White + Graham Brown score 90% of points played together. That's team work guys! (they only played 10 points together, but they scored 9 of them!)
The least scoring pair:
Mark Donahue + Alec Clark. Don't put them on together! When Mark is on the sideline, Alec is scoring 64% of his points! It doesn't go both ways though! When Alec is on the sideline, Mark is scoring a pedestrian 44.5% of his points. Gotta try harder, Donahue!
Owen Lumley
Thu, 2016-02-11 22:31
Permalink
What does
the average PAR per current team look like?
Alex Bush
Fri, 2016-02-12 12:57
Permalink
Average per team:
Average per team:
Moderate Expectations: 0.0096
Karma Down Under: -0.0127
Ignominious: 0.0034
Pirate Rob's: 0.0034
Luke Parity: -0.0021
Katy Parity: -0.0216
Stan's Sturgeons: 0.0101
Scoobers: -0.0127
Ariel Untiveros
Mon, 2016-02-15 11:39
Permalink
Stan's team
Stan's got the highest average? I demand a recount!