Who's good at coding in excel?

I'd like to see if there's a way of measuring if players make other players on the field better (or worse). Can someone write the code to go through Matt's play by play data to get the info?

Here's how I envision it.

 Suppose we want to know: When Christopher Castonguay is on the field, does he make his teammates better? Each player as a proportion of goals for to goals against, broken down on offense and on defense for each game. For example: 

  • In week 7 Brent Burton went: 1/9 (0.111) on offense, 4/10 (0.4) on defense.
  • In week 6 Brent Burton went: 7/11 (0.6364) on offense and 8/13 (0.6153) on defense.

 Now, can we count the number of points that Brent and Chris played together? If we can, we can calculate the proportion of Brent's goals given CC is on the field. So, hypothetically: 

  • Week 7, Brent and Chris played 2 offensive points, and they went 1/2
  • Week 6, Brent and Chris played 3 offensive points and they went 1/3.

 Then with Chris on the field, Brent's numbers are: 

  • Week 7 offense: 0.5
  • Week 6 offense, 0.3333

 Then we subtract Brent's score with Chris from Brent's score without Chris 

  • Week 7 offense: 0.3889
  • Week 6 offense: -0.3031

 If we sum that up we get 0.0858. Can we do this for all players who played with CC over all the weeks? This will give us a numerical number to say that CC is bringing players up or down.

This seems like it could be solved using Visual Basics, but I don't really know. Alex, if someone messages you, let me know because I would like to learn how to code something like this.

Curious why you only consider Offensive points.

I'll consider both O and D points, but I got tired of typing and switched to O for the example.

 

I thought you'd ask why I chose you...

With Excel, not really VBA proficient but pretty good with logic formulas.. but also shitty with math. I can kind of envision what needs to be done, but would need a good chunk of data to mess with... and I think it would take some serious effort to compute each player with everyone on their team. You would first need to find out how many points each player played with each player. Do you have that data easily?

It's in the public sheet under MM Play by Play

I also made a table, which I think is the easiest (AB - Workings). MY first approach was the figure out how many points each player played with each other player and then to see which of those were scored.

Send me an email

Not sure how busy I will be, but my initial feeling is a vlookup formula for each player against each player should do it somehow. I can't access Google docs at work though.. which is where I have all my sheets that I can pull stuff from

I would think avoiding VBA would be best so we can base it directly in Google Docs.  Question on that though: if there is a calculationally intense tab in the online sheet, will the rest of the sheet slow down?  I did my 'who is throwing to whom' tab last year and had to keep the whole thing on manual.  Once inputs were updated, I hit F9 and let it run for a number of minutes, then pasted the results into the online sheet.  I'm not familiar enough with Google Docs to know how it handles this.

I would hate to a) put up a tab that is going to make the whole sheet unuseable for everyone and b) spend a whole bunch of development time for something that is going to cause breakage outlined in a).

I did some fiddling with google docs formulas around week 2 to poke around the stats a bit. I am by no means an expert...

Google Docs is weird.. but I don't think it will become slow. In fact, it seems to be more of an issue to have it actually update.. I've generally found the best way is to put a button that when pushed does the calculations. Otherwise, gdocs will NOT refresh any custom functions on the fields, even if cells are updated. It only refreshes fields with built-in formulas if cells are updated. I think only the owner might be able to click that button as well since the document is read only to everyone else?

Also, if anyone wanted to see the tab with custom stuff.. it gets a bit complicated with allowing custom functions to run in the doc.

Otherwise, the built in scripting stuff isn't too terrible... it's just a finicky as VBA more or less.

I had planned to take a couple days over the holidays to try and put the play-by-play data in a more accessible (for computers) format. I've been trying to figure out the best (easiest) way to get it into a C++ text string (because I'm at least vaguely functional with C++...not so with VBA). Suggestions on how to best do this are welcome. I'm thinking one string per point - from there you can build counters for all the usual things (passes, Ds, Gs, As, TAs, turns in general), plus build a table of who's on with each other for each individual point and the result. Then you can just search for Brent && Chris in each string and count the results.

It also occurs to me that the data must originate, in the app, in a similar sort of string that then outputs to Google Sheets, but I don't know if those strings are preserved anywhere. Should probably ask Matt, because otherwise we're basically reverse engineering it.

My first goal was to try and recreate the sort of data you see here (in the Assists to Goals chart), so I could understand how my team currently functions - who passes to who, and also who ignores who. But calculating any one individual's effects on another (using the basic calculation Bush suggests, anyway) wouldn't be too hard to do. That said, I think building a table of ALL player pairs seems unwieldy (at least to present, if not also to code).

Bush would like descrete events, and he's probably correct in this, for a number of different analytically relevant reasons that I can think of (ease of sorting, comparing, aggregating, etc).

He wants:

EVENT_ORIGIN, EVENT_DESTINATION, EVENT_RESULT

This makes sense from a spreadsheet perspective, as well as a basic game flow perspective. Aggregate analysis works best on ordered, column data (which the current event chains are not).

e.g.: 

KEATES, EWING, CATCH

EWING, BUSH, CATCH

BUSH, KEATES, CATCH

KEATES, NULL, THROW AWAY

NULL, AMOS, D

AMOS, NULL, PICKUP

AMOS, ASHLIN, GOAL

From a data perspective this lends itself very easily to a List object filled with Event objects. You can thus navigate a list of events like ordered nodes in a linked list, forward and back. I've a few working ideas for a DB of this stuff.

I like the line-by-line format.  I would recommend adding a few items like:

  • GameID (increment the number each game)
  • Point ID (increment the number each time a point is scored)
  • Possession ID (increment the number each time there is a turnover, a point is started and maybe after an injury sub depending on how folks want to use this)

This way it would make it easier to link to a table showing who is on the field and reducing computation time for aggregating over games, points, possessions, etc.

I had thought of including home team score and away team score as well, but I don't think that one is necessary.

The data model I'm fiddling with has events tracked by possessions, possessions by points, points by game. A point has 12 people on. A point has 1 or many possessions. A game has has 1 or many points. A game has 2 teams. A team has 12 players (and is unique by date given roster moves). Some other details I'm forgetting, this is all loosely detailed. All tied together by appropriate IDs as FKs. I should draw it out at some point...

From there you can drive most reports needed through SQL, which is actually pretty straight forward. It exports to JSON easily, exports to .csv easily, integrates well with any language needed.

Christmas plans, I guess.

If we're going to spend the time to convert the play-by-play into a "computer accessible" format, maybe we do it in a more... broadly usable manner. While I think we *all* love C++ (#NotReally), converting it to a format that anyone can use for any programming language would likely serve us better.

The obvious choice to me seems to be JSON, since pretty well any language can read and use it's data structure quite easily, and it's still human-readable.

It hadn't even occured to me to get this nerdy about Parity, but all y'all have got me really curious about how this might work. 

EDIT:

Well, it seems my curiosity was slighty stronger than I thought, so I found a few minutes and whipped together a JSON data structure. This sample document contains the first 4 points of the first game of week 1. I haven't captured *every* possible event type/situation yet (intercetions for example, where a D and the subsequen throw are made by the same player) but it's easy simple to extend.

The finer points of the structure (which values are allowed for a given field) would need to be defined and standardized, but at least it's a start.

If this works for people, at least it gives something to work towards. If there's comments, feel free to let me know. I might even start a new thread...

{
    "Source": "Al Colantonio",
    "Destination": null,
    "Type": "Throw",
    "Result": "Throw Away"
}

You know you're a nerd when you're chirping people in JSON...

Still though, *like*.

So much geek in one thread.  Shouldn't you all be camping out for the screening of this movie that's about to come out? ;)

+1

Meh...already seen it.  ;)

K

If anyone is coding in VBA or GAS and needs any help. LMK

Google Hangouts preferred :-).

I managed to get this done for the first 13 weeks of games. Basically, I measured the ratio of player A scoring when player B was on the field and the ratio of player A scoring when player B was not on the field, then I took the difference in those two values. I then summed that over all possible players and averaged it to get a value for player B that I think measures his or her effect on the other players on the field (obviously there's the effect of all 5 other players, but let's not worry too much about that. I would also need to break it down by O and D points, which I didn't do either.

You can read the number like: "On average, when player B is on the field, his teammates have x (higher/lower) chance of being on the field when their team scores"

Here are some highlights

Adam Mac: +5.14%

Al Colantonio: +2.45%

Keates: +10.35%

Geofford: +9.07%

Mehmet: +5.63%

Steve Chow: +11.42%

 

Take from it what you will.

 

Or maybe some middlelights or alllights?

So is this like W.A.R. stat from baseball?

What I'm taking from this is that my Points Above Replacement number is close to the average so something else must be why I'm on the list.

I am going to assume just being me makes me "noteworthy".

Chirpworthy

...so you're on P.A.R. with the league?!?

"Chirpworthy"! Oh snap?

Are there any noteworthy or chirpworthy women?

I'll keep the lowlights to myself:

Owen "Because I asked for it" Lumley: +6.39%

Jessie Robinson: +3.19%

Fred Caron +2.53%

Justice Price: +2.13%

Alex Bush: +1.58%

Stan Kent: +1.35%

Ashlin Kelly: +0.67%

Ian Ewing: -1.34%

Brent Burton: -1.78%

Christopher "I run OCUA" Castonguay: -2.39%

Most points played together:

Andre Scott + Jessire Robinson played 160 points together. It helps that they've also played 13 games together!

The most scoring pair:

Jessie Robinson + Alex Bush were on the field together for 89 points. Watch out when they are on! Jessie throws it recklessly and Alex cuts poorly!

Best ratio of points scored to points played:

Steph White + Graham Brown score 90% of points played together. That's team work guys! (they only played 10 points together, but they scored 9 of them!)

The least scoring pair:

Mark Donahue + Alec Clark. Don't put them on together! When Mark is on the sideline, Alec is scoring 64% of his points! It doesn't go both ways though! When Alec is on the sideline, Mark is scoring a pedestrian 44.5% of his points. Gotta try harder, Donahue!

 

the average PAR per current team look like?  

Average per team:

 

Moderate Expectations: 0.0096

Karma Down Under: -0.0127

Ignominious: 0.0034

Pirate Rob's: 0.0034

Luke Parity: -0.0021

Katy Parity: -0.0216

Stan's Sturgeons: 0.0101

Scoobers: -0.0127

Stan's got the highest average?  I demand a recount!