SPaNK 2010 v.0.1.0 Available
Moderator: Angels
SPaNK 2010 v.0.1.0 Available
First of all, here is a link to the Google document which I'd like some help with. It's a list of players who I need DOB, Bats, Throws and Primary Position information on.
https://spreadsheets.google.com/ccc?key ... 2REE&hl=en
Secondly, here is a link to the 0.1.0 release for SPaNK 2010. A few thing to keep in mind...
1. This is batters only, I plan on having pitchers in the next week or so.
1. This only includes hitters who played in MLB the last two years. I fully intend to add minor leaguers, for the final 2010 release, they're just even farther from being ready than these guys are.
3. These stats are intended as being park neutral
4. There are still 3 or 4 more processes I intend to test out in this system, so don't count on any of these stats remaining as they are. This is v.0.1.0 because itt's still very simplistic compared to what I hope to ultimately have produced for 2010.
5. Note that I projected all players out to 500 plate appearances, including Pitchers, though that will change.
6. If anything looks really wonky, please let me know, preferably by posting here. Finding individual players whose numbers are screwy helps me improve the whole system.
https://spreadsheets.google.com/ccc?key ... 2TlE&hl=en
https://spreadsheets.google.com/ccc?key ... 2REE&hl=en
Secondly, here is a link to the 0.1.0 release for SPaNK 2010. A few thing to keep in mind...
1. This is batters only, I plan on having pitchers in the next week or so.
1. This only includes hitters who played in MLB the last two years. I fully intend to add minor leaguers, for the final 2010 release, they're just even farther from being ready than these guys are.
3. These stats are intended as being park neutral
4. There are still 3 or 4 more processes I intend to test out in this system, so don't count on any of these stats remaining as they are. This is v.0.1.0 because itt's still very simplistic compared to what I hope to ultimately have produced for 2010.
5. Note that I projected all players out to 500 plate appearances, including Pitchers, though that will change.
6. If anything looks really wonky, please let me know, preferably by posting here. Finding individual players whose numbers are screwy helps me improve the whole system.
https://spreadsheets.google.com/ccc?key ... 2TlE&hl=en
- Athletics
- Posts: 1926
- Joined: Fri May 21, 2010 1:00 am
- Location: San Diego, CA
- Name: Stephen d'Esterhazy
I looked over it quick and aside from a few errors with the pitchers, it looks decent. I mean, its a nice place to start.
"My shit doesn't work in the playoffs. My job is to get us to the playoffs. What happens after that is fucking luck."
LAA 11 - 15 331W - 479L
LAA 16 - 20 477W - 333L 17-20 ALW
OAK 21 - 24 297W - 189L 21-22 ALW
LAA 11 - 15 331W - 479L
LAA 16 - 20 477W - 333L 17-20 ALW
OAK 21 - 24 297W - 189L 21-22 ALW
yeah, the pitcher hitting needs to be handled separately obviously. I also have full L/R splits for all players.
i should also add, i'm not professing myself to be on the same level as the guys at BP with PECOTA, this is a project specifically designed to create a functional, readily updatable projection system for the IBC. I'm no whiz at this stuff, I just really like the idea of the IBC having a little more indpendence and one more thing to make us unique among Sim leagues.
i should also add, i'm not professing myself to be on the same level as the guys at BP with PECOTA, this is a project specifically designed to create a functional, readily updatable projection system for the IBC. I'm no whiz at this stuff, I just really like the idea of the IBC having a little more indpendence and one more thing to make us unique among Sim leagues.
- Rangers
- Site Admin
- Posts: 4024
- Joined: Wed Feb 23, 2005 1:00 am
- Location: Prosper, TX
- Name: Brett Perryman
Bren, I mentioned this to JP as we were talking about disclosing the formula, but the only inherent problem I see here is that it will absolutely not work (imo) for a member (or a few members) of the league to be able to determine exactly what players' projections will be and no one else to have any idea. I don't see it as an obstacle, but you are either going to have to disclose the basic formula, not play (and no one wants that), or find a really good way to express the gist so that people will have the ability to approximate at least similarly to what you can.
But that is a different matter from the projections themselves, and I think the majority of us like the idea of not being slaves to zips or dmb.
But that is a different matter from the projections themselves, and I think the majority of us like the idea of not being slaves to zips or dmb.
JP and I already discussed this and I agree completely. i'll be making the full system available to several members of the ExCo for their review, so that there are no questions of fairness or bias.Tigers wrote:Bren, I mentioned this to JP as we were talking about disclosing the formula, but the only inherent problem I see here is that it will absolutely not work (imo) for a member (or a few members) of the league to be able to determine exactly what players' projections will be and no one else to have any idea. I don't see it as an obstacle, but you are either going to have to disclose the basic formula, not play (and no one wants that), or find a really good way to express the gist so that people will have the ability to approximate at least similarly to what you can.
But that is a different matter from the projections themselves, and I think the majority of us like the idea of not being slaves to zips or dmb.
- Guardians
- Posts: 4972
- Joined: Sun Jan 29, 2012 1:00 am
- Location: Tallahassee, FL
- Name: Pat Gillespie
But after the Exco reviews, the formula will be made public to the league, correct?Padres wrote:JP and I already discussed this and I agree completely. i'll be making the full system available to several members of the ExCo for their review, so that there are no questions of fairness or bias.Tigers wrote:Bren, I mentioned this to JP as we were talking about disclosing the formula, but the only inherent problem I see here is that it will absolutely not work (imo) for a member (or a few members) of the league to be able to determine exactly what players' projections will be and no one else to have any idea. I don't see it as an obstacle, but you are either going to have to disclose the basic formula, not play (and no one wants that), or find a really good way to express the gist so that people will have the ability to approximate at least similarly to what you can.
But that is a different matter from the projections themselves, and I think the majority of us like the idea of not being slaves to zips or dmb.
- Rangers
- Site Admin
- Posts: 4024
- Joined: Wed Feb 23, 2005 1:00 am
- Location: Prosper, TX
- Name: Brett Perryman
At the least, I think that everyone needs to have a very good idea of the of the way it works. Since the formula is developed by a league member, no one should have an advantage over anyone else in terms of knowledge of how players will project. I don't see any way that it could possibly be fair otherwise.Astros wrote:But after the Exco reviews, the formula will be made public to the league, correct?Padres wrote:JP and I already discussed this and I agree completely. i'll be making the full system available to several members of the ExCo for their review, so that there are no questions of fairness or bias.Tigers wrote:Bren, I mentioned this to JP as we were talking about disclosing the formula, but the only inherent problem I see here is that it will absolutely not work (imo) for a member (or a few members) of the league to be able to determine exactly what players' projections will be and no one else to have any idea. I don't see it as an obstacle, but you are either going to have to disclose the basic formula, not play (and no one wants that), or find a really good way to express the gist so that people will have the ability to approximate at least similarly to what you can.
But that is a different matter from the projections themselves, and I think the majority of us like the idea of not being slaves to zips or dmb.
Uncertain. I'd definitely share the general philosophy and process, but the specific details, I'm not sure. I'm of two minds on it.Astros wrote:But after the Exco reviews, the formula will be made public to the league, correct?Padres wrote:JP and I already discussed this and I agree completely. i'll be making the full system available to several members of the ExCo for their review, so that there are no questions of fairness or bias.Tigers wrote:Bren, I mentioned this to JP as we were talking about disclosing the formula, but the only inherent problem I see here is that it will absolutely not work (imo) for a member (or a few members) of the league to be able to determine exactly what players' projections will be and no one else to have any idea. I don't see it as an obstacle, but you are either going to have to disclose the basic formula, not play (and no one wants that), or find a really good way to express the gist so that people will have the ability to approximate at least similarly to what you can.
But that is a different matter from the projections themselves, and I think the majority of us like the idea of not being slaves to zips or dmb.
On the one hand, i like the idea of approaching it as an open-source type of project, not just for the league's sake, but because I'm a big proponent of that sort of approach to development.
On the other hand, I've put an absurd amount of time into this project, it dwarfs what I put into setting up the league and keeping it going in the first place. I've been having dreams about doing excel work and I've gone cross-eyed more times than I want to think about. And while I have absolutely no reservations about giving away the finished product to anyone and everyone who wants it, giving out the details of every step in the process is something I'm less sure of.
It's also not as simple a thing to share as "Hey, here's the formula... (H+BB+HBP)/(AB+B+HBP+SF)". I've got over a dozen excel workbooks I'm using so far, and that's just for the offensive projections for established major leaguers. I expect to add several more for the hitters still and haven't yet added pitching, defense or minor leaguers.
At this point the approach toward these projections is actually really simple.
First, I make sure all the stats I use are park-neutral. I'm using a blend of analysis sources for that data.
Second, I apply a very basic age progression rate to the last three years of park-neutral statistics.
Then I generate
My next steps in the full process are to polish up the pitching stats for release, hit the defense, then tackle the minor leaguers. I also want to improve the age progression and try to make it more position specific, see what sort of patterns that generates as far as progression of catchers v. 1b v. ss etc as well as looking at adding a Defense Independent element and see what that does to things. Any time I finish a step, I'll be posting the results here immediately for review and feedback.
One other thing, one of my goals in producing the SPaNKs is that they'll be easy to produce going forward. meaning that I should be able to produce a set of 2011 projections within a couple days of the end of the 2010 season (pretty cool eh?), the framework should allow the data to just be dropped in pretty easily (I hope). If it would make members feel better, I could abstain from trading during the period when new projections are first being produced, rather like the signing freezes we used to have.
- Padres
- Site Admin
- Posts: 4796
- Joined: Sat May 13, 2006 1:00 am
- Location: Wells, Maine
- Name: Jim Berger
Andy Tracy .384/.430/.491 with 23 triples and 0 HRs ... SIM Super Star to be - or really wonky?
Andy Tracy, INF: An International League postseason All-Star in 2009, Tracy, 36, spent the entire season with Lehigh Valley. He led the league in walks (74), finished second in home runs (26) and RBI (96) and was fourth in runs scored (76). Tracy also appeared in nine games with the Phillies, where he made one start and hit .500 (4-8) as a pinch-hitter.
Andy Tracy, INF: An International League postseason All-Star in 2009, Tracy, 36, spent the entire season with Lehigh Valley. He led the league in walks (74), finished second in home runs (26) and RBI (96) and was fourth in runs scored (76). Tracy also appeared in nine games with the Phillies, where he made one start and hit .500 (4-8) as a pinch-hitter.
Last edited by Padres on Sun Dec 13, 2009 6:11 pm, edited 1 time in total.
My only problem with this is it's a purely stat based system. It cant take into account a guy who played a year while injured and drastically hurts a guy who has one off year. I feel like any system should have a bit of common sense into it as well. Though, admittedly, that is often a flaw some projections have as well...
It does consider injuries to some extent already, a crappy injury year is likely to be well short of a full season, which will temper the effect on the numbers.Giants wrote:My only problem with this is it's a purely stat based system. It cant take into account a guy who played a year while injured and drastically hurts a guy who has one off year. I feel like any system should have a bit of common sense into it as well. Though, admittedly, that is often a flaw some projections have as well...
The thing about "off years" is that they could also be the beginning of the end as well, we don't know until we look back later. The use of 3 years worth of statistics I hope balances out the off year factor and I REALLY don't want to open the can of worms involved with "judgement calls" on individual players. That's a shitstorm that I can see from miles away and plan to stay far, far away from.
Frankly, we don't really know how DMB or ZiPS come up with their numbers. As much as we can all espouse the value of scouting and case by case analysis, it introduces a huge potential for bias, which in this case, is not something anyone wants.
It's not just off years, what about fluke years?
My only point is I dont think a purely statistical approach is a smart one. Is there anyone out there that projects purely based on statistics/formulas? I doubt it. These guys who make it to the point where they are asked to make projections are there because they are baseball experts. I know with all the egos in this league many people think they are experts themselves, but lets be serious. As much as there are flaws with anyone's projections out there, they are done by experts. I would rather that than any amateur system (no offense Bren). We can work on the flaws of those systems rather than just throw them out completely. Zips doesnt take into account park effects? Lets use the park percentages from DMB and create a formula to do that. I think creating a new system, especially one better known by only a select few in the league, is not a good idea. Just my two cents, I know with this league it doesnt matter. Not going to check back on this thread, so dont bother responding.
My only point is I dont think a purely statistical approach is a smart one. Is there anyone out there that projects purely based on statistics/formulas? I doubt it. These guys who make it to the point where they are asked to make projections are there because they are baseball experts. I know with all the egos in this league many people think they are experts themselves, but lets be serious. As much as there are flaws with anyone's projections out there, they are done by experts. I would rather that than any amateur system (no offense Bren). We can work on the flaws of those systems rather than just throw them out completely. Zips doesnt take into account park effects? Lets use the park percentages from DMB and create a formula to do that. I think creating a new system, especially one better known by only a select few in the league, is not a good idea. Just my two cents, I know with this league it doesnt matter. Not going to check back on this thread, so dont bother responding.
What about Fluke years? What’s the difference between a fluke year and an off year? They’re two sides of the same coin, one is good, one is bad. It might be the start of a trend or it might be just a fluke, we don’t know until a year later. I’ve already accounted for that to some extent. There are indicators in the stats which can give us a hint as to whether a player was truly better/worse or just lucky/unlucky, and I fully intend to make use of those indicators, but anything beyond that is a guess, pure and simple.
As for not liking to rely on stats…. PECOTA is arguably the best system out there, does anyone think the stat heads at BP are sitting around and guessing as far as who will perform well and who won’t? Absolutely not, theirs is probably the most purely statistical (and most complex) set of projections out there. DMB wasn’t making guesses or following hunches either. I’ve never seen good write-ups on how CHONE or ZiPS are produced but I can guarantee you they’re not following hunches on who will improve and who will decline. It’s all stats. If you don’t want to rely on stats, you’re gonna be SOL in the projection game.
Last season we got shafted by DMB and had to find a replacement. We used ZiPS, not because they’re the best system out there (I’ve heard GM’s here argue the exact opposite in fact) but because they came in a DMB compatible season disk and had some splits (although a lot of players lacked them). I don’t claim to be able to produce a product on par with what DMB and PECOTA produce, but I’m darn sure that I can do better than what Szymborski put out last year. And if I don’t, then we don’t use it. To the best of my understanding the league has made no commitment to any system for the 2010 season, I’m just trying to give us another (hopefully better) option.
As for not liking to rely on stats…. PECOTA is arguably the best system out there, does anyone think the stat heads at BP are sitting around and guessing as far as who will perform well and who won’t? Absolutely not, theirs is probably the most purely statistical (and most complex) set of projections out there. DMB wasn’t making guesses or following hunches either. I’ve never seen good write-ups on how CHONE or ZiPS are produced but I can guarantee you they’re not following hunches on who will improve and who will decline. It’s all stats. If you don’t want to rely on stats, you’re gonna be SOL in the projection game.
Last season we got shafted by DMB and had to find a replacement. We used ZiPS, not because they’re the best system out there (I’ve heard GM’s here argue the exact opposite in fact) but because they came in a DMB compatible season disk and had some splits (although a lot of players lacked them). I don’t claim to be able to produce a product on par with what DMB and PECOTA produce, but I’m darn sure that I can do better than what Szymborski put out last year. And if I don’t, then we don’t use it. To the best of my understanding the league has made no commitment to any system for the 2010 season, I’m just trying to give us another (hopefully better) option.