Twin's turn up gas on Hot Stove early
The Twins, in contrast to several recent off-seasons, have emerged as one of the most active teams in the early off-season in less than a week since the wrap of the post-season they have made two moves that quickly answered possible questions about how the Twins will build their lineup for 2010 as well as what players they prioritized keeping most. Unsurprisingly they picked Michael Cuddyer's club option for the 2011 season as they were both under a time-crunch to make a decision about that year since they had to make up their minds on the 2011 option year within 5 days of the end of this year's World Series and they retain for a relatively competitive price a player who will be 32 in 2011 who is coming off a very productive and cost-effective year. Their choice boiled down to letting him walk as a free agent after next year after having paid him $24 million for the three years from 2008-2010 or pick up the option and pay him another $10.5 million for the final year when he will be 32. For the $8 million annualized salary they paid him in 2009 they got what was likely an outlier of a season, but for $8 million getting 32 HR, 73 XBH and 94 RBI is an extremely high return on their investment. While his career totals suggest that his SLG of .520 in 2009 is likely about .050 higher than what the Twins can expect out of him in 2010 and 2011 this move was a no-brainer from both a public relations standpoint and team building standpoint. Making the choice to not commit beyond 2010, for a relatively affordable price, to the player who is the the only right-handed power bat on the team and who is coming off a stellar, 30+ HR season would show little commitment to building on this year's success in 2010 and beyond. As the Twins move into their new largely publicly funded stadium, showing that the increased revenues from higher ticket prices, team-controlled luxury suites, et cetera, will be used to prevent teams from poaching our best players who are allowed to hit free agency thus keeping us in constant semi-rebuilding mode will be an important PR goal. While Cuddyer is a nice start obviously the real litmus test here will be if Joe Mauer can be resigned. More important than the PR relevane of the move, though, at 32 Cuddyer's production will likely not be significantly worse than his career mean, and with inflation between now and 2011 $10.5 million for a dependable, above average right-handed power bat will at least be justifiable if not somewhat of a bargain, especially considering the need to insert a rightie into the lineup whose most talented hitters are all left-handed--Mauer, Morneau and Kubel. The Twins thus keep an important role-player and a guy who while far from a gold-glove fielder has both an above average arm in right and some defensive versatility that he displayed by doing a competent job filling in for Morneau at first base at the end of this last season. Much less predictable and a much less typical move for the Twins to make, especially so early in the off-season, was a somewhat unusual deal where the Twins sent a young but still very far from developed outfield prospect in Carlos Gomez to the Brewers for a former All-Star shortstop J.J. Hardy who has struggled mightily his last two seasons after appearing in 2008 to be part of a core of young talent around which the Brewers would build their team. While neither player seems like particularly attractive trade bait (Hardy is coming off an abysmal year and Gomez has not blossomed into a productive hitter since being acquired in the Santana deal) the trade does actually make a fair amount of sense for both clubs who may both benefit, and in my opinion, particularly the Twins could have pulled off a real coup here even if the likely outcome is marginal improvement. I believe this trade has more inherent risk for the Brewers since Hardy's abilities have been borne out by real world experience and him performing at a high level would be returning to a previous level of performance as opposed to having a breakout season, but either team could in retrospect appear very foolish if one of these players surprises and plays especially well. However, I think it is more likely that a year from now this trade may have improved both teams in a measurable but not profound way and will not be seen as mistake (at least not a major mistake) for either front office. While Gomez was the most prominent player acquired in the Santana trade and the only player from that deal to see significant MLB playing time, he was in practice a 4th outfielder and with usual DH Jason Kubel's ability to play a passable if mediocre right-field if needed due to an injury to an outfielder, he was really one of 5 viable options in the outfield, of which he was by far the weakest at the plate. While his fielding was so stellar as to make up in large part for his lack of production at the plate and in my opinion justified his playing time over Delmon Young when the latter was struggling offensively, Gomez will likely never be more than a defensive specialist center fielder on a team that lacks a more offensively productive player at that position unless he significantly improves his plate discipline. While Gomez exuded sheer athleticism and at times was an extremely exciting offensive player (for instance, hitting for the cycle in 2008 and often taking an extra base on sheer speed) at the plate in just over 1000 big league at bats he's had rather mediocre numbers, unable to translate his athletic gifts into MLB production, batting just .248, getting on base less than 3 of 10 trips to the plate at .292 and OPSing a mere .638. As he will only be 24 next season and he already has been playing in the bigs for 3 seasons, it might be the case he has suffered from overly fast promotion, and that he may develop a better ability to translate muscles and speed into power, on base percentage, and base-stealing prowess, but so far he has not shown signs of being on the cusp of being an important MLB talent. In Hardy the Twins acquire a player who has proven that he has the ability to put together a productive full season at the plate while playing at an above average level at the crucial defensive position of short-stop. He played nearly all of 2007 and 2008 (playing 151 and 146 games respectively) and putting up a total line of .280/.340/.470 averaged over those two full seasons, hitting 26 and 24 HR and making the All-Star team in 2008. In those two seasons his fielding was at least above average, 10th best in MLB at SS at +7 for 2007 based on The Fielding Bible's +/- system, and even at an outstanding level, +19 in 2008 and 4th in MLB. Even if his 2009 struggles at the plate were more than a fluke (his line at the MLB level last year was a disappointing .229/.302/.357 that earned him a demotion to AAA 2 days before he would accrued enough playing time to have qualified for Free Agency after 2010 instead of after 2011) he nets the Twins at least a possible medium-term solution at short-stop, a position that aside from Orlando Cabrera's brief tenure there in this year's pennant race has been a constant source of instability and has featured a revolving door of players, none of whom were particular talented hitters, since Jason Bartlett was dealt along with Matt Garza for Delmon Young and Brendan Harris. A note on Hardy's demotion last year: as mentioned in passing, while going from being an All-Star to earning a demotion to AAA in a year doesn't generally make them more attractive trade fodder, Hardy's demotion could have been strategic on Milwaukee's part, as Alcides Escobar's breakout year allowed the Brewers the flexibility to demote Hardy which resulted in the Brewers or now the Twins maintaining control over Hardy for two years instead of one, surely a huge factor in this trade. This trade does not look unbalanced for either team from the outset, as both teams traded from positions of relative strength: the Brewers gave up a player who would have become a possibly divisive clubhouse presence as well as likely ending up a relatively expensive back-up shortstop for the youngster Escobar (Hardy is set to make $4.65 million in 2010 and has another arbitration year in 2011.) The Twins will retain control of Hardy for two years at least and will get a full two seasons to see if 2009 was flukishly bad and even if he can't repeat the .821 OPS of his 2008 season if a .750ish OPS-hitting, above average fielding SS could be part of their long-term plans. In exchange for this buy-low addition for the Twins, they give up 4 years of control of an athletically gifted prospect who once was projected to have a very high ceiling, but who has had a lot of that promise fade now after about 2 full seasons worth of at bats over 3 years at the MLB level where his production has never surpassed mediocre for any extended period of time. On the other hand, while the Brewers are not exactly lacking for an outfield Mike Cameron is getting older and having Gomez on the bench gives the Brewers a number of advantageous ways to use Gomez's speed, stellar defense and skill at bunting for hits (bizarrely accompanied by an unacceptable lack of skill at executing a sacrifice bunt) all mean he could find a niche as a bench player in Milwaukee, potentially being used as a pinch runner, late game defensive replacement and perhaps as a pinch hitter in certain situations with less than two outs where the Brewers want to put pressure on a team's infield. He also gives the Brewers outfield depth they were lacking by providing top-tier defense and at least intermittent offensive production as a potential replacement for an injured player. Gomez's speed and ability to read the ball off the bat have lead him to rack up in full-time play in 2008 and part-time play in 2009 +/- numbers of +29, easily #1 in MLB in 2008, and +17, #4 in MLB in 2009 while only playing about 2/3 of a season's worth of innings. While not making up for his below-average plate performance, as an overall player the number of runs he takes away from other teams relative to the average CF's performance are significantly above average and partially compensate for a lack of consistent output at the plate thus far in his young career; all in all even if he doesn't break out and develop significantly he gives the Brewers a very solid and versatile bench player and a serviceable backup if they have injuries in the outfield. After the Garza/Bartlett for Young/Harris debacle, this is really the second major trade of the Bill Smith era that is a talent-for-talent deal where both teams could have walked away but in the end thought they they each will benefit; the Santana trade certainly might have played out better but the strategic disadvantage of being unable to resign Santana forced Smith to strike some deal sending away his ace pitcher. While the talent involved in this deal is not as valuable or high-profile as the Tampa Bay deal, in this case the Twins are on the right side of a trade where a proven but somewhat inconsistent MLB talent is dealt for a young, unproven outfield prospect who has shown promise but not real production in the show. In the near-term the Twins gain two years of a cost-controlled player who could easily shore up or at the outer edge of probability provide above average play at short-stop, a position that has been a consistent liability for the Twins. If Hardy can prove that 2009 was in fact merely a fluke year and Smith can extend him he may come away from this deal looking very smart indeed. If Hardy's struggles prove more permanent then the Twins are basically where they were before since both players on the left side of their infield (1 year rentals in Orlando Cabrera and Joe Crede) are almost certainly not going to be resigned as free agents. In exchange for this low-risk, high-reward potential Smith dealt a nice player in Gomez, but a player who, even with injuries in the outfield, was hard to find playing time for due to the glut of more talented outfielders the Twins carry; while Gomez's defense was stellar, unless he can make a drastic improvement at the plate, hit for more power, better average, and draw more walks, he will never be the type of valuable everyday CF that the Twins already have in Denard Span. While the Twins outfield defense with both Span and Gomez patrolling seemed that nearly any fly ball that was not a homer would be caught, the Twins will go forward with a more productive offensive outfield and a center-fielder who is no defensive slouch in Denard Span, accompanied by Young and Cuddyer in left and right with Kubel able to fill in if there is an injury. So while both teams seem like they will be improved by this deal as they trade from areas of redundancy for areas of want, the Brewers seem to be at much greater risk of looking foolish a few years from now. Hardy was an All-Star the year before last, is a very talented fielder, and if he proves that 2007 and 2008 are better reflections of his actual skill level than a lost year in 2009, the Twins could come away from the deal with a very talented everyday player at a difficult to fill position. If Hardy does not pan out the Twins will only come away looking like losers in this trade if Gomez dramatically improves his offensive performance which would involve improving such basic skills as learning how to better identify strikes, take more pitches and work counts, predict what pitch is likely coming, protect the plate with two strikes, and develop power to boot. It's a tall order, but time is on Gomez's side: he's still only 23. However, the chance that a 28 year old former All-Star will play at a reasonably high level after one off year seems much greater than the chances that an offensive work in progress will turn raw athleticism into consistent production. It's sad to see Go-go go (my one very nice Twins jersey is a Gomez jersey, ironically enough) I'm very excited that Smith pulled off a seeming coup in getting a player with the upside and proven record of Hardy for a guy who has not shown any real reason to expect development into a team-changing producer. While I loved Gomez's defense, hustle and occasionally very exciting play, he has never shown an ability to cope with MLB pitching in 1000+ at bats over three seasons; he still does not see enough pitches, looks confused or simply over-matched throughout at bats, and frequently winds up striking out after 3 or 4 pitches hacking wildly at strike three or conversely being frozen by a pitch far too close to take. With the shortstop situation in Milwaukee along with the Brewer's keen decision to keep control of Hardy through 2011 it is obvious that they had an eye on trading him this off-season, however, for Smith to work out a deal for a former All-Star SS for a redundant and purely speculative talent whose entire hitting approach is seriously flawed is something of a coup, if a small one, as it would seem that Hardy's 2007 and 2008 production, despite his recent struggles, would be enough to net at the very least a single, younger blue chip prospect or a group of prospects. That the deal was worked out without sending away any young pitching is a pleasant surprise. It could turn out that this deal is forgettable if Hardy turns out to be an average or below average SS for two years and Gomez is a passable bench player or fill-in in Milwaukee, but the likely upside seems to be all on the Twins side. Smith's second decision to pull the trigger on a major trade, in short, seems to have been informed by the backward risk-reward picture embodied in the Garza trade, as this time we are highly unlikely to be the team regretting this trade in a year. Labels: baseball, twins
The foolishness of populist fervor for CEO pay cuts
Populists left and right are supporting "Pay Czar" Kenneth Feinberg's slashing of the pay of top executives at 7 companies that have been propped up by a ton of taxpayer bailout money by about 90% as well as rules for those firms that are designed to incentivize long-term success by making sure that, for instance, stock-linked compensation is pegged to long-term stock performance, presumably to avoid short-term risks that promise a quick buck by piling on long-term systemic risk. That left-leaning folks would support any move to slash CEO pay is unsurprising, but that folks who are generally in favor of a free market are on board with this is a bit more surprising, although not particularly hard to understand. Bill O'Reilly gave the typical conservative populist justification for this move in a discussion with Neil Cavuto, whose concerns about a slippery slope were dismissed by O'Reilly as paranoid and misplaced. The conservative explanation generally goes something like this: these executives' companies were saved by the government and therefore the taxpayers are essentially the ultimate shareholders and the government is sort of a super-board of directors that looks out for those ultimate shareholders. By making sure that money isn't wasted by paying these executives the gaudy sums of their non-government-money owing counterparts at other firms the government is ensuring that these firms run lean and mean and get paid back more quickly. The reason a conservative like O'Reilly supports the move is that he doesn't want to see taxpayer dollars squandered by lining the pockets of fat cat executives, who by inference, when they work for other firms that are not subject to government regulation, are compensated to a degree that is essentially 10-fold greater than what their performance justifies. Goldman Sachs's notable success, its many powerful alumni in positions of public service, often regulating both their friends and former competitors (e.g. John Corzine being a Senator and Governor and Hank Paulson being the head of Treasury) and its receipt of public money (both as one of the 9 banks forced to take government capital to shore up public confidence during the crisis last fall and as a very large creditor of AIG) all mean that it is the poster child for the greedy, corrupt, firm whose executives are paid too much because of a friendly system whose rules they control. Despite being held up as one of the worst offenders against any form of social conscience, but having paid back its government obligations, Goldman is more or less back to business as usual both in terms of its business operations and more pertinently to why they're being vilified, in terms of compensation for top employees of the firm. After 2008 when they paid out very little (relative to other years) in base comp and no executive bonuses (they were holding TARP cash at the time) they will be back to paying out millions upon millions in comp for top execs this year with big bonuses too after a boom year for the firm where their stock gained ground amid a bull market. One can see why this situation pisses people off: who decides how much Goldman execs get paid? Goldman execs. The fox doesn't just guard the henhouse... he runs it. The thing is that Goldman Sachs has been a publicly traded company for a decade now, and it while its top execs are still called "partners" the name is purely symbolic: the company is publicly traded and ultimately the "partners" are not a small group of owners who can act without oversight, but rather they act with the oversight of a board of directors and ultimately the votes of their shareholders. The situation is complicated by the fact that Goldman's IPO was so recent that some larger than normal fraction of shares are held by current or former Goldman execs, but in general at other large companies like Morgan Stanley and dozens of other huge corporations that have been publicly traded companies for a long time, the number of shares that are in the hands of people who don't work for the company is generally the large majority. If the route to profitability were to slash executive pay it seems like that plan would have been pushed for by shareholders at one of the hundreds of institutions set up in this fashion, doesn't it? And yet there has never been a successful shareholder push at a major publicly held company, even on Wall Street, in the wake of this most recent financial crisis, which in the popular narrative was caused by the incompetence of these well paid plutocrats, where a company's shareholders to voted to slash executive compensation in a manner even remotely similar to the government plan. If it made sense though, you'd think shareholders would be all for this; money not paid out in compensation, especially at a firm without any significant capital costs like a financial firm, would be essentially pure profit and could be distributed to the shareholders as dividends. Yet even as income inequality has risen, the income of the very top portion of earners has exploded, examples arose of extremely dire consequences being the result of decisions made by well-paid, strongly recruited chief executives, still no company took this route. Many of the firms that were run by these very well paid individuals either took a big hit in earnings and market capitalization, had to be saved by government intervention, or went completely belly up, and yet shareholders at GS, Morgan Stanley or any of the hundreds of other corporations in other sectors fail to see the wisdom of demanding that their employees be paid 10% of what they have previously been paid. Having known many Yankees fans who were generally left leaning, I find it somewhat surprising that for some reason most people I've met who oppose the discrepancy between the pay of say, a chief executive at a large manufacturing corporation and the pay of a blue collar worker at that same company have ever complained to me about the discrepancy of the pay between the players on the Yankees and say, the food vendors at Yankees Stadium. Now I have no problem with this discrepancy; I recognize that the ability to play baseball at the level of a Major Leaguer is an incredibly rare skill. Ironically, of course, their skill is at playing a trivial game, but since millions of people enjoy watching that game and will pay for the privilege, their skills generate wealth and improve the quality of the lives of people who happily exchange hard earned money for game tickets, Yankees caps, etc. But perhaps it is merely because people intuitively understand that Alex Rodriguez's ability to hit a baseball is exceedingly rare it is the reason that they do not complain that he is paid tens of thousands of dollars per hour of game time. They understand that the idea that the Yankees would try to slash costs to improve their team is ludicrous. As in real life, in Major League Baseball, human capital is mobile; an inability to pay people well will lead to a lack of talent (see the consistent failure of teams like the Pittsburgh Pirates that chronically cannot pay as well as the Yankees) and people interested in seeing the Yankees be successful on the field and at making money obviously think that their massive payroll to get and keep top talent is justified. In the realm of finance, it seems, people are under the impression that CEOs are replaceable, a dime a dozen, interchangeable, and that systems of evaluating and pricing talent in positions of crucial importance for the success of the organization are disconnected from incentives like seeing the firm succeed or control risks. But just as sometimes the Yankees misallocate large amounts of payroll (see Carl Pavano) and such examples and tales of golden parachutes on Wall Street show that such systems are obviously imperfect and sometimes absolutely wrong, and just as the Yankees don't always win the World Series just because they pay their players several fold more than most other ball-clubs, that doesn't mean that the entire premise of incentivizing performance with pay is flawed. For now, the fate of these 7 firms will be interesting to watch; as financial wards of the state the mobility of their top executives and the autonomy of decision-making of the firms' employees both in staying or leaving and in day to day operations is unclear; it would seem to me that the best executives, who suddenly find themselves being compensated for the foreseeable future at a level that a competitor could easily double, triple, or increase nearly 10-fold, would jump ship if they could. If they can't, their incentive to try to rise up within the organization and put forth the extraordinary effort of the stereotypical workaholic businessman glued to his blackberry seems like it would only be to maintain their reputation so that their pay will increase relative to past levels once they can either be paid in the old way by their firm once they're out from under government regulation or once they can leave their current firm. So at best there are reasons for the best workers to try to leave the firm, for other firms to poach talent at bargain basement prices, and only weak incentives to continue putting forth full effort that would increase profits and get the taxpayers their money back that are contingent on the idea that their pay will someday be back to near what it would have been. The automakers regulated under this plan can be discounted; they will never be profitable for reasons unrelated to executive compensation. But the performance going forward of Bank of America will set a bad precedent no matter what happens; if the pay cuts decrease the quality of employees or the quality of their work then capital will flow out of the firm, the firm will be less profitable, and taxpayers will wait longer to get their money back. Further, if there is a massive flight of talent from these firms then the incentive for the government will be to somehow level the playing field and limit executive pay more broadly. If the firms exceed expectations then the government will be emboldened to curb CEO pay more broadly and more worrisome, the model of a firm "too big to fail" seems like it will be proved viable and government will inextricably become more intertwined with business. The latter scenario seems very hard to fathom, however, and I doubt that this experiment will do anything but prove that while imperfect and probably permeated by a significant fraction of misses, the system of rewarding outstanding performance abilities held by a limited number of highly talented individuals with a a very rare skill with outstanding compensation will be vindicated. While no one questions that Alex Rodriguez has a very rare skill that is therefore worth very high compensation in a free market, seeing why some white guy in a suit is so special is less easy to accept; if it weren't him it would be somebody else, it seems, but the number of individuals with the intelligence, experience, leadership and interpersonal skills to successfully create business strategies for an extremely complex business like an investment bank that deals with billions of dollars daily is few, and those who do so successfully must be compensated well to maintain competitiveness in a global marketplace. While there will be unsuccessful individuals and they should be sacked, while it's no more fair than the gaudy sums that Alex Rodriguez is paid for playing a children's game very very well, you should be paid a competitive rate if you possess the ability to successfully helm a company that facilitates trillions of dollars of global capital flows in a given year and creates massive amounts of wealth for those who work for the firm and, in the case of say, a successful investment bank or fund management firm, for the businesses who are their clients or the investors who trust them with their money, often including large institutional investors such as state or union pension funds. Is it justified? It depends. But while money talks, it also walks these days. If the U.S., for instance, were, hypothetically, I stress, to cap CEO pay, it seems likely that foreign firms like UBS and Deutsche Bank would feel that the skills of the executives at competing firms like Morgan Stanley and Goldman would be valuable enough that they would pay them a large amount of money to jump ship. It's like the Yankees and Red Sox, now, but we could make it like the Yankees and Pirates. Why we would think that would be the way back to competitiveness, I have no idea. One final comment, to paraphrase a quote from the man behind this decision to cut CEO pay at these government indebted firms, Special Master for compensation Kenneth Feinberg, he says he doesn't want to be called the Pay Czar because that suggests imperial powers whereas his job has involved months upon months of negotiations and meetings and haggling and this number and plan were reached in a process that resembled a negotiation. There's only one problem. The final decision rested in the hands of one Kenneth Feinberg, and while the companies can appeal this decision, the appeal goes to... Kenneth Feinberg. That sounds like an imperial power to me, no? Who is John Galt? Labels: ballyhoo, baseball, economics, government, politics
Update and an attempt at Instruction Manual Version 0.1
Intro, caveat, info for geeksThe good news first: I've made significant progress towards getting the program launched towards completion. (If you just can't wait here's a link to the test version of the application which has no guarantee of functioning as intended.) While probably not dangerous to your computer (it's always practiced safe sex with other programs, so the odds of you getting a virus are pretty low) the program is not yet actually functional (but certain exciting new parts are) and you should probably read on to know where to expect it to stop working and what the issues are... skip ahead to the next section if you're just interested in using the site and not the whole back-story... The bad news: As mentioned, still not even at alpha level functionality in the sense that while a lot more stuff WORKS there isn't a cohesive whole that *does* anything yet, per se, but you can definitely start to what the user experience will be like and how the thing will work now than you could before some recent coding. Piece of bad news #2: if you use Internet Exploder the site will act funky and not quite as described below since I haven't yet worked out the cross-browser bugs. There are a bajillion reasons you should use Firefox and not IE (and I say this as a Microsoft-defender and Mac-hater) and you can find out how and why you should switch here. So while I've discovered a few minor bugs that pop up infrequently in Firefox (most of which go away by reloading the page as far as JavaScript) and I'd be interested to hear about the site just not working in older versions of FF particularly, it should work for the most part for FF, but in IE it won't work properly, and I know this and will get to that. That said, let me get this out of the way right here. Officially as of this blog-post we're open-source! Click here to download a .zip which mirrors the file structure of all the various files for the site exactly (which, online, are all just dumped unceremoniously into the /ta directory for the most part except for a few image directories.) For any tech-geeks interested in helping out shoot me an email at adam dot litterman at Gee, mail! dot com with what you're interested in working on... I'll write a bit towards the end towards future directions where I want to take this thing which might whet your interest. As for any JavaScript gurus the cross-browser issue has to do with the different way which IE7 and IE8 parse the Document Object Model. It's a bit frustrating b/c the bug is in a beta version of Yahoo's very awesome free suite of user JavaScript user interface tools; I mean that only in the sense that I have no clue how to go into their code and hack it and I (irrationally) hoped that they had totally written their code so it was at least IE7/8 & Firefox 3.x compatible so I have to rewrite a bunch of my code at a fairly deep level. Definitely for the final product while I obviously will get it to work in IE the real focus on having everything spot on will be for Firefox since IE sucks--but that's beyond the scope of this article. New features and instructions on using the site!OK, so what's new? Well, the basic system of user interaction and identification with the site is essentially completely functional (the one exception is the password reminder page which I haven't had time to write yet... if you need your password send me an email at adam dot litterman AT g to the mizzail and I'll friggin' look it up for you. But anyways, aside from that on the to-do list, you can importantly: sign up and create an account (I decided that you're going to have to at least pick a username and password, it's just too damn confusing to keep track of people without a login page to kind of lassoo who they are for technical reasons to do with how the database will work.) As you'll be warned by the signup form this ain't online banking... while I'm not a cyber-criminal it's my database so don't use your friggin' e-mail adress and its password. You don't HAVE to give a real e-mail but I'm hoping this site will actually be used by more than a few individuals eventually so there will be an automated password reminder system so you know, it will need to be real if you want that to work. You can also logout which deletes your cookies and protects your sensitive, sensitive data about how you think Jermaine Dye will hit for the rest of the year. OK, great, you can log-in, then what? Well, once you sign up and login you'll see that the top menu bar goes from Joe Mauer's gorgeous visage to the same image with some translucescent buttons show up, the most important of which, for now, being the one that says "Configure a team." This is what the site is based off of, and here's how the idea works. You click "Configure a team." You're brought to a page that contains pictures of the logos of all 30 MLB teams. Click on any one of them and you're brought to the heart of the simulation, the team configuration page. There are three main sections to this page. At top is a box with the players currently on the active roster (this is up to the second from MLB.com...and yes, I'm eventually going to try to figure out how to get the whole 40 man roster up there so if for some reason Justin Morneau's on the DL for 2 weeks you can still put him in your simulation.) Each one appears in a box with a colored gradient for a background (check out the Yankees'. I don't like the Yankees.) Below the 25 man roster are two boxes one containing room for 9 players to form a batting order and another containing a general schematic for how managers use pitchers with 5 starters, a set-up man and a closer and then typically 5 other pitchers (sometimes 4, sometimes 6) who are used less the less successful they are. How to populate those spaces will be explained in a minute. First, it's important to note that to stick a player into a role on the team you will first need to "configure" him. You do this by clicking the gray button labeled "CFG" that occurs to the left of each player's name and which pops up an interactive window which allows you to see both their performance this year and over the previous 3 years (or however long they've played in MLB if they're younger) which are all combined to show you a projection for their performance. Please note: this projection for their performance is a projection over 700 plate appearances. This is equal to 4.3 plate appearances per game played, so only players who miss no time and hit fairly high in the lineup acheive 700 plate appearances. If you can wrap your head around thinking of David Eckstein like that, then you can treat his HR numbers accordingly--if not, just look at the rate stats like batting average and so forth. In reality, the projections for both hitters and pitchers are just useful to help you intuitively understand the projection you're creating--the actual nuts and bolts of the Markov Chain mathematical model of the simulation. Something similar happens with pitchers. Once you click "CFG" you'll see that in addition to this year's stats, the previous years' stats, and this year's current projection there are also two buttons; one says "Configure Prediction" and one says "Use Default Projection." If you click on the former 3 sliders will pop up which will intuitively shape things like a hitter or pitcher's ability to hit for power or throw with control; as you slide these sliders up or down (left or right, technically) you can see in real time how the underlying values that will go into the simulation would translate into real-world type statistics as far as things like batting average, slugging percentage, on base percentage, ERA, WHIP, etc. Once you're satisfied you can click "Save and Close" and suddenly which will save this projection (for now just in the cache of your browser) and also close the window. Once closed, the box of the player you just configured will change in appearance; it will have a new background with a similar gradient but a steel-plate like texture which is meant to visually connote something grippy, since as you'll see if you mouseover it, the mouse changes to a four-way arrow indicating it can now be dragged, obviously, to one of the lineup spots. If you're happy with the default projection as it is, you can skip that whole business of futzing with the sliders and just hit "Use Default Projection" and essentially it will be the same as if you'd opened the sliders, done nothing, and hit "Save and Close." Constructing the lineup is straightforward enough (leave space #9 blank for the NL teams--even Cardinals fans--just because your manager's a retard and wants to increase the number of plate appearances his teams' pitchers have doesn't mean I'm specifically programming that stupid special case into my program.) The pitchers are straightforward enough as far as the starters, set-up man and closer, and for the 5 relievers just rate them in order of who you think will be used less, so that the guy who always gets the call in the 6th or 7th inning of a tight game is reliever #1 and the guy who only gets in for mop-up duty or the 17th inning is reliever #5. Also note that for now the complexity of pitchers hitting is not accounted for so pitchers won't stick in the spots in the lineup even on NL teams (just leave #9 blank for now--I'll simulate a league average #9 NL hitter somehow... hopefully won't leave too big of an artifact) and obviously position players won't stick as pitchers. What it doesn't do yet but will doOK, don't go crazy just yet configuring all 30 teams. So what happens right now is that unseen (unless you look at the source code of the webpage and know how to read javascript) is that when you drag a configured player onto an appropriate target it sends a message to the server without you leaving the page containing the mathematical bits and pieces (likelihoods that players will triple or walk etc.) that will make up the model. Again if you want to read more about how a markov chain of baseball works this is a good primer and a couple other blog posts also discuss it in more depth.My immediate next step in the project (after finishing this blog post which will form the text on the intro page of the web-site since right now it's just boilerplate text) is to start working on MySQL database to keep track of that information-- the database is already set up storing the data necessary to maintain usernames and passwords and email addresses and miscellaneous other data about which I'm curious which you can provide or not (like where people who use the site live, generally) for an arbitrary number of users. These databases are lean and mean and will be able to dynamically be updated so that I'm pretty sure I'll be able to set it up so that if you drag configure A-Rod and drag him from to the #3 spot in the lineup and then reload the page he'll still be there, and then if you drag him from the #3 spot to the cleanup spot and reload the page, he'll be in the cleanup spot. In other words, every time you lock a player into place it'll be automatically saved, so there won't be any need to worry about the page crashing or closing the window while forgetting to save your work. Once you've configured enough of a team (say, the batting lineup and 4 starters and 4 relievers should be enough--the pitching part is more for completeness than necessity) it should be able to do an interesting and robust--but still relatively simple-- simulation. By modeling how many runs pitchers allow per inning, how pitchers are used by managers generally (which, for purposes of simulation, is happily rather orthodox and often stupid) and then obviously modeling hitting by modeling innings as Markov processes and simulating N games, say, 10,000, we can get a robust idea of how many runs a team should score and allow against a league average opponent. This is interesting on opening day, since at that point in the schedule this approximation is good enough for a first order idea of who's going to win what. But look at this year and the schedule; with about 60 games to left to play the Twins have 0, count 'em, 0 games against the Yankees, Red Sox and Rays left, whereas the White Sox have a 14 game stretch (including 11 on the road) where they play no teams other than the Red Sox (@ for 4 games, hosting 3) @Yankees (3 games) Twins (at the dome for 3 games) and Cubs (1 makeup game which washes out a travel day off. Simulating against the league average team and saying well against a league average team over 60 games the Twins win 32 games and the White Sox win 31 isn't particularly interesting at this point in the season. For that, I'd need to have a database that included the entire MLB schedule, some way of modeling the performance of opposition teams in a Markov process so that each game could say be simulated 100 times in order to compute things like expected final standings and probabilities of teams making the playoffs based on different performances of individual players and different batting orders or uses for bullpen arms. I intend to tackle those problems, but getting the website in working order as far as the nuts and bolts obviously comes first. Look for this stuff to be in like a 1.0 Alpha version by Opening Day 2010. I think right now I'm at like Version 0.3.12 or something, but who's counting? That's why currently there's a space for "My Sim s"--while the first alpha version of the web-site that's "done" will have the much more bare-bones simulation upon which all others must be built--the goal will eventually be to have several more complicated simulators. Beyond simulating who will make the playoffs is getting a little too detailed and this post is long enough, so I'll leave it at that. But if that piques you're interest and you enjoy PHP programming, MySQL and no pay, please let me know and you can help me create something with no chance of financial remuneration! But the bare bones simulator ain't too shabby... try to find another website that let's you see what the effect of shifting Joe Mauer in the order or him hitting 30 dingers is on the won/loss record of the team or the average runs scored? -A.L. 7/27/09 Important noteIn order to keep track of your movements at all times and control your mind make the fun website work(!) WE WILL leave dangerous "cookies" of information on your hard drive. If you remember the hysteria from the early 2000s while seemingly an innocuous way to anonymously and durably identify you when you login by storing 6 characters of data on your computer this is actually a form of government mind control and/or one link in a vast international criminal conspiracy, I haven't decided which. Which is a poor attempt at humor which means: you need to have JavaScript (no shit) and cookies enabled for the site to work. It will even throw errors(!) if you don't have your cookies turned on. Labels: baseball, databases, instructions, markov chain, markov chain web application
Simulator Status Update
So... considering that I was pessimistic that I would have a working alpha prototype version 0.1, let alone 1.0 done before Opening Day 2010, I'm pretty happy with the progress I'm making. In large part it's due to the very user-friendly User Interface for dynamic web applications. So just to reminds you what I'm shooting for and how I hope to get there... I had a Java Program in college that simulated baseball using a Markov Chain model, a natural way to simulate baseball games and something that I briefly wrote about and which is all over the net both on sites teaching statistics but interestingly now on baseball stat-head sites. For the ultimate analysis of the general dynamics of baseball via Markov Chain Modeling, The Book has dozens of tables showing expected run values for certain events in certain situations based on Markov Chains. But having a program written in Java that took input in the form specially formatted text files and gave output that was interesting (like... run through every permutation of those 9 batters and find the optimal batting order) but which was just way beyond most people's ability to download and install and actually use in any interesting way. But the idea of a Markov Chain based simulator where you've got control of what players are going to do better or worse than one would predict (if one were neutral towards their ability to over-perform or under-perform relative to their previous experience) and then potentially use that to generate models of very complex things like... what a team's record is based on your projections for the second half of the year at the all-star break. And with the convergence of buying the domain name and hosting plan for the purpose of distributing that other code (and just having a website to post files) I decided that making a web application with Javascript and possibly AJAX on the front-end (that is, the page the user sees) and then PHP and MySQL on the backend (a very common combination for building interactive web applications) seemed like a good idea. I wrote up a piece outlining my thinking on what the thing could do and set to work. I'm actually amazed I haven't abandoned the project and have gotten this far. If you go play around with the latest stable version of the application or (and this might be broken at any given time) the actual development version you can see where it's at. If you take a look at, for instance, this debug-mode Mets page and click on the "CFG" configuration button for Johan Santana and then play around with the window that pops up and save your work by hitting "Save and Close" and the you look at Johan Santana's name you'll see that two numbers... the number of innings the program projects Santana will pitch and how many runs per inning he'll allow are displayed. This may not look like much, but what it means is that I've:
- Made the front page that lets you pick a team
- Made dynamically generated pages for each team that load the team's roster on the fly
- Allowed each player on the roster to have his stats and a projection of what he'll do based on his past 3 years of data as well as what he's done so far this year
- Displayed this information in a dynamically generated pop-up window which also allows you to modify a number of intuitive features of how a player might over or underperform the projection
- ...and this is what is new, is that the projection equations and programming for those widgets are done and now when you've configured the player the data is accessible to be... used, somehow.
Later today I'll post on what that means and what is left to do in order to actually make this a viable program. Labels: baseball, markov chain, markov chain web application, musings
I am sorry, David Kalan
I know what it feels like to live and die with a sports team. While hockey is a sport that I enjoy, particularly college hockey where Princeton has a illustrious history and recently it has had some success, and where the Gophers have been perennially dominant, I feel like I will need to learn to love the Wild, and NHL hockey more generally, since I'm moving to Minnesota. But while I've been to my share of Devils' games, they are not one of my teams the same way that the Twins or Vikings or Princeton or Minnesota Basketball, Football, Hockey and Lax are. I enjoyed seeing the Devils win the Cup all three times they did it, and went to two of the games in '03 against Anaheim, but while I like seeing them win, with the Twins and Vikings, I am euphoric when they win, and apoplectic when the lose. After the Devils were beating the Canes in game 7 of the playoffs tonight 3-2 with 4 minutes left, and then 3 minutes left, and then 2 minutes left, a pretty bad gut punch came; a backdoor pass to allow the tying goal with 90 seconds. I was trying to think of comparably bad gut punches I'd experienced; it's a bit different than my frame of reference since I just haven't watched that much hockey and lax, which are the closest in terms of how instantaneously game-play can change. Also, goals are pretty rare in hockey, so my equating it with game 2 of the ALDS between the Twins and Yankees where the Twins took the lead in game 2 and were one half inning away from taking a 2-0 lead back to Minnesota in a best of 5 and lost that game by 2 wasn't really apt, since it was an immediate loss and then it was not an elimination game. I thought of the 2004 or 2005 Princeton Yale game where on a gut-wrenching final play Princeton lost when Yale scored a touchdown to win on the last play of the game that would have given us our first bonfire in over a decade. We got pretty sweet revenge my Senior year though and stormed the Yale Bowl and got our bonfire, so uhh, I dunno, it just wasn't the same. But then, almost unimaginably with 30 seconds left, 60 seconds after a bad gt punch, came an epic, digusting, inconceivably, Christ-almighty-there-is-God gut punch. Against the greatest goalie of all time... there were two goals... in 60 seconds... that lost the team the series that they had in the bag with nothing but "Wait til next year." The closest thing I can think of is when the Tigers came back against their perennial rivals the Syracuse Orangemen in the 2002 NCAA Finals when after scoring 4 goals in a row to tie it, they lost on a game-winning Syracuse goal. But Syracuse had been winning, we didn't have the game in the bag, and at that time I wasn't so committed to Pton; my sister went there but I didn't make the decision to commit there for probably another couple months after that game. He just experienced literally the worst sports' gut punch of any friend I've ever had--well, the Yankees losing the 2001 World Series was probably worse, but I'm not really friends with Yankees fans-- so... my sympathies, Dave. As I mentioned to him, if a similar level of gut-punch happens to me at home (he was at work, you know, working for NHL.com) where, say, the Twins take a one run lead into the bottom of the 9th on the road in Game 7 of the Worl Series and then get two outs and then instead of having a rally build just give up... two... solo... home-runs... to a couple of light-hitting douche-bags then uhh, you know, I might end up with a broken TV, a lot of broken glass, and a hammer in my hand before my heart exploded. Labels: baseball, musings, sports, twins
Twins news and notes
Some bizarre facts that I don't know whether they bode well or not: --The Twins won their first game of the season this year where they did not at some point trail. And with the win they moved to 8-9 and will be .5 game behind Chicago when their game finishes up and 1.5 games behind Division Leader KC. That means that they trailed at some point in all of the first 16 games and yet won 7 of them. As far as a percentage for winning games in which you trail at one point, 7 of 16 or about .438 seems pretty good; on the other hand, trailing at some point for 16 consecutive games to start the year is not a good sign, and doesn't bode well for a team's long-term success. --The Twins have played a bunch of games against teams that were surprisingly good; mainly their 8 games hosting Seattle and at Toronto. Those two teams have the best records in baseball at this point. What year is this, 1994? So having gone 3-5 in nearly half their games against two of the hottest teams to open the season, along with 2 against Boston, which means they've played 10 of their first 16 games against very strong opponents. --A bunch of Twins pitchers who you really can't conceive of all having awful years have yet to get it together. Liriano has only had one very good start, and it wound up being a loss, but beyond that he just hasn't gotten through a start well yet. Tonight's game was Blackburn's first good start, holding the Indians to 1 run over 7 IP. Slowey leads the team in wins with 2, despite the fact that his ERA is nearly 6. Glen Perkins, a guy who many wanted to trade for a sack of balls has had 3 dominant starts, going 8 innings in each of 3 starts and posting a 1.50 ERA and earning a loss, a win and a ND. You've got to expect that Liriano will get things together, as will Slowey and Baker (who after coming off the DL has been awful.) Perkins I predict will win a Cy Young award. --There have been some pleasant surprises on the offense, such as Denard Span showing, thus far, no signs of a sophomore slump, as of this writing having an OBP of .408 with a BA of .323; i.e. he's drawing a ton of walks and hitting for average well, again. Morneau is being like Morneau except more Morneauy, putting up what would could be the front end of another MVP season if he keeps hitting like this; 4 HR, 13 RBI, .324/.356/.574 for a .930 OPS after 17 games... yowza. He could finally put up a really massive number of HR, not just general XBH if he can continue at that pace...right now he's on pace for about 38 HR/124 RBI 10% of the way through the year. Even a more pleasant surprise has been Kubel looking like he was always supposed to, OPSing an insane .926 with 13 RBI, almost the same as Morneau on OPS and the same number of RBI (now, he did just have a game where he hit for the cycle, including a grand slam home run to cap it off.) But if he could actually prove to be he Morneau like slugger that he was expected to be before knee injuries and a couple less encouraging years took all the shine off his penny and Mauer gets healthy and the starters start pitching better (as they have, to some extent) and the bullpen gets better as the roster move to bring up Mijares after Crain got hurt makes him stick there and they end up punting Dickey or Breslow or Morillo or Guerrier when Crain gets healthy, look for the Twins to have one of the best May records in baseball. --Speaking of Jose Mijares, he pitched well allowing one hit and no runs in his return to a big league game after getting called up when Crain got hurt; after how he pitched last fall and this April at Rochester, I think we'll be seeing him a lot more now that the current bullpen has proved a very treacherous bridge to Nathan, and since basically it was due to a lack of conditioning that he had gained 5 pounds on his gut and lost 5 MPH on his fastball and that it took him an extra couple weeks to get tha worked out that had kept him off the 25 man roster. Look for him to not make that mistake again. --Speaking of the Twins' possible breakout in May first of all it's good that the division is, as some astute pundits predicted, a total dogfight, with no clear favorite at all at this point in the season, since while the Twins might be one of he better playing teams in baseball, they're gunna have to hope some good teams that started cool or lukewarm stay that way while some teams that started hot cool down? Why? Their next 7 non-AL Central series (not counting a 2 gamer against Baltimore)? Hosting Vs. TB, Vs. Sea, @NYY, vs. Milwaukee, vs. Boston, @TB, @Sea. If the Rays aren't the 2008 Rays that's not so bad, but if they are, then that's 20+ games against some pretty damn hot/talented teams. --Now that all the games involving AL central teams have gone final, KC is 6-4 over their last 10, the other 4 teams in the division are 5-5. Talk about a friggin' dog-fight. --Despite being an ML worst -27 runs scored the Twins are 1 game under .500. Does that mean they've bee getting lucky or have a knack for winning tight games (see their near .500 WP in games where they trailed at some point) You think of the addage that there's 50 or so games you'll win, 50 or so games you'll lose, and that it's how you play the middle 62 that are tossups that is what counts... well, it seems like the Twins have been playing a bunch of those tossups and a bunch of games that they just were gunna lose (9-0 and 12-2 and 6-1 blowouts against ace pitchers like Doc Halladay etc.) Tonight's 5-1 easy win over Cleveland was the first game that felt like one of the 50 "wins." Of course, that adage is true in the sense that almost every team will lose at least 50 games and that often time 50 wins will be relatively uneven and squared away well before the final inning or two, but I think back to last year and there were so many damn games that you'd put into that "50 losses" column that were winnable as far as the pitching matchup and were lost due to early errors that put the pitchers in tough spots and then the Twins were basically in an insurmountable hole early that shouldn't just be written off as "losses." And of course when you lose the division like they did there were also quite a few games you can think of where one bonehead play lead to a loss; get rid of that one play and we're the 2008 Division Champs. So early in the season I'm kind of cognizant of that and hoping that we can start winning a lot of easy games while still winning a lot of the close ones. --In totally unrelated to the Twins news, it's nice to see the Pirates at 9-6 after 15 and threatening to take the lead in a tie game @ San Diego. I wonder what the last time they finished April with 10 wins, let alone a winning record, was? Of course, neither one of those is a done deal, but they've got the go-ahead run on third base late in this game and one would hope that if they win this game they wouldn't go 0-5 in their remaining 5 April tilts, as would be necessary for them to end the month with a losing record. If they drop this one, they'll need to go 2-3. Could they finally be a legit enough team and in a mediocre enough division to post a winning record for the first time since the early years of Clinton's first term? I hope so, but I think not. Labels: baseball, twins
Simu-live blogging of the Twins home opener
I DVRed the Twins home opener since I'm getting the free preview of the MLB Extra innings package and I'm not watching it live so I can't "live blog it" but I can look at what time things happened and comment on them. I'll put things in Central Daylight Time (since that's the local time where the game is going on and I'm in a Minneapolis state of mind. So away we go... went... er...) 19:30 There's an old superstition that the first hitter of the season is an omen that determines the rest of the season. Well, Denard Span getting a walk is exactly what I want to see happen if it means that the young guys will continue to play at the level they were at last year or improve (since Span's huge contribution was having a huge OBP in the leadoff hole. Casilla keeps the great portent coming by following Span's walk (which as a leadoff man is just as good as a single... that's why you put high OBP guys who draw a lot of walks in the #1 and #2 hole) by hitting a line drive single which would be a great sign that he can hack it as an offensive player in MLB.
19:31 The Mariners' broadcast team (which I don't love but which isn't awful) says that there's a rumor that Mauer's back injury might sideline him for six months (i.e. the whole season.) While Twins' doctors are saying he could be back soonish (maybe 2 weeks, maybe a month) which wouldn't be bad, but when the joint between your spine and pelvis is messed up... who the hell knows when the hell you're coming back. Mauer's lower back is hurting him a lot and some guys who aren't lanky MLB catchers coming off 4 years of frequent squatting have years of back pain where they can't find any relief. Cuddyer got totally tooled by Felix Hernandez, who while he is a dominant pitcher, would not have done the same to Mauer. We really, really need Mauer and I'm afraid if we don't have him for the whole season or even a large fraction of it we're in a lot of trouble. 19:36 Morneau is called out after he rips a grounder up the middle, it deflects of Hernandez's glove and then the second baseman gets it, hurls it top speed to first, and Morneau is called out when the replay proved he was clearly safe. Twins' screwed on a call for the first time in 2009. At this rate (.2 innings into one of 162 games) they should expect the total for the year be 2187 blown calls. This sounds about right if they're going to reach last year's mark. 19:46 On cue, after the Twins get hosed out of a run, a Beltre lead-off double leads to a Mariner's run on a sac-fly. Sigh. Other than the Beltre double though Liriano looks like he's gotten off to a pretty solid start, with a couple K's. 19:56 Gomez takes away his first double of the season and a hard-hit ball by Kenji Johjima. It's interesting to see him pull a ball deep to the gap when in the WBC he seemed like such a dedicated slap hitter. Casilla follows it up with a nice play. I'm excited about the defense on this team from what I've seen so far. Punto is much bemoaned, but he, Span and Crede being in the field every day along with average, not mediocre, fielders like Redmond and Morneau with the only really bad fielder being Cuddyer makes me hopeful that Twins' pitchers' ERA's this year will be below what their peripherals would suggest. Mauer is a fantastic defender and thus again, the fate of his back injury looms large over the Twins' prospects for the season. 20:03 Punto works a walk in his first AB of the season which is as good as a single since he's leading off the bottom of the third. He's thought of as a defense only guy, and while he had the horrible year where nearly was under .200 for the season in BA in '07, in '06 and '08 he hit around .290. He's not an awful hitter. Span nearly beats out the throw on a sac-bunt. First sighting of scrappy Twins' small ball, and the team "playing the game right," which they almost certainly did not do. Beltre robs Casilla of a hit with a Gold Glove play. If the new Mariners' GM hadn't asked for a king's ransom for him it would have been nice to obtain him. He is the difference in this game so far between the Twins leading 1-0 and losing 1-0. 20:06 Since Ken Griffey Jr. is back as a Mariner they show him cutting down Cuddyer in the one game playoff from last year. This is painful to watch. Punto almost gets thrown out creeping down the third base line with two ous. Why? Cuddyer follows it up with a backwards K, the fourth by King Felix. Looks like we're infor a pitcher's duel. We'll see how the Twins' bullpen looks. 20:11 Casilla continues to flash some impressive leather. 20:15 If Liriano didn't make a lucky fielding play Beltre would have gotten another hit, this time a grounder ripped right up the box. 20:20 Kubel flashes a sign that perhaps the outrageous numbers he was putting up in Spring Training were legit and he's finally ready to be the player many thought he'd be in '04 by getting a good pitch to hit and ripping it to left field on a rope with one out in the 4th. 20:27Sigh. Ken Griffey Jr. fucks us for the second consecutive game w/ a HR. 20:38 After Cuddyer hits a single with the bases loaded to make it 2-1. Morneau then hits into a tailor-made GIDP. Sigh. 20:45 Gutierrez pops a 2 run HR. Bad luck for Liriano, but it was after Casilla, one of several fielders on both teams who've been fooled on ground balls by the lights in the dome. That's strange, and makes me think they changed something about the dome, since I'd never seen that happen before it happened multiple times to both teams. 20:58 I'm assuming that Liriano is done after 7; he pitched well enough, and one of the runs (the missed ball by Casilla shouldn't count as an earnie even though it will. 7 IP is good news from any starter though, and he certainly kept the Twins in the game; a sac-fly would have tied this game and a single would have given them the lead when they had Morneau up with 1 out and the bases loaded, score 2-1. The mistakes to Gutierrez and Griffey were really the two low points that look like they're going to overshadow a pretty good looking start. Fortunately Liriano won't be facing King Felix most starts but the lack of timely hitting tonight is an unfortunate reminder that the Twins' near record BA w/ RISP was a fluke last year. We'll see if the Twins can put on their rally cap here in the bottom of the 7th. Right as it type that Redmond bounces into a routine ground ball out. Hopefully Hernandez will be pulled after this inning and hopwefully we can get at their week bullpen in the 8th since there's two down now. 21:06 Ayala gets his first licks in as a Twin. I'm afraid that he's the default 8th inning man... hard to tell since we haven't had a lead all game, so this could be him being used a situational rightie since the lead-off is a RHer in Johjima. He leads off w/ a single. Followed up by a slick little DP from Casilla who fielded a tough belt-high hop and flipped to Punto onto Morneau. Hopefully we'll see a lot of those. Ayala retires former teammate Endy Chavez without incident. Against Johjima, Betancourt and Endy he did fine, but you know, they were losing and he'll face sterner challenges should they use him as the eighth inning man. 21:16 King Felix closes out the 8th with his 9th consecutive out, the last one being of Morneau. I'm not too worried about the lack of offense in this game since they are not going to be facing a dominant rising star like this night in, night out. But damn, they got taken to school, especially as he finished out the 8th inning looking like he was just as fresh as in the first. 21:26 Crain is in for the 9th. Lead off the inning with a walk after falling way behind. Terrific. Span gets to a ball that drops against another team without two of the fastest best fielding OFers in left and center and saves Crain some trouble after Beltre rips one into the gap with a man on. Two outs... can we please get to the bottom of the ninth... 21:29 Terrific. Breslow throws four pitches and departs. (They were all balls.) Final 6-1. Not a bad start from Liriano and tip your cap to King Felix who made a great opening day start unlike Liriano who on three days rest was just good. We'll see how the Mauer saga plays itself out. Fortunately we've got 161 games left to make up for this loss, unlike our previous one. Labels: baseball, twins
It's here!
Yea, yea, the Phillies lost to the Braves tonight. Terrific. But today is finally, finally, finally a day on which the Twins will play a regular season baseball game after the travesty of the White Sox beating them 1-0 in Chicago after the Twins won the season series after going 2-7 in Chicago and 8-1 at home. With Joe Mauer and Scott Baker on the DL to start the season with some kind of sketchy injuries that may or may not be short little issues of inflammation and soreness, if those injuries linger the Twins could be in trouble. If the Twins can avoid getting bit by the injury big anymore and Baker and Mauer are only on the DL for a short stint as the Twins are saying is definitely possible (they are saying Baker could miss as little as one start, and that Mauer could be back within a matter of weeks) then I think the Twins are the best team on paper in the division. But with the Indians having revamped their ailing bullpen and having a bunch of guys who have had at least one great year like Carmona, Lee, Hafner, Martinez and of course Sizemore they could be formidable, and if the White Sox can get some consistent starting pitching (unlikely) they could be in the mix. The Tigers have offensive all-stars but I just don't think they've got the pitching and I'll have to see a Royals team that's competitive for the division before I believe it. But with all the question marks, it begins! Labels: baseball, twins
Musings on baseball such as what does it mean to win the World Series?
I was having a very one sided chat with my friend Alex and he paid attention to some of my ideas for what my dream Baseball Markov Chain program will look like, which I outlined in my last post. But his attention waned as I expanded on some comments about the general nature of baseball, but he told me if I kind of summarized it and made it into a blog post he'd read it since he was, you know, busy "doing his job": At any rate, I had some thoughts about what the point of a baseball season is, and that seemed like an appropriate subject to muse on as I'm going to see my first game of the season tomorrow--the exhibition game between the Mets and Red Sox at Citifield (seeing the Mets new Stadium should be fun) and the Twins' first game is Monday. While I was confident they'd make the playoffs for most of the off-season, recent injuries that could become nagging to two crucial players... franchise player and one-of-a-kind in all of baseball history catcher Joe Mauer (two AL batting titles, only catcher to win 1, in his first full four years) and their rock of a starter, Scott Baker. Worrisome. But what does it mean to make the playoffs? And to get to the top of the heap, to win the ultimate prize, the World Series? I wrote a blog post once over at Battle Your Tail Off about how a strategy of trying to load up your team for a one year run at the playoffs wasn't a great idea, and gave some mathematical backup to it. I thought of it as I wrote to Alex describing how you could re-write a custom version of the Markov Chain simulator web application I'm starting to work on and which I might actually make... like, how many wins would you expect the Yankees to have if they played in the NL West or Central? What would the probability of a team winning the world series be if you went back to the 2 team LCS only model for the playoffs? Or if you went totally old-school and eliminated all the divisions, and just had the team in each league who won the most of 162 games win the pennant and go straight to the world series? And what if the world series was 9 games, like originally? Or 13 games? Or the other way, if you had a format for MLB's playoffs more like basketball or hockey than baseball? The thing about it is, when I did the calculations linked to above, where I figured out the probability of the best team in the playoffs to end up world champion, they almost never actually ended up winning the Championship. It was only once, in fact, from 1996-2006, the '98 Yankees, with the notable losers including the 2001 Mariners who went into the playoffs with a 48% chance of winning it all according to my simulation, arguably one of the best teams in the last 30 years, who lost in the ALCS to the Yankees. The most interesting takeaway message was that the average for probability of winning it all at the start of the playoffs of the teams that eventually did win the World Series for 1996-2006 was 13.8%, or 1.3% higher than you would expect if all the teams were exactly as good as one another. In other words, it seems like the biggest determinant of who wins the World Series is random chance selecting amongst the 8 teams who make the playoffs. The Twins haven't really had any success in the playoffs other than get to the ALCS in 2002 and get a half inning away from taking a 2-0 lead back to Minnesota over the Yankees in 2004, all the way up to losing a friggin' 1 game playoff 1-0 to the White Sox last year, almost the furthest extent to which you can play more than 162 games and do absolutely nothing... get to game 163 and lose in a game that required you to score 2 runs. The conclusion of the post, though, was that the Twins' team philosophy that has made them play games beyond 162 in every year starting in 2002 other than 2005 and 2007 is the right one, and that trying to load up the team with stars for one year at the cost of losing young talent that sustains the quality of the team is foolish. For instance consider that the 2002 Giants did get close to a World Series title (they lost to the Angels in that year's World Series) by trading to the Twins Franscico Liriano, Joe Nathan and Boof Bonser for A. J. Pierzynski. The Twins have made the playoffs three of the six years since then, and made a one game playoff in one of the three years they missed it, whereas the Giants haven't done either of those things once, in a division that has seen every other team make the playoffs at least once since then, and the one team that hasn't won it, the Rockies, made the World Series as a wild card team. But even with their playoff futility, in general, making the playoffs as often as possible should be the goal of a team like the Twins, with limited resources. The way that teams that aren't from huge markets, like the Cardinals, the Marlins, etc. have won World Series recently has been by, well, frankly getting lucky after making the playoffs. Baseball as a game, when played by teams that are even only relatively close in skill level, has any one game, or even any 7 game series of games determined more by random variation in player performance and the timing of that performance than overall skill level. I mean, a team can score 5 runs with 6 hits, while in that same game, the other team could have 14 hits, just scattered about enough that they only led to 1 run. You see these variations in individual games all the time. To me that makes the entire concept of the World Baseball Classic somewhat farcical. A double elimination tournament whose champions are determined by one game playoffs, you can get completely counterintuitive results like the Dutch team of minor leaguers or lower level players beating the Dominican Republic team of MLB all-stars two games in a row and eliminating them. The overall winner being such an unskilled team is small, but the chance that the "best" team as determined by, say, if these teams played each other over a 162 game season who had the best record, is the same as the winner of the tournament seems to be 1/(number of teams who are not horrible), i.e. if a team is not so bad that they could conceivably win 4 games in a row to win the tournament at the playoff stages, then that team basically has as good a chance as all the others who fulfill that requirement have an equal chance of winning. Over a 162 game season, in a typical year the difference between the best team and the worst team is about 20% of their chance in winning a given game, with most years, unless there is a historically great or awful team, since the top team usually wins about 60% of their games and the worst, in a league with 30 teams, usually wins about 40% of their games. This is over such a large sample of games that their records reflect a pretty good approximation of how good they are. So if we wanted to determine which team was really the best we wouldn't have any leagues or divisions or playoffs, we'd just have all the teams try to play all the other teams an equal amount, and have them play as many games as possible, and then the team with the best record would be champion. But that's not what anybody wants; so the Champion is not the best team. That's a somewhat counter-intuitive but definitive assertion that fans make simply by not preferring a playoff-less system as just described. The Champion is, bizarrely, the team that, in dramatic fashion, surely, uses a tiny smidgen of their skill plus a whopping scoop of random chance to win a tournament that isn't nearly long enough to remove the effect of chance, but who made that tournament by being one of the 8 better teams in the league (not best anymore because with the division structure it's not clear that the 6 division champions or frankly even the two wild card teams are the 8 best teams.) That's fine, I s'pose, but in sports where simply the number of games, the number of playoff teams, and the structure of the playoffs would lead you to assume that chance would play much greater of a role, baseball's World Series champions are due much more to random variation than these other sports (granted, amongst a handful of a competitive teams, who at the bottom tier changes from year to year, but has excluded certain teams like the Reds, Pirates and Royals for years and years.) For instance, in football, you only play 16 games, yet we typically think that the winning percentages of the teams (which you would expect to vary more than in baseball due to the law of large numbers) and are probably right in assuming that the teams with the best winning percentages actually are the best teams. And the same teams, in 2 of the other 3 major U.S. sports, in the NBA and NFL, have been able to have success in the playoffs much more consistently over the past decade or so than in baseball, aside from the Yankees dynasty. But think of how few teams have won the NBA's western conference or the Super Bowl in the last 10+ years relative to baseball. If you look at the above mentioned data, the one counterpount you could list, the Yankees' "Dynasty" of the late 90's was a complete illusion; in three of their four championship years they were not only not the top team coming into the playoffs, and in those three years, their top probability of winning going into the playoffs was 13% and change, and was as little as 7%. So... they got lucky. Do we realize that, and is that how we want, to crown our champions and "dynasties" in baseball, ever since MLB created the wild-card system? Labels: baseball, musings
Markov Chain Application ideal design
If you could simply describe how a computer application worked and have it automatically spring to life, what follows is how I would describe my dream version of this Markov Chain baseball simulator I want to make. Unfortunately such programs don't exist and a problem with actually getting this to work, even as a toy that's hard-coded for 9 Twins players, let alone integrating support for a pitching staff, platoon splits, and ideally dynamic loading of player statistics (google "Baseball Hack" to get some info on how I might do that) and let it work for all 30 MLB teams. If it were fun enough and made sensible enough predictions and could be used at any point during the season to predict, say, your team's chances based on how they've played, how you think they'll play, and how their rivals will play, then I think people would find it very cool and would play with it for a while. What my dream Baseball Markov Chain program will look like is more or less as follows: A combination of Ajax and DHTML on the client end to create an attractive and intuitive GUI for people to easily translate their own qualitative predictions into quantitative tweaks to a very stout mathematical model. As we briefly described previously, that model is a Markov Chain model of a baseball game; a half inning of a game is described entirely from the offense's perspective; using a player's expected statistics to simulate, say, 9 innings, count the number of runs scored, and call that "a game." Then, say, simulate 1,000 games so that we hopefully average out the differences in other teams' pitching and defense, and see how often we get over some amount of runs that we think the Twins will allow. How do we do this? Well, we could use last year's value (but that seemed like an outlier) or we could use a similar set of projections about who will pitch most of the innings and how they'll do to let the user predict how well the pitching will do. If a simulated game's Twins runs > average Twins pitcher's runs allowed, then call that a win. After simulating 1,000 or so games, figure out our winning percentage and extrapolate backwards how many games we'd win in a 162 game season. But wait wait wait, you're saying, what was all that stuff about "an attractive and intuitive GUI for people to easily translate their own qualitative predictions into quantitative tweaks"? The way that the qualitative predictions or beliefs that the user has about how a player will do relative either to some known quantity-- perhaps their career average or maybe some simple modeled prediction of how they'll do this year. Whether we're using the career average or a certain type of projection will at least be shown to the user (with the player's career/projected stats listed in the drag and drop box) but maybe there will be a drop-down box at the top that lets you pick if you want to use simply their career average, or maybe a simple projection like a 3 year rolling average, or maybe PECOTA's propietary predictions (if I can get access to those integrated.) But whatever model we use as the default, it will be described in the player's drag and drop box, and inside that box there will be a group of sliders (like a volume control) that will default to "1." They might be labeled things like "Power," "Plate Discipline," "Speed," "Batting Average." So now you kind of see how your qualitative predictions can be input very intuitively: "Delmon Young will finally start hitting for power this year" gets turned into sliding Delmon Young's "Power" slider from 1 to 1.2, or 1.4, and watching as you see how that changes his stats in his drag and drop box and see if you think they're reasonable. Once you have, you will have, behind the screen, modified the probabilities in the Markov Chain model. Some caveats; a projection for the year is probably much better than a career average; if it's simply a career average you're going to want to downgrade Joe Crede on Speed and Power simply because he's well past his prime and has been injured. So maybe we will use a 3 year rolling average to compute something that we will display, like, "Joe Crede projected stats over 700 Plate Appearances." Now it's not necessary that Joe Crede get that many plate appearances, and I would like to eventually adopt a system where you can set up a bunch of parallel lineups, where a certain player is projected to get so much of the playing time. That would be like, the very last feature to add though, although adding a feature to define a lineup spot as a platoon split would be a lot easier. Labels: baseball, markov chain, markov chain web application
It's all coming together!
Ahh, yes... It's finally coming together. The reason I originally registered this rather cumbersome domain name was because most advanced baseball modeling systems that I was interested in use Markov Chains. A Markov Chain is basically a special case of a graph (a mathematical graph, not a graph as you typically think of it.) A graph is simply a representation of states (nodes) connected by transitions between those states (lines or edges.)  A Markov Chain is basically a graph that has the special property of only having its edges go one way. In the case of baseball as a Markov Chain the basic unit of modeling that you are performing is the outcome of one half inning. There are 24 (plus one special case) nodes that can occur in a half inning that in the most general sense describe what's going on in the half inning... there are either no runners on base plus a runner on first and none on third; you can think of this as "0" "1" "12" "123" "13" "2" "23" "3". There are 8 total states regarding the baserunners. And then any of those can occur with zero, one or two outs, plus you can think of a special state with nobody on and three outs that triggers the end of a half-inning. You can think of these individual states as having an overall probability of transferring to any of the other possible states that can result from a plate appearance (this is why the edges are directed; you can't go from having 2 outs to 1 out, for instance.) And certain transitions (e.g. from "123,0" to "0,0") are associated with a certain number of runs scoring (the only way to go from bases loaded nobody out to bases empty nobody out in one plate appearance would be a grand slam, so 4 runs score.) So if you eventually look at enough empirical data you can generate a generic Markov table that shows you things like... the expected number of runs to score in an inning from one state to another; the value, in runs, of a plate appearance, which is simply is equal to PlateAppearanceValue = expectedValue(newState) - expectedValue(oldState); And while you can generate from very basic tables based on predictions of these states (versus empirical observation) like so...And if you're interested in doing some really detailed analyzes along that line, you can read about them in Tommy Tango's The Book. It describes things like where the best place to put a hitter in the lineup is to how valuable a certain play is at certain states in an inning (and which are more likely to come up with a certain player at bat) for instance, as you probably would realize without computing it, it's much more valuable for your #1 or #2 hitter to be able to draw a walk (since it's equivalent to a single for these players a lot more often) as opposed to the #4 or #5 hitters (who usually come up with players on and who when they walk often don't change what is needed to score a run and they advance to a new state with a weaker hitter at the plate.) You know that stuff intuitively, but if you wanted to know how big those effects are, well, buy The Book. It also goes into details about stuff like fielding, platoon effects, and tons of other stuff that while it could theoretically be incorporated into a Markov model turns out to effect at most a few games worth of wins per year (of course, if you have a manager who makes optimal decisions at every juncture all year that can add up to a lot of games) are analyzed separately. I also wanted to have my own web domain so I could have a non-lame free-hosted web-page and one that wouldn't change when I switched educational institutions... Princeton gave me a nice free web page even with access to cgi-scripting, etc., but which then expired a few months after I graduated. I work at NYU Medical Center so they don't offer the web services they would to a student, and I assume I'll have a page at the University of Minnesota (which will be pretty permanent since I'll be there for 5 years) but you know... it's probably best to keep the stuff on that web page related to my Ph.D. work. Not you know, massively intricate web applications involving Markov Chains, Twins' player statistics, and how well the Twins will do this year. I don't know how detailed it's going to be right now, but thus far I have an HTML spreadsheet showing the Twins' players' Markov properties (i.e. given a random state, how likely is, say, Carlos Gomez to hit a triple? Or, when there's a double play possibility, how likely is Nick Punto to hit into one?) to generate an average run scored per inning. I've also wanted to re-sharpen my skills on things like perl scripting, as well as learn new web stuff that I haven't done such as work on PHP and Ajax javascripting to allow users to tailor their predictions to work out a model of how many games you'll think the Twins will win. For instance, right now, I just have the Twins' likely opening day starters, and in the most simple scenario, I could have an Ajax script that has a slider that's equivalent to "to what degree will this player overperform or underperform his career line" that would lead to a prediction of runs scored per game which can crudely lead to a Won-Loss projection. Of course the model could get much more detailed, with Twins' pitchers having their own Markov models, how much playing time you estimate a pitcher will get, what order the Twins hitters will be in, if a spot will be shared (as it invariably will) what degree that spot will be split, and how much those splits will lead to correct-handed match-ups, etc. Basically, you could take your prior assumptions about the Twins and build an intricate enough Ajax template, a Markov model built with PHP and perl on the backend and lead yourself to a very, very detailed model of how many games the Twins will win. So that's the goal (not sure that I'm going to implement all those features...) having just our starting lineup and rotation and your prediction about over/under perfomance would be prety good, but I think that's where I'm gunna try to take the site. Of course, the original Java based model that was going to be totally console based crashed and burned (I think the code-base is somewhere on the site) since it was too hard, but you know, hope springs internal. Labels: baseball, markov chain, site, twins
|
|
|