Some historic data is not point-in-time

WalterW · July 7, 2016, 4:27pm

Steve,

Would a revision controlled DB satisfy your requirements for debugging simulation changes? In other words, let P123 maintain the DB as they do now but allow a sim to roll-back changes to allow access to the development DB.

I don’t think P123 would ever do this. Just sayin.

Walter

Jrinne · July 7, 2016, 4:34pm

I think changes in the algorithm for a factor is a difficult question. And very different from raw data changes.

I will probably be on different sides of this issue in the future. But in the past I strongly favored moving to the newer earnings estimates revisions methods. This was, in part, because it improved the latency of data issue that Walter raised.

mgerstein · July 7, 2016, 5:05pm

As Georg said in the opening sentence of this thread, “We had this discussion before.”

The complaint raised by George in the initial post involved a clear-cut bug correction from nearly a year ago that was explained here: https://www.portfolio123.com/mvnforum/viewthread_thread,9005

And as per the thread cited in that announcement, the fix was made in response to errors properly called to our attention by users:
https://www.portfolio123.com/mvnforum/viewthread_thread,8977

Under no circumstances will we ever consider refraining from fixing an error or a bug. That is what you can and should demand and is critical to the “institutional quality” remark referenced by Georg later in the thread.

If you are using Portfolio123 to develop investment strategies that you expect to apply in the real world with real money, or if you are using the platform to develop knowledge (i.e. academically), you should appreciate the changes rather than be troubled by them. Legitimate models are not typically rendered useless by the bug fixes we make. That was even so when we switched data providers from Reuters to Compustat and wound up with a completely different philosophy of data standardization leading to significant changes in many items (see generally [url=https://www.portfolio123.com/doc/WhyCompustat.pdf]https://www.portfolio123.com/doc/WhyCompustat.pdf[/url]).
The data provider that we use is the platinum standard and the dominant choice among institutions and academicians. Our internal P.I.T. engine is likewise state-of-the-art, so much so that Compustat’s Quality Assurance team has come to appreciate us as the hard-core customer that raises issues their other clients routinely miss.
Those in this thread who stated that a major change in results due to a bug fix indicates a problem with the soundness of the model are correct.
As to the live results in SA, these are based on buys and sells at market prices, so they are not impacted by bug fixes.
In theory, we could possibly keep a log of every single change made to every single item, but the cost of doing so and making it available to you in a usable manner would result in considerably higher subscription prices, something that makes no sense because . . . .
The benefits of hyper perfectionism in simulations are non-existent for the sort of modeling that needs to be done to support the goals of investors and academicians. The simulation is not an end unto itself. Its purpose is to provide feedback to you regarding the efficacy of your implementation of your strategic ideas.
If this were nothing more than a competitive video game platform, where there would be a powerful need to preserve the integrity of past user scores, a case could be made that, as Steve said, “modifying data is not risk free.” But this is not a video game platform. We’re not in business to compile and preserve user scores. We’re in business to help you develop actionable investment strategies, and accordingly, the proposition that we should refrain from fixing bugs must be absolutely positively rejected, as we do. Ditto the notion of imposing on you the cost of maintaining a change-log that contributes nothing whatsoever to the business purpose of Portfolio123.

We understand that there are some hard-core believers in views different from those expressed here. So be it. We agree to disagree.

InspectorSector · July 7, 2016, 5:13pm

Unfortunately, it all ties together.

If the performance of a sim changes from one day to the next then either a determination of the root cause is required, or the phenomenon is ignored. If the phenomenon is ignored then there is (substantial) risk that something flaky is going on. “Something flaky” usually ends up costing a fair amount of money before long.

Now, who has determined the root cause of Georg’s problem? The answer is NO ONE. Is this really the mindset that we want to carry forward? Blame a problem on the first excuse we can find and move on? Just keep ignoring it until it explodes in our faces? In all likelihood (in my estimation) there are several contributors to the changing performance problem, changes to the underlying data is ONE ANSWER, not THE ANSWER. But no one can do any kind of investigation until the database is nailed to the ground. For those people who are technically challenged, well they may not understand this.

Now I have to ask you people posting here, how many P123 bugs have you discovered to date? I have found many, and I have to tell you some of them have cost me dearly. A changing database masks many potential problems that we can’t detect.

Take care
Steve

Jrinne · July 7, 2016, 5:30pm

There were a lot of different posts on different issues.

It would be hard to know what people are agreeing or disagreeing about.

One thing I would agree to disagree with is that a short term change of a model timing SH says anything about anything. I surely would not say: “…a major change in results due to a bug fix indicates a problem with the soundness of the model are correct.”

I might say something like it takes too much out of sample data for a timing model to prove itself one way or the other for me personally. Why would I have any opinion at all on Georg’s model based on the small amount of data he presented in the post?

geov · July 7, 2016, 5:42pm

Steve,
I know of course what the root cause of the discrepancy between the port and the sim is which I posted at the beginning of this thread. As one can see the sim with the revised data performed better than the live portfolio. This leads one to believe that using this particular rule will provide a higher return than what was actually obtained. So this is an embarrassing problem, because it is uncertain whether any of the previous signals would have actually been generated with the then point-in-time data.

In future I will refrain from using any estimated data, because this is the data subject to revisions most of the time.

I agree with Marc that “Under no circumstances will we ever consider refraining from fixing an error or a bug.”
However, there are still errors which have not been fixed, such as the misleading dividend yield of the SA models, or the average returns for the trades reported in the trading stats, for example. These are not important fixes and certainly don’t worry me because I know the underlying root problem, but others may not know this.

But let us not diminish the value of P123 and all the hard work put in by the P123 team to improve this platform all the time. I am certainly very satisfied with what I am getting out of it.

InspectorSector · July 7, 2016, 5:45pm

Walter - the answer is Yes. But that might be difficult to achieve. Another possibility would be to have the database updated once per year, and give users a chance to update their sims. But it is important that other changes are not made at the same time. Another alternative would be to have a separate static “original” database that never changes, and people can run their sims on.

Steve

InspectorSector · July 7, 2016, 5:59pm

Georg - that is the problem, but do you or anyone else here truly understand the root cause? The answer has to be NO because no investigation has been done.

This point we can agree on Marc. I will state my position one more time and that is a changing database is not in anyone’s true benefit. What is most beneficial are a stationary database and stationary rules. If a rule needs to be fixed then deprecate it and create a new rule. This is what was done 2+ years ago but the mindset changed and it isn’t for the better. The bug fix that you refer to that happened last year, actually had been a demonstrated problem for 2 years prior.

Yes but lets not give them swelled heads.

Steve