User:Devon McCormick/Research/HoldingWinnersSellingLosers0

From J Wiki
Jump to navigation Jump to search

This was originally presented on 4/29/2007 for the NYC Financial Engineering Meetup. It illustrates the use of J to manipulate data quickly and easily in order to explore topics in quantitative financial research. This work continues here.

How Long to Hold Winners and Losers

Background: One well-known behavioral shortcoming of investors is a tendency to take profits too early on investments that are doing well and, conversely, to hold on to losing positions for too long. One of the strengths of quantitative investing is overcoming bad investing behavior by sticking to rules that correct an investor’s natural tendency to do the wrong thing. An example of this is how the regular re-balancing discipline of asset allocation corrects the tendency to over-invest in recent winners and avoid recent losers.

Questions to be addressed: How do we define winners versus losers? Are there simple rules to determine sales? Do the rules work the same – but in reverse – for short positions?

Caveats

Can we identify groups of assets for which the rules work better or worse? What is the form of a good rule (e.g. flat percentage, volatility- or market- related change)? Are there exogenous factors that influence rules? Do similar rules apply if we look at excess-return space? Are the best rules good enough to be worth implementing?

A Preliminary Look at Some Data

Code is include in a fixed (Courier) font; user input is indented three spaces; output is left-justified.

   load '~User/code/SPXData.ijs'
   $CODAT
848

We have data on 848 different companies over the whole period. Here are the first and last elements of DTS which is sorted ascending in date order.

   ({.,{:)DTS
+----------+----------+
| 1/02/1990|12/03/2004|
+----------+----------+

The data begins at the start of 1990 and goes almost to the end of 2004.

   $COINFO
849 4

We have four columns of data on each company, e.g. looking at the first three rows in the table:

   3{.COINFO
+------+-------------------+--------+----------+
|Symbol|Company Name       |CUSIP   |Date      |
+------+-------------------+--------+----------+
|MMM   |3M Co.             |88579Y10| 1/02/1990|
+------+-------------------+--------+----------+
|ABT   |Abbott Laboratories|00282410| 1/02/1990|
+------+-------------------+--------+----------+

Selecting only companies with data for the entire period,

   'dattit seldat inftit selinf'=: selCompleteSets ''
   $seldat
245

Using a Dense Matrix

Arbitrarily selecting this dataset gives us 245 companies with data for the whole period which means that we have survivor bias built in but it's easier to have a dense matrix to show off cool matrix abilities without messing with the real-world messiness of collected data. That is, this data and any conclusions are for illustrative purposes only and does not constitute investment advice.

   $clpxs=. ((dattit i. <'Close'){"1 &> seldat)%(dattit i. <'Adjust'){"1 &> seldat
245 3765

So, the 245 x 3765 table "clpxs" consists of prices for 245 companies over 3,765 days.

To pick a few companies on which to concentrate initially, look at the standard deviations of the prices and choose the companies with the lowest and highest.

   usus sds=. stddev"1 clpxs
4.7264748 46.584648 14.71278 7.0387579
   (sds i. (<./,>./)sds){selinf  NB. lowest and highest stddevs
+---+---------------------------+--------+----------+
|WOR|Worthington Industries Inc.|98181110| 1/02/1990|
+---+---------------------------+--------+----------+
|GLW|Corning Inc.               |21935010| 1/02/1990|
+---+---------------------------+--------+----------+

Do the same thing for the standard deviation of the share volume.

   clvol=. ((dattit i. <'Volume'){"1 &> seldat)*(dattit i. <'Adjust'){"1 &> seldat
   usus sds=. stddev"1 clvol
0.097023249 26.049271 1.5918885 2.6359499
   (sds i. (<./,>./)sds){selinf  NB. lowest and highest stddevs
+----+--------------+--------+----------+
|MDP |Meredith Corp.|58943310| 1/02/1990|
+----+--------------+--------+----------+
|INTC|Intel Corp.   |45814010| 1/02/1990|
+----+--------------+--------+----------+

So, we'll experiment a little with just these four companies (Worthington, Corning, Meredith, and Intel or WOR, GLW, MDP, INTC) for now.

First, look at a plot of the prices: [[1]] [[2]]
However, we know that it’s more “quant” to look at returns: [3]
Since we select two companies based on the biggest variation in trading volume, let’s look at that too: |[4]
We could also look at the second derivative of price, that is, change in daily returns: [5]

Now, how do we distinguish stocks in downward price trends from those in upward trends. We could look at number of days up versus number of days down in a period: this would be a technical way of doing it. However, a more quant method is to do a linear regression over some period, say 20 days and look at the slope of the line over that period: if it’s negative, prices are going down and if it’s positive, they’re going up. This is how we might graph this. We’ll only look at about year’s worth of data since it’s hard to see anything looking at 20-day periods over 15 years.

Here's the 20-day linear regressions of the prices for each of our four stocks over the first year or so: Px1990LinAprox20DayWdw_20.png
Doing the same for returns presents a less clear picture: Rets1990Lin20Dy.png

An Example of Some Problems with Data Consistency

The following may give some idea of how hard it is to get good, consistent data. Here are two samples of purported price and volume data for AT&T, not exactly an obscure stock, for a period of about a month-and-a-half. This particular period was chosen because it covers the time of the company's (reverse) stock split of 1-for-5 shares, so there are some interesting discrepancies between the two versions of the same series.

Price and Volume Data for AT&T from Two Different Sources
From Yahoo Finance From Factset/Compustat Ratios
Date Close Volume Adj Close Close Volume Adjust FPx/YPx F/Yadj Yvol/Fvol YV/FVAdj
10/30/2002 25.96 6618700 20.72 66.00 1.908 5 2.54 3.19 3.47 0.69
10/31/2002 25.66 6984600 20.48 65.20 3.095 5 2.54 3.18 2.26 0.45
11/1/2002 27.25 10612200 21.75 67.60 2.819 5 2.48 3.11 3.76 0.75
11/4/2002 27.87 10199400 22.25 69.45 3.277 5 2.49 3.12 3.11 0.62
11/5/2002 27.72 7174500 22.13 71.50 2.594 5 2.58 3.23 2.77 0.55
11/6/2002 27.80 6334400 22.19 70.30 3.089 5 2.53 3.17 2.05 0.41
11/7/2002 26.77 6267400 21.37 68.25 2.989 5 2.55 3.19 2.10 0.42
11/8/2002 27.21 8021400 21.72 69.50 2.279 5 2.55 3.20 3.52 0.70
11/11/2002 26.20 5076400 20.91 67.60 1.665 5 2.58 3.23 3.05 0.61
11/12/2002 25.01 12683200 19.96 69.30 2.65 5 2.77 3.47 4.79 0.96
11/13/2002 24.32 12866800 19.41 67.35 4.057 5 2.77 3.47 3.17 0.63
11/14/2002 24.40 7596800 19.48 68.35 2.499 5 2.80 3.51 3.04 0.61
11/15/2002 25.19 7715400 20.11 69.30 3.209 5 2.75 3.45 2.40 0.48
11/18/2002 25.64 8173000 20.47 67.55 5.065 5 2.63 3.30 1.61 0.32
11/19/2002 25.48 6724500 20.34 27.20 16.506 1 1.07 1.34 0.41 0.41
11/20/2002 26.20 9795800 20.91 27.66 13.585 1 1.06 1.32 0.72 0.72
11/21/2002 27.48 11146700 21.94 28.00 10.683 1 1.02 1.28 1.04 1.04
11/22/2002 27.55 8017700 21.99 27.97 8.639 1 1.02 1.27 0.93 0.93
11/25/2002 28.40 8141600 22.67 27.99 8.436 1 0.99 1.23 0.97 0.97
11/26/2002 27.45 7918400 21.91 27.83 8.28 1 1.01 1.27 0.96 0.96
11/27/2002 28.73 5098900 22.93 28.00 11.55 1 0.97 1.22 0.44 0.44
11/29/2002 28.50 3332300 22.75 28.04 2.157 1 0.98 1.23 1.54 1.54
12/2/2002 28.35 7492300 22.63 28.00 5.366 1 0.99 1.24 1.40 1.40
12/3/2002 26.93 8025100 21.50 28.08 5.273 1 1.04 1.31 1.52 1.52
12/4/2002 26.45 8882400 21.11 28.06 4.125 1 1.06 1.33 2.15 2.15
12/5/2002 26.25 6162700 20.95 27.95 3.943 1 1.06 1.33 1.56 1.56
12/6/2002 26.43 6156600 21.10 28.00 3.927 1 1.06 1.33 1.57 1.57
12/9/2002 25.69 5830800 20.51 27.89 4.453 1 1.09 1.36 1.31 1.31
12/10/2002 25.70 5328900 20.51 26.64 7.496 1 1.04 1.30 0.71 0.71

As you can see, the numbers are very different. Looking at the "adjustment" column for both series, we see that Yahoo is adjusting the price for the dividends (in "Adj Close") whereas Factset has a share multiplier (in "Adjust") to adjust for splits. However, neither the ratios of the unadjusted nor the adjusted prices seem consistent. Not only this, but the ratios of the share volume numbers, which should be affected by splits but not dividends, seem to vary too much to be able to simply reconcile the two series.

Concentrating on the volume numbers as they should have fewer confounding factors, we can compare the day-to-day changes in volume to see if there is some obvious correspondence or consistent difference between the series.

Compare Rates of Change Between Two Datasets
From Yahoo Finance From Factset/Compustat
Date Yahoo

Close

Yahoo

Volume

Yahoo

AdjClose

Factset

Close

Factset

Volume

10/31/2002 -1.16% 5.53% -1.16% -1.21% 62.21%
11/1/2002 6.20% 51.94% 6.20% 3.68% -8.92%
11/4/2002 2.28% -3.89% 2.30% 2.74% 16.25%
11/5/2002 -0.54% -29.66% -0.54% 2.95% -20.84%
11/6/2002 0.29% -11.71% 0.27% -1.68% 19.08%
11/7/2002 -3.71% -1.06% -3.70% -2.92% -3.24%
11/8/2002 1.64% 27.99% 1.64% 1.83% -23.75%
11/11/2002 -3.71% -36.71% -3.73% -2.73% -26.94%
11/12/2002 -4.54% 149.85% -4.54% 2.51% 59.16%
11/13/2002 -2.76% 1.45% -2.76% -2.81% 53.09%
11/14/2002 0.33% -40.96% 0.36% 1.48% -38.40%
11/15/2002 3.24% 1.56% 3.23% 1.39% 28.41%
11/18/2002 1.79% 5.93% 1.79% -2.53% 57.84%
11/19/2002 -0.62% -17.72% -0.64% -59.73% 225.88%
11/20/2002 2.83% 45.67% 2.80% 1.69% -17.70%
11/21/2002 4.89% 13.79% 4.93% 1.23% -21.36%
11/22/2002 0.25% -28.07% 0.23% -0.11% -19.13%
11/25/2002 3.09% 1.55% 3.09% 0.07% -2.35%
11/26/2002 -3.35% -2.74% -3.35% -0.57% -1.85%
11/27/2002 4.66% -35.61% 4.66% 0.61% 39.49%
11/29/2002 -0.80% -34.65% -0.78% 0.14% -81.32%
12/2/2002 -0.53% 124.84% -0.53% -0.14% 148.77%
12/3/2002 -5.01% 7.11% -4.99% 0.29% -1.73%
12/4/2002 -1.78% 10.68% -1.81% -0.07% -21.77%
12/5/2002 -0.76% -30.62% -0.76% -0.39% -4.41%
12/6/2002 0.69% -0.10% 0.72% 0.18% -0.41%
12/9/2002 -2.80% -5.29% -2.80% -0.39% 13.39%
12/10/2002 0.04% -8.61% 0.00% -4.48% 68.34%

Graphing the pair of daily volume change series:

volumeChangesComparison.png

As we can see, there is a general correspondence but some puzzling differences remain. However, the biggest difference seems to occur on the date of the reverse split (11/19/2002), so can be explained by that.

This work continues here.