Are there factors that influence the direction of the market

Executive summary

We found 2 factors that significantly influence the direction of the S&P futures market: time of the day and downtrends.

The baseline win rate for the S&P futures is 51.6%, but we found time periods that have an average win rate of 55.4%, and some as low as 47% average win rate. We also found that the market has a mean-reversion tendency: when a 30-minute candle went down, the next candle has a 53% chance of being winning vs 50% for other circumstances.

We used these two factors to build a 54% win rate system compared to the baseline. The system is also more robust to downturns.

Detailed summary

We built a model using the findings mentioned above. The basic idea was to go long during time frames with a 55.4% average win rate, and short on the frames with a 47% win rate. The rationale behind this strategy is that by introducing consistently short and long positions, in theory we should be able to catch both upward and downward movements of the market.

The new model was tested using historical data. It went below 50% win rate only 9.6% of the times vs 33% for baseline. On the upside, it had a 55% win rate or more 40.7% of the time vs 1.8% for baseline.

Baseline

New Model

Model Building Methodology

In this section, we will explain in detail the methodology we used to build the model.

Definitions

Dependent variable

We used win rate % as the dependant variable of our model A candle is considered positive or "won" when the difference between Close Price and Open Price is positive (Close - Open ). The win rate is calculated by the sum of won candles divided by the total number of candles.

Independant variables

I came up with a few variables that could help predict the direction that I hadn't really tested systematically yet.
Quantile size of the previous time period: the size of the candle
Calculation: the absolute value of (High - Low)/Open.
Top 1% of biggest candle, top 2%, top 3% etc.
We don't care if the candle is winning or losing, we only care about the size of the candle.
Quantile profit of previous time period: the size of the profit in %, without concern for the direction
Calculation : absolute value of (Close - Open)/Open
Top 1%, Top 2 % of biggest profit percent in absolute.
We don't care if the candle is winning or losing, we only care about the size of the profit.
The Hausdorff / Fractal Dimension: explained in the previous article
Weekday: Monday, Tuesday, etc
Time of the day: 9:30, 10:30, 11:00 etc

In total, we have 8 categories to be tested
Quantile size of the previous time period
Quantile profit of previous time period
The Haussdorf / Fractal Dimension
Weekday
Time of the day
Position difference
Position direction
Position category

New Variable Definition: Candle Position

Markets are very chaotic and are difficult to analyze using traditional analytics methods. To solve this problem, we came up with a candle classification method to simplify the data.

With this method, any candle is divided into four equal parts, starting from below to top. Two numbers are given, the first one corresponding to the part of the candle where the Open was, the second for the Close. For instance, in the candle below, the Open was in part 4, and the Close was also in part 4. Therefore, we call this candle '4-4'.

Below are a few other examples. It is very important to note that this refers only to the shape of the candle.

Using this classification, we can subdivide candles into more categories. The first is called the "position difference", which is the absolute value of the difference between the 2 values:

Candles 1-4 and 4-1 have a position difference of 3
4-2, 3-1, 1-3, 2-4 have a difference of 2
1-2, 2-3, 3-4, 4-3, 3-2, 2-1 have a difference of 1
1-1, 2-2, 3-3, 4-4 have a difference of 0.

Fun fact, candles that have a position difference of 0 closely resemble what technical analysis describes as "Hammer", "Dojis", "Hanging Man" etc. Here are some examples of candles.

We also added candle direction. For those that have position difference "equal", we call them "equal", those that go up "up" and down "down. Very original...
"Equal" : 1-1, 2-2, 3-3, 4-4
"Up : 1-2, 1-3, 1-4,2-3, 2-4, 3-4
"Down" : 4-1, 4-3, 4-2, 3-2, 3-1, 2-1

Statistical Analysis

We have assessed the 8 factors mentioned previously and did some basic statical tests to assess the statistical significance. Two factors proved to be particularly significant: position direction and time of the day.

We were surprised to find out that when candles go down, the following candle has higher chances of being positive than otherwise. We would have expected to be otherwise. This seems to imply a certain tendency to revert to the mean by the market.

The position category provides some drill down about different candle configurations and their respective win rate. We can see for instance that even though the "up" position direction has a low win rate, the categories "1-2" and "2-3" are upward, they actually have a positive win rate percentage.

Another factor that proved to be particularly robust was the time of the day. We identified five time periods (half hours) that had a very strong tendency to be winning (55.4% win rate vs (51.6% baseline). The same is true for certain hours with a low win rate (47% average).

Even though some time frames seem to be highly profitable, this doesn't come with some problems. Below is a plot of the distribution of weekly win rates for high win rate ours vs baseline. In one glimpse, we can see the weekly win rate varies greatly for high win rate time frames compared to the baseline. The reason why is because their weekly sample size is much smaller.

Nevertheless, if we look at the green vertical line on the graph, which represents the mean win rate of the baseline, we can see that most of the data for high win time periods is distributed on the right of the baseline mean. Many times, we have seen weeks with 60%, 70%, and even 80% win rate. On the other hand, there have been occurrences where the win rate went as low as 20%. This is rather concerning, a low 30% win rate can mean a big drop in the P&L.

To see the chaotic high variability of the win rate, we have plotted the weekly win rate over time. The red color represents values below 50% win rate and green values above 50% win rate. On the y axis, a value of 2.5 means 50+2.5 = 52.5%, -2.5 means 50-2.5 = 47.5%.

For this kind of time series, we prefer using a 4 weeks rolling average, this allows us to simulate the equivalent of a monthly win rate, and also it averages outliers. The result is still very noisy. Some periods of high win rates follow some very negative win rates.

Testing the data

We tried to combine both position direction and time frames to try to generate a model with a high success rate. Our idea is to long the high win rate % time frames, and short those with low % win rate. Hopefully, this will be able to catch both up and downswings and create a more robust model.

For result discussion, see the detailed summary above.

Conclusion

This model is very encouraging for the future. The next step will be to work on profitability, for instance clarifying how to handle position sizing, targets, and stop losses. And also possibly use a more advanced machine algorithm to improve the direction prediction accuracy.