VectorBT Pro - Plotting Indicators and Visualising Strategy with Cleaned Signals
11 min read

VectorBT Pro - Plotting Indicators and Visualising Strategy with Cleaned Signals

VectorBT Pro - Plotting Indicators and Visualising Strategy with Cleaned Signals

In this blog we will see how to visualize our Double Bollinger Band strategy along with the indicators and the cleaned entries/exits from the simulation. You will master your VectorBT Pro plotting skills by creating your own plot_strategy() function and learn how to go from a basic plot like this ⬇

H4_OHLCV_Only

TO this Fancy Advanced Plot 🎉 😎 with stacked figures, entries/exits and layered indicators

Final_Strategy_Plot

Plotting - Basics 👶

As before we will start with the global settings we will use everywhere for our plotting like this dark theme and figure / width.

## Global Plot Settings for vectorBT
vbt.settings.set_theme("dark")
vbt.settings['plotting']['layout']['width'] = 1280

OHLCV Plot

The code to get a basic OHLCV plot is shown below, to get custom plot titles and other attributes you pass it in the kwargs. To make some sensible visualization and not get a super condensed plot we will use .iloc to slice a small sample of the dataframe.

## Plot OHLCV data first
kwargs1 = {"title_text" : "OHLCV Plot", "title_font_size" : 18}
h4_ohlc_sample = h4_df[["Open", "High", "Low", "Close"]].iloc[100:200]#.dropna()
f = h4_ohlc_sample.vbt.ohlcv.plot(**kwargs1)
f.show()

H4_OHLCV_Only

This gives the plot you saw earlier, and yes we see some ugly gaps in the candlestick data, which we will see how to fix later. Let's try to add the Bollinger Bands indicator on top of this basic candlestick plot

OHLCV Plot with Bollinger Bands

h4_bbands.iloc[100:200].plot(fig = f,
                            lowerband_trace_kwargs=dict(fill=None, name = 'BB_Price_Lower'), 
                            upperband_trace_kwargs=dict(fill=None, name = 'BB_Price_Upper'),
                            middleband_trace_kwargs=dict(fill=None, name = 'BB_Price_Middle')).show()

H4_OHLCV_with_2BB

The fundamental concept you have to understand in layering elements on a figure is to reference the parent figure, notice how the fig = f in the above Bollinger Band plot is referencing the parent figure object f we first created. Also notice that child objects inherit the styling and other attributes kwargs1 you passed to the parent object. You can ofcourse over-ride them by placing another **kwargs in the plot() function call.

Adding RSI on Stacked SubPlots

So we learnt how to add an indicator to the figure, but what if we want to create stacked subplots, like adding an RSI Indicator below the previous plot. This is essentially done using the vbt.make_subplots() function and the use of the add_trace_kwargs argument inside the plot() function.

kwargs1 = {"title_text" : "H4 OHLCV with BBands on Price and RSI", "title_font_size" : 18, 
           "legend" : dict(yanchor="top",y=0.99, xanchor="right",x= 0.25)}

fig = vbt.make_subplots(rows=2,cols=1, shared_xaxes=True, vertical_spacing=0.1)

## Sliced Data
h4_price = h4_df[["Open", "High", "Low", "Close"]]
indices = slice(100,200)
h4_price.iloc[indices].vbt.ohlcv.plot(add_trace_kwargs=dict(row=1, col=1),  fig=fig, **kwargs1) 
h4_bbands.iloc[indices].plot(add_trace_kwargs=dict(row=1, col=1),fig=fig,
                            lowerband_trace_kwargs=dict(fill=None, name = 'BB_Price_Lower'), 
                            upperband_trace_kwargs=dict(fill=None, name = 'BB_Price_Upper'),
                            middleband_trace_kwargs=dict(fill=None, name = 'BB_Price_Middle'))

h4_rsi.iloc[indices].rename("RSI").vbt.plot(add_trace_kwargs=dict(row=2, col=1),fig=fig, **kwargs1 )

h4_bbands_rsi.iloc[indices].plot(add_trace_kwargs=dict(row=2, col=1),limits=(25, 75),fig=fig,
                            lowerband_trace_kwargs=dict(fill=None, name = 'BB_RSI_Lower'), 
                            upperband_trace_kwargs=dict(fill=None, name = 'BB_RSI_Upper'),
                            middleband_trace_kwargs=dict(fill=None, name = 'BB_RSI_Middle'),
                            # xaxis=dict(rangeslider_visible=True) ## Without Range Slider
                            )

fig.update_xaxes(rangebreaks=[dict(values=dt_breaks)])
fig.layout.showlegend = False
fig.show()

Stacked_RSI_Plot


Plotting - Advanced 💪

In this section, we will see how to create your own plot_strategy with all the customizations you would want. Please read the inline comments in the code block below to understand the inner workings and details of the various commands

def plot_strategy(slice_lower : str, slice_upper: str, df : pd.DataFrame , rsi : pd.Series,
                  bb_price : vbt.indicators.factory, bb_rsi : vbt.indicators.factory,  
                  pf: vbt.portfolio.base.Portfolio, entries: pd.Series = None, 
                  exits: pd.Series = None,
                  show_legend : bool = True):
    """Creates a stacked indicator plot for the 2BB strategy.
    Parameters
    ===========
    slice_lower : str, start date of dataframe slice in yyyy.mm.dd format
    slice_upper : str, start date of dataframe slice in yyyy.mm.dd format
    df          : pd.DataFrame, containing the OHLCV data
    rsi         : pd.Series, rsi indicator time series in same freq as df
    bb_price    : vbt.indicators.factory.talib('BBANDS'), computed on df['close'] price
    bb_rsi      : vbt.indicators.factory.talib('BBANDS') computer on RSI
    pf          : vbt.portfolio.base.Portfolio, portfolio simulation object from VBT Pro
    entries     : pd.Series, time series data of long entries
    exits       : pd.Series, time series data of long exits
    show_legend : bool, switch to show or completely hide the legend box on the plot
    
    Returns
    =======
    fig         : plotly figure object
    """
    kwargs1 = {"title_text" : "H4 OHLCV with BBands on Price and RSI", 
               "title_font_size" : 18,
               "height" : 960,
               "legend" : dict(yanchor="top",y=0.99, xanchor="left",x= 0.1)}
    fig = vbt.make_subplots(rows=2,cols=1, shared_xaxes=True, vertical_spacing=0.1)
    ## Filter Data according to date slice
    df_slice = df[["Open", "High", "Low", "Close"]][slice_lower : slice_upper]
    bb_price = bb_price[slice_lower : slice_upper]
    rsi = rsi[slice_lower : slice_upper]
    bb_rsi = bb_rsi[slice_lower : slice_upper]

    ## Retrieve datetime index of rows where price data is NULL
    # retrieve the dates that are in the original datset
    dt_obs = df_slice.index.to_list()
    # Drop rows with missing values
    dt_obs_dropped = df_slice['Close'].dropna().index.to_list()
    # store  dates with missing values
    dt_breaks = [d for d in dt_obs if d not in dt_obs_dropped]

    ## Plot Figures
    df_slice.vbt.ohlcv.plot(add_trace_kwargs=dict(row=1, col=1),  fig=fig, **kwargs1) ## Without Range Slider
    rsi.rename("RSI").vbt.plot(add_trace_kwargs=dict(row=2, col=1), trace_kwargs = dict(connectgaps=True), fig=fig) 

    bb_line_style = dict(color="white",width=1, dash="dot")
    bb_price.plot(add_trace_kwargs=dict(row=1, col=1),fig=fig, **kwargs1,
                lowerband_trace_kwargs=dict(fill=None, name = 'BB_Price_Lower', connectgaps=True, line = bb_line_style), 
                upperband_trace_kwargs=dict(fill=None, name = 'BB_Price_Upper', connectgaps=True, line = bb_line_style),
                middleband_trace_kwargs=dict(fill=None, name = 'BB_Price_Middle', connectgaps=True) )

    bb_rsi.plot(add_trace_kwargs=dict(row=2, col=1),limits=(25, 75),fig=fig,
                lowerband_trace_kwargs=dict(fill=None, name = 'BB_RSI_Lower', connectgaps=True,line = bb_line_style), 
                upperband_trace_kwargs=dict(fill=None, name = 'BB_RSI_Upper', connectgaps=True,line = bb_line_style),
                middleband_trace_kwargs=dict(fill=None, name = 'BB_RSI_Middle', connectgaps=True, visible = False))
    
    ## Plots Long Entries / Exits and Short Entries / Exits
    pf[slice_lower:slice_upper].plot_trade_signals(add_trace_kwargs=dict(row=1, col=1),fig=fig,
                                                   plot_close=False, plot_positions="lines")

    ## Plot Trade Profit or Loss Boxes
    pf.trades.direction_long[slice_lower : slice_upper].plot(
                                        add_trace_kwargs=dict(row=1, col=1),fig=fig,
                                        plot_close = False,
                                        plot_markers = False
                                        )
                                        

    pf.trades.direction_short[slice_lower : slice_upper].plot(
                                            add_trace_kwargs=dict(row=1, col=1),fig=fig,
                                            plot_close = False,
                                            plot_markers = False
                                            )

    if (entries is not None) & (exits is not None):
        ## Slice Entries and Exits
        entries = entries[slice_lower : slice_upper]
        exits = exits[slice_lower : slice_upper]
        ## Add Entries and Long Exits on RSI in lower subplot
        entries.vbt.signals.plot_as_entries(rsi, fig = fig,
                                                add_trace_kwargs=dict(row=2, col=1),
                                                trace_kwargs=dict(name = "Long Entry", 
                                                                  marker=dict(color="limegreen")
                                                                  ))  

        exits.vbt.signals.plot_as_exits(rsi, fig = fig, 
                                            add_trace_kwargs=dict(row=2, col=1),
                                            trace_kwargs=dict(name = "Short Entry", 
                                                              marker=dict(color="red"),
                                                            #   showlegend = False ## To hide this from the legend
                                                              )
                                            )

    fig.update_xaxes(rangebreaks=[dict(values=dt_breaks)])
    fig.layout.showlegend = show_legend  
    # fig.write_html(f"2BB_Strategy_{slice_lower}_to_{slice_upper}.html")
    
    return fig
slice_lower = '2019.11.01'
slice_higher = '2019.12.31'

fig = plot_strategy(slice_lower, slice_higher, h4_data.get(), h4_rsi, 
                           h4_bbands, h4_bbands_rsi, pf,
                           clean_h4_entries, clean_h4_exits,
                           show_legend = True)

# fig.show_svg()
fig.show()

Final_Strategy_Plot

Notes:

  • We are using the H4 timeframe data for our plotting, as the 15T baseline data series results in a very dense looking plot which slows down the interactive plot.
  • We are passing a slice of the time series dataframe in order to avoid a dense, highly cluttered plot. You can make this interactive also if you were to use plotly-dash (a topic for another blog post perhaps 😃 )
  • The gaps in the OHLCV candlesticks on weekends are fixed by storing the dates with missing values in the dt_breaks variable and passing this in the rangebreaks=[dict(values=dt_breaks)] argument in the fig.update_xaxes() method to filter our the missing dates in the x-axes.
  • We fixed the gaps occuring in the bollinger bands (due to weekend market closures) by using the connectgaps = True argument in each _trace_kwargs of the bollinger bands.
  • Note, that you may wonder why we are naming the legends entries and exits in the RSI subplot below as Long Entries and Short Entries and not four distinct labels as in the first subplot.
    • This is because since we set direction = both in the pf.simulation the Short Entry is basically an Exit for the previous Long position and similarly, a Long Entry is basically an exit for the previous short position. This type of nomenclature setting makes the plot also usable for other settings like when you would want to set direction = longonly in the pf.simulation in which case you can just call these legend items Long Entries and Long Exits on the RSI subplot.

Cleaning entries and exits

You might have noticed in the last code block we were using clean_h4_entries and clean_h4_exits. Lets understand why we did that? Before we plot any entries and exits for visual analysis, it is imperative to clean them. We will continue with the entries and exits at the end of our previous tutorial Strategy Development and Signal Generation in order to explain how we can go about cleaning and resampling entry & exit signals.

entries = mtf_df.signal == 1.0
exits = mtf_df.signal == -1.0

The total nr. of actual signals in the entries and exits array is computed by the vbt.signals.total() accessor method.

print(f"Length of Entries (array): {len(entries)} || Length of Exits (array): {len(exits)}" )
print(f"Total Nr. of Entry Signals: {entries.vbt.signals.total()} || \
        Total Nr. of Exit Signals: {exits.vbt.signals.total()}")

Output:

Length of Entries (array): 105188 || Length of Exits (array): 105188
Total Nr. of Entry Signals: 2135 || Total Nr. of Exit Signals: 1568

Why do you want to clean the entries and exits?

From the print statement output we can see that in the entire length of the dataframe 105188 on 15T frequency, we can see that the total number of raw entry and raw exit signals is 2135 and 1568 respectively, which is prior to any cleaning. Currently there are many duplicate signals in the entries and exits, and this discrepancy between entries and exits need to be resolved by cleaning.

Cleaning entry and exit signals is easily done by vbt.signals.clean() accessor and after cleaning each entry signal has a corresponding exit signal. When we validate the total number of clean entry and clean exit signals using the vbt.signals.total() method now, we can now see the cleaned entry and exit signals total to be 268 and 267 respectively.

## Clean redundant and duplicate signals
clean_entries, clean_exits = entries.vbt.signals.clean(exits)
print(f"Length of Clean_Entries Array: {len(clean_entries)} || Length of Clean_Exits Array: {len(clean_exits)}" )
print(f"Total nr. of Entry Signals in Clean_Entry Array: {clean_entries.vbt.signals.total()} || \
        Total nr. of Exit Signals in Clean_Exit Array: {clean_exits.vbt.signals.total()}")

Output

Length of Clean_Entries Array: 105188 || Length of Clean_Exits Array: 105188
Total nr. of Entry Signals in Clean_Entry Array: 268 || Total nr. of Exit Signals in Clean_Exit Array: 267

Could you think of a reason why there is a difference of one signal between the clean_entry and clean_exit signals ?

  • This is happening because the most recent (entry) position that was last opened, has not been closed out.

Cleaning Signals - Visual Difference

The below plots will make it easier when understanding what happens if we use uncleaned signals
Uncleaned Entries & Exits on our RSI Plot
unCleaned_RSI

Cleaned Entries & Exits on our RSI plot
Cleaned_RSI

Resampling entries and exits to H4 timeframe

Resampling any entries/exits is only required when we are concerned with analyzing (plotting, counting etc.) and visualizing our strategy and entries/exits on a timeframe different from the baseline frequency of our strategy. It is always recommended to first clean the raw entries/exits array and then do the resampling.

When do we need to upsample and downsample entries & exits?

  • Downsampling of entries/exists is required if we want to do any visual analysis on a timeframe higher than that our baseline frequency of our strategy.
    • For eg, in the below code we show a case of Downsampling our entries/exits on the 15m timeframe to H4 timeframe. Downsampling comes with the risk of information loss.
  • Upsampling of entries/exits is required, if we want to do any visual analysis on a timeframe lower than that our baseline frequency of our strategy.
    • For example, you have some crossover on 4h frequency and you want to combine the signals with some other crossover on 15 min frequency, then you would need to upsample signals from 4h->15min, which would cause no problems because there is no information loss in upsampling
clean_h4_entries = clean_entries.vbt.resample_apply("4h", "any", wrap_kwargs=dict(dtype=bool))
clean_h4_exits = clean_exits.vbt.resample_apply("4h", "any", wrap_kwargs=dict(dtype=bool))

print(f"Length of H4_Entries (array): {len(clean_h4_entries)} || Length of H4_Exits (array): {len(clean_h4_exits)}" )
print(f"Total nr. of H4_Entry Signals: {clean_h4_entries.vbt.signals.total()} || \
        Total nr. of H4_Exit Signals: {clean_h4_exits.vbt.signals.total()}")

Output

Length of H4_Entries (array): 6575 || Length of H4_Exits (array): 6575
Total nr. of H4_Entry Signals: 263 || Total nr. of H4_Exit Signals: 263

Information Loss during downsampling

From the above output we see the nr. of signals on the clean_h4_entries and clean_h4_exits as 263 respectively, this shows a loss of information during this Downsampling process. This information loss occurs because by aggregating signals during downsampling, our data becomes less granular.

Key Points

  • How information loss occurs during downsampling of entries/exits?
    • Let's say we have [entry, entry, exit, exit, entry, entry] in our signal array, after cleaning we'll get [entry, exit, entry], but after aggregating the original signal sequence you'll get just {entry:4 ; exit:2}, which clearly cannot after cleaning produce the same sequence as on the smaller (more granular) timeframe.
    • The problem here is that the information loss occured during downsampling the cleaned entries and exits, ignores any exit that could have closed the position if you back tested on the original timeframe (15T in our example), that is why we end up seeing 263 || 263 in the above print statement.
  • After downsampling if we want to retrace the order of signals and we want to investigate "Which signal came first in the original timeframe ?", we encounter a irresolvible problem. We can't resample 1m (minutely) data to one year timeframe and then expect the signal count to be the same even though our new data is all fitting in only one bar (1D candle).

By changing the method to sum and removing the wrap_kwargs argument in the vbt.resample_apply() method we can aggregate the signals and show the aggregated nr. of signals.

## sum() will aggregate the signals
h4_entries_agg = clean_entries.vbt.resample_apply("4h", "sum")
h4_exits_agg = clean_exits.vbt.resample_apply("4h", "sum")

## h4_extries_agg is not a vbt object so vbt.signals.total() accessor will not work 
## and thus we use the pandas sum() method in the print statement below
print(f"Length of H4_Entries (array): {len(h4_entries_agg)} || Length of H4_Exits (array): {len(h4_exits_agg)}" )
print(f"Aggregated H4_Entry Signals: {int(h4_entries_agg.sum())} || Aggregated H4_Exit Signals: {int(h4_exits_agg.sum())}")

Output

Length of H4_Entries (array): 6575 || Length of H4_Exits (array): 6575
Aggregated H4_Entry Signals: 268   || Aggregated H4_Exit Signals: 267

Though the result of this aggregation shows the same result we got above when printing the clean_entries.vbt.signals.total() and clean_exits.vbt.signals.total(), this does not help us in inferring the true order of the signals and the information loss created by downsampling cannot be compensated by aggregation.

Do we have to resample entries/exits for running the backtesting simulation?

When you are doing portfolio simulation (backtesting), using the from_signals() which we see in the previous blog post - Strategy Development and Signal Generation, the from_signals method automatically cleans the entries and exits , therefore there is no reason to do resampling or cleaning of any entries and exits.
Irrespective of whether we use the clean entries and exits or just the regular entries and exits in the simulation/backtest when running from_signals() it will make no difference. You can try this for yourself and see the results will be the same.