Dry goods丨How to use Orca to develop quantitative strategies?

This article describes an example of momentum strategy. Momentum strategy is one of the most famous quantitative long and short-term stock strategies. Since Jegadeesh and Titman (1993) first proposed this concept, it has been widely used in academic research and sales. Investors believe in momentum strategies that, among individual stocks, past winners will surpass past losers.

The most commonly used momentum factor is the stock's earnings after excluding the most recent month and the past 12 months. In academic publications, momentum strategies are usually adjusted once a month and the holding period is also one month. In this example, we rebalance 1/21 of our portfolio every day and hold new shares for 21 days. For simplicity, we do not consider transaction costs.

Step 1: Load the stock trading data, clean and filter the data, and then construct a momentum signal for each company's stock in the past 12 months, ignoring the latest month.

def load_price_data(df):
    USstocks = df[df.date.dt.weekday.between(0, 4), df.PRC.notnull(), df.VOL.notnull()][
                   ['PERMNO', 'date', 'PRC', 'VOL', 'RET', 'SHROUT']
               ].sort_values(by=['PERMNO', 'date'])
    USstocks['PRC'] = USstocks.PRC.abs()
    USstocks['MV'] = USstocks.SHROUT * USstocks.PRC
    USstocks['cumretIndex'] = (USstocks + 1)['RET'].groupby('PERMNO', lazy=True).cumprod()
    USstocks['signal'] = (USstocks.shift(21) / USstocks.shift(252) - 1).groupby(
                            'PERMNO', lazy=True)['cumretIndex'].transform()
    return USstocks


df = orca.read_csv('C:/DolphinDB/Orca/databases/USstocks.csv')
price_data = load_price_data(df)

Note: The above code uses two extended functions of Orca.

  1. Orca supports the use of commas instead of & in conditional filter statements, which will be more efficient in some scenarios. See the tutorial .
  2. Orca's groupbyfunctions provide lazy parameters, which transformcan be used in conjunction with the function to realize the context by function of DolphinDB. See the tutorial .

Step 2: Generate a portfolio for the momentum strategy.

First, select the tradable stocks that meet the following conditions: no missing momentum signal value, positive trading volume on the day, market value of more than 100 million US dollars, and a price of more than 5 US dollars per share.

def gen_trade_tables(df):
    USstocks = df[(df.PRC > 5), (df.MV > 100000), (df.VOL > 0), (df.signal.notnull())]
    USstocks = USstocks[['date', 'PERMNO', 'MV', 'signal']].sort_values(by='date')
    return USstocks


tradables = gen_trade_tables(price_data)

Secondly, based on momentum signals, 10 groups of tradable stocks are formulated. Only the 2 most extreme groups (winners and losers) are kept. Suppose we always want to be long by $1 a day and short by $1 a day in 21 days, so we are long $1/21 in the winner group and $1/21 per sky in the loser group. In each group, we can use equal weight or value weight to calculate the weight of each stock on the formation date of the portfolio.

def form_portfolio(start_date, end_date, tradables, holding_days, groups, wt_scheme):
    ports = tradables[tradables.date.between(start_date, end_date)].groupby('date').filter('count(PERMNO) >= 100')
    ports['rank'] = ports.groupby('date')['signal'].transform('rank{
   
   {,true,{groups}}}'.format(groups=groups))
    ports['wt'] = 0.0
    
    ports_rank_eq_0 = (ports['rank'] == 0)
    ports_rank_eq_groups_sub_1 = (ports['rank'] == groups-1)
    if wt_scheme == 1:
        ports.loc[ports_rank_eq_0, 'wt'] = \
            ports[ports_rank_eq_0].groupby(['date'])['PERMNO'].transform(
                r'(PERMNO->-1\count(PERMNO)\{holding_days})'.format(holding_days=holding_days)
            )
        ports.loc[ports_rank_eq_groups_sub_1, 'wt'] = \
            ports[ports_rank_eq_groups_sub_1].groupby(['date'])['PERMNO'].transform(
                r'(PERMNO->1\count(PERMNO)\\{holding_days})'.format(holding_days=holding_days)
            )
    elif wt_scheme == 2:
        ports.loc[ports_rank_eq_0, 'wt'] = \
            ports[ports_rank_eq_0].groupby(['date'])['MV'].transform(
                r'(MV->-MV\sum(MV)\{holding_days})'.format(holding_days=holding_days)
            )
        ports.loc[ports_rank_eq_groups_sub_1, 'wt'] = \
            ports[ports_rank_eq_groups_sub_1].groupby(['date'])['MV'].transform(
                r'(MV->MV\sum(MV)\{holding_days})'.format(holding_days=holding_days)
            )
    ports = ports.loc[ports.wt != 0, ['PERMNO', 'date', 'wt']].sort_values(by=['PERMNO', 'date'])
    ports.rename(columns={'date': 'tranche'}, inplace=True)
    return ports


start_date, end_date = orca.Timestamp("1996.01.01"), orca.Timestamp("2017.01.01")
holding_days = 5
groups = 10
ports = form_portfolio(start_date, end_date, tradables, holding_days, groups, 2)
daily_rtn = price_data.loc[price_data.date.between(start_date, end_date), ['date', 'PERMNO', 'RET']]

Note: The above code uses the extended function of Orca, that is, allows filtertransformand other higher-order functions to accept a string representing a DolphinDB function or script. See tutorial .

Step 3: Calculate the profit/loss for the subsequent 21 days for each stock in our portfolio. The stocks will be closed 21 days after the portfolio formation date.

def calc_stock_pnl(ports, daily_rtn, holding_days, end_date, last_days):
    dates = ports[['tranche']].drop_duplicates().sort_values(by='tranche')

    dates_after_ages = orca.DataFrame()
    for age in range(1, holding_days+1):
        dates_after_age_i = dates.copy()
        dates_after_age_i['age'] = age
        dates_after_age_i['date_after_age'] = dates_after_age_i['tranche'].shift(-age)
        dates_after_ages.append(dates_after_age_i, inplace=True)

    pos = ports.merge(dates_after_ages, on='tranche')
    pos = pos.join(last_days, on='PERMNO')
    pos = pos.loc[(pos.date_after_age.notnull() & (pos.date_after_age <= pos.last_day.clip(upper=end_date))),
                  ['date_after_age', 'PERMNO', 'tranche', 'age', 'wt']]
    pos = pos.compute()
    pos.rename(columns={'date_after_age': 'date', 'wt': 'expr'}, inplace=True)
    pos['ret'] = 0.0
    pos['pnl'] = 0.0

    # use set_index to make it easy to equal join two Frames
    daily_rtn.set_index(['date', 'PERMNO'], inplace=True)
    pos.set_index(['date', 'PERMNO'], inplace=True)
    pos['ret'] = daily_rtn['RET']
    pos.reset_index(inplace=True)
    pos['expr'] = (pos.expr * (1 + pos.ret).cumprod()).groupby(
                    ['PERMNO', 'tranche'], lazy=True).transform()
    pos['pnl'] = pos.expr * pos.ret / (1 + pos.ret)

    return pos


last_days = price_data.groupby('PERMNO')['date'].max()
last_days.rename("last_day", inplace=True)
stock_pnl = calc_stock_pnl(ports, daily_rtn, holding_days, end_date, last_days)

Note: The above code has a pos.compute()sentence that directly calculates the result of an intermediate expression (DataFrame with conditional filtering). Because we only need to assign values ​​to the filtered results.

In addition, the above code calls a set_indexfunction on two DataFrames , and then assigns the columns of one DataFrame to another, which is similar to left join the two DataFrames by index column, and then assigns the corresponding column in the result. If the script pos['ret'] = pos.merge(daily_rtn, on=['date', 'PERMNO'])['RET']is executed directly , a join will be made during calculation and another join will be made during assignment, which will cause unnecessary calculations.

Step 4: Calculate the profit/loss of the portfolio and plot the cumulative return of the momentum strategy over time.

port_pnl = stock_pnl.groupby('date')['pnl'].sum()
cumulative_return = port_pnl.cumsum()
cumulative_return.plot()
plt.show()

Note: The plot function will download the DolphinDB table corresponding to the entire DataFrame to the client, and then align the drawing. You should pay attention to the amount of data when using it to avoid performance problems caused by a large number of network transmissions. See tutorial .

Click here to view the complete code. For how to use the DolphinDB script to implement this strategy, please refer to the official example .

Guess you like

Origin blog.csdn.net/qq_41996852/article/details/111467359