How zipline loads data

0. Several commands of zipline about bundle

zipline bundles can get the data that they already have.

zipline bundles

zipline clean can clear old useless data, the following command can replace bundle and int:

zipline clean [-b bundle] –keep-last int

zipline ingest can fetch data from the specified location, replace yourkey and bundle with the following command:

QUANDL_API_KEY=yourkey zipline ingest [-b bundle]

1. How zipline uses this data

We _runpassed the parameter in the function bundle, bundle_timestampand these two parameters can specify a data set. Inside _runwe see this code:

if bundle is not None:
    bundle_data = bundles.load(
        bundle,
        environ,
        bundle_timestamp,
    )

    prefix, connstr = re.split(
        r'sqlite:///',
        str(bundle_data.asset_finder.engine.url),
        maxsplit=1,
    )
    if prefix:
        raise ValueError(
            "invalid url %r, must begin with 'sqlite:///'" %
            str(bundle_data.asset_finder.engine.url),
        )
    env = TradingEnvironment(asset_db_path=connstr, environ=environ)
    first_trading_day =\
        bundle_data.equity_minute_bar_reader.first_trading_day
    data = DataPortal(
        env.asset_finder,
        trading_calendar=trading_calendar,
        first_trading_day=first_trading_day,
        equity_minute_reader=bundle_data.equity_minute_bar_reader,
        equity_daily_reader=bundle_data.equity_daily_bar_reader,
        adjustment_reader=bundle_data.adjustment_reader,
    )

First load the data, then build TradingEnvironment, then build DataPortal. Let's first look at the loadfunction:

def load(name, environ=os.environ, timestamp=None):
    """Loads a previously ingested bundle.

    Parameters
    ----------
    name : str
        The name of the bundle.
    environ : mapping, optional
        The environment variables. Defaults of os.environ.
    timestamp : datetime, optional
        The timestamp of the data to lookup.
        Defaults to the current time.

    Returns
    -------
    bundle_data : BundleData
        The raw data readers for this bundle.
    """
    if timestamp is None:
        timestamp = pd.Timestamp.utcnow()
    timestr = most_recent_data(name, timestamp, environ=environ)
    return BundleData(
        asset_finder=AssetFinder(
            asset_db_path(name, timestr, environ=environ),
        ),
        equity_minute_bar_reader=BcolzMinuteBarReader(
            minute_equity_path(name, timestr, environ=environ),
        ),
        equity_daily_bar_reader=BcolzDailyBarReader(
            daily_equity_path(name, timestr, environ=environ),
        ),
        adjustment_reader=SQLiteAdjustmentReader(
            adjustment_db_path(name, timestr, environ=environ),
        ),
    )

4 datasets are loaded: 2 are stored with sqlite and the other 2 are stored with Bcolz. As for why it exists, we will discuss it later. We just need to know that asset data, adjustment data, equity_minute_bar data and equity_daily_bar data are loaded here. After loading, the construction starts TradingEnvironment, and the asset data related information is also saved in this. code show as below:

class TradingEnvironment(object):
    ...

    def __init__(
        self,
        load=None,
        bm_symbol='SPY',
        exchange_tz="US/Eastern",
        trading_calendar=None,
        asset_db_path=':memory:',
        future_chain_predicates=CHAIN_PREDICATES,
        environ=None,
    ):

       ...

        if isinstance(asset_db_path, string_types):
            asset_db_path = 'sqlite:///' + asset_db_path
            self.engine = engine = create_engine(asset_db_path)
        else:
            self.engine = engine = asset_db_path

        if engine is not None:
            AssetDBWriter(engine).init_db()
            self.asset_finder = AssetFinder(
                engine,
                future_chain_predicates=future_chain_predicates)
        else:
            self.asset_finder = None

Finally, let's focus on DataPortalthe data that zipline uses is related to it. We will introduce it in the next section.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324638575&siteId=291194637