从 mne 的 datasets
第一次导入数据的时候是需要下载的,这里有两个蛋疼的地方:
- 下载很慢,几乎下不动
- 默认路径是
~/mne_data
我们肯定希望能自己下载数据放到自己想要的文件夹里,然后告诉 mne 别下载了,到这个文件夹里找去吧!
那么怎么做呢?首先我找到了 mne 对应位置的代码。在 mne.datasets.utils.py
里的
_data_path
函数。
def _data_path(path=None, force_update=False, update_path=True, download=True,
name=None, check_version=False, return_version=False,
archive_name=None):
我找到了几个比较重要的内容:
# try to match url->archive_name->folder_name
urls = dict( # the URLs to use
brainstorm=dict(
bst_auditory='https://osf.io/5t9n8/download?version=1',
bst_phantom_ctf='https://osf.io/sxr8y/download?version=1',
bst_phantom_elekta='https://osf.io/dpcku/download?version=1',
bst_raw='https://osf.io/9675n/download?version=2',
bst_resting='https://osf.io/m7bd3/download?version=3'),
fake='https://github.com/mne-tools/mne-testing-data/raw/master/'
'datasets/foo.tgz',
misc='https://codeload.github.com/mne-tools/mne-misc-data/'
'tar.gz/%s' % releases['misc'],
sample='https://osf.io/86qa2/download?version=4',
somato='https://osf.io/tp4sg/download?version=5',
spm='https://osf.io/je4s8/download?version=2',
testing='https://codeload.github.com/mne-tools/mne-testing-data/'
'tar.gz/%s' % releases['testing'],
multimodal='https://ndownloader.figshare.com/files/5999598',
opm='https://osf.io/p6ae7/download?version=2',
visual_92_categories=[
'https://osf.io/8ejrs/download?version=1',
'https://osf.io/t4yjp/download?version=1'],
mtrf='https://osf.io/h85s2/download?version=1',
kiloword='https://osf.io/qkvf9/download?version=1',
fieldtrip_cmc='https://osf.io/j9b6s/download?version=1',
phantom_4dbti='https://osf.io/v2brw/download?version=1',
)
# filename of the resulting downloaded archive (only needed if the URL
# name does not match resulting filename)
archive_names = dict(
fieldtrip_cmc='SubjectCMC.zip',
kiloword='MNE-kiloword-data.tar.gz',
misc='mne-misc-data-%s.tar.gz' % releases['misc'],
mtrf='mTRF_1.5.zip',
multimodal='MNE-multimodal-data.tar.gz',
opm='MNE-OPM-data.tar.gz',
sample='MNE-sample-data-processed.tar.gz',
somato='MNE-somato-data.tar.gz',
spm='MNE-spm-face.tar.gz',
testing='mne-testing-data-%s.tar.gz' % releases['testing'],
visual_92_categories=['MNE-visual_92_categories-data-part1.tar.gz',
'MNE-visual_92_categories-data-part2.tar.gz'],
phantom_4dbti='MNE-phantom-4DBTi.zip',
)
# original folder names that get extracted (only needed if the
# archive does not extract the right folder name; e.g., usually GitHub)
folder_origs = dict( # not listed means None (no need to move)
misc='mne-misc-data-%s' % releases['misc'],
testing='mne-testing-data-%s' % releases['testing'],
)
# finally, where we want them to extract to (only needed if the folder name
# is not the same as the last bit of the archive name without the file
# extension)
folder_names = dict(
brainstorm='MNE-brainstorm-data',
fake='foo',
misc='MNE-misc-data',
mtrf='mTRF_1.5',
sample='MNE-sample-data',
testing='MNE-testing-data',
visual_92_categories='MNE-visual_92_categories-data',
fieldtrip_cmc='MNE-fieldtrip_cmc-data',
phantom_4dbti='MNE-phantom-4DBTi',
)
经过验证,可以确认:
- urls: 对应数据集的下载地址
- archive_names: 下载后的文件名
- folder_names:数据解压后的目录名
程序的逻辑是如果在给定目录下没有找到应该有的 folder_names,那么就启动下载。这就意味着我们完全可以自己手动下载后解压到我们想要的位置。
以 sample
为例,先把对应的链接复制到浏览器,获取真实的下载地址,然后使用迅雷(事实证明,这类东西还是迅雷下的快,当然我开了超级会员,反正也不贵)下载下来,命名为 MNE-sample-data-processed.tar.gz
。下载到目录 D:\Data\DataBase\EEGData\mne_data
,解压,得到:
/mnt/d/Data/DataBase/EEGData/mne_data ⌚ 22:34:54
$ tree -L 2
.
├── MNE-sample-data
│ ├── MEG
│ ├── SSS
│ ├── subjects
│ └── version.txt
└── MNE-sample-data-processed.tar.gz
然后进入命令行,修改一下路径。
In [13]: datasets.sample.data_path(r'D:\Data\DataBase\EEGData\mne_data')
Attempting to create new mne-python configuration file:
C:\Users\ZuoYiping\.mne\mne-python.json
Out[13]: 'D:\\Data\\DataBase\\EEGData\\mne_data\\MNE-sample-data'
搞定!