UC Berkeley AI Project -MindsDB 学习

版权声明:王家林大咖2018年新书《SPARK大数据商业实战三部曲》清华大学出版,微信公众号:从零起步学习人工智能 https://blog.csdn.net/duan_zhihua/article/details/86582740

UC Berkeley AI Project -MindsDB 学习

        MindsDB的目标是让开发人员在他们的项目中使用人工神经网络变得非常简单,为所有能够接触到数据的人构建mindsdb,输入几行代码就能实现深度学习神经网络。

       UC Berkeley AI Project -mindsdb的github地址: https://github.com/mindsdb/mindsdb#mindsdb

   本文内容:

  • MindsDB  安装部署
  • MindsDB 案例数据源格式
  • MindsDB 简单的案例代码
  • 案例运行结果          

1,安装部署 MindsDB

G:\ProgramData\Anaconda3\Scripts>pip3 install mindsdb --user
Collecting mindsdb
  Downloading https://files.pythonhosted.org/packages/80/e1/9ef3cc2e6157fb85456b912c6e4b9dc13dbce1c74bf7211de800351cf4f7/mindsdb-0.8.9.8-py3-none-any.whl (112kB)
    100% |████████████████████████████████| 122kB 74kB/s
Collecting attrs>=18.2.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/3a/e1/5f9023cc983f1a628a8c2fd051ad19e76ff7b142a0faf329336f9a62a514/attrs-18.2.0-py2.py3-none-any.whl
Collecting scikit-learn>=0.20.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/c1/1c/8fa5aefe23a2fc254e9faadc10a30052c63d92f05fb59127ff0e65e4171c/scikit_learn-0.20.2-cp36-cp36m-win_amd64.whl (4.8MB)
    100% |████████████████████████████████| 4.8MB 68kB/s
Requirement already satisfied: Jinja2>=2.10 in g:\programdata\anaconda3\lib\site-packages (from mindsdb) (2.10)
Collecting Pillow>=5.3.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/ec/ca/7af5b6628ecf770645f8cc3c9da3c2bb5c5ffc7384a9ff0666fdb818b4d5/Pillow-5.4.1-cp36-cp36m-win_amd64.whl (1.9MB)
    100% |████████████████████████████████| 1.9MB 19kB/s
Requirement already satisfied: torch>=0.4.1 in g:\programdata\anaconda3\lib\site-packages (from mindsdb) (0.4.1)
Collecting incremental>=17.5.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/f5/1d/c98a587dc06e107115cf4a58b49de20b19222c83d75335a192052af4c4b7/incremental-17.5.0-py2.py3-none-any.whl
Collecting PyHamcrest>=1.9.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/9a/d5/d37fd731b7d0e91afcc84577edeccf4638b4f9b82f5ffe2f8b62e2ddc609/PyHamcrest-1.9.0-py2.py3-none-any.whl (52kB)
    100% |████████████████████████████████| 61kB 27kB/s
Requirement already satisfied: Flask>=1.0.2 in g:\programdata\anaconda3\lib\site-packages (from mindsdb) (1.0.2)
Collecting requests>=2.20.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/7d/e3/20f3d364d6c8e5d2353c72a67778eb189176f08e873c9900e10c0287b84b/requests-2.21.0-py2.py3-none-any.whl (57kB)
    100% |████████████████████████████████| 61kB 62kB/s
Requirement already satisfied: python-dateutil>=2.7.3 in g:\programdata\anaconda3\lib\site-packages (from mindsdb) (2.7.3)
Collecting urllib3>=1.23 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/62/00/ee1d7de624db8ba7090d1226aebefab96a2c71cd5cfa7629d6ad3f61b79e/urllib3-1.24.1-py2.py3-none-any.whl (118kB)
    100% |████████████████████████████████| 122kB 93kB/s
Collecting pandas>=0.23.4 (from mindsdb)
  Cache entry deserialization failed, entry ignored
  Downloading https://files.pythonhosted.org/packages/0e/67/def5bfaf4d3324fdb89048889ec523c0903c5efab1a64c8dbe0ac8eec13c/pandas-0.23.4-cp36-cp36m-win_amd64.whl (7.7MB)
    100% |████████████████████████████████| 7.7MB 25kB/s
Requirement already satisfied: itsdangerous>=0.24 in g:\programdata\anaconda3\lib\site-packages (from mindsdb) (0.24)
Collecting txaio>=18.8.1 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/e9/6d/e1a6f7835cde86728e5bb1f577be9b2d7d273fdb33c286e70b087d418ded/txaio-18.8.1-py2.py3-none-any.whl
Collecting idna>=2.7 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl (58kB)
    100% |████████████████████████████████| 61kB 63kB/s
Collecting tinydb>=3.11.1 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/d9/2b/98040184cfbf03113736a160ea35aa92dc3619312ba5a4d6cafaf7f81c73/tinydb-3.12.2-py2.py3-none-any.whl
Requirement already satisfied: six>=1.11.0 in c:\users\lenovo\appdata\roaming\python\python36\site-packages (from mindsdb) (1.11.0)
Requirement already satisfied: Werkzeug>=0.14.1 in c:\users\lenovo\appdata\roaming\python\python36\site-packages (from mindsdb) (0.14.1)
Requirement already satisfied: chardet>=3.0.4 in g:\programdata\anaconda3\lib\site-packages (from mindsdb) (3.0.4)
Requirement already satisfied: MarkupSafe>=1.0 in g:\programdata\anaconda3\lib\site-packages (from mindsdb) (1.0)
Collecting tinydb-serialization>=1.0.3 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/8e/d3/88d7ae1ad819fc7c73dfe8d76e4a73cc476e3024a51f9b763b794272c727/tinydb-serialization-1.0.4.zip
Collecting pymongo>=3.7.1 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/d8/25/44b0fc81668a883739b108d9bd0c95b24f0b0204cb2dc93e0f259e173670/pymongo-3.7.2-cp36-cp36m-win_amd64.whl (315kB)
    100% |████████████████████████████████| 317kB 4.8MB/s
Collecting tinymongo>=0.2.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/a6/37/a44a36381c30ecdbba6adc435c4e2e0bb387b5907c45f92da03e7a81c261/tinymongo-0.2.0.tar.gz
Requirement already satisfied: setuptools>=21.2.1 in c:\users\lenovo\appdata\roaming\python\python36\site-packages (from mindsdb) (39.1.0)
Collecting python-engineio>=2.3.1 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/60/f9/9a53733a18c34ed0f279654bca2de7729ba893dfb3c9ec134d7cab7cf614/python_engineio-3.2.3-py2.py3-none-any.whl (115kB)
    100% |████████████████████████████████| 122kB 82kB/s
Requirement already satisfied: torchvision>=0.2.1 in g:\programdata\anaconda3\lib\site-packages (from mindsdb) (0.2.1)
Collecting cython>=0.29.2 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/a9/8e/40bae51b44ff4ac399c96a7ad9e17da07f0d96d8493c1e3ea29e1b6db420/Cython-0.29.3-cp36-cp36m-win_amd64.whl (1.7MB)
    100% |████████████████████████████████| 1.7MB 85kB/s
Collecting sklearn>=0.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/1e/7a/dbb3be0ce9bd5c8b7e3d87328e79063f8b263b2b1bfa4774cb1147bfcd3f/sklearn-0.0.tar.gz
Collecting openpyxl>=2.5.8 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/08/8a/509eb6f58672288da9a5884e1cc7e90819bc8dbef501161c4b40a6a4e46b/openpyxl-2.5.12.tar.gz (173kB)
    100% |████████████████████████████████| 174kB 44kB/s
Collecting Twisted>=18.7.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/5d/0e/a72d85a55761c2c3ff1cb968143a2fd5f360220779ed90e0fadf4106d4f2/Twisted-18.9.0.tar.bz2 (3.1MB)
    100% |████████████████████████████████| 3.1MB 23kB/s
Collecting numpy>=1.15.2 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/31/7e/8905636f7e4f9b9d7078aa0e701500634f832f145855a11beb098d3b0fb1/numpy-1.16.0-cp36-cp36m-win_amd64.whl (11.9MB)
    100% |████████████████████████████████| 11.9MB 65kB/s
Collecting eventlet>=0.24.1 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/86/7e/96e1412f96eeb2f2eca9342dcc4d5bc9305880a448b603b0a8e54439b71c/eventlet-0.24.1-py2.py3-none-any.whl (219kB)
    100% |████████████████████████████████| 225kB 81kB/s
Collecting zope.interface>=4.5.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/da/08/726e3b0e3bd9912fb530f9864bf9a3af9f9f6a1dfd4cc7854ca14fdab441/zope.interface-4.6.0-cp36-cp36m-win_amd64.whl (133kB)
    100% |████████████████████████████████| 143kB 116kB/s
Collecting python-socketio>=2.0.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/8a/f2/f61d999bdc90c3d0f9c81df029bbb80c227b02d4e794beb38296b12fcfb0/python_socketio-3.1.1-py2.py3-none-any.whl (44kB)
    100% |████████████████████████████████| 51kB 39kB/s
Requirement already satisfied: xlrd>=0.9.0 in g:\programdata\anaconda3\lib\site-packages (from mindsdb) (1.1.0)
Collecting constantly>=15.1.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/b9/65/48c1909d0c0aeae6c10213340ce682db01b48ea900a7d9fce7a7910ff318/constantly-15.1.0-py2.py3-none-any.whl
Collecting wheel>=0.32.2 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/ff/47/1dfa4795e24fd6f93d5d58602dd716c3f101cfd5a77cd9acbe519b44a0a9/wheel-0.32.3-py2.py3-none-any.whl
Collecting pytz>=2018.5 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/61/28/1d3920e4d1d50b19bc5d24398a7cd85cc7b9a75a490570d5a30c57622d34/pytz-2018.9-py2.py3-none-any.whl (510kB)
    100% |████████████████████████████████| 512kB 47kB/s
Collecting Click>=7.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/fa/37/45185cb5abbc30d7257104c434fe0b07e5a195a6847506c074527aa599ec/Click-7.0-py2.py3-none-any.whl (81kB)
    100% |████████████████████████████████| 81kB 70kB/s
Requirement already satisfied: scipy>=1.1.0 in g:\programdata\anaconda3\lib\site-packages (from mindsdb) (1.1.0)
Collecting hyperlink>=18.0.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/a7/b6/84d0c863ff81e8e7de87cff3bd8fd8f1054c227ce09af1b679a8b17a9274/hyperlink-18.0.0-py2.py3-none-any.whl
Collecting Automat>=0.7.0 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/a3/86/14c16bb98a5a3542ed8fed5d74fb064a902de3bdd98d6584b34553353c45/Automat-0.7.0-py2.py3-none-any.whl
Collecting Flask-SocketIO>=3.0.2 (from mindsdb)
  Downloading https://files.pythonhosted.org/packages/7c/7f/b3f90a780209fd93f9c57a9f17de0a7a0030cf7c512b6d2027f71fd7570a/Flask_SocketIO-3.1.2-py2.py3-none-any.whl
Requirement already satisfied: certifi>=2017.4.17 in g:\programdata\anaconda3\lib\site-packages (from requests>=2.20.0->mindsdb) (2018.4.16)
Requirement already satisfied: jdcal in g:\programdata\anaconda3\lib\site-packages (from openpyxl>=2.5.8->mindsdb) (1.4)
Requirement already satisfied: et_xmlfile in g:\programdata\anaconda3\lib\site-packages (from openpyxl>=2.5.8->mindsdb) (1.0.1)
Requirement already satisfied: greenlet>=0.3 in g:\programdata\anaconda3\lib\site-packages (from eventlet>=0.24.1->mindsdb) (0.4.13)
Collecting monotonic>=1.4 (from eventlet>=0.24.1->mindsdb)
  Downloading https://files.pythonhosted.org/packages/ac/aa/063eca6a416f397bd99552c534c6d11d57f58f2e94c14780f3bbf818c4cf/monotonic-1.5-py2.py3-none-any.whl
Collecting dnspython>=1.15.0 (from eventlet>=0.24.1->mindsdb)
  Downloading https://files.pythonhosted.org/packages/ec/d3/3aa0e7213ef72b8585747aa0e271a9523e713813b9a20177ebe1e939deb0/dnspython-1.16.0-py2.py3-none-any.whl (188kB)
    100% |████████████████████████████████| 194kB 71kB/s
Building wheels for collected packages: tinydb-serialization, tinymongo, sklearn, openpyxl, Twisted
  Running setup.py bdist_wheel for tinydb-serialization ... done
  Stored in directory: C:\Users\lenovo\AppData\Local\pip\Cache\wheels\d1\52\d2\9716fc8ef4a1e571c45e0539ee7d2a386ee9e1217226725d7e
  Running setup.py bdist_wheel for tinymongo ... done
  Stored in directory: C:\Users\lenovo\AppData\Local\pip\Cache\wheels\37\04\f4\e86be5480a00ce5d590134ba9157bbc7b9daa91c8b8870aa78
  Running setup.py bdist_wheel for sklearn ... done
  Stored in directory: C:\Users\lenovo\AppData\Local\pip\Cache\wheels\76\03\bb\589d421d27431bcd2c6da284d5f2286c8e3b2ea3cf1594c074
  Running setup.py bdist_wheel for openpyxl ... done
  Stored in directory: C:\Users\lenovo\AppData\Local\pip\Cache\wheels\95\b0\38\e5d13093b588f87177df648c06d07d4b7221f2c17d544cde4c
  Running setup.py bdist_wheel for Twisted ... done
  Stored in directory: C:\Users\lenovo\AppData\Local\pip\Cache\wheels\57\2e\89\11ba83bc08ac30a5e3a6005f0310c78d231b96a270def88ca0
Successfully built tinydb-serialization tinymongo sklearn openpyxl Twisted
distributed 1.21.8 requires msgpack, which is not installed.
tensorflow 1.10.0 has requirement numpy<=1.14.5,>=1.13.3, but you'll have numpy 1.16.0 which is incompatible.
botocore 1.12.4 has requirement urllib3<1.24,>=1.20, but you'll have urllib3 1.24.1 which is incompatible.
Installing collected packages: attrs, numpy, scikit-learn, Pillow, incremental, PyHamcrest, idna, urllib3, requests, pytz, pandas, txaio, tinydb, tinydb-serialization, pymongo, tinymongo, python-engineio, cython, sklearn, openpyxl, zope.interface, constantly, Automat, hyperlink, Twisted, monotonic, dnspython, eventlet, python-socketio, wheel, Click, Flask-SocketIO, mindsdb
  Found existing installation: numpy 1.14.5
    Uninstalling numpy-1.14.5:
      Successfully uninstalled numpy-1.14.5
  Found existing installation: wheel 0.31.1
    Uninstalling wheel-0.31.1:
      Successfully uninstalled wheel-0.31.1
Successfully installed Automat-0.7.0 Click-7.0 Flask-SocketIO-3.1.2 Pillow-5.4.1 PyHamcrest-1.9.0 Twisted-18.9.0 attrs-18.2.0 constantly-15.1.0 cython-0.29.3 dnspython-1.16.0 eventlet-0.24.1 hyperlink-18.0.0 idna-2.8 incremental-17.5.0 mindsdb-0.8.9.8 monotonic-1.5 numpy-1.16.0 openpyxl-2.5.12 pandas-0.23.4 pymongo-3.7.2 python-engineio-3.2.3 python-socketio-3.1.1 pytz-2018.9 requests-2.21.0 scikit-learn-0.20.2 sklearn-0.0 tinydb-3.12.2 tinydb-serialization-1.0.4 tinymongo-0.2.0 txaio-18.8.1 urllib3-1.24.1 wheel-0.32.3 zope.interface-4.6.0
You are using pip version 18.0, however version 18.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

G:\ProgramData\Anaconda3\Scripts>
 
 

2,从mindsdb官网https://github.com/mindsdb/mindsdb将mindsdb Fork到自己的github库,然后clone到本地笔记本中。

$ cd /d/PycharmProjects/git_UC_Berkeley_mindsdb

lenovo@duanzhihua MINGW64 /d/PycharmProjects/git_UC_Berkeley_mindsdb
$ git init
Initialized empty Git repository in D:/PycharmProjects/git_UC_Berkeley_mindsdb/.                                                                                                                git/

lenovo@duanzhihua MINGW64 /d/PycharmProjects/git_UC_Berkeley_mindsdb (master)
$ git status
On branch master

No commits yet

nothing to commit (create/copy files and use "git add" to track)

lenovo@duanzhihua MINGW64 /d/PycharmProjects/git_UC_Berkeley_mindsdb (master)
 
$ git remote add origin  https://github.com/duanzhihua/mindsdb.git

lenovo@duanzhihua MINGW64 /d/PycharmProjects/git_UC_Berkeley_mindsdb (master)
$ git remote -v
origin  https://github.com/duanzhihua/mindsdb.git (fetch)
origin  https://github.com/duanzhihua/mindsdb.git (push)

lenovo@duanzhihua MINGW64 /d/PycharmProjects/git_UC_Berkeley_mindsdb (master)
$ https://github.com/duanzhihua/mindsdb.git^Ct clone

lenovo@duanzhihua MINGW64 /d/PycharmProjects/git_UC_Berkeley_mindsdb (master)
$ git clone https://github.com/duanzhihua/mindsdb.git
Cloning into 'mindsdb'...
remote: Enumerating objects: 10, done.
remote: Counting objects: 100% (10/10), done.
remote: Compressing objects: 100% (8/8), done.
Receiving objects:  52% (1178/2229), 10.46 MiB | 36.00 KiB/s
remote: Total 2229 (delta 2), reused 6 (delta 2), pack-reused 2219
Receiving objects: 100% (2229/2229), 14.10 MiB | 53.00 KiB/s, done.
Resolving deltas: 100% (1373/1373), done.

lenovo@duanzhihua MINGW64 /d/PycharmProjects/git_UC_Berkeley_mindsdb (master)
$

接下来进行 MindsDB官网案例的学习。

3,MindsDB官网的案例数据源。

数据源:https://raw.githubusercontent.com/mindsdb/mindsdb/master/docs/examples/basic/home_rentals.csv

数据格式如下:包括房间数量、浴室数量、平方英尺、位置、发布天数、初始价格、邻里、租金字段。

扫描二维码关注公众号,回复: 5346073 查看本文章

本案例采用MindsDB深度学习框架,使用几行代码就能自动构建模型进行训练,从房间数量、浴室数量、平方英尺等字段预测租金价格。

4,MindsDB的官网案例代码:

# -*- coding: utf-8 -*-
from mindsdb import MindsDB

# First we initiate MindsDB
mdb = MindsDB()

# We tell mindsDB what we want to learn and from what data
mdb.learn(
    from_data="https://raw.githubusercontent.com/mindsdb/mindsdb/master/docs/examples/basic/home_rentals.csv", # the path to the file where we can learn from, (note: can be url)
    predict='rental_price', # the column we want to learn to predict given all the data in the file
    model_name='home_rentals' # the name of this model
) 

# use the model to make predictions
result = mdb.predict(predict='rental_price', when={'number_of_rooms': 2,'number_of_bathrooms':1, 'sqft': 1190}, model_name='home_rentals')

# you can now print the results
print('The predicted price is ${price} with {conf} confidence'.format(price=result.predicted_values[0]['rental_price'], conf=result.predicted_values[0]['prediction_confidence']))

5,相对于自己编写数据源加载、模型搭建、模型训练、结果预测等一系列深度学习的业务代码,MindsDB框架已将深度学习的框架进行了封装,调用MindsDB的API接口代码简洁、使用简单。在Spyder中运行MindsDB代码,运行结果如下:在2个房间、1个浴室、1190平方英尺的情况下,房屋的租金价格约为3307.38美元。

......
Test Error:0.023462874814867973, Accuracy:0.9894965839689579 | Best Accuracy so far: 0.9928742487986718
Test Error:0.023497262969613075, Accuracy:0.9894160102038013 | Best Accuracy so far: 0.9928742487986718
Test Error:0.02354746311903, Accuracy:0.9893318950384381 | Best Accuracy so far: 0.9928742487986718
Test Error:0.023581644520163536, Accuracy:0.9892346872510738 | Best Accuracy so far: 0.9928742487986718
Test Error:0.02364024706184864, Accuracy:0.9891667810935878 | Best Accuracy so far: 0.9928742487986718
Test Error:0.02365952543914318, Accuracy:0.9891101838926004 | Best Accuracy so far: 0.9928742487986718
Test Error:0.023698417469859123, Accuracy:0.9890618921359042 | Best Accuracy so far: 0.9928742487986718
Test Error:0.02374541200697422, Accuracy:0.9889779962731021 | Best Accuracy so far: 0.9928742487986718
Test Error:0.023786203935742378, Accuracy:0.9888827878312947 | Best Accuracy so far: 0.9928742487986718
Test Error:0.02381254732608795, Accuracy:0.9888063745529171 | Best Accuracy so far: 0.9928742487986718
Test Error:0.02384304068982601, Accuracy:0.988731764880387 | Best Accuracy so far: 0.9928742487986718
Test Error:0.02390199527144432, Accuracy:0.9885929118570124 | Best Accuracy so far: 0.9928742487986718
Test Error:0.02394930087029934, Accuracy:0.9885100610035826 | Best Accuracy so far: 0.9928742487986718
Loading model from store for retrain on new learning rate 0.001
Trained: model home_rentals [OK], TOTAL TIME: 448.31 seconds
[END] ModelTrainer, execution time: 448.314 seconds
[START] StatsLoader
[END] StatsLoader, execution time: 0.060 seconds
[START] DataExtractor
[END] DataExtractor, execution time: 0.003 seconds
[START] DataVectorizer
[END] DataVectorizer, execution time: 0.001 seconds
[START] ModelPredictor
Predict: model home_rentals, epoch 0
Starting model...
Inferring from model and data...
predicting batch...
Predict: model home_rentals [OK], TOTAL TIME: 0.09 seconds
[END] ModelPredictor, execution time: 0.089 seconds
The predicted price is $3307.38 with 0.12 confidence

 https://github.com/duanzhihua/mindsdb

猜你喜欢

转载自blog.csdn.net/duan_zhihua/article/details/86582740
今日推荐