芳香千里 發表於 2019-5-8 21:20:00

python数据分析与量化交易

<h1>第一章-学习之前的认知</h1>
<p><span style="color: rgba(0, 0, 0, 1)">影响股价的因素</span></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">、公司自身因素
</span><span style="color: rgba(128, 0, 128, 1)">2</span><span style="color: rgba(0, 0, 0, 1)">、心理因素
</span><span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">、行业因素
</span><span style="color: rgba(128, 0, 128, 1)">4</span><span style="color: rgba(0, 0, 0, 1)">、经济因素
</span><span style="color: rgba(128, 0, 128, 1)">5</span><span style="color: rgba(0, 0, 0, 1)">、市场因素
</span><span style="color: rgba(128, 0, 128, 1)">6</span>、政治因素</pre>
</div>
<p>金融量化投资</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 0, 1)">量化投资的优势
</span><span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">、避免主观情绪,人性弱点和认知偏差,选择更加客观
</span><span style="color: rgba(128, 0, 128, 1)">2</span><span style="color: rgba(0, 0, 0, 1)">、能同时包括多角度的观察和多层次的模型
</span><span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">、及时跟踪市场变化,不断发现新的统计模型,寻找交易机会
</span><span style="color: rgba(128, 0, 128, 1)">4</span>、在决定投资策略后,能通过回测验证其效果<br>量化策略<br>  通过一套固定的逻辑来分析、判断和决策,自动地进行股票交易<br>策略的周期<br>  实现想法、学习知识<br>  实现策略:python<br>  检验策略:回测、模拟交易<br>  实盘交易<br>  优化策略,放弃策略</pre>
</div>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420112514053-1498115908.png" alt=""></p>
<h1>第二章-科学计算基础包---numpy</h1>
<h2>量化投资和python</h2>
<p>为什么选择python呢?</p>
<div class="cnblogs_code">
<pre>其他选择:excel、SAS/<span style="color: rgba(0, 0, 0, 1)">SPSS(统计软件,无编程)、R(功能太单一,制作数据分析)
   
量化投资实际上就是分析数据从而做出决策的过程
python数据处理相关模块
    </span><span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">、NumPy:数组批量计算
    </span><span style="color: rgba(128, 0, 128, 1)">2</span><span style="color: rgba(0, 0, 0, 1)">、pandas:灵活的表计算
    </span><span style="color: rgba(128, 0, 128, 1)">3</span>、Matplotlib:数据可视化</pre>
</div>
<p>怎么使用python进行量化投资</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 0, 1)">自己编写
    NumPy </span>+ pandas +<span style="color: rgba(0, 0, 0, 1)"> Matplotlib....
在线平台
    聚宽、优矿、米筐、Quantopian....
开源框架
    RQAlpha、QUANTAXIS....</span></pre>
</div>
<h2>IPython的使用</h2>
<div class="cnblogs_code">
<pre>pip3 <span style="color: rgba(0, 0, 255, 1)">install</span> ipthon<br>也可以直接安装anacoda ,集成了ipython、<span style="color: rgba(0, 0, 0, 1)">NumPy pandas <span style="color: rgba(0, 0, 0, 1)">Matplotlib 等许多python的常用模块和框架</span></span></pre>
</div>
<p>与python解释器的使用方法一致</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420120816242-691587597.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420121212369-1745167126.png" alt=""></p>
<p>TAB键自动完成</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420121028738-1121365605.png" alt=""></p>
<p>?内省、查看具体信息</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420121414376-773614514.png" alt=""></p>
<p>?进行模糊匹配,命名空间搜索</p>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420121444117-2123049695.png" alt=""></p>
<p>!执行系统命令</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420121613677-2116605518.png" alt=""></p>
<p>某些命令不用加也能执行</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420121637100-1059781215.png" alt=""></p>
<p>??两个问号</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420122213586-860325665.png" alt=""></p>
<p>快捷键</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420122339244-1477808928.png" alt=""></p>
<h2>IPython的魔术命令</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420153100115-910295506.png" alt=""></p>
<p>%timeit 很费事,他要跑很多次</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420153233895-1965868288.png" alt=""></p>
<p>%paste 执行剪切板中的python代码</p>
<p>%pdb 在异常发生后自动进入调试模式,使用on</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420153901167-830558575.png" alt=""></p>
<p>然后就可以使用pdb相关的命令,进行调试状态</p>
<p>p命令最常用,打印的意思</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420153719130-1250342781.png" alt=""></p>
<p>%魔术命令</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420153317536-1565366845.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420154851540-1012543174.png" alt=""></p>
<p>命令的历史可以使用上下方向键,或者%hist查看命令历史</p>
<p>_ 表示上一次的输出</p>
<p>__ 表示上两个命令</p>
<p>_48 第多少的结果</p>
<p>_i48 第多少行的结果的字符串形式</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420154804202-1653458605.png" alt=""></p>
<p>&nbsp;%bookmark 目录标签系统</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420160101764-1140894902.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420160255838-745274632.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420160405178-1798111661.png" alt=""></p>
<h2>IPython Notebook-Jupyter的初识</h2>
<p>安装jupyter</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420161550339-1817726183.png" alt=""></p>
<p>使用notebook</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420161752389-669464229.png" alt=""></p>
<p>进入了jupyter的web界面</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420161949469-2028773267.png" alt=""></p>
<p>创建新的notebook</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420164036129-628953701.png" alt=""></p>
<p>出现一个小问题:编写的代码不能运行且前面的提示符In[*]</p>
<p>查看命令行,出现错误提示</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420164210841-780571291.png" alt=""></p>
<p>将软件降级安装后,解决问题</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420164241956-1392173486.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420164309473-188441530.png" alt=""></p>
<p><strong>可以用notebook写博客</strong>,支持makedown,而且他可以将页面直接输出成很多文本形式</p>
<h2>正戏-Numpy模块</h2>
<h3>Numpy简介</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420170513434-117772064.png" alt=""></p>
<h3>实例展示为什么要使用numpy</h3>
<p>例子:已知若干家跨国公司的市值,将其换算成人民币</p>
<p>普通的函数方法</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">、将公司市值存储成列表或者其他格式
</span><span style="color: rgba(128, 0, 128, 1)">2</span><span style="color: rgba(0, 0, 0, 1)">、创建变量,存储汇率
</span><span style="color: rgba(128, 0, 128, 1)">2</span><span style="color: rgba(0, 0, 0, 1)">、遍历列表
</span><span style="color: rgba(128, 0, 128, 1)">3</span>、做乘法运算,放入新的列表</pre>
</div>
<p>用numpy</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420171654665-590860181.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420171721644-631880804.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420171735086-966882553.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420171752030-1994601162.png" alt=""></p>
<p>例子2:已知每件商品的价格和每件商品的数量,计算总金额</p>
<p>还是用a作为价格,再创建一个数组作为每件商品的数量</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420172143809-1027400687.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420172252475-991213893.png" alt=""></p>
<p>计算每件商品的价格</p>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420172309629-848059958.png" alt=""></p>
<p>计算总金额</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420172333789-1367308775.png" alt=""></p>
<h3>ndarray-多维数组对象</h3>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420172615547-1340667673.png" alt=""></p>
<h3>ndarray-常用属性</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420172720142-493005091.png" alt=""></p>
<div class="cnblogs_code">
<pre>In [<span style="color: rgba(128, 0, 128, 1)">26</span><span style="color: rgba(0, 0, 0, 1)">]: a.ndim
Out[</span><span style="color: rgba(128, 0, 128, 1)">26</span>]: <span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">
In [</span><span style="color: rgba(128, 0, 128, 1)">27</span><span style="color: rgba(0, 0, 0, 1)">]: a.size
Out[</span><span style="color: rgba(128, 0, 128, 1)">27</span>]: <span style="color: rgba(128, 0, 128, 1)">50</span><span style="color: rgba(0, 0, 0, 1)">
In [</span><span style="color: rgba(128, 0, 128, 1)">28</span><span style="color: rgba(0, 0, 0, 1)">]: a.shape
Out[</span><span style="color: rgba(128, 0, 128, 1)">28</span>]: (<span style="color: rgba(128, 0, 128, 1)">50</span><span style="color: rgba(0, 0, 0, 1)">,)
</span>-----------------------------<span style="color: rgba(0, 0, 0, 1)">
In [</span><span style="color: rgba(128, 0, 128, 1)">30</span>]: b = np.array([[<span style="color: rgba(128, 0, 128, 1)">1</span>,<span style="color: rgba(128, 0, 128, 1)">2</span>,<span style="color: rgba(128, 0, 128, 1)">3</span>,],[<span style="color: rgba(128, 0, 128, 1)">4</span>,<span style="color: rgba(128, 0, 128, 1)">5</span>,<span style="color: rgba(128, 0, 128, 1)">6</span><span style="color: rgba(0, 0, 0, 1)">]])
In [</span><span style="color: rgba(128, 0, 128, 1)">31</span><span style="color: rgba(0, 0, 0, 1)">]: b.ndim
Out[</span><span style="color: rgba(128, 0, 128, 1)">31</span>]: <span style="color: rgba(128, 0, 128, 1)">2</span><span style="color: rgba(0, 0, 0, 1)">
In [</span><span style="color: rgba(128, 0, 128, 1)">32</span><span style="color: rgba(0, 0, 0, 1)">]: b.size
Out[</span><span style="color: rgba(128, 0, 128, 1)">32</span>]: <span style="color: rgba(128, 0, 128, 1)">6</span><span style="color: rgba(0, 0, 0, 1)">
In [</span><span style="color: rgba(128, 0, 128, 1)">33</span><span style="color: rgba(0, 0, 0, 1)">]: b.shape
Out[</span><span style="color: rgba(128, 0, 128, 1)">33</span>]: (<span style="color: rgba(128, 0, 128, 1)">2</span>, <span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">)
</span>-----------------------------<span style="color: rgba(0, 0, 0, 1)">
三维-第三个维度相当于笔记本的每一页,翻个页就到另一面
In [</span><span style="color: rgba(128, 0, 128, 1)">35</span>]: c = np.array([[[<span style="color: rgba(128, 0, 128, 1)">1</span>,<span style="color: rgba(128, 0, 128, 1)">2</span>,<span style="color: rgba(128, 0, 128, 1)">3</span>,],[<span style="color: rgba(128, 0, 128, 1)">4</span>,<span style="color: rgba(128, 0, 128, 1)">5</span>,<span style="color: rgba(128, 0, 128, 1)">6</span>]],[[<span style="color: rgba(128, 0, 128, 1)">1</span>,<span style="color: rgba(128, 0, 128, 1)">2</span>,<span style="color: rgba(128, 0, 128, 1)">3</span>],[<span style="color: rgba(128, 0, 128, 1)">1</span>,<span style="color: rgba(128, 0, 128, 1)">2</span>,<span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">]]])

In [</span><span style="color: rgba(128, 0, 128, 1)">36</span><span style="color: rgba(0, 0, 0, 1)">]: c
Out[</span><span style="color: rgba(128, 0, 128, 1)">36</span><span style="color: rgba(0, 0, 0, 1)">]:
array([[[</span><span style="color: rgba(128, 0, 128, 1)">1</span>, <span style="color: rgba(128, 0, 128, 1)">2</span>, <span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">],
      [</span><span style="color: rgba(128, 0, 128, 1)">4</span>, <span style="color: rgba(128, 0, 128, 1)">5</span>, <span style="color: rgba(128, 0, 128, 1)">6</span><span style="color: rgba(0, 0, 0, 1)">]],

       [[</span><span style="color: rgba(128, 0, 128, 1)">1</span>, <span style="color: rgba(128, 0, 128, 1)">2</span>, <span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">],
      [</span><span style="color: rgba(128, 0, 128, 1)">1</span>, <span style="color: rgba(128, 0, 128, 1)">2</span>, <span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">]]])
In [</span><span style="color: rgba(128, 0, 128, 1)">37</span><span style="color: rgba(0, 0, 0, 1)">]: c.shape
Out[</span><span style="color: rgba(128, 0, 128, 1)">37</span>]: (<span style="color: rgba(128, 0, 128, 1)">2</span>, <span style="color: rgba(128, 0, 128, 1)">2</span>, <span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">)
</span>------------------------------------<span style="color: rgba(0, 0, 0, 1)">
数组的转置
In [</span><span style="color: rgba(128, 0, 128, 1)">39</span>]: c =<span style="color: rgba(0, 0, 0, 1)"> c.T

In [</span><span style="color: rgba(128, 0, 128, 1)">40</span><span style="color: rgba(0, 0, 0, 1)">]: c
Out[</span><span style="color: rgba(128, 0, 128, 1)">40</span><span style="color: rgba(0, 0, 0, 1)">]:
array([[[</span><span style="color: rgba(128, 0, 128, 1)">1</span>, <span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">],
      [</span><span style="color: rgba(128, 0, 128, 1)">4</span>, <span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">]],

       [[</span><span style="color: rgba(128, 0, 128, 1)">2</span>, <span style="color: rgba(128, 0, 128, 1)">2</span><span style="color: rgba(0, 0, 0, 1)">],
      [</span><span style="color: rgba(128, 0, 128, 1)">5</span>, <span style="color: rgba(128, 0, 128, 1)">2</span><span style="color: rgba(0, 0, 0, 1)">]],

       [[</span><span style="color: rgba(128, 0, 128, 1)">3</span>, <span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">],
      [</span><span style="color: rgba(128, 0, 128, 1)">6</span>, <span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">]]])

In [</span><span style="color: rgba(128, 0, 128, 1)">41</span>]: c =<span style="color: rgba(0, 0, 0, 1)"> c.T

In [</span><span style="color: rgba(128, 0, 128, 1)">42</span><span style="color: rgba(0, 0, 0, 1)">]: c
Out[</span><span style="color: rgba(128, 0, 128, 1)">42</span><span style="color: rgba(0, 0, 0, 1)">]:
array([[[</span><span style="color: rgba(128, 0, 128, 1)">1</span>, <span style="color: rgba(128, 0, 128, 1)">2</span>, <span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">],
      [</span><span style="color: rgba(128, 0, 128, 1)">4</span>, <span style="color: rgba(128, 0, 128, 1)">5</span>, <span style="color: rgba(128, 0, 128, 1)">6</span><span style="color: rgba(0, 0, 0, 1)">]],

       [[</span><span style="color: rgba(128, 0, 128, 1)">1</span>, <span style="color: rgba(128, 0, 128, 1)">2</span>, <span style="color: rgba(128, 0, 128, 1)">3</span><span style="color: rgba(0, 0, 0, 1)">],
      [</span><span style="color: rgba(128, 0, 128, 1)">1</span>, <span style="color: rgba(128, 0, 128, 1)">2</span>, <span style="color: rgba(128, 0, 128, 1)">3</span>]]])</pre>
</div>
<h3>ndarray-数据类型</h3>
<p>查看数据类型</p>
<div class="cnblogs_code">
<pre>In [<span style="color: rgba(128, 0, 128, 1)">24</span><span style="color: rgba(0, 0, 0, 1)">]: a.dtype
Out[</span><span style="color: rgba(128, 0, 128, 1)">24</span>]: dtype(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">float64</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420172958751-1274327114.png" alt=""></p>
<p>我们使用的巨大部分都是数字类型,它本身就是用来做计算的</p>
<p>64位数的长度是多少(2**63-1)</p>
<div class="cnblogs_code">
<pre>In [<span style="color: rgba(128, 0, 128, 1)">25</span>]: <span style="color: rgba(128, 0, 128, 1)">2</span>**<span style="color: rgba(128, 0, 128, 1)">64</span>-<span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">
Out[</span><span style="color: rgba(128, 0, 128, 1)">25</span>]: <span style="color: rgba(128, 0, 128, 1)">18446744073709551615</span></pre>
</div>
<h2>numpy-array的创建</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420174513917-1979439202.png" alt=""></p>
<div class="cnblogs_code">
<pre>In : <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 可以这样创建一个10位全是0的数组</span>
In : <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> numpy as np
In [</span>3]: a = np.array(*10<span style="color: rgba(0, 0, 0, 1)">)
In [</span>4<span style="color: rgba(0, 0, 0, 1)">]: a
Out[</span>4<span style="color: rgba(0, 0, 0, 1)">]: array()

In [</span>5]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 也可以用zeros创建</span>
In : b = np.zeros(10<span style="color: rgba(0, 0, 0, 1)">)
In [</span>7<span style="color: rgba(0, 0, 0, 1)">]: b
Out[</span>7<span style="color: rgba(0, 0, 0, 1)">]: array()

In [</span>8]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 可以看见都是0.,说明他是一个浮点数,来看一下类型</span>
In : b.dtype
Out[</span>9]: dtype(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">float64</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)

In [</span>10]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 创建的时候指定类型,不使用默认的,直接用int</span>
In : c = np.zeros(10,dtype=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">int</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
In [</span>12<span style="color: rgba(0, 0, 0, 1)">]: c
Out[</span>12<span style="color: rgba(0, 0, 0, 1)">]: array()
In [</span>13<span style="color: rgba(0, 0, 0, 1)">]: c.dtype
Out[</span>13]: dtype(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">int32</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)

In [</span>14]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 创建全是1的数组</span>
In : d = np.ones(10<span style="color: rgba(0, 0, 0, 1)">)
In [</span>16<span style="color: rgba(0, 0, 0, 1)">]: d
Out[</span>16]: array()

In [</span>17]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 看一下empty的用法,创建空数组,里面放的都是随机数</span>
In : e = np.empty(50<span style="color: rgba(0, 0, 0, 1)">)
In [</span>19<span style="color: rgba(0, 0, 0, 1)">]: e
Out[</span>19<span style="color: rgba(0, 0, 0, 1)">]:
array([</span>1.23004319e-311, 1.23004150e-311, 2.95806213e-311, 1.26927730e-277<span style="color: rgba(0, 0, 0, 1)">,
       </span>5.54041819e+228, 2.84855906e-311, 5.97288716e-299, 3.28487474e-311<span style="color: rgba(0, 0, 0, 1)">,
       </span>9.43293441e-314, 2.26784710e-308, 1.23004306e-311, 1.23002517e-311<span style="color: rgba(0, 0, 0, 1)">,
       </span>3.38460664e+125, 6.69053866e+151, 6.56693077e-085, 1.03564308e-308<span style="color: rgba(0, 0, 0, 1)">,
       </span>1.33360293e+241, 1.71632673e+243, 5.96115807e+228, 1.71011791e+214<span style="color: rgba(0, 0, 0, 1)">,
       </span>5.67517369e-311, 1.00562508e-248, 2.85308965e-313, 2.14793507e-308<span style="color: rgba(0, 0, 0, 1)">,
       </span>1.38760675e+219, 2.92135768e+209, 2.21211602e+214, 2.28723653e-308<span style="color: rgba(0, 0, 0, 1)">,
       </span>6.96983359e+228, 1.33360298e+241, 2.11280666e+161, 1.29883065e+219<span style="color: rgba(0, 0, 0, 1)">,
       </span>1.11074825e-310, 1.46972270e-200, 4.97508544e-313, 4.65203811e+151<span style="color: rgba(0, 0, 0, 1)">,
       </span>4.66820502e+180, 5.61168418e-313, 3.81674046e-308, 1.33360303e+241<span style="color: rgba(0, 0, 0, 1)">,
       </span>1.54523733e-310, 5.03961303e-266, 3.99046880e-008, 2.08868046e-310<span style="color: rgba(0, 0, 0, 1)">,
       </span>2.53185169e-212, 7.44726967e-251, 1.39069238e-309, 2.75926410e-306<span style="color: rgba(0, 0, 0, 1)">,
       </span>4.90398331e-307, 5.23951796e+202<span style="color: rgba(0, 0, 0, 1)">])
In [</span>20]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">这些随机值是,之前内存的残存值。这个empty有什么用呢?</span>
In : <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">为了之后给里面赋值,因为它相对于zeros和ones创建的时候少了1个步骤,会更快一点</span>
<span style="color: rgba(0, 0, 0, 1)">
In [</span>22]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> arange可以指定步长为小数,pyton中是不可以的</span>
In : f = np.arange(1,10,0.3<span style="color: rgba(0, 0, 0, 1)">)
In [</span>25<span style="color: rgba(0, 0, 0, 1)">]: f
Out[</span>25<span style="color: rgba(0, 0, 0, 1)">]:
array([</span>1. , 1.3, 1.6, 1.9, 2.2, 2.5, 2.8, 3.1, 3.4, 3.7, 4. , 4.3, 4.6<span style="color: rgba(0, 0, 0, 1)">,
       </span>4.9, 5.2, 5.5, 5.8, 6.1, 6.4, 6.7, 7. , 7.3, 7.6, 7.9, 8.2, 8.5<span style="color: rgba(0, 0, 0, 1)">,
       </span>8.8, 9.1, 9.4, 9.7<span style="color: rgba(0, 0, 0, 1)">])

In [</span>26]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> linspace线性空间,和arange非常相,但是完全不一样,把指定的范围数字分成间隔相同的份数,最后一个参数是数组的长度,即份数</span>
In : k = np.linspace(0,5,10<span style="color: rgba(0, 0, 0, 1)">)或者是np.linespace(0,5,num=10)
In [</span>32<span style="color: rgba(0, 0, 0, 1)">]: k<span style="color: rgba(0, 128, 0, 1)">#并且linspace不像arange不包含最后一个数,它是包含最后一个数的,可以在最后看见5</span>
Out[</span>32<span style="color: rgba(0, 0, 0, 1)">]:
array([0.      , </span>0.55555556, 1.11111111, 1.66666667, 2.22222222<span style="color: rgba(0, 0, 0, 1)">,
       </span>2.77777778, 3.33333333, 3.88888889, 4.44444444, 5<span style="color: rgba(0, 0, 0, 1)">.      ])
In [</span>27]: g = np.linspace(1,100,100<span style="color: rgba(0, 0, 0, 1)">)
In [</span>28<span style="color: rgba(0, 0, 0, 1)">]: g
Out[</span>28<span style="color: rgba(0, 0, 0, 1)">]:
array([</span>1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,10.,11<span style="color: rgba(0, 0, 0, 1)">.,
      </span>12.,13.,14.,15.,16.,17.,18.,19.,20.,21.,22<span style="color: rgba(0, 0, 0, 1)">.,
      </span>23.,24.,25.,26.,27.,28.,29.,30.,31.,32.,33<span style="color: rgba(0, 0, 0, 1)">.,
      </span>34.,35.,36.,37.,38.,39.,40.,41.,42.,43.,44<span style="color: rgba(0, 0, 0, 1)">.,
      </span>45.,46.,47.,48.,49.,50.,51.,52.,53.,54.,55<span style="color: rgba(0, 0, 0, 1)">.,
      </span>56.,57.,58.,59.,60.,61.,62.,63.,64.,65.,66<span style="color: rgba(0, 0, 0, 1)">.,
      </span>67.,68.,69.,70.,71.,72.,73.,74.,75.,76.,77<span style="color: rgba(0, 0, 0, 1)">.,
      </span>78.,79.,80.,81.,82.,83.,84.,85.,86.,87.,88<span style="color: rgba(0, 0, 0, 1)">.,
      </span>89.,90.,91.,92.,93.,94.,95.,96.,97.,98.,99<span style="color: rgba(0, 0, 0, 1)">.,
       </span>100<span style="color: rgba(0, 0, 0, 1)">.])

In [</span>33]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">eye 生成单位矩阵,对角线上都是1,不做线性代数,基本不会遇到</span>
In : w = np.eye(5<span style="color: rgba(0, 0, 0, 1)">)
In [</span>38<span style="color: rgba(0, 0, 0, 1)">]: w
Out[</span>38<span style="color: rgba(0, 0, 0, 1)">]:
array([[</span>1<span style="color: rgba(0, 0, 0, 1)">., 0., 0., 0., 0.],
       ,
       ,
       ,
       ])</pre>
</div>
<h2>ndarray-批量运算</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420224414458-1050855440.png" alt=""></p>
<p>比较运算最后得到的是布尔值</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420225044579-822022373.png" alt=""></p>
<p>如何快速生成一个二维数组</p>
<div class="cnblogs_code">
<pre>In : np.arange(15).reshape((3,5<span style="color: rgba(0, 0, 0, 1)">))
Out[</span>39<span style="color: rgba(0, 0, 0, 1)">]:
array([[ 0,</span>1,2,3,4<span style="color: rgba(0, 0, 0, 1)">],
       [ </span>5,6,7,8,9<span style="color: rgba(0, 0, 0, 1)">],
       [</span>10, 11, 12, 13, 14]])</pre>
</div>
<h2>ndarray-索引</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420225354030-130364741.png" alt=""></p>
<div class="cnblogs_code">
<pre>array([[ 0,1,2,3,4<span style="color: rgba(0, 0, 0, 1)">],
       [ </span>5,6,7,8,9<span style="color: rgba(0, 0, 0, 1)">],
       [</span>10, 11, 12, 13, 14<span style="color: rgba(0, 0, 0, 1)">]])

In [</span>40]: a = np.arange(15).reshape((3,5<span style="color: rgba(0, 0, 0, 1)">))

In [</span>41]: a
Out[</span>41]: 12</pre>
</div>
<h2>adarray-切片</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420230122856-310270608.png" alt=""></p>
<p>&nbsp;</p>
<p>也是前包后不包</p>
<div class="cnblogs_code">
<pre>In : f
Out[</span>46<span style="color: rgba(0, 0, 0, 1)">]:
array([</span>1. , 1.3, 1.6, 1.9, 2.2, 2.5, 2.8, 3.1, 3.4, 3.7, 4. , 4.3, 4.6<span style="color: rgba(0, 0, 0, 1)">,
       </span>4.9, 5.2, 5.5, 5.8, 6.1, 6.4, 6.7, 7. , 7.3, 7.6, 7.9, 8.2, 8.5<span style="color: rgba(0, 0, 0, 1)">,
       </span>8.8, 9.1, 9.4, 9.7<span style="color: rgba(0, 0, 0, 1)">])

In [</span>48]: f
Out[</span>48]: array()</pre>
</div>
<p>&nbsp;但是数组切片,为了省空间,在切片的时候只是浅拷贝</p>
<p>&nbsp;如果要不影响原数组,切片的时候使用copy</p>
<div class="cnblogs_code">
<pre>In : b = f

In [</span>56<span style="color: rgba(0, 0, 0, 1)">]: b
Out[</span>56]: array()

In [</span>57]: b = 5<span style="color: rgba(0, 0, 0, 1)">

In [</span>58<span style="color: rgba(0, 0, 0, 1)">]: b
Out[</span>58]: array()

In [</span>59<span style="color: rgba(0, 0, 0, 1)">]: f
Out[</span>59<span style="color: rgba(0, 0, 0, 1)">]:
array([</span>5. , 1.3, 1.6, 1.9, 2.2, 2.5, 2.8, 3.1, 3.4, 3.7, 4. , 4.3, 4.6<span style="color: rgba(0, 0, 0, 1)">,
       </span>4.9, 5.2, 5.5, 5.8, 6.1, 6.4, 6.7, 7. , 7.3, 7.6, 7.9, 8.2, 8.5<span style="color: rgba(0, 0, 0, 1)">,
       </span>8.8, 9.1, 9.4, 9.7])<br>使用copy<br>b = f</span>.copy()</pre>
</div>
<p>多行切片</p>
<div class="cnblogs_code">
<pre>In : a
Out[</span>49<span style="color: rgba(0, 0, 0, 1)">]:
array([[ 0,</span>1,2,3,4<span style="color: rgba(0, 0, 0, 1)">],
       [ </span>5,6,7,8,9<span style="color: rgba(0, 0, 0, 1)">],
       [</span>10, 11, 12, 13, 14<span style="color: rgba(0, 0, 0, 1)">]])
In [</span>50]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 多行切片,可以看做是[切行,切列]</span>
In : a
Out[</span>54<span style="color: rgba(0, 0, 0, 1)">]:
array([,
       [</span>5, 6]])</pre>
</div>
<h2>ndarray-布尔型索引</h2>
<p>需求:选出列表中大于5的数</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420232313402-1421126718.png" alt=""></p>
<div class="cnblogs_code">
<pre>In : <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> random
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 用列表的filter方法</span>
In : a =
In [</span>62<span style="color: rgba(0, 0, 0, 1)">]: a
Out[</span>62]:
In [</span>63]: list(filter(<span style="color: rgba(0, 0, 255, 1)">lambda</span> x:x&gt;5<span style="color: rgba(0, 0, 0, 1)">, a))
Out[</span>63]:
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 用数组的布尔值索引</span>
In : a =<span style="color: rgba(0, 0, 0, 1)"> np.array(a)
In [</span>65]: a
Out[</span>65]: array([ 6,8,6,7,6,8,6, 10<span style="color: rgba(0, 0, 0, 1)">])

In [</span>66]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 布尔型索引的原理</span>
In : <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 第一步 a&gt;5</span>
In : a&gt;5<span style="color: rgba(0, 0, 0, 1)">
Out[</span>68<span style="color: rgba(0, 0, 0, 1)">]:
array([False, False, False,True, False,True,True,True, False,
       False,True,True, False,True, False, False, False,True,
       False, False])

In [</span>69]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 第二步,返回每一位置为ture的位置的值</span>
In : b = np.array()
In [</span>71]: c =<span style="color: rgba(0, 0, 0, 1)"> np.array()
In [</span>72<span style="color: rgba(0, 0, 0, 1)">]: b
Out[</span>72]: array()</pre>
</div>
<p>&nbsp;需求2:选出数组中大于5的偶数</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420232647991-2134583192.png" alt=""></p>
<p>题外:and 和 &amp; 有什么区别?</p>
<h2>ndarray-花式索引</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420233317516-371122778.png" alt=""></p>
<p>注意:多维数组中,花式索引和花式索引不能出现在,逗号的两边</p>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420235645616-1900539405.png" alt=""></p>
<h2>Numpy-通用函数</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190420235808125-934984373.png" alt=""></p>
<h3>abs-批量求绝对值</h3>
<div class="cnblogs_code">
<pre>In : a = np.arange(-5,5<span style="color: rgba(0, 0, 0, 1)">)
In [</span>3<span style="color: rgba(0, 0, 0, 1)">]: a
Out[</span>3]: array([-5, -4, -3, -2, -1,0,1,2,3,4<span style="color: rgba(0, 0, 0, 1)">])
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 直接用abs也可以</span>
In : abs(a)
Out[</span>4]: array()
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 严谨的用法是np.abs</span>
In : np.abs(a)
Out[</span>5]: array()</pre>
</div>
<h3>aqrt-开方</h3>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 直接使用会报错,没有这个sqrt,找不到</span>
In : sqrt(a)
</span>---------------------------------------------------------------------------<span style="color: rgba(0, 0, 0, 1)">
NameError                                 Traceback (most recent call last)
</span>&lt;ipython-input-7-55c08d4e5fa4&gt; <span style="color: rgba(0, 0, 255, 1)">in</span> &lt;module&gt;<span style="color: rgba(0, 0, 0, 1)">()
</span>----&gt; 1<span style="color: rgba(0, 0, 0, 1)"> sqrt(a)

NameError: name </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">sqrt</span><span style="color: rgba(128, 0, 0, 1)">'</span> <span style="color: rgba(0, 0, 255, 1)">is</span> <span style="color: rgba(0, 0, 255, 1)">not</span><span style="color: rgba(0, 0, 0, 1)"> defined
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> math模块下有sqrt</span>
In : <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> math
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 报错,sqrt一次只能处理一个值</span>
In : math.sqrt(a)
</span>---------------------------------------------------------------------------<span style="color: rgba(0, 0, 0, 1)">
TypeError                                 Traceback (most recent call last)
</span>&lt;ipython-input-11-c85d302be686&gt; <span style="color: rgba(0, 0, 255, 1)">in</span> &lt;module&gt;<span style="color: rgba(0, 0, 0, 1)">()
</span>----&gt; 1<span style="color: rgba(0, 0, 0, 1)"> math.sqrt(a)

TypeError: only size</span>-1<span style="color: rgba(0, 0, 0, 1)"> arrays can be converted to Python scalars
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 使用np.sqrt ,因为负数不能求开方</span>
In : np.sqrt(a)
F:\Python36\Scripts\ipython3:</span>1: RuntimeWarning: invalid value encountered <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> sqrt
Out[</span>10<span style="color: rgba(0, 0, 0, 1)">]:
array([       nan,      nan,      nan,      nan,      nan,
       0.      , </span>1.      , 1.41421356, 1.73205081, 2.      ])</pre>
</div>
<h3>把一个小数变成整数-取整和保留小数位</h3>
<div class="cnblogs_code">
<pre>In : a = 1.6
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 这种取整,叫做向0取整</span>
In : int(a)
Out[</span>13]: 1
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 这种叫做四舍五入</span>
In : round(a)
Out[</span>14]: 2
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 向上取整-ceil</span>
In : math.ceil(a)
Out[</span>15]: 2
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 向下取整-floor</span>
In : math.floor(a)
Out[</span>16]: 1
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 使用np</span>
In : a
Out[</span>18]: array([-5.5, -4.5, -3.5, -2.5, -1.5, -0.5,0.5,1.5,2.5,3.5,4.5<span style="color: rgba(0, 0, 0, 1)">])
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 向下取整</span>
In : np.floor(a)
Out[</span>19]: array([-6., -5., -4., -3., -2., -1.,0.,1.,2.,3.,4<span style="color: rgba(0, 0, 0, 1)">.])
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 向上取整</span>
In : np.ceil(a)
Out[</span>22]: array([-5., -4., -3., -2., -1., -0.,1.,2.,3.,4.,5<span style="color: rgba(0, 0, 0, 1)">.])
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 四舍五入</span>
In : np.round(a)
Out[</span>23]: array([-6., -4., -4., -2., -2., -0.,0.,2.,2.,4.,4<span style="color: rgba(0, 0, 0, 1)">.])
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> rint和round是一样的</span>
In : np.rint(a)
Out[</span>20]: array([-6., -4., -4., -2., -2., -0.,0.,2.,2.,4.,4<span style="color: rgba(0, 0, 0, 1)">.])
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 向0取整</span>
In : np.trunc(a)
Out[</span>21]: array([-5., -4., -3., -2., -1., -0.,0.,1.,2.,3.,4.])</pre>
</div>
<p>注:这里的round用的是"四舍六入五成双,奇进偶不进"的方法。对于大量的计算而言,比普通的四舍五入要更科学</p>
<h3>modf-把小数和整数部分分开获取</h3>
<div class="cnblogs_code">
<pre>In : np.modf(a)
Out[</span>26<span style="color: rgba(0, 0, 0, 1)">]:
(array([</span>-0.5, -0.5, -0.5, -0.5, -0.5, -0.5,0.5,0.5,0.5,0.5,0.5<span style="color: rgba(0, 0, 0, 1)">]),
array([</span>-5., -4., -3., -2., -1., -0.,0.,1.,2.,3.,4<span style="color: rgba(0, 0, 0, 1)">.]))

In [</span>27]: x,y =<span style="color: rgba(0, 0, 0, 1)"> _

In [</span>28<span style="color: rgba(0, 0, 0, 1)">]: x
Out[</span>28]: array([-0.5, -0.5, -0.5, -0.5, -0.5, -0.5,0.5,0.5,0.5,0.5,0.5<span style="color: rgba(0, 0, 0, 1)">])

In [</span>29<span style="color: rgba(0, 0, 0, 1)">]: y
Out[</span>29]: array([-5., -4., -3., -2., -1., -0.,0.,1.,2.,3.,4.])</pre>
</div>
<h3>isnan和isinf-浮点数特殊值的判定</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421114539048-1480530704.png" alt=""></p>
<div class="cnblogs_code">
<pre>In : a = np.ones(5<span style="color: rgba(0, 0, 0, 1)">)

In [</span>32<span style="color: rgba(0, 0, 0, 1)">]: a
Out[</span>32]: array()

In [</span>33]: a =<span style="color: rgba(0, 0, 0, 1)"> 0

In [</span>34<span style="color: rgba(0, 0, 0, 1)">]: a
Out[</span>34]: array()

In [</span>36]: b = a/<span style="color: rgba(0, 0, 0, 1)">a
F:\Python36\Scripts\ipython3:</span>1: RuntimeWarning: invalid value encountered <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> true_divide

In [</span>37<span style="color: rgba(0, 0, 0, 1)">]: a
Out[</span>37]: array()

In [</span>38<span style="color: rgba(0, 0, 0, 1)">]: b
Out[</span>38]: array([ 1., nan,1.,1.,1<span style="color: rgba(0, 0, 0, 1)">.])

In [</span>39]: 1 <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> b
Out[</span>39<span style="color: rgba(0, 0, 0, 1)">]: True
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 这样判断对nan是无用的</span>
In : np.nan <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> b
Out[</span>40<span style="color: rgba(0, 0, 0, 1)">]: False
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> isnan的作用</span>
In : np.isnan(b)
Out[</span>41]: array()</pre>
</div>
<h4>&nbsp;isnan用来取值</h4>
<div class="cnblogs_code">
<pre>In : b
Out[</span>42<span style="color: rgba(0, 0, 0, 1)">]: array()

In [</span>43]: b[~<span style="color: rgba(0, 0, 0, 1)">np.isnan(b)]
Out[</span>43]: array()</pre>
</div>
<h4>inf-比任何数都大</h4>
<div class="cnblogs_code">
<pre>In : np.inf &gt; 1000000000000000000000000000000<span style="color: rgba(0, 0, 0, 1)">
Out[</span>44<span style="color: rgba(0, 0, 0, 1)">]: True

In [</span>45]: float(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">inf</span><span style="color: rgba(128, 0, 0, 1)">'</span>) &gt; 1000000000000000000000000000000000<span style="color: rgba(0, 0, 0, 1)">
Out[</span>45]: True</pre>
</div>
<h4>inf和isinf的使用</h4>
<div class="cnblogs_code">
<pre>In : a = np.array()
In [</span>47]: b = np.array()
In [</span>48]: a/<span style="color: rgba(0, 0, 0, 1)">b
F:\Python36\Scripts\ipython3:</span>1: RuntimeWarning: divide by zero encountered <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> true_divide
Out[</span>48]: array([ 1., inf,1., inf,1<span style="color: rgba(0, 0, 0, 1)">.])
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 和np.nan不一样,是相等的</span>
In : np.inf ==<span style="color: rgba(0, 0, 0, 1)"> np.inf
Out[</span>49<span style="color: rgba(0, 0, 0, 1)">]: True

In [</span>50]: c = a/<span style="color: rgba(0, 0, 0, 1)">b
F:\Python36\Scripts\ipython3:</span>1: RuntimeWarning: divide by zero encountered <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> true_divide
In [</span>51<span style="color: rgba(0, 0, 0, 1)">]: c
Out[</span>51]: array([ 1., inf,1., inf,1<span style="color: rgba(0, 0, 0, 1)">.])
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 取出不是inf的值</span>
In : c
Out[</span>52]: array()
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 取出不是inf的值,用~</span>
In : c[~<span style="color: rgba(0, 0, 0, 1)">np.isinf(c)]
Out[</span>53]: array()</pre>
</div>
<h3>二元函数</h3>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 0, 1)">add 加
substract 减
multiply 乘
divide 除
power 乘方
mod 取模</span></pre>
</div>
<h4>maximum-对两个数组的每一个都取一个最大值</h4>
<div class="cnblogs_code">
<pre>In : a
Out[</span>58]: array()

In [</span>59<span style="color: rgba(0, 0, 0, 1)">]: b
Out[</span>59]: array()

In [</span>60<span style="color: rgba(0, 0, 0, 1)">]: np.maximum(a,b)
Out[</span>60]: array()</pre>
</div>
<p>mininum-和maxinum一样的用法,只是对比取最小的值</p>
<h3>更改数组形状-reshape和resize和ravel</h3>
<div class="cnblogs_code">
<pre>a = np.random.random((3,2<span style="color: rgba(0, 0, 0, 1)">))
a

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> reshape 并不改变原始数组</span>
a.reshape(2, 3<span style="color: rgba(0, 0, 0, 1)">)
array([[</span>0.91122299, 0.93234796, 0.86025081<span style="color: rgba(0, 0, 0, 1)">],
       [</span>0.33770259, 0.13627525, 0.78460434<span style="color: rgba(0, 0, 0, 1)">]])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">查看 a</span>
array([,
       [</span>0.86025081, 0.33770259<span style="color: rgba(0, 0, 0, 1)">],
       [</span>0.13627525, 0.78460434<span style="color: rgba(0, 0, 0, 1)">]])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> resize 会改变原始数组</span>
a.resize(2, 3<span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">   查看 a</span>
array([,
       [</span>0.33770259, 0.13627525, 0.78460434]])<br><br># 展平数组-数组变成一行<br>a.ravel()</pre>
<pre>array([0.91122299, 0.93234796, 0.86025081, 0.33770259, 0.13627525,
       0.78460434])</pre>
</div>
<h3>拼合数组-vstack和hstack</h3>
<div class="cnblogs_code">
<pre>a = np.random.randint(10,size=(3,3<span style="color: rgba(0, 0, 0, 1)">))
b </span>= np.random.randint(10,size=(3,3<span style="color: rgba(0, 0, 0, 1)">))
a,b
out:
(array([[</span>1, 4, 7<span style="color: rgba(0, 0, 0, 1)">],
      [</span>5, 6, 6<span style="color: rgba(0, 0, 0, 1)">],
      [</span>6, 4, 5<span style="color: rgba(0, 0, 0, 1)">]]),
array([[</span>8, 3, 1<span style="color: rgba(0, 0, 0, 1)">],
      [</span>1, 5, 8<span style="color: rgba(0, 0, 0, 1)">],
      [</span>5, 0, 6<span style="color: rgba(0, 0, 0, 1)">]]))
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 垂直拼合</span>
<span style="color: rgba(0, 0, 0, 1)">np.vstack((a,b))
out:
array([[</span>1, 4, 7<span style="color: rgba(0, 0, 0, 1)">],
       [</span>5, 6, 6<span style="color: rgba(0, 0, 0, 1)">],
       [</span>6, 4, 5<span style="color: rgba(0, 0, 0, 1)">],
       [</span>8, 3, 1<span style="color: rgba(0, 0, 0, 1)">],
       [</span>1, 5, 8<span style="color: rgba(0, 0, 0, 1)">],
       [</span>5, 0, 6<span style="color: rgba(0, 0, 0, 1)">]])
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 水平拼合</span>
<span style="color: rgba(0, 0, 0, 1)">np.hstack((a,b))
array([[</span>1, 4, 7, 8, 3, 1<span style="color: rgba(0, 0, 0, 1)">],
       [</span>5, 6, 6, 1, 5, 8<span style="color: rgba(0, 0, 0, 1)">],
       [</span>6, 4, 5, 5, 0, 6]])</pre>
</div>
<h3>分割数组-vsplit和hsplit</h3>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 沿横轴分割数组</span>
np.hsplit(a,3<span style="color: rgba(0, 0, 0, 1)">)
,
      [</span>5<span style="color: rgba(0, 0, 0, 1)">],
      [</span>6<span style="color: rgba(0, 0, 0, 1)">]]),
array([[</span>4<span style="color: rgba(0, 0, 0, 1)">],
      [</span>6<span style="color: rgba(0, 0, 0, 1)">],
      [</span>4<span style="color: rgba(0, 0, 0, 1)">]]),
array([[</span>7<span style="color: rgba(0, 0, 0, 1)">],
      [</span>6<span style="color: rgba(0, 0, 0, 1)">],
      [</span>5<span style="color: rgba(0, 0, 0, 1)">]])]

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 沿纵轴分割数组</span>
np.vsplit(a,3<span style="color: rgba(0, 0, 0, 1)">)
]), array([]), array([])]</pre>
</div>
<h3>数组排序</h3>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 生成示例数组</span>
a = np.array((, , ))
a
array([[</span>1, 4, 3<span style="color: rgba(0, 0, 0, 1)">],
       [</span>6, 2, 9<span style="color: rgba(0, 0, 0, 1)">],
       [</span>4, 7, 2<span style="color: rgba(0, 0, 0, 1)">]])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">返回每列最大值</span>
np.max(a, axis=<span style="color: rgba(0, 0, 0, 1)">0)
array([</span>6, 7, 9<span style="color: rgba(0, 0, 0, 1)">])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 返回每行最小值</span>
np.min(a,axis=1<span style="color: rgba(0, 0, 0, 1)">)
array([</span>1, 2, 2<span style="color: rgba(0, 0, 0, 1)">])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 返回每列最大值索引</span>
np.argmax(a,axis=<span style="color: rgba(0, 0, 0, 1)">0)
array([</span>1, 2, 1<span style="color: rgba(0, 0, 0, 1)">])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 返回每行最小值索引</span>
np.argmin(a,axis=1<span style="color: rgba(0, 0, 0, 1)">)
array()</pre>
</div>
<p>&nbsp;</p>
<h2>numpy-统计方法和随机数生成</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421121525524-902070626.png" alt=""></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 统计中位数</span>
np.median(a, axis=<span style="color: rgba(0, 0, 0, 1)">0)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 统计各行的算术平均值</span>
np.mean(a, axis=1<span style="color: rgba(0, 0, 0, 1)">)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 统计各列的加权平均值</span>
np.average(a, axis=<span style="color: rgba(0, 0, 0, 1)">0)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 统计各行的方差</span>
np.var(a, axis=1<span style="color: rgba(0, 0, 0, 1)">)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 统计数组各列的标准偏差</span>
np.std(a, axis=0)</pre>
</div>
<p>&nbsp;</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421121558305-1556545535.png" alt=""></p>
<h3>数学时间</h3>
<div class="cnblogs_code">
<pre>1 2 3 4 5<span style="color: rgba(0, 0, 0, 1)">
平均数: </span>3<span style="color: rgba(0, 0, 0, 1)">
方差 :每个数</span>-3的值的平方,加在一起,再除以数字的个数</pre>
</div>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421122024548-1558395989.png" alt=""></p>
<div class="cnblogs_code">
<pre>标准差:对方差开平方根</pre>
</div>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421122255356-1114018891.png" alt=""></p>
<p>方差用来计算数组内数值的范围</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421122527761-1515266837.png" alt=""></p>
<p>平均数加减两倍方差的结果活落在90%的范围上</p>
<p>矩阵乘法</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422145348786-1596795075.png" alt=""></p>
<h4>矩阵乘法运算(注意与a*b的区别)</h4>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 0, 1)">A = np.array([,])<br>B = np.array([,])<br><br>np.dot(A,B)
array([[</span>19, 22<span style="color: rgba(0, 0, 0, 1)">],
       [</span>43, 50]])</pre>
</div>
<h4>数学函数</h4>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 求三角函数</span>
a = np.array()

np.sin(a)
array([</span>-0.54402111,0.91294525, -0.98803162,0.74511316, -0.26237485<span style="color: rgba(0, 0, 0, 1)">])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)">   以自然对数为底数的指数函数</span>
<span style="color: rgba(0, 0, 0, 1)">np.exp(a)
array([</span>2.20264658e+04, 4.85165195e+08, 1.06864746e+13, 2.35385267e+17<span style="color: rgba(0, 0, 0, 1)">,
       </span>5.18470553e+21<span style="color: rgba(0, 0, 0, 1)">])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 方根的运算-开平方</span>
<span style="color: rgba(0, 0, 0, 1)">np.sqrt(a)
array([</span>3.16227766, 4.47213595, 5.47722558, 6.32455532, 7.07106781<span style="color: rgba(0, 0, 0, 1)">])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 方根的运算-求立方</span>
np.power(a,3<span style="color: rgba(0, 0, 0, 1)">)
array([</span>1000,   8000,27000,64000, 125000])</pre>
</div>
<h3>随机数</h3>
<pre></pre>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 创建二维随机数组</span>
np.random.rand(2, 3<span style="color: rgba(0, 0, 0, 1)">)
array([[</span>0.46181641, 0.06400509, 0.93763711<span style="color: rgba(0, 0, 0, 1)">],
       [</span>0.67133387, 0.0801051 , 0.81633397<span style="color: rgba(0, 0, 0, 1)">]])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 创建二维随机整数数组</span>
np.random.randint(5, size=(2, 3<span style="color: rgba(0, 0, 0, 1)">))
array([[</span>4, 2, 2<span style="color: rgba(0, 0, 0, 1)">],
       [</span>4, 0, 0]])</pre>
</div>
<div class="cnblogs_code">
<pre>In : np.random.randint(0,10,10<span style="color: rgba(0, 0, 0, 1)">)
Out[</span>61]: array()
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 生成多维随机数组</span>
In : np.random.randint(0,10,(3,5<span style="color: rgba(0, 0, 0, 1)">))   或者如上一个例子所示 使用size参数
Out[</span>62<span style="color: rgba(0, 0, 0, 1)">]:
array([[</span>4, 5, 7, 7, 8<span style="color: rgba(0, 0, 0, 1)">],
       [</span>4, 1, 5, 1, 4<span style="color: rgba(0, 0, 0, 1)">],
       [</span>2, 3, 9, 6, 8<span style="color: rgba(0, 0, 0, 1)">]])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 0-1之间的随机数</span>
In : np.random.rand(10<span style="color: rgba(0, 0, 0, 1)">)
Out[</span>63<span style="color: rgba(0, 0, 0, 1)">]:
array([</span>0.97926997, 0.17454168, 0.52831388, 0.28070782, 0.2715298<span style="color: rgba(0, 0, 0, 1)"> ,
       </span>0.2749287 , 0.44007621, 0.56472258, 0.53291951, 0.30727733<span style="color: rgba(0, 0, 0, 1)">])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 指定数组中的随机数</span>
In : np.random.choice(,10<span style="color: rgba(0, 0, 0, 1)">)
Out[</span>64]: array()

In [</span>65]: np.random.choice(,(2,3<span style="color: rgba(0, 0, 0, 1)">))
Out[</span>65<span style="color: rgba(0, 0, 0, 1)">]:
array([[</span>5, 2, 3<span style="color: rgba(0, 0, 0, 1)">],
       [</span>3, 4, 4<span style="color: rgba(0, 0, 0, 1)">]])

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> uniform 平均分布,出现每一个小数的概率都一样</span>
In : np.random.uniform(2.0,4.0,10<span style="color: rgba(0, 0, 0, 1)">)
Out[</span>67<span style="color: rgba(0, 0, 0, 1)">]:
array([</span>3.30135597, 2.5034658 , 3.80415042, 3.58323964, 2.82819204<span style="color: rgba(0, 0, 0, 1)">,
       </span>3.45701693, 2.51628589, 3.94588971, 2.46530701, 3.269412<span style="color: rgba(0, 0, 0, 1)">])

In [</span>68]: np.random.uniform(2,4,10<span style="color: rgba(0, 0, 0, 1)">)
Out[</span>68<span style="color: rgba(0, 0, 0, 1)">]:
array([</span>3.99532675, 2.27704994, 2.44378248, 2.33492658, 3.79537452<span style="color: rgba(0, 0, 0, 1)">,
       </span>2.6754694 , 3.04022564, 2.12863367, 3.27047096, 3.70261513<span style="color: rgba(0, 0, 0, 1)">])

In [</span>69]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> random中所有的方法都被numpy重写过</span></pre>
</div>
<h3><strong><strong>fromfunction-</strong>依据自定义函数创建数组</strong></h3>
<div class="cnblogs_code">
<pre>&gt;&gt;&gt; <span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> f(x,y):
...         </span><span style="color: rgba(0, 0, 255, 1)">return</span> 10*x+<span style="color: rgba(0, 0, 0, 1)">y
...
</span>&gt;&gt;&gt; b = fromfunction(f,(5,4),dtype=<span style="color: rgba(0, 0, 0, 1)">int)
</span>&gt;&gt;&gt;<span style="color: rgba(0, 0, 0, 1)"> b
array([[ 0,</span>1,2,3<span style="color: rgba(0, 0, 0, 1)">],
      [</span>10, 11, 12, 13<span style="color: rgba(0, 0, 0, 1)">],
      [</span>20, 21, 22, 23<span style="color: rgba(0, 0, 0, 1)">],
      [</span>30, 31, 32, 33<span style="color: rgba(0, 0, 0, 1)">],
      [</span>40, 41, 42, 43<span style="color: rgba(0, 0, 0, 1)">]])

</span><span style="color: rgba(0, 128, 0, 1)">#</span>
np.fromfunction(<span style="color: rgba(0, 0, 255, 1)">lambda</span> i,j:i+j,(3,3<span style="color: rgba(0, 0, 0, 1)">))
array([,
       [</span>1., 2., 3<span style="color: rgba(0, 0, 0, 1)">.],
       [</span>2., 3., 4<span style="color: rgba(0, 0, 0, 1)">.]])
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 生成的规则就是数组中每一个元素所在位置的索引值作为x和y的值</span></pre>
</div>
<p>&nbsp;</p>
<p>还有很多高级功能没有说,numpy相对于pandas来说是比较基础的包</p>
<p>接下来请领教pandas</p>
<h1>第三章-数据分析核心包---pandas</h1>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421124530048-1151066614.png" alt=""></p>
<h2>series-一维数据对象</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421124714542-634469400.png" alt=""></p>
<div class="cnblogs_code">
<pre>In : <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> pandas as pd

In [</span>73]: pd.Series()
Out[</span>73<span style="color: rgba(0, 0, 0, 1)">]:
0    </span>2
1    3
2    4
3    5<span style="color: rgba(0, 0, 0, 1)">
dtype: int64

In [</span>74]: pd.Series(,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])
Out[</span>74<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>2<span style="color: rgba(0, 0, 0, 1)">
b    </span>3<span style="color: rgba(0, 0, 0, 1)">
c    </span>4<span style="color: rgba(0, 0, 0, 1)">
d    </span>5<span style="color: rgba(0, 0, 0, 1)">
dtype: int64

In [</span>75]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 所以说serries更像是列表和字典的结合体</span>
<span style="color: rgba(0, 0, 0, 1)">
In [</span>76]: pd.Series(np.arange(5<span style="color: rgba(0, 0, 0, 1)">))
Out[</span>76<span style="color: rgba(0, 0, 0, 1)">]:
0    0
</span>1    1
2    2
3    3
4    4<span style="color: rgba(0, 0, 0, 1)">
dtype: int32

In [</span>77]: <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 在制定了索引之后,用原来的下标还是能访问</span>
In :sr = pd.Series(,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])

In [</span>83<span style="color: rgba(0, 0, 0, 1)">]: sr
Out[</span>83<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>2<span style="color: rgba(0, 0, 0, 1)">
b    </span>3<span style="color: rgba(0, 0, 0, 1)">
c    </span>4<span style="color: rgba(0, 0, 0, 1)">
d    </span>5<span style="color: rgba(0, 0, 0, 1)">
dtype: int64

In [</span>84]: sr
Out[</span>84]: 4
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 所以他有两种索引方式,一种是下标,一种是标签,像字典的key</span></pre>
</div>
<h2>Series-使用特性</h2>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421144515623-1170452106.png" alt=""></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 字典创建Series</span>
In : sr = pd.Series({<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>:1,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span>:2<span style="color: rgba(0, 0, 0, 1)">})
In [</span>86<span style="color: rgba(0, 0, 0, 1)">]: sr
Out[</span>86<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>1<span style="color: rgba(0, 0, 0, 1)">
b    </span>2
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 键索引</span>
<span style="color: rgba(0, 0, 0, 1)">dtype: int64
In [</span>88]: sr[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
Out[</span>88]: 1
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> in的用法</span>
In : <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span> <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> sr
Out[</span>89<span style="color: rgba(0, 0, 0, 1)">]: True
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 通过字典创建,也能使用下标索引</span>
In : sr
Out[</span>87]: 2
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 和字典有一点不一样,写for循环的时候,for字典循环的是key,而Series遍历的是值</span>
In : <span style="color: rgba(0, 0, 255, 1)">for</span> i <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> sr:
    ...:   </span><span style="color: rgba(0, 0, 255, 1)">print</span><span style="color: rgba(0, 0, 0, 1)">(i)
    ...:
</span>1
2

<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 获取索引</span>
In : sr.index
Out[</span>91]: Index([<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span>], dtype=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">object</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
In [</span>93<span style="color: rgba(0, 0, 0, 1)">]: sr.index
Out[</span>93]: <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 获取值</span>
In : sr.values
Out[</span>94]: array(, dtype=<span style="color: rgba(0, 0, 0, 1)">int64)

In [</span>95<span style="color: rgba(0, 0, 0, 1)">]: sr.values
Out[</span>95]: 1
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 花式索引</span>
In : sr = pd.Series(a,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">e</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])

In [</span>102<span style="color: rgba(0, 0, 0, 1)">]: sr
Out[</span>102<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>3<span style="color: rgba(0, 0, 0, 1)">
b    </span>4<span style="color: rgba(0, 0, 0, 1)">
c    </span>5<span style="color: rgba(0, 0, 0, 1)">
d    </span>6<span style="color: rgba(0, 0, 0, 1)">
e    </span>7<span style="color: rgba(0, 0, 0, 1)">
dtype: int32

In [</span>103]: sr[[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">e</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]]
Out[</span>103<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>3<span style="color: rgba(0, 0, 0, 1)">
e    </span>7<span style="color: rgba(0, 0, 0, 1)">
c    </span>5<span style="color: rgba(0, 0, 0, 1)">
dtype: int32
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 标签索引来切片,它是前包后也包的</span>
In : sr[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span>:<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
Out[</span>106<span style="color: rgba(0, 0, 0, 1)">]:
b    </span>4<span style="color: rgba(0, 0, 0, 1)">
c    </span>5<span style="color: rgba(0, 0, 0, 1)">
d    </span>6<span style="color: rgba(0, 0, 0, 1)">
dtype: int32</span></pre>
</div>
<h3>Series-整数索引问题</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421150402669-749309298.png" alt=""></p>
<div class="cnblogs_code">
<pre>In : sr = pd.Series(np.arange(10<span style="color: rgba(0, 0, 0, 1)">))

In [</span>108<span style="color: rgba(0, 0, 0, 1)">]: sr
Out[</span>108<span style="color: rgba(0, 0, 0, 1)">]:
0    0
</span>1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9<span style="color: rgba(0, 0, 0, 1)">
dtype: int32

In [</span>111]: sr2 = sr.copy()
In [</span>112<span style="color: rgba(0, 0, 0, 1)">]: sr2
Out[</span>112<span style="color: rgba(0, 0, 0, 1)">]:
</span>5    5
6    6
7    7
8    8
9    9<span style="color: rgba(0, 0, 0, 1)">
dtype: int32
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 问题开始了,sr2的下标索引并不是从0开始的</span>
In : sr2
Out[</span>113]: 5
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 因为这个时候是有歧义的,所以,如果索引是整数类型,则根据整数进行下标取值的时候,总是面相标签的</span></pre>
</div>
<h3>解决办法:loc和iloc</h3>
<div class="cnblogs_code">
<pre>In : sr2.loc
Out[</span>114]: 5<span style="color: rgba(0, 0, 0, 1)">

In [</span>115]: sr2.iloc[-1<span style="color: rgba(0, 0, 0, 1)">]
Out[</span>115]: 9
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 因为长度只有5,所以使用sr.iloc会报错</span></pre>
</div>
<h2>Series-数据对齐</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421152454836-808421223.png" alt=""></p>
<p>&nbsp;按照标签索引进行计算</p>
<div class="cnblogs_code">
<pre>In : sr1 = pd.Series(,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])
In [</span>118]: sr2 = pd.Series(,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])

In [</span>119]: sr1+<span style="color: rgba(0, 0, 0, 1)">sr2
Out[</span>119<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>33<span style="color: rgba(0, 0, 0, 1)">
c    </span>32<span style="color: rgba(0, 0, 0, 1)">
d    </span>45<span style="color: rgba(0, 0, 0, 1)">
dtype: int64</span></pre>
</div>
<p>pandas中长度不一样也可以计算,并引入NaN数据作为<strong>数据缺失值</strong></p>
<div class="cnblogs_code">
<pre>In : sr1 = pd.Series(,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])

In [</span>121]: sr2 = pd.Series(,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])

In [</span>122]: sr1+<span style="color: rgba(0, 0, 0, 1)">sr2
Out[</span>122<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>33.0<span style="color: rgba(0, 0, 0, 1)"><strong>
b   NaN</strong>
c    </span>32.0<span style="color: rgba(0, 0, 0, 1)">
d    </span>45.0<span style="color: rgba(0, 0, 0, 1)">
dtype: float64</span></pre>
</div>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421153515029-600560682.png" alt=""></p>
<div class="cnblogs_code">
<pre>In : sr1 = pd.Series(,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])

In [</span>124]: sr2 = pd.Series(,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])

In [</span>125]: sr1+<span style="color: rgba(0, 0, 0, 1)">sr2
Out[</span>125<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>43.0<span style="color: rgba(0, 0, 0, 1)">
b   NaN
c    </span>23.0<span style="color: rgba(0, 0, 0, 1)">
d   NaN
dtype: float64</span></pre>
</div>
<p>但是有的时候,我不需要他出现NaN</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421153635382-308812488.png" alt=""></p>
<div class="cnblogs_code">
<pre>In : sr1.add(sr2,fill_value=<span style="color: rgba(0, 0, 0, 1)">0)
Out[</span>126<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>43.0<span style="color: rgba(0, 0, 0, 1)">
b    </span>10.0<span style="color: rgba(0, 0, 0, 1)">
c    </span>23.0<span style="color: rgba(0, 0, 0, 1)">
d    </span>34.0<span style="color: rgba(0, 0, 0, 1)">
dtype: float64</span></pre>
</div>
<p>Series-缺失数据和处理确实数据</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421153931786-618040992.png" alt=""></p>
<p>处理缺失数据有两种思路-删除和填充</p>
<p>判断有没有缺失数据-isnull和notnull</p>
<div class="cnblogs_code">
<pre>In : sr.isnull()
Out[</span>127<span style="color: rgba(0, 0, 0, 1)">]:
0    False
</span>1<span style="color: rgba(0, 0, 0, 1)">    False
</span>2<span style="color: rgba(0, 0, 0, 1)">    False</span><span style="color: rgba(0, 0, 0, 1)">
dtype: bool</span></pre>
</div>
<p>删掉缺失数据的方法</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 恶意直接利用索引取值的方法</span>
In : sr
Out[</span>132<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>43.0<span style="color: rgba(0, 0, 0, 1)">
c    </span>23.0<span style="color: rgba(0, 0, 0, 1)">
dtype: float64
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 使用dropna 删除</span>
In : sr.dropna()
Out[</span>134<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>43.0<span style="color: rgba(0, 0, 0, 1)">
c    </span>23.0<span style="color: rgba(0, 0, 0, 1)">
dtype: float64</span></pre>
</div>
<p>填充的方法</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 使用fillna填充</span>
In : sr.fillna(0)
Out[</span>133<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>43.0<span style="color: rgba(0, 0, 0, 1)">
b   </span>0.0<span style="color: rgba(0, 0, 0, 1)">
c    </span>23.0<span style="color: rgba(0, 0, 0, 1)">
d   </span>0.0<span style="color: rgba(0, 0, 0, 1)">
dtype: float64</span></pre>
</div>
<p>有的时候,不喜欢看见0 ,我们可以<strong>填充一个平均值</strong></p>
<div class="cnblogs_code">
<pre>In : sr.fillna(<span style="color: rgba(255, 0, 0, 1)">sr.mean()</span>)
Out[</span>135<span style="color: rgba(0, 0, 0, 1)">]:
a    </span>43.0<span style="color: rgba(0, 0, 0, 1)">
b    </span>33.0<span style="color: rgba(0, 0, 0, 1)">
c    </span>23.0<span style="color: rgba(0, 0, 0, 1)">
d    </span>33.0<span style="color: rgba(0, 0, 0, 1)">
dtype: float64</span></pre>
</div>
<p>pandas在计算平均值的时候,会跳过nan。如果不想跳过去,可以加一些参数</p>
<h3>Series小结</h3>
<div class="cnblogs_code">
<pre>Series的特性-数组+字典的结合体-<span style="color: rgba(0, 0, 0, 1)">
整数索引的问题</span>-<span style="color: rgba(0, 0, 0, 1)">loc和iloc
数据对齐</span>-<span style="color: rgba(0, 0, 0, 1)">面向标签和缺失值
缺失值的处理</span>-<span style="color: rgba(0, 0, 0, 1)">删除和填充
pandas的mean求平均值的特点的使用</span></pre>
</div>
<h2>DataFrame-二维数据对象</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421155512261-1423098807.png" alt=""></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 第一种创建范式</span>
In : df=pd.DataFrame({<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">one</span><span style="color: rgba(128, 0, 0, 1)">'</span>:,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">two</span><span style="color: rgba(128, 0, 0, 1)">'</span>:})

In [</span>138<span style="color: rgba(0, 0, 0, 1)">]: df
Out[</span>138<span style="color: rgba(0, 0, 0, 1)">]:
   onetwo
0    </span>1    4
1    2    5
2    3    6
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 第二种创建方式</span>
In : pd.DataFrame({<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">one</span><span style="color: rgba(128, 0, 0, 1)">'</span>:pd.Series(,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>]),<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">two</span><span style="color: rgba(128, 0, 0, 1)">'</span>:pd.Series(,index=[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])})
Out[</span>140<span style="color: rgba(0, 0, 0, 1)">]:
   onetwo
a</span>1.0    4<span style="color: rgba(0, 0, 0, 1)">
b</span>2.0    5<span style="color: rgba(0, 0, 0, 1)">
c</span>3.0    6<span style="color: rgba(0, 0, 0, 1)">
dNaN    </span>7
<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 还有很多种创建的方式...</span></pre>
</div>
<h3>文件读写操作</h3>
<p>vim test.csv</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 0, 1)">a,b,c
</span>1,2,3
4,5,6
7,8,9</pre>
</div>
<p>读取csv文件</p>
<div class="cnblogs_code">
<pre>In : pd.read_csv(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">test.csv</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
Out[</span>145<span style="color: rgba(0, 0, 0, 1)">]:
   abc
0</span>123
1456
2789</pre>
</div>
<p>保存文件为csv</p>
<div class="cnblogs_code">
<pre>In : df
Out[</span>147<span style="color: rgba(0, 0, 0, 1)">]:
   abc
0</span>123
1456
2789<span style="color: rgba(0, 0, 0, 1)">

In [</span>148]: df.to_csv(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">test2.csv</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<h2>DataFrame-常用属性</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421175036219-174650436.png" alt=""></p>
<p>index用来获取行索引,values获取的值是二维数组, 这是和Series一样的地方</p>
<div class="cnblogs_code">
<pre>In : df =<span style="color: rgba(0, 0, 0, 1)"> _140

In [</span>157<span style="color: rgba(0, 0, 0, 1)">]: df
Out[</span>157<span style="color: rgba(0, 0, 0, 1)">]:
   onetwo
a</span>1.0    4<span style="color: rgba(0, 0, 0, 1)">
b</span>2.0    5<span style="color: rgba(0, 0, 0, 1)">
c</span>3.0    6<span style="color: rgba(0, 0, 0, 1)">
dNaN    </span>7<span style="color: rgba(0, 0, 0, 1)">

In [</span>158<span style="color: rgba(0, 0, 0, 1)">]: df.index
Out[</span>158]: Index([<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">b</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span>], dtype=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">object</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)

In [</span>159<span style="color: rgba(0, 0, 0, 1)">]: df.values
Out[</span>159<span style="color: rgba(0, 0, 0, 1)">]:
array([[ </span>1.,4<span style="color: rgba(0, 0, 0, 1)">.],
       [ </span>2.,5<span style="color: rgba(0, 0, 0, 1)">.],
       [ </span>3.,6<span style="color: rgba(0, 0, 0, 1)">.],
       ])</pre>
</div>
<p>转置T-把行变成列,列变成行,且一列都成了一个属性(所有的转置默认都会)</p>
<p>可以指定属性dtype</p>
<div class="cnblogs_code">
<pre>In : df.T
Out[</span>160<span style="color: rgba(0, 0, 0, 1)">]:
       a    b    c    d
one</span>1.02.03.0<span style="color: rgba(0, 0, 0, 1)">NaN
two</span>4.05.06.07.0</pre>
</div>
<p>获取列索引columns</p>
<div class="cnblogs_code">
<pre>In : df.columns
Out[</span>163]: Index([<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">one</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">two</span><span style="color: rgba(128, 0, 0, 1)">'</span>], dtype=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">object</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<p>快速统计</p>
<div class="cnblogs_code">
<pre>In : df.describe()
Out[</span>165<span style="color: rgba(0, 0, 0, 1)">]:
       one       two
count</span>3.04.000000  个数<span style="color: rgba(0, 0, 0, 1)">
mean   </span>2.05.500000  平均数      <span style="color: rgba(0, 0, 0, 1)">
std    </span>1.01.290994  标准差<span style="color: rgba(0, 0, 0, 1)">
min    </span>1.04.000000  最小值
25%    1.54.750000  25%位置的数
50%    2.05.500000  中位数
75%    2.56.250000  75%位置的数<span style="color: rgba(0, 0, 0, 1)">
max    </span>3.07.000000  最大数</pre>
</div>
<h2>DataFrame-索引和切片</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421180736776-1235038841.png" alt=""></p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 先选列。再选行</span>
In : df
Out[</span>168<span style="color: rgba(0, 0, 0, 1)">]:
   onetwo
a</span>1.0    4<span style="color: rgba(0, 0, 0, 1)">
b</span>2.0    5<span style="color: rgba(0, 0, 0, 1)">
c</span>3.0    6<span style="color: rgba(0, 0, 0, 1)">
dNaN    </span>7<span style="color: rgba(0, 0, 0, 1)">

In [</span>169]: df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">one</span><span style="color: rgba(128, 0, 0, 1)">'</span>][<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
Out[</span>169]: 1.0<span style="color: rgba(0, 0, 0, 1)">

In [</span>170]: df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">one</span><span style="color: rgba(128, 0, 0, 1)">'</span>]
Out[</span>170]: 2.0<span style="color: rgba(0, 0, 0, 1)">

In [</span>171]: df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">one</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
Out[</span>171]: 1.0</pre>
</div>
<p>建议使用loc或者iloc指定,并不建议使用双中括号</p>
<div class="cnblogs_code">
<pre>In : df.loc[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">one</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
Out[</span>172]: 1.0<span style="color: rgba(0, 0, 0, 1)">

In [</span>173]: df.loc[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,:]
Out[</span>173<span style="color: rgba(0, 0, 0, 1)">]:
one    </span>1.0<span style="color: rgba(0, 0, 0, 1)">
two    </span>4.0<span style="color: rgba(0, 0, 0, 1)">
Name: a, dtype: float64</span></pre>
</div>
<p>灵活搭配使用</p>
<div class="cnblogs_code">
<pre>In : df.loc[[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">a</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">],:]
Out[</span>174<span style="color: rgba(0, 0, 0, 1)">]:
   onetwo
a</span>1.0    4<span style="color: rgba(0, 0, 0, 1)">
c</span>3.0    6</pre>
</div>
<h2>DataFrame-数据对齐与缺失数据</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421183117475-689914436.png" alt=""></p>
<p>DataFrame在使用dropna时,如果一行有一个缺失值,会将整行都删除</p>
<p>指定how=‘all’,删除全部是nan的行</p>
<div class="cnblogs_code">
<pre>In : df.loc[[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">c</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">d</span><span style="color: rgba(128, 0, 0, 1)">'</span>],<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">two</span><span style="color: rgba(128, 0, 0, 1)">'</span>] =<span style="color: rgba(0, 0, 0, 1)"> np.nan

In [</span>178<span style="color: rgba(0, 0, 0, 1)">]: df
Out[</span>178<span style="color: rgba(0, 0, 0, 1)">]:
   onetwo
a</span>1.04.0<span style="color: rgba(0, 0, 0, 1)">
b</span>2.05.0<span style="color: rgba(0, 0, 0, 1)">
c</span>3.0<span style="color: rgba(0, 0, 0, 1)">NaN
dNaNNaN

In [</span>179]: df.dropna(how=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">all</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
Out[</span>179<span style="color: rgba(0, 0, 0, 1)">]:
   onetwo
a</span>1.04.0<span style="color: rgba(0, 0, 0, 1)">
b</span>2.05.0<span style="color: rgba(0, 0, 0, 1)">
c</span>3.0<span style="color: rgba(0, 0, 0, 1)">NaN
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> how 默认的值是any,也就是只要有nan就都会删除</span></pre>
</div>
<p>如何把有一列中有缺失值的那一列都删除?</p>
<p>axis参数意思是-轴,默认是0,是0的时候,指定的是行,1指定的是列</p>
<div class="cnblogs_code">
<pre>In : df
Out[</span>184<span style="color: rgba(0, 0, 0, 1)">]:
   onetwo
a</span>1.04.0<span style="color: rgba(0, 0, 0, 1)">
b</span>2.05.0<span style="color: rgba(0, 0, 0, 1)">
c</span>3.0<span style="color: rgba(0, 0, 0, 1)">NaN
d</span>5.0<span style="color: rgba(0, 0, 0, 1)">NaN

In [</span>185]: df.dropna(axis=1<span style="color: rgba(0, 0, 0, 1)">)
Out[</span>185<span style="color: rgba(0, 0, 0, 1)">]:
   one
a</span>1.0<span style="color: rgba(0, 0, 0, 1)">
b</span>2.0<span style="color: rgba(0, 0, 0, 1)">
c</span>3.0<span style="color: rgba(0, 0, 0, 1)">
d</span>5.0</pre>
</div>
<p>pandas-其他常用方法</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421184528138-2135477873.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421184631253-1370829870.png" alt=""></p>
<p>排序中的ascending=False是倒序,by是指定排序的行(列)</p>
<p>当排序的列(行)有nan的时候,都默认放在了最后,<span style="color: rgba(0, 0, 255, 1)"><strong>不参与排序</strong></span></p>
<p><span style="color: rgba(0, 0, 255, 1)"><strong>numpy的所有通用函数都适于用pandas</strong></span></p>
<h2>pandas-时间对象处理</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421192831817-429366622.png" alt=""></p>
<p>&nbsp;datetime中将时间字符串转化成时间对象</p>
<div class="cnblogs_code">
<pre>In : <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> datetime

In [</span>187]: datetime.datetime.strptime(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-01</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">%Y-%m-%d</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
Out[</span>187]: datetime.datetime(2010, 1, 1, 0, 0)<br>记忆strptime   p--parse 解析<br>记忆strftime   f--format 格式化</pre>
</div>
<p>但是不是所有人写时间的格式都像这样的,有一个库可以帮我们做这件事</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)">dateutil

In [</span>191]: dateutil.parser.parse(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">02/03/2010</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
Out[</span>191]: datetime.datetime(2010, 2, 3<span style="color: rgba(0, 0, 0, 1)">, 0, 0)

In [</span>192]: dateutil.parser.parse(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">02-03-2010</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
Out[</span>192]: datetime.datetime(2010, 2, 3<span style="color: rgba(0, 0, 0, 1)">, 0, 0)

In [</span>193]: dateutil.parser.parse(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-JAN-10</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
Out[</span>193]: datetime.datetime(2010, 1, 10, 0, 0)</pre>
</div>
<p>pandas中的to_datetime就是引用了这个模块,进行批量转换</p>
<div class="cnblogs_code">
<pre>In : pd.to_datetime([<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">02-03-2010</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-JAN-10</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">])
Out[</span>194]: DatetimeIndex([<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-02-03</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-10</span><span style="color: rgba(128, 0, 0, 1)">'</span>], dtype=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">datetime64</span><span style="color: rgba(128, 0, 0, 1)">'</span>, freq=None)</pre>
</div>
<p>注意:得到对象第DatetimeIndex</p>
<h3>时间对象生成-date_range</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421194957570-1899547785.png" alt=""></p>
<div class="cnblogs_code">
<pre>In : pd.date_range(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-01</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-05-01</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
Out[</span>195<span style="color: rgba(0, 0, 0, 1)">]:
DatetimeIndex([</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-01</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-02</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-03</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-04</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
               </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-05</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-06</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-07</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-08</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
               </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-09</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-10</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
               ...
               </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-04-22</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-04-23</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-04-24</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-04-25</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
               </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-04-26</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-04-27</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-04-28</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-04-29</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,
               </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-04-30</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-05-01</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">],
            dtype</span>=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">datetime64</span><span style="color: rgba(128, 0, 0, 1)">'</span>, length=121, freq=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">D</span><span style="color: rgba(128, 0, 0, 1)">'</span>)</pre>
</div>
<p>使用periods指定长度</p>
<p>pd.date_range? 查看帮助中的参数帮助信息</p>
<div class="cnblogs_code">
<pre>start : str <span style="color: rgba(0, 0, 255, 1)">or</span> datetime-<span style="color: rgba(0, 0, 0, 1)">like, optional
    Left bound </span><span style="color: rgba(0, 0, 255, 1)">for</span><span style="color: rgba(0, 0, 0, 1)"> generating dates.
end : str </span><span style="color: rgba(0, 0, 255, 1)">or</span> datetime-<span style="color: rgba(0, 0, 0, 1)">like, optional
    Right bound </span><span style="color: rgba(0, 0, 255, 1)">for</span><span style="color: rgba(0, 0, 0, 1)"> generating dates.
periods : integer, optional   <strong>长度</strong>
    Number of periods to generate.
freq : str </span><span style="color: rgba(0, 0, 255, 1)">or</span> DateOffset, default <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">D</span><span style="color: rgba(128, 0, 0, 1)">'<strong>频率 H-小时 W-周 W-MON W-WEN</strong></span><span style="color: rgba(0, 0, 0, 1)">
    Frequency strings can have multiples, e.g. </span><strong><span style="color: rgba(128, 0, 0, 1)">'</span></strong><span style="color: rgba(128, 0, 0, 1)">5H</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">. See//<strong>B-工作日</strong>
    :ref:`here </span>&lt;timeseries.offset_aliases&gt;` <span style="color: rgba(0, 0, 255, 1)">for</span><span style="color: rgba(0, 0, 0, 1)"> a list of //<span style="color: rgba(0, 0, 255, 1)"><strong>1H20min</strong></span>
    frequency aliases.<br><strong>date_range的参数freq可以各种花式定义时间间隔</strong>
tz : str </span><span style="color: rgba(0, 0, 255, 1)">or</span><span style="color: rgba(0, 0, 0, 1)"> tzinfo, optional
    Time zone name </span><span style="color: rgba(0, 0, 255, 1)">for</span> returning localized DatetimeIndex, <span style="color: rgba(0, 0, 255, 1)">for</span><span style="color: rgba(0, 0, 0, 1)"> example
    </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">Asia/Hong_Kong</span><span style="color: rgba(128, 0, 0, 1)">'</span>. By default, the resulting DatetimeIndex <span style="color: rgba(0, 0, 255, 1)">is</span><span style="color: rgba(0, 0, 0, 1)">
    timezone</span>-<span style="color: rgba(0, 0, 0, 1)">naive.
normalize : bool, default False
    Normalize start</span>/<span style="color: rgba(0, 0, 0, 1)">end dates to midnight before generating date range.
name : str, default None
    Name of the resulting DatetimeIndex.
closed : {None, </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">left</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">right</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">}, optional
    Make the interval closed with respect to the given frequency to
    the </span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">left</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">right</span><span style="color: rgba(128, 0, 0, 1)">'</span>, <span style="color: rgba(0, 0, 255, 1)">or</span> both sides (None, the default).</pre>
</div>
<p>得到的是Timestamp对象,可以将其用<strong>to_pydatetime</strong>转换成时间对象</p>
<p>还可以转成字符串</p>
<p><strong>date_range的参数freq可以各种花式定义时间间隔</strong></p>
<h3>时间序列</h3>
<p>生成的时间对象可以用来构建时间序列的</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421195614007-552567418.png" alt=""></p>
<div class="cnblogs_code">
<pre>In : sr = pd.Series(np.arange(5),index=pd.date_range(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-01</span><span style="color: rgba(128, 0, 0, 1)">'</span>,periods=5<span style="color: rgba(0, 0, 0, 1)">))

In [</span>199<span style="color: rgba(0, 0, 0, 1)">]: sr
Out[</span>199<span style="color: rgba(0, 0, 0, 1)">]:
</span>2010-01-01<span style="color: rgba(0, 0, 0, 1)">    0
</span>2010-01-02    1
2010-01-03    2
2010-01-04    3
2010-01-05    4<span style="color: rgba(0, 0, 0, 1)">
Freq: D, dtype: int32</span></pre>
</div>
<p>那么有什么作用呢?直观的好处就是以时间为索引获取指定范围的数据</p>
<div class="cnblogs_code">
<pre>In : sr = pd.Series(np.arange(100),index=pd.date_range(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-01-01</span><span style="color: rgba(128, 0, 0, 1)">'</span>,periods=100<span style="color: rgba(0, 0, 0, 1)">))

In [</span>201]: sr[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">2010-03</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
Out[</span>201<span style="color: rgba(0, 0, 0, 1)">]:
</span>2010-03-01    59
2010-03-02    60
2010-03-03    61
2010-03-04    62
2010-03-05    63
2010-03-06    64
2010-03-07    65
2010-03-08    66
2010-03-09    67
2010-03-10    68
2010-03-11    69
2010-03-12    70
2010-03-13    71
2010-03-14    72
2010-03-15    73
2010-03-16    74
2010-03-17    75
2010-03-18    76
2010-03-19    77
2010-03-20    78
2010-03-21    79
2010-03-22    80
2010-03-23    81
2010-03-24    82
2010-03-25    83
2010-03-26    84
2010-03-27    85
2010-03-28    86
2010-03-29    87
2010-03-30    88
2010-03-31    89<span style="color: rgba(0, 0, 0, 1)">
Freq: D, dtype: int32</span></pre>
</div>
<p><strong>还比如&nbsp; sr['2017':'2018']</strong></p>
<p>特别方便</p>
<h3>resample函数--重新取样</h3>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 以周为单位取和</span>
In : sr.resample(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">W</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">).sum()
Out[</span>203<span style="color: rgba(0, 0, 0, 1)">]:
</span>2010-01-03      3
2010-01-10   42
2010-01-17   91
2010-01-24    140
2010-01-31    189
2010-02-07    238
2010-02-14    287
2010-02-21    336
2010-02-28    385
2010-03-07    434
2010-03-14    483
2010-03-21    532
2010-03-28    581
2010-04-04    630
2010-04-11    579<span style="color: rgba(0, 0, 0, 1)">
Freq: W</span>-SUN, dtype: int32</pre>
</div>
<p>truncate是类似切片的函数,意义不大,因为都可以通过切片操作来取值</p>
<h2>pandas-文件处理</h2>
<h3>读取文件</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421200851634-151588328.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421201028983-472523151.png" alt=""></p>
<p>&nbsp;</p>
<p>&nbsp; 例子:</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421201744267-739045217.png" alt=""></p>
<p>header= none</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421202110625-890641497.png" alt=""></p>
<p>names的使用</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421202222610-1409065642.png" alt=""></p>
<p>在一个数据表中,如果某一列中有None,这整个列的类型都会变成object,变成了字符串</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421202637537-1055874272.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421202805213-1098033717.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421202739366-706216884.png" alt="">本来应该是float的</p>
<p>&nbsp;但是因为有none,变成了字符串</p>
<p>nan可以解释成浮点数,但是none无法解释,</p>
<p>解决:用na_values</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421203002872-1542720633.png" alt=""></p>
<h3>写入文件</h3>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421201200124-693091061.png" alt=""></p>
<p>写入文件示例:</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421222855022-1459267590.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421222601034-1644014706.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421222529044-491810987.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421222657871-1985694468.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421222721611-1703751199.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421201217029-1046806066.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421223230289-906513937.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421223514605-1784029215.png" alt=""></p>
<p><strong>Python中读取excel的时候需要安装模块xlrd</strong></p>
<p>......还有很多内容</p>
<p>要多多练习,才能掌握,变成自己的</p>
<h1>第四章-数据可视化工具包---matplotlib</h1>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421224129609-1510183544.png" alt=""></p>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422063559810-1849959288.png" alt=""></p>
<p>如果在命令行或者pycharm中运行,会弹出对话框,可以进行拖动、放大等操作...</p>
<h2>plot函数</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190421225207449-1771577956.png" alt=""></p>
<p>&nbsp;plot用来绘制点图或者线图,两个参数(即x和y)</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422063944174-370680325.png" alt=""></p>
<p>还有第三个参数,一个字符串,来决定线的样式(示例:v是小三角,用短线和点连接,显示红色)</p>
<p>也可以使用参数传递(color=‘red’,marker=‘^’,linestyle='-.')</p>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422064320338-105025403.png" alt=""></p>
<p>&nbsp;我想画多条线?该如何操作</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422070130878-464740061.png" alt=""></p>
<p>show函数,调用之后,之前的plot都出现在一张图上了</p>
<h2>Matplotlib-图像标注</h2>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422070342515-820956011.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422071007837-222256149.png" alt=""></p>
<p>plt.legend的用法之一</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422071328437-884095875.png" alt=""></p>
<h2>pandas和Matplotlib</h2>
<p>&nbsp;<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422210849899-1427575984.png" alt=""></p>
<p>直接使用</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422210919037-1052570688.png" alt=""></p>
<h3>作业:绘制数学函数图像</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422211300668-303249649.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422212843238-1767900692.png" alt=""></p>
<h2>画布与子图</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422213056878-1281202750.png" alt=""></p>
<p>&nbsp;fig.add_subplot(2,2,1) 其中 2,2的意思就是把画布分成2x24份,最后的1是第一个位置</p>
<h2>Matplotlib支持的图类型</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422214214044-1590349127.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422214310855-349536172.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422214405793-880401363.png" alt=""></p>
<h3>条形图</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422214817852-1713120129.png" alt=""><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422215822258-106828358.png" alt=""></p>
<h3>饼图</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422220815765-463226650.png" alt=""></p>
<h2>折线图-matplot.finance</h2>
<p>matplotlib.finance.子包中有许多绘制金融相关图的函数接口</p>
<p>绘制K线图:matplotlib.finance.candlestick_ochl函数</p>
<p>参数的帮助信息</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422222816715-229643547.png" alt=""></p>
<p>导入模块并给数据添加了一个time字段</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422223300427-1265708043.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422224403182-137753089.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190422224524621-998759668.png" alt=""></p>
<h1>第五章-金融数据分析基础实战</h1>
<h2>tushare包介绍</h2>
<p>Tushare是一个免费、开源的财经数据接口包。</p>
<h2>练习1-股票数据分析</h2>
<div class="cnblogs_code">
<pre><span style="color: rgba(128, 0, 128, 1)">1</span><span style="color: rgba(0, 0, 0, 1)">、使用tushare包获取某股票的历史行情数据
</span><span style="color: rgba(128, 0, 128, 1)">2</span>、输出该股票所有收盘比开盘上涨3%<span style="color: rgba(0, 0, 0, 1)">以上的日期
</span><span style="color: rgba(128, 0, 128, 1)">3</span>、输出该股票所有开盘比前日收盘跌幅超过2%<span style="color: rgba(0, 0, 0, 1)">的日期(用shift错位)
</span><span style="color: rgba(128, 0, 128, 1)">4</span>、加入我从2010年1月1日开始,每月第一个交易日买入一手股票,每年最后一个交易日卖出所有股票,<br>到今天为止,我的收益如何?</pre>
</div>
<p>tushare接口的使用和shift函数,resample的使用</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190424230310635-1665647588.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190424230329237-1018335777.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190424230349218-939850052.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190424230434058-522801295.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190424230449720-1479260033.png" alt=""></p>
<h2>&nbsp;练习2-查找历史金叉死叉的日期</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190425212637115-201534254.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190425213123665-676442956.png" alt=""></p>
<p>编写代码</p>
<p>&nbsp;</p>
<h2>第一个量化策略</h2>
<p>基于聚宽编码和回测</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190426220950187-900648160.png" alt=""></p>
<p>&nbsp;initialize函数</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427092708394-1144224374.png" alt=""></p>
<p>handle_data函数,每个单位时间执行一次回测</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427101803744-500269552.png" alt=""></p>
<p>策略实现</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入函数库</span>
<span style="color: rgba(0, 0, 255, 1)">from</span> jqdata <span style="color: rgba(0, 0, 255, 1)">import</span> *

<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 初始化函数,设定基准等等</span>
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> initialize(context):
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 1、设置股票池为沪深300的所有成分股</span>
    g.security = get_index_stocks(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)<br>   # 基准收益<br>&nbsp;&nbsp;&nbsp; set_benchmark('000300.XSHG') # 持有后不动
    set_option(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">use_real_price</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,True)
    set_order_cost(OrderCost(open_tax</span>=0, close_tax=0.001, open_commission=0.0003, close_commission=0.0003, close_today_commission=0, min_commission=5), type=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">stock</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"># 回测函数</span>
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> handle_data(context, data):
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 每只股票买多少的问题,账户金额/股票个数的长度=每个股票分多少钱</span>

    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 一般情况下先卖后买 </span>
    tobuy =<span style="color: rgba(0, 0, 0, 1)"> []
    </span><span style="color: rgba(0, 0, 255, 1)">for</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> g.security:
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 获取股票当前的开盘价</span>
      p =<span style="color: rgba(0, 0, 0, 1)"> get_current_data().day_open
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 查看是否持有这只股票</span>
      amount =<span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions.total_amount
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 股票的持仓成本</span>
      cost =<span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions.avg_cost
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 3、如果当前股价比买入时上涨了25%,则清仓止盈</span>
      <span style="color: rgba(0, 0, 255, 1)">if</span> amount &gt; 0 <span style="color: rgba(0, 0, 255, 1)">and</span> p &gt;= cost * 1.25<span style="color: rgba(0, 0, 0, 1)">:
            order_target(stock,0)</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 止盈</span>
      <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 4、如果当前股价比买入时下跌了10%,则卖出止损</span>
      <span style="color: rgba(0, 0, 255, 1)">if</span> amount &gt; 0 <span style="color: rgba(0, 0, 255, 1)">and</span> p &lt;= cost *0.9<span style="color: rgba(0, 0, 0, 1)">:
            order_target(stock,0)</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 止损</span>
      <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 2、如果当前股价小于10元且当前不持仓,则买入</span>
      <span style="color: rgba(0, 0, 255, 1)">if</span> p &lt;= 10.0 <span style="color: rgba(0, 0, 255, 1)">and</span> amount ==<span style="color: rgba(0, 0, 0, 1)"> 0:
            tobuy.append(stock)
            order(stock,</span>1000<span style="color: rgba(0, 0, 0, 1)">)
      </span><span style="color: rgba(0, 0, 255, 1)">if</span><span style="color: rgba(0, 0, 0, 1)"> tobuy:
            cost_per_stock </span>= context.portfolio.available_cash /<span style="color: rgba(0, 0, 0, 1)"> len(tobuy)
            </span><span style="color: rgba(0, 0, 255, 1)">for</span> per <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> tobuy:
                order_value(per,cost_per_stock)</span></pre>
</div>
<h2>双均线策略-最简单只股票</h2>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入函数库</span>
<span style="color: rgba(0, 0, 255, 1)">from</span> jqdata <span style="color: rgba(0, 0, 255, 1)">import</span> *

<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 初始化函数,设定基准等等</span>
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> initialize(context):
    set_benchmark(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span>) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 持有后不动</span>
    set_option(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">use_real_price</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,True)
    set_order_cost(OrderCost(open_tax</span>=0, close_tax=0.001, open_commission=0.0003, close_commission=0.0003, close_today_commission=0, min_commission=5), type=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">stock</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 选股</span>
    g.security = [<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">601318.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
    g.p1 </span>= 5<span style="color: rgba(0, 0, 0, 1)">
    g.p2 </span>= 10
   
   
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> handle_data(context, data):
    </span><span style="color: rgba(0, 0, 255, 1)">for</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> g.security:
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 金叉:如果5日均线大于10日均线,且没有持仓</span>
      <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 死叉:如果5日均线小于10日均线,并且持仓</span>
      <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 获取历史数据</span>
      df =<span style="color: rgba(0, 0, 0, 1)"> attribute_history(stock,g.p2)
      m10 </span>= df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">close</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">].mean()
      m5 </span>= df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">close</span><span style="color: rgba(128, 0, 0, 1)">'</span>][-5<span style="color: rgba(0, 0, 0, 1)">:].mean()
      
      </span><span style="color: rgba(0, 0, 255, 1)">if</span> m10 &gt; m5 <span style="color: rgba(0, 0, 255, 1)">and</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions:
            </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 死叉卖出</span>
<span style="color: rgba(0, 0, 0, 1)">            order_target(stock, 0)
      </span><span style="color: rgba(0, 0, 255, 1)">if</span> m10 &lt; m5 <span style="color: rgba(0, 0, 255, 1)">and</span> stock <span style="color: rgba(0, 0, 255, 1)">not</span> <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions:
            order(stock,context.portfolio.available_cash </span>* 0.8<span style="color: rgba(0, 0, 0, 1)">)
</span></pre>
</div>
<p>&nbsp;在回测图上添加其他的图</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427115619035-511446360.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427115708209-1398406559.png" alt=""></p>
<h2>因子选股策略</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427115926196-974577243.png" alt=""></p>
<p>查询财务数据</p>
<p>get_fundanmentals</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427152559226-1841408059.png" alt=""></p>
<p>策略编写</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入函数库</span>
<span style="color: rgba(0, 0, 255, 1)">from</span> jqdata <span style="color: rgba(0, 0, 255, 1)">import</span> *

<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 初始化函数,设定基准等等</span>
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> initialize(context):
    set_benchmark(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span>) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 持有后不动</span>
    set_option(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">use_real_price</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,True)
    set_order_cost(OrderCost(open_tax</span>=0, close_tax=0.001, open_commission=0.0003<span style="color: rgba(0, 0, 0, 1)">,
    close_commission</span>=0.0003, close_today_commission=0, min_commission=5), type=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">stock</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 选股范围</span>
    g.security = get_index_stocks(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 获取数据,在官网数据选项卡中找到valuation表</span>
    g.q =<span style="color: rgba(0, 0, 0, 1)"> query(valuation).filter(valuation.code.in_(g.security))
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 要定期跟新调仓</span>
      <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 1、定义天数变量,在handle_data中计数,每天+1,当days%30==0的时候,执行调仓</span>
            <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 这是没30个交易日调一次</span>
      <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 2、使用run_monthly(handle,1),定义handle用来跟新的函数,1表示第一个交易日</span>
            <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 这是每月调一次</span>
    run_monthly(handle,1<span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 定义自己的仓位最多有20只股票</span>
    g.N = 20
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> handle(context):
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 注意,有的函数方法会报错,因为平台支持的第三方平台的版本所导致</span>
    df = get_fundamentals(g.q)[[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">code</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">market_cap</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]]
    df </span>= df.sort_values(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">market_cap</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">).iloc[:g.N,:]
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 新选出的股票池</span>
    to_hold = df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">code</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">].values
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 手上可能有一些股票,有的留着,没有的卖掉,添加新的</span>
    <span style="color: rgba(0, 0, 255, 1)">for</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions:
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 手上的股票没在to_hold中,买掉</span>
      <span style="color: rgba(0, 0, 255, 1)">if</span> stock <span style="color: rgba(0, 0, 255, 1)">not</span> <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> to_hold:
            order_target(stock,0)
      
    to_buy </span>=
   
    </span><span style="color: rgba(0, 0, 255, 1)">if</span><span style="color: rgba(0, 0, 0, 1)"> to_buy:
      cash_per_stock </span>= context.portfolio.available_cash /<span style="color: rgba(0, 0, 0, 1)"> len(to_buy)
      </span><span style="color: rgba(0, 0, 255, 1)">for</span> per <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> to_buy:
            order_value(per,cash_per_stock)</span></pre>
</div>
<p>注意停牌的股票的过滤</p>
<p>取前30个,把停牌的(paused)过滤掉,在取前20个</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427162539536-733168994.png" alt=""></p>
<h2>多因子选股策略</h2>
<p>市值小</p>
<p>净资产收益率要高</p>
<h3>如何同时综合多个因子</h3>
<p>...</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427164433155-956339690.png" alt=""></p>
<h3>补充知识-标准化</h3>
<p>&nbsp;标准化,归一化,数据预处理的方法<img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427170624754-1113946404.png" alt=""></p>
<p>编码实现</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入函数库</span>
<span style="color: rgba(0, 0, 255, 1)">from</span> jqdata <span style="color: rgba(0, 0, 255, 1)">import</span> *

<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 初始化函数,设定基准等等</span>
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> initialize(context):
    set_benchmark(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span>) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 持有后不动</span>
    set_option(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">use_real_price</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,True)
    set_order_cost(OrderCost(open_tax</span>=0, close_tax=0.001, open_commission=0.0003<span style="color: rgba(0, 0, 0, 1)">,
    close_commission</span>=0.0003, close_today_commission=0, min_commission=5), type=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">stock</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 选股范围</span>
    g.security = get_index_stocks(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000002.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 获取数据,在官网数据选项卡中找到valuation表,市值数据在这个表中</span>
    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 找到roe在,indicator表中</span>
    g.q =<span style="color: rgba(0, 0, 0, 1)"> query(valuation,indicator).filter(valuation.code.in_(g.security))
   

   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 要定期跟新调仓</span>
      <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 1、定义天数变量,在handle_data中计数,每天+1,当days%30==0的时候,执行调仓</span>
            <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 这是没30个交易日调一次</span>
      <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 2、使用run_monthly(handle,1),定义handle用来跟新的函数,1表示第一个交易日</span>
            <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 这是每月调一次</span>
      <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 定义自己的仓位最多有20只股票</span>
    g.N = 20<span style="color: rgba(0, 0, 0, 1)">
   
    run_monthly(handle,</span>1<span style="color: rgba(0, 0, 0, 1)">)
   
   
</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> handle(context):
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 注意,有的函数方法会报错,因为平台支持的第三方平台的版本所导致</span>
    df = get_fundamentals(g.q)[[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">code</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">market_cap</span><span style="color: rgba(128, 0, 0, 1)">'</span>,<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">roe</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]]
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 进行归一化</span>
    df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">market_cap</span><span style="color: rgba(128, 0, 0, 1)">'</span>] = (df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">market_cap</span><span style="color: rgba(128, 0, 0, 1)">'</span>]) - df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">market_cap</span><span style="color: rgba(128, 0, 0, 1)">'</span>].min()) / (df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">market_cap</span><span style="color: rgba(128, 0, 0, 1)">'</span>].max() - df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">market_cap</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">].min())
    df[</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">roe</span><span style="color: rgba(128, 0, 0, 1)">'</span>] = (df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">roe</span><span style="color: rgba(128, 0, 0, 1)">'</span>]) - df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">roe</span><span style="color: rgba(128, 0, 0, 1)">'</span>].min()) / (df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">roe</span><span style="color: rgba(128, 0, 0, 1)">'</span>].max() - df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">roe</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">].min())
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 增加一列作为评分,收益率越大越好,市值越小越好。最后的结果越大越好</span>
    df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">score</span><span style="color: rgba(128, 0, 0, 1)">'</span>] = df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">roe</span><span style="color: rgba(128, 0, 0, 1)">'</span>] - df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">market_cap</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 选最后20只</span>
    df = df.sort_values(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">score</span><span style="color: rgba(128, 0, 0, 1)">'</span>).iloc[-<span style="color: rgba(0, 0, 0, 1)">g.N:,:]
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 新选出的股票池</span>
    to_hold = df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">code</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">].values
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 手上可能有一些股票,有的留着,没有的卖掉,添加新的</span>
    <span style="color: rgba(0, 0, 255, 1)">for</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions:
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 手上的股票没在to_hold中,买掉</span>
      <span style="color: rgba(0, 0, 255, 1)">if</span> stock <span style="color: rgba(0, 0, 255, 1)">not</span> <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> to_hold:
            order_target(stock,0)
      
    to_buy </span>=
   
    </span><span style="color: rgba(0, 0, 255, 1)">if</span><span style="color: rgba(0, 0, 0, 1)"> to_buy:
      cash_per_stock </span>= context.portfolio.available_cash /<span style="color: rgba(0, 0, 0, 1)"> len(to_buy)
      </span><span style="color: rgba(0, 0, 255, 1)">for</span> per <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> to_buy:
            order_value(per,cash_per_stock)
</span></pre>
</div>
<p>&nbsp;还可以增加权重、增加更多的因子</p>
<h2>&nbsp;均值回归理论</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427173525623-371103927.png" alt=""></p>
<p>均值回归策略是一个选股策略</p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427174114219-813567171.png" alt=""></p>
<p>编码实现</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入函数库</span>
<span style="color: rgba(0, 0, 255, 1)">from</span> jqdata <span style="color: rgba(0, 0, 255, 1)">import</span> *

<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 初始化函数,设定基准等等</span>
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> initialize(context):
    set_benchmark(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span>) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 持有后不动</span>
    set_option(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">use_real_price</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,True)
    set_order_cost(OrderCost(open_tax</span>=0, close_tax=0.001, open_commission=0.0003<span style="color: rgba(0, 0, 0, 1)">,
    close_commission</span>=0.0003, close_today_commission=0, min_commission=5), type=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">stock</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 选股范围</span>
    g.security = get_index_stocks(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000002.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 均线</span>
    g.ma_days = 30
   
    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 股票数量</span>
    g.stock_num = 10<span style="color: rgba(0, 0, 0, 1)">
   
    run_monthly(handle,</span>1<span style="color: rgba(0, 0, 0, 1)">)
   
</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> handle(context):
    sr </span>= pandas.Series(index=<span style="color: rgba(0, 0, 0, 1)">g.security)
    </span><span style="color: rgba(0, 0, 255, 1)">for</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> sr.index:
      ma </span>= attribute_history(stock, g.ma_days)[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">close</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">].mean()
      p </span>=<span style="color: rgba(0, 0, 0, 1)"> get_current_data().day_open
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 计算偏离程度</span>
      ratio = (ma-p) /<span style="color: rgba(0, 0, 0, 1)"> ma
      sr </span>=<span style="color: rgba(0, 0, 0, 1)"> ratio
      
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 不用sort,有一个更快的函数nlargest</span>
    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 新选出的股票池</span>
    to_hold =<span style="color: rgba(0, 0, 0, 1)"> sr.nlargest(g.stock_num).index.values
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 手上可能有一些股票,有的留着,没有的卖掉,添加新的</span>
    <span style="color: rgba(0, 0, 255, 1)">for</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions:
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 手上的股票没在to_hold中,买掉</span>
      <span style="color: rgba(0, 0, 255, 1)">if</span> stock <span style="color: rgba(0, 0, 255, 1)">not</span> <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> to_hold:
            order_target(stock,0)
      
    to_buy </span>=
   
    </span><span style="color: rgba(0, 0, 255, 1)">if</span><span style="color: rgba(0, 0, 0, 1)"> to_buy:
      cash_per_stock </span>= context.portfolio.available_cash /<span style="color: rgba(0, 0, 0, 1)"> len(to_buy)
      </span><span style="color: rgba(0, 0, 255, 1)">for</span> per <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> to_buy:
            order_value(per,cash_per_stock)</span></pre>
</div>
<h2>布林带策略</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427181600180-622272660.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427182306739-761275512.png" alt=""></p>
<p>上下N取小了不好,去大了等于没取,因为很难触碰,上下可以取不同的N</p>
<p>编码实现</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入函数库</span>
<span style="color: rgba(0, 0, 255, 1)">from</span> jqdata <span style="color: rgba(0, 0, 255, 1)">import</span> *

<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 初始化函数,设定基准等等</span>
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> initialize(context):
    set_benchmark(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span>) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 持有后不动</span>
    set_option(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">use_real_price</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,True)
    set_order_cost(OrderCost(open_tax</span>=0, close_tax=0.001, open_commission=0.0003<span style="color: rgba(0, 0, 0, 1)">,
    close_commission</span>=0.0003, close_today_commission=0, min_commission=5), type=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">stock</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 选股范围</span>
    g.security = (<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">600036.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    g.M </span>= 20<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 试验过20比较好</span>
    g.k = 2<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 听说1.7比较好</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> 初始化策略</span>
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> handle_data(context, data):
    sr </span>= attribute_history(g.security,g.M)[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">close</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">]
    ma </span>=<span style="color: rgba(0, 0, 0, 1)"> sr.mean()
    up </span>= ma + g.k *<span style="color: rgba(0, 0, 0, 1)"> sr.std()
    down </span>= ma - g.k *<span style="color: rgba(0, 0, 0, 1)"> sr.std()
    p </span>=<span style="color: rgba(0, 0, 0, 1)"> get_current_data().day_open
    cash </span>=<span style="color: rgba(0, 0, 0, 1)"> context.portfolio.available_cash
    </span><span style="color: rgba(0, 0, 255, 1)">if</span> p &lt; down <span style="color: rgba(0, 0, 255, 1)">and</span> g.security <span style="color: rgba(0, 0, 255, 1)">not</span> <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions:
      order_value(g.security, cash)
    </span><span style="color: rgba(0, 0, 255, 1)">elif</span> p &gt;up <span style="color: rgba(0, 0, 255, 1)">and</span> g.security <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions:
      order_target(g.security, 0)</span></pre>
</div>
<p>&nbsp;</p>
<p>可以继续尝试其他股票或者多只股票</p>
<p>多只股票牵涉资金分配的问题</p>
<p>多尝试几个参数,看效果如何</p>
<p>布林带比较窄的时候,说明波动小,将不适合短线交易,也可将其作为一个因子</p>
<p>加入止损操作</p>
<h2>PEG策略</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427184332476-1458241501.png" alt=""></p>
<h3>市盈率是什么</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427184511079-1302027280.png" alt=""></p>
<h3>PEG策略说明</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427185248325-1738574197.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427185550916-920790160.png" alt=""></p>
<h3>PEG选股</h3>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427185653831-598483507.png" alt=""></p>
<p>编码实现</p>
<p>市盈率有静态的和动态的两种,我们使用静态的pe_ratio,在valuation表中</p>
<p>收益增长率inc_net_profit_year_on_year,在indicator里面</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入函数库</span>
<span style="color: rgba(0, 0, 255, 1)">from</span> jqdata <span style="color: rgba(0, 0, 255, 1)">import</span> *

<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 初始化函数,设定基准等等</span>
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> initialize(context):
    set_benchmark(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span>) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 持有后不动</span>
    set_option(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">use_real_price</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,True)
    set_order_cost(OrderCost(open_tax</span>=0, close_tax=0.001, open_commission=0.0003<span style="color: rgba(0, 0, 0, 1)">,
    close_commission</span>=0.0003, close_today_commission=0, min_commission=5), type=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">stock</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 选股范围</span>
    g.security = get_index_stocks(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
    g.q </span>=<span style="color: rgba(0, 0, 0, 1)"> query(valuation.code,valuation.pe_ratio,indicator.inc_net_profit_year_on_year).filter(valuation.code.in_(g.security))
    g.N </span>= 20<span style="color: rgba(0, 0, 0, 1)">
    run_monthly(handle,</span>1<span style="color: rgba(0, 0, 0, 1)">)
   
</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> handle(context):
    df </span>=<span style="color: rgba(0, 0, 0, 1)"> get_fundamentals(g.q)
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 过滤负值的PEG</span>
    df = df[(df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">pe_ratio</span><span style="color: rgba(128, 0, 0, 1)">'</span>] &gt; 0) &amp; (df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">inc_net_profit_year_on_year</span><span style="color: rgba(128, 0, 0, 1)">'</span>] &gt;<span style="color: rgba(0, 0, 0, 1)"> 0) ]
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 计算peg</span>
    df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">peg</span><span style="color: rgba(128, 0, 0, 1)">'</span>] = df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">pe_ratio</span><span style="color: rgba(128, 0, 0, 1)">'</span>] /df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">inc_net_profit_year_on_year</span><span style="color: rgba(128, 0, 0, 1)">'</span>]/100<span style="color: rgba(0, 0, 0, 1)">
    df </span>= df.sort_values(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">peg</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
    to_hold </span>= df[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">code</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">][:g.N].values
    </span><span style="color: rgba(0, 0, 255, 1)">print</span><span style="color: rgba(0, 0, 0, 1)">(to_hold)
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 手上可能有一些股票,有的留着,没有的卖掉,添加新的</span>
    <span style="color: rgba(0, 0, 255, 1)">for</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions:
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 手上的股票没在to_hold中,买掉</span>
      <span style="color: rgba(0, 0, 255, 1)">if</span> stock <span style="color: rgba(0, 0, 255, 1)">not</span> <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> to_hold:
            order_target(stock,0)
      
    to_buy </span>=
   
    </span><span style="color: rgba(0, 0, 255, 1)">if</span><span style="color: rgba(0, 0, 0, 1)"> to_buy:
      cash_per_stock </span>= context.portfolio.available_cash /<span style="color: rgba(0, 0, 0, 1)"> len(to_buy)
      </span><span style="color: rgba(0, 0, 255, 1)">for</span> per <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> to_buy:
            order_value(per,cash_per_stock)</span></pre>
</div>
<h2>动量策略和反转策略</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427230223998-1396108737.png" alt=""></p>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190427230255359-1320214746.png" alt=""></p>
<p>&nbsp;</p>
<p>编码实现</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> jqdata
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> math
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> numpy as np
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> pandas as pd
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> datetime

</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> initialize(context):
    set_option(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">use_real_price</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">, True)
    set_order_cost(OrderCost(open_tax</span>=0, close_tax=0.001, open_commission=0.0003<span style="color: rgba(0, 0, 0, 1)">,
    close_commission</span>=0.0003, close_today_commission=0, min_commission=5), type=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">stock</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    g.benchmark </span>= <span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">
    g.N </span>= 10<span style="color: rgba(0, 0, 0, 1)">
    set_benchmark(g.benchmark)
    run_monthly(handle, </span>1<span style="color: rgba(0, 0, 0, 1)">)
   
</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> handle(context):
    stocks </span>= get_index_stocks(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 这段时间的收盘价(attribu是选取一只股票多个时间的,history是选择多只股票)</span>
    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 转置,相当于将股票代码放在了表头上</span>
    df_close = history(30, field=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">close</span><span style="color: rgba(128, 0, 0, 1)">'</span>, security_list=<span style="color: rgba(0, 0, 0, 1)">list(stocks)).T
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 增加ret列,表示收益率(最后一天的价格-第一天的价格)/ 第一天的价格</span>
    df_close[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">ret</span><span style="color: rgba(128, 0, 0, 1)">'</span>] = (df_close.iloc[:,-1]-df_close.iloc[:,0])/<span style="color: rgba(0, 0, 0, 1)">df_close.iloc[:,0]
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> ascending = False 表示降序,即为动量策略,总选最好的</span>
    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> ascending = True反转策略</span>
    sorted_stocks = df_close.sort_values(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">ret</span><span style="color: rgba(128, 0, 0, 1)">'</span>, ascending =<span style="color: rgba(0, 0, 0, 1)"> False).index
   
    to_hold </span>=<span style="color: rgba(0, 0, 0, 1)"> sorted_stocks[:g.N]
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 手上可能有一些股票,有的留着,没有的卖掉,添加新的</span>
    <span style="color: rgba(0, 0, 255, 1)">for</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions:
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 手上的股票没在to_hold中,买掉</span>
      <span style="color: rgba(0, 0, 255, 1)">if</span> stock <span style="color: rgba(0, 0, 255, 1)">not</span> <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> to_hold:
            order_target(stock,0)
      
    to_buy </span>=
   
    </span><span style="color: rgba(0, 0, 255, 1)">if</span><span style="color: rgba(0, 0, 0, 1)"> to_buy:
      cash_per_stock </span>= context.portfolio.available_cash /<span style="color: rgba(0, 0, 0, 1)"> len(to_buy)
      </span><span style="color: rgba(0, 0, 255, 1)">for</span> per <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> to_buy:
            order_value(per,cash_per_stock)
</span></pre>
</div>
<p>&nbsp;</p>
<p>最后得出结论,A股市场的反转策略优于动量策略</p>
<h2>羊驼交易法则</h2>
<p><img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190428214532050-1009530002.png" alt=""> <img src="https://img2018.cnblogs.com/blog/1355675/201904/1355675-20190428214736656-1678388865.png" alt=""></p>
<p>编码实现</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 导入函数库</span>
<span style="color: rgba(0, 0, 255, 1)">from</span> jqdata <span style="color: rgba(0, 0, 255, 1)">import</span> *

<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 初始化函数,设定基准等等</span>
<span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> initialize(context):
    set_benchmark(</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span>) <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 持有后不动</span>
    set_option(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">use_real_price</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">,True)
    set_order_cost(OrderCost(open_tax</span>=0, close_tax=0.001, open_commission=0.0003<span style="color: rgba(0, 0, 0, 1)">,
    close_commission</span>=0.0003, close_today_commission=0, min_commission=5), type=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">stock</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
   
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 选股范围</span>
    g.security = get_index_stocks(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">000300.XSHG</span><span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(0, 0, 0, 1)">)
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 看多长时间的 收益率</span>
    g.period = 30<span style="color: rgba(0, 0, 0, 1)">
    g.N </span>= 10
    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 每次调整几只股票</span>
    g.change = 1
    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 标志位,第一次购买的时候购买的是10只</span>
    g.init =<span style="color: rgba(0, 0, 0, 1)"> True
    run_monthly(handle,</span>1<span style="color: rgba(0, 0, 0, 1)">)

</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> get_sorted_stocks(context,stocks):
    df_close </span>= history(g.period, field=<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">close</span><span style="color: rgba(128, 0, 0, 1)">'</span>, security_list=<span style="color: rgba(0, 0, 0, 1)">stocks).T
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 增加ret列,表示收益率(最后一天的价格-第一天的价格)/ 第一天的价格</span>
    df_close[<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">ret</span><span style="color: rgba(128, 0, 0, 1)">'</span>] = (df_close.iloc[:,-1]-df_close.iloc[:,0])/<span style="color: rgba(0, 0, 0, 1)">df_close.iloc[:,0]
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> ascending = False 表示降序,即为动量策略,总选最好的</span>
    <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> ascending = True反转策略</span>
    sorted_stocks = df_close.sort_values(<span style="color: rgba(128, 0, 0, 1)">'</span><span style="color: rgba(128, 0, 0, 1)">ret</span><span style="color: rgba(128, 0, 0, 1)">'</span>, ascending =<span style="color: rgba(0, 0, 0, 1)"> False)
    </span><span style="color: rgba(0, 0, 255, 1)">return</span><span style="color: rgba(0, 0, 0, 1)"> sorted_stocks.index.values
   
   
</span><span style="color: rgba(0, 0, 255, 1)">def</span><span style="color: rgba(0, 0, 0, 1)"> handle(context):
    </span><span style="color: rgba(0, 0, 255, 1)">if</span><span style="color: rgba(0, 0, 0, 1)"> g.init:
      stocks </span>=<span style="color: rgba(0, 0, 0, 1)"> get_sorted_stocks(context, g.security)[:g.N]
      cash </span>= context.portfolio.available_cash * 0.9 /<span style="color: rgba(0, 0, 0, 1)"> len(stocks)
      </span><span style="color: rgba(0, 0, 255, 1)">for</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> stocks:
            order_value(stock, cash)
      g.init </span>=<span style="color: rgba(0, 0, 0, 1)"> False
      </span><span style="color: rgba(0, 0, 255, 1)">return</span><span style="color: rgba(0, 0, 0, 1)">
    stocks </span>=<span style="color: rgba(0, 0, 0, 1)"> get_sorted_stocks(context, context.portfolio.positions.keys())
   
    </span><span style="color: rgba(0, 0, 255, 1)">for</span> stock <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> stocks:
      </span><span style="color: rgba(0, 0, 255, 1)">if</span> len(context.portfolio.positions) &gt;=<span style="color: rgba(0, 0, 0, 1)"> g.N:
            </span><span style="color: rgba(0, 0, 255, 1)">break</span>
      <span style="color: rgba(0, 0, 255, 1)">if</span> stock <span style="color: rgba(0, 0, 255, 1)">not</span> <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> context.portfolio.positions:
            order_value(stock, context.portfolio.available_cash </span>* 0.9)</pre>
</div>
<h2>简易回测框架开发</h2>
<p>框架内容</p>
<ol>
<li>上下文信息保存:context</li>
<li>获取数据:</li>
<li>下单函数:</li>
<li>用户接口:</li>
<li>...</li>
</ol>
<p>&nbsp;</p><br><br>
来源:https://www.cnblogs.com/yxiaodao/p/10732824.html
頁: [1]
查看完整版本: python数据分析与量化交易