Python【Pandas】取交集、并集、差集
<h2 id="前堤条件对于colums都相同的dataframe做过滤的时候">前堤条件:对于colums都相同的dataframe做过滤的时候</h2><p>创建2个结构(列名)一致的DataFrame,df1和df2有1条重合的数据</p>
<pre><code class="language-python">import pandas as pd
df1=pd.DataFrame([['a',10,'男'],['b',11,'女']],columns=['name','age','gender'])
df2=pd.DataFrame([['a',10,'男']],columns=['name','age','gender'])
df1
</code></pre>
<div>
<style scoped="">.dataframe tbody tr th:only-of-type { vertical-align: middle }
\3c pre>\3c code>.dataframe tbody tr th { vertical-align: top }
.dataframe thead th { text-align: right }</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right">
<th></th>
<th>name</th>
<th>age</th>
<th>gender</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>a</td>
<td>10</td>
<td>男</td>
</tr>
<tr>
<th>1</th>
<td>b</td>
<td>11</td>
<td>女</td>
</tr>
</tbody>
</table>
</div>
<pre><code class="language-python">df2
</code></pre>
<div>
<style scoped="">.dataframe tbody tr th:only-of-type { vertical-align: middle }
\3c pre>\3c code>.dataframe tbody tr th { vertical-align: top }
.dataframe thead th { text-align: right }</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right">
<th></th>
<th>name</th>
<th>age</th>
<th>gender</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>a</td>
<td>10</td>
<td>男</td>
</tr>
</tbody>
</table>
</div>
<h3 id="取交集">取交集</h3>
<pre><code class="language-python">pd.merge(df1,df2,on=['name','age','gender'])
</code></pre>
<div>
<style scoped="">.dataframe tbody tr th:only-of-type { vertical-align: middle }
\3c pre>\3c code>.dataframe tbody tr th { vertical-align: top }
.dataframe thead th { text-align: right }</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right">
<th></th>
<th>name</th>
<th>age</th>
<th>gender</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>a</td>
<td>10</td>
<td>男</td>
</tr>
</tbody>
</table>
</div>
<h3 id="取并集">取并集</h3>
<pre><code class="language-python">pd.merge(df1,df2,on=['name','age','gender'],how='outer')
</code></pre>
<div>
<style scoped="">.dataframe tbody tr th:only-of-type { vertical-align: middle }
\3c pre>\3c code>.dataframe tbody tr th { vertical-align: top }
.dataframe thead th { text-align: right }</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right">
<th></th>
<th>name</th>
<th>age</th>
<th>gender</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>a</td>
<td>10</td>
<td>男</td>
</tr>
<tr>
<th>1</th>
<td>b</td>
<td>11</td>
<td>女</td>
</tr>
</tbody>
</table>
</div>
<h3 id="取差集">取差集</h3>
<pre><code class="language-python">df1=df1.append(df2)
df1=df1.drop_duplicates(subset=['name','age','gender'],keep=False)
df1
</code></pre>
<div>
<style scoped="">.dataframe tbody tr th:only-of-type { vertical-align: middle }
\3c pre>\3c code>.dataframe tbody tr th { vertical-align: top }
.dataframe thead th { text-align: right }</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right">
<th></th>
<th>name</th>
<th>age</th>
<th>gender</th>
</tr>
</thead>
<tbody>
<tr>
<th>1</th>
<td>b</td>
<td>11</td>
<td>女</td>
</tr>
</tbody>
</table>
</div>
<p>python pandas取交集、并集、差集</p><br><br>
来源:https://www.cnblogs.com/kaerxifa/p/13155768.html
頁:
[1]