# Librairies ```python import pandas as pd import matplotlib import matplotlib.pyplot as plt import seaborn as sns ``` # Data ```python df = sns.load_dataset("tips") df.head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>total_bill</th> <th>tip</th> <th>sex</th> <th>smoker</th> <th>day</th> <th>time</th> <th>size</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>16.99</td> <td>1.01</td> <td>Female</td> <td>No</td> <td>Sun</td> <td>Dinner</td> <td>2</td> </tr> <tr> <th>1</th> <td>10.34</td> <td>1.66</td> <td>Male</td> <td>No</td> <td>Sun</td> <td>Dinner</td> <td>3</td> </tr> <tr> <th>2</th> <td>21.01</td> <td>3.50</td> <td>Male</td> <td>No</td> <td>Sun</td> <td>Dinner</td> <td>3</td> </tr> <tr> <th>3</th> <td>23.68</td> <td>3.31</td> <td>Male</td> <td>No</td> <td>Sun</td> <td>Dinner</td> <td>2</td> </tr> <tr> <th>4</th> <td>24.59</td> <td>3.61</td> <td>Female</td> <td>No</td> <td>Sun</td> <td>Dinner</td> <td>4</td> </tr> </tbody> </table> </div> # Analyse univariée ## Variable discrète ```python df["sex"].value_counts(normalize=False, sort=True, ascending=False) ``` sex Male 157 Female 87 Name: count, dtype: int64 Por exprimer en pourcentage -> normalize = True ```python df["sex"].value_counts(normalize=True, sort=True, ascending=False) ``` sex Male 0.643443 Female 0.356557 Name: proportion, dtype: float64 ```python # Avec Pandas df["sex"].value_counts().plot(kind="bar") plt.show() ``` ![png](analyse%20univariée%20-%20MeP_9_0.png) ```python # Avec Seaborn sns.countplot(data=df, x="sex", hue="sex") ``` <Axes: xlabel='sex', ylabel='count'> ![png](analyse%20univariée%20-%20MeP_10_1.png) ## Variable continue ### Statistiques ```python df["tip"].describe() ``` count 244.000000 mean 2.998279 std 1.383638 min 1.000000 25% 2.000000 50% 2.900000 75% 3.562500 max 10.000000 Name: tip, dtype: float64 ### Histogramme ```python # Avec Pandas df["tip"].plot(kind="hist", bins=10) plt.show() ``` ![png](analyse%20univariée%20-%20MeP_15_0.png) ```python # Avec Seaborn sns.displot(data=df, x="tip", bins=10) ``` <seaborn.axisgrid.FacetGrid at 0x1e3a7f8bad0> ![png](analyse%20univariée%20-%20MeP_16_1.png) ### Boxplot ```python # Avec Pandas df["tip"].plot(kind="box") plt.show() ``` ![png](analyse%20univariée%20-%20MeP_18_0.png) ```python # Avec Seaborn fig, ax = plt.subplots(figsize=(10,2)) # sns.catplot(data=df, y="tip", kin="box") sns.boxplot(x=df["tip"]) plt.show() ``` ![png](analyse%20univariée%20-%20MeP_19_0.png)