Pada artikel ini, kita akan membahas cara plotting numerical variables dalam Python menggunakan pandas. Dalam pandas, terdapat kelas DataFrame yang memiliki anggota bernama plot. Menggunakan metode scatter() pada anggota plot tersebut, Anda dapat membuat plot antara dua variabel atau dua kolom DataFrame pandas.
Syntax
Plotting numerical variables in pandas can be done using the following syntax:
import pandas as pd
import matplotlib.pyplot as plt
# Create a sample DataFrame
data = {'X': [1, 2, 3, 4, 5], 'Y': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)
# Plot the data
df.plot.scatter(x='X', y='Y')
plt.show()
Example
Here is an example of plotting two numerical variables in pandas:
import pandas as pd
import matplotlib.pyplot as plt
# Create a sample DataFrame
data = {'X': [1, 2, 3, 4, 5], 'Y': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)
# Plot the data
plt.scatter(df['X'], df['Y'])
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Plot of X and Y')
plt.show()
This will create a scatter plot of the two numerical variables X
and Y
.
In this article, we have learned how to plot numerical variables in pandas using the plot.scatter()
method. We can use this method to visualize relationships between different variables or columns of our DataFrame.
References
- Pandas Documentation: https://pandas.pydata.org/docs/
- Matplotlib Documentation: https://matplotlib.org/stable/tutorials/introductory/pyplot.html