Dalam analisis data, labeling pada plot data clustering dapat membantu dalam mengidentifikasi pattern dan trend yang ada. Berikut adalah 7 gaya labeling yang dapat digunakan:
STYLE 1: LEGEND
Gaya labeling pertama ini menggunakan legenda untuk menunjukkan nama-nama cluster. Legenda ini dapat berupa kotak-kotak atau simbol-simbol yang mewakili setiap cluster.
facet.ax.text(pad, y, 'Distributions of points in clusters:',
ha='right', va='bottom', color='black')
for i, label in enumerate(groups.keys()):
text = facet.ax.text(pad+((i+1)*sep), y, label,
ha='right', va='bottom',
color=customPalette[i])
STYLE 2: LABELS ABOVE CLUSTER
Gaya labeling kedua ini menampilkan nama-nama cluster di atas setiap cluster.
plt.figure(figsize=(5,5))
for i, label in enumerate(groups.keys()):
plt.scatter(x=data.loc[data['label']==label, 'x'],
y=data.loc[data['label']==label,'y'],
color=customPalette[i],
alpha=0.7)
plt.annotate(label,
(data.loc[data['label']==label,['x','y']].mean()[0],
data.loc[data['label']==label,['x','y']].mean()[1]),
horizontalalignment='center',
verticalalignment='center',
size=20, weight='bold',
color=customPalette[i])
STYLE 3: LABELS NEXT TO CLUSTER
Gaya labeling ketiga ini menampilkan nama-nama cluster di samping setiap cluster.
labels = {'A': (1.25,1),
'B': (2.25,4.5),
'C': (4.75,3.5),
'D': (4.75,1.5)}
plt.figure(figsize=(5,5))
for i, label in enumerate(groups.keys()):
plt.scatter(x=data.loc[data['label']==label, 'x'],
y=data.loc[data['label']==label,'y'],
color=customPalette[i],
alpha=0.7)
plt.annotate(label,
labels[label],
horizontalalignment='center',
verticalalignment='center',
size=20, weight='bold',
color=customPalette[i])
STYLE 4: LABELS NEXT TO CLUSTER
Gaya labeling keempat ini juga menampilkan nama-nama cluster di samping setiap cluster.
labels = {'A': (1.25,1),
'B': (2.25,4.5),
'C': (4.75,3.5),
'D': (4.75,1.5)}
plt.figure(figsize=(5,5))
for i, label in enumerate(groups.keys()):
plt.scatter(x=data.loc[data['label']==label, 'x'],
y=data.loc[data['label']==label,'y'],
color=customPalette[i],
alpha=0.7)
plt.annotate(label,
labels[label],
horizontalalignment='center',
verticalalignment='center',
size=20, weight='bold',
color=customPalette[i])
STYLE 5: LABELS CENTERED ON CLUSTER MEANS
Gaya labeling kelima ini menampilkan nama-nama cluster di tengah setiap cluster.
plt.figure(figsize=(5,5))
for i, label in enumerate(groups.keys()):
plt.scatter(x=data.loc[data['label']==label, 'x'],
y=data.loc[data['label']==label,'y'],
color=customPalette[i],
alpha=0.20)
plt.annotate(label,
data.loc[data['label']==label,['x','y']].mean(),
horizontalalignment='center',
verticalalignment='center',
size=20, weight='bold',
color=customPalette[i])
STYLE 6: LABELS CENTERED ON CLUSTER MEANS (2)
Gaya labeling keenam ini juga menampilkan nama-nama cluster di tengah setiap cluster.
plt.figure(figsize=(5,5))
for i, label in enumerate(groups.keys()):
plt.scatter(x=data.loc[data['label']==label, 'x'],
y=data.loc[data['label']==label,'y'],
color=customPalette[i],
alpha=1)
plt.annotate(label,
data.loc[data['label']==label,['x','y']].mean(),
horizontalalignment='center',
verticalalignment='center',
size=20, weight='bold',
color=customPalette[i])
STYLE 7: LABELS WITH ICON
Gaya labeling ketujuh ini menampilkan nama-nama cluster dengan simbol-simbol.
plt.figure(figsize=(5,5))
for i, label in enumerate(groups.keys()):
plt.scatter(x=data.loc[data['label']==label, 'x'],
y=data.loc[data['label']==label,'y'],
color=customPalette[i],
alpha=0.7)
plt.annotate(label + '\n' + icon[label],
(data.loc[data['label']==label,['x','y']].mean()[0],
data.loc[data['label']==label,['x','y']].mean()[1]),
horizontalalignment='center',
verticalalignment='center',
size=20, weight='bold',
color=customPalette[i])
Dengan demikian, dengan menggunakan gaya labeling yang tepat, Anda dapat lebih mudah mengidentifikasi pattern dan trend pada data clustering.