Sentiment Analysis of Skincare Active Ingredient Topics using Latent Dirichlet Allocation and InSet Lexicon on Twitter Social Media
Abstract
The cosmetic industry, encompassing skincare, underwent a growth rate of up to 9.61%, as indicated by data from the Central Statistics Agency (BPS). With the ongoing expansion of the cosmetic sector, the production of products, particularly those featuring active ingredients in skincare, increased accordingly. Consequently, the utilization of these active ingredients witnessed an upward trend. Twitter data pertaining to active skincare ingredients was collected, forming a substantial dataset that required methods for analyzing topics and opinions.To identify latent topic information, topic modeling using Latent Dirichlet Allocation (LDA) was employed. Prior to conducting topic modeling, clustering was initially performed using K-Means to facilitate the categorization of the extensive dataset into more specific data groups. Subsequently, sentiment analysis was carried out using the InSet Lexicon. The research resulted in four clusters, each of which underwent topic modeling with LDA.Cluster 1 unveiled a topic focusing on the content of alpha arbutin, with sentiment results of 42.5% positive, 45% negative, and 12.5% neutral. Cluster 2 centered around the content of reinol and AHA BHA, with sentiment results of 41.36% positive, 46.99% negative, and 12.13% neutral. Cluster 3 delved into the content of salicylic acid and hyaluronic acid, with sentiment results of 40.57% positive, 42.62% negative, and 16.80% neutral. Lastly, Cluster 4 discussed the clay mask "Skintific" containing mugwort, with sentiment results of 41.67% positive, 43.94% negative, and 14.39% neutral.This research is anticipated to be beneficial and can be utilized by the skincare industry to update the company's business strategies.