Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailConsumer products often advertise features like protection against germs gentle on fabric natural ingredients only rich in vitamin C or high energy efficiency making it crucial to analyse and compare these claims and ingredient lists across a wide range of product categories. The aim is to develop analytics and market insights to identify trends across various consumer products including categories such as personal care food and beverages and household cleaning supplies. To achieve this automated clustering of product claims and ingredients is needed followed by feature extraction to identify key characteristics that define each group.
This thesis will involve developing and evaluating methods to first cluster similar product claims and ingredients into meaningful groups and then extract important features from each cluster to identify trends common themes and unique selling points. This will provide valuable market insights and help assess how products are positioned relative to one another.
The project will begin with two datasets: a product claims dataset containing around 300000 claims from a variety of consumer products and an ingredients dataset containing lists of ingredients from these products.
The first step is to research and apply clustering techniques to group similar product claims. The focus will be on finding the most suitable clustering algorithms and optimizing them to ensure meaningful groupings. Various methods will be compared to determine which approach works best for the given data. Once the clustering is complete feature extraction methods will be applied to identify key characteristics from the clusters of product claims. The goal is to derive relevant insights that are specific to each group highlighting common keywords such as protection natural ingredients or efficiency.
The ingredients dataset will also be clustered to identify common groupings and standardize ingredient lists across different products (e.g. distilled water and aqua become water) allowing for clearer analysis and comparison.
Throughout the project different clustering and feature extraction methods will be compared using appropriate metrics to evaluate their performance. The research will involve identifying the best approaches optimizing their parameters and assessing their performance.
We are looking for a motivated student with great interest in machine learning natural language processing (NLP) and data science. Knowledge of clustering techniques and feature extraction is beneficial. This thesis is suitable for students of computer science data science or a related field with experience in Python and machine learning frameworks.
The thesis project can be published and used in your personal portfolio as well as in company marketing. Include Resum/CV and portfolio in your application.
Full Time