Building a tool to automatically extract web data and images, analyse it and present it in an easy to use dashboard.
Ecommerce sites change every hour and it is difficult to keep up to date with the offers and promotions presented by different online stores. Our client needed a tool to analyse multiple websites, many times per day; to show which products were being advertised and where, how much real estate each advert was given and an analysis of the text used in every advert. The extracted data would be used to build charts in an easy-to-use tableau dashboard.
We built an analysis tool using an ETL Serverless web scraping tool, which analyses daily the smartphone market, checking the price fluctuation across different sector-leading online stores.
It also contained an image recognition tool, using a data dictionary to extract the most promoted devices across the stores.
All the data is later loaded into a Business Intelligence Tool to perform data analysis.
Technologies: Node.js, Python, AWS Step Functions, AWS Lambda, Business Intelligence Tools, OCR, OpenCV, AWS Rekognition.
This product was built in stages following a lean development process. The MVP of the tool extracted text from just one site with further versions analysing flat images, screen real estate and analysing the advert text content itself. This data was then loaded in Tableau where dashboards were created so users could see real-time data on how their products are being advertised across various platforms.
The data has proved invaluable allowing product owners to see why demand is changing for their products on a day to day basis rather than having to wait for store reports.
We turned to Secret Source with a rough sketch of what we were looking to build, and a guide on the questions we were looking to answer.
From that topline overview, a project team was formed. Through their expertise Secret Source were able to quickly extract the pertinent information from our organisation and translate that data into a detailed sprint program.
The program was delivered on time and within budget, the process was smooth and most importantly it was an enjoyable experience in what was unknown territory.
I am really looking forward to building on this foundation with Secret Source, as we broaden the scope of the data we are interested in, and highly recommend their business as a strong partner.
As serverless solutions have been improving with the growth of AWS, Azure and Google Cloud, demand for serverless projects has increased and our team has been growing. Working on projects like this where we get to leverage the vast power of AWS is just the type of work we love doing and we are looking forward to developing this product further and expanding its functionality. We have at least another three months of development in the pipeline adding features to the current product before we move into new products and markets.