/* ---- Google Analytics Code Below */

Wednesday, January 25, 2023

Ranking Models: NDCG: in Towards Data Science

Used this in practice ...  Good to see this overview for use. 

Demystifying NDCG        in Towards Data Science by Aparna Dhinakaran   

How to best use this important metric for monitoring ranking models

Ranking models underpin many aspects of modern digital life, from search results to music recommendations. Anyone who has built a recommendation system understands the many challenges that come from developing and evaluating ranking models to serve their customers.

While these challenges start in data preparation and model training and continue through model development and model deployment, often what tends to give data scientists and machine learning engineers the most trouble is maintaining their ranking models in production. It is notoriously difficult to maintain models in production because of how these models are constantly changing as they adapt to dynamic environments.

In order to break down how to monitor normalized discounted cumulative gain (NDCG) for ranking models in production, this post covers:

What is NDCG and where is it used?

The intuition behind NDCG

What is NDCG@K?

How does NDCG compare to other metrics?

How is NDCG used in model monitoring?

After tackling these main questions, your team will be able to achieve real time monitoring and root cause analysis using NGCG for ranking models in production.

What Is NDCG and Where Is It Used?

Normalized discounted cumulative gain is a measure of ranking quality. ML teams often use NDCG to evaluate the performance of a search engine, recommendation, or other information retrieval system. Search engines are popular for companies that have applications which directly interact with customers, like Alphabet, Amazon, Etsy, Netflix, and Spotify — just to name a few.

The value of NDCG is determined by comparing the relevance of the items returned by the search engine to the relevance of the item that a hypothetical “ideal” search engine would return. For example, if you search “Hero” on a popular music streaming app, you might get 10+ results with the word “Hero” in either the song, artist, or album.

The relevance of each song or artist is represented by a score (also known as a “grade”) that is assigned to the search query. The scores of these recommendations are then discounted based on their position in the search results — did they get recommended first or last? The discounted scores are then cumulated and divided by the maximum possible discounted score, which is the discounted score that would be obtained if the search engine returned the documents in the order of their true relevance.

If a user wants the song “My Hero” by Foo Fighters, for example, the closer that song is to the top for the recommendation the better the search will be for that user. Ultimately, the relative order of returned results or recommendations is important for customer satisfaction.  .... '   (more below at link) 

No comments: