Information Retrieval (IR) and NLU
There are a lot of explanations elsewhere, here I’d like to share some example questions in an interview setting.
Last week we joined the hype for ChatGPT. Following the same train of thought, let’s think more about informational retrieval and natural language understanding problems.
What is Information Retrieval (IR)?
How does Natural Language Understanding (NLU) and Information Retrieval fit in with each other?
One of my friends is an expert in IR and I’ve also run into some use cases in the past. The following are examples of some dedicated roles for IR:
- Google IR Product: Job Description
- Linkedin IR Engineer: Job Description
- Wayfair Search Engineer: Job Description
- TikTok MLE Search: Job Description
- Airbnb Search Engineer: Job Description
Here are some tips for readers’ reference:
Question 1:
Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers).
Watch how Omar Khattab from Stanford unpacks this definition:
Question 2:
First of all, queries and documents are often expressed in natural language. So we naturally want to understand a query’s meaning and its intent and understand the document’s contents and their topics to be able to effectively match queries to documents. The form of understanding is critical, although you can go pretty far for many IR tasks with intelligently matching terms at lexical level.
On the other hand, IR can contribute to NLU in three exciting ways:
- IR provides a rich source for creating challenging and realistic NLU tasks, ones where finding information from a large corpus is essential.
- IR offers a powerful tool to make NLU models for existing task more accurate and more effective.
- IR can often lend us a nice framework for evaluating NLU systems whenever the output domain is large, just like in search or whenever low latency is important, which are key characteristics in IR.
Watch how Omar Khattab from Stanford explains this (remember to watch a bit longer to hear the full story!):
Happy practicing!
Thanks for reading my newsletter. You can follow me on Linkedin or Twitter @Angelina_Magr !
Note: There are different angles to answer an interview question. The author of this newsletter does not try to find a reference that answers a question exhaustively. Rather, the author would like to share some quick insights and help the readers to think, practice and do further research as necessary.
Source of quotes/videos: NLU and Information Retrieval | Stanford CS224U Natural Language Understanding | Spring 2021 by Omar Khattab
Source of images/Good reads: Introduction to Information Retrieval , Christopher D. Manning , Prabhakar Raghavan and Hinrich Schütze , Introduction to Information Retrieval, Cambridge University Press. 2008.