What makes BERT unique? Truthfully, it’s because BERT provides “context” unlike Word2Vec, Glove2Vec, and more. Previously, I did a post on the introduction of BERT with Say Hello to BERT! and What Problems Will BERT Aim to Solve?
In order to understand how BERT improves natural language understanding, I will break it down for you.
B is for Bi-directional
- What do the previous language models have in common? They could move in 1 direction, either from only left to right or right to left. To know the context of a sentence, it uses either of the two directions mentioned.
- However, BERT is uses a bi-directional approach. A first of its kind! Now, BERT can scan a sentence as a whole.
ER means Encoder Representations
- What goes in goes out. BERT utilizes an “in-and-out” method so whatever you try to put it, BERT will decode and solve.
T stands for Transformers
- Lastly, BERT now has a masked language model (MLM), which was impossible in the past. Before, it was hard to learn the context of a sentence. For a machine, it’s surely hard to understand the meaning of things. Previous models struggled with natural language understanding.
- For example, search engines could easily mix up the pronouns used. Who is “he”, “she”, “they” or “we” in this sentence? Machines could easily get confused on the meanings especially when multiple pronouns are involved.
- Hence, this is where the “transformers” come in handy. It will help search engines keep an eye on the pronouns and meanings whether it’s directly stated or implied. This way, BERT can quickly tell who is the receiver and sender of the sentence, and understand the context as well.
- The masked language modeling is there to stop search engines from taking the words too literally. When there’s a “mask”, BERT reads between the lines and tries to guess the missing word.
BERT in the World of Google Search
More understanding on the human language
- BERT is there to try to decipher the nuances of people: on how they speak, how they type, and how they communicate in general. When Google understands human language on a deeper level, it is a huge breakthrough for online search.
Seeks to improve conversational search
- As I said earlier, it is NOT just how humans type their queries. However, it is also about how humans do voice search.
Expand SEO to an international level
- With the English language that’s being fed to BERT, it will surely master the 2,500 million words quickly. BERT may be mono-linguistic now… but who knows in the future?
- BERT could master multiple languages and translate the English phrases into various languages. Yes, multilingual BERT could be a real possibility even if at this point, BERT can’t still fully understand other major languages. Eventually, it will come.
Solves ambiguous and confusing questions
- Currently, people are complaining how the search results are being ranked. With BERT, ambiguity is hoped to be lessened. Google can now have a better understanding at the context of words and knowledge of resolving ambiguous queries.