background

Natural Language Processing (NLP)


What is Natural Language Processing (NLP)?

Natural Language Processing (NLP), not to be confused with Neuro-Linguistic Programming (also NLP!), is a branch of AI that enables computers to understand human language in both written and verbal forms. NLP sits at the intersection of AI and computational linguistics and draws on research from both fields.

Legal NLP is the application of NLP methods to the legal domain. Rather than thinking about AI as a mystical being that will magically solve all of your problems, it’s more useful to think about the concrete tasks that approaches like NLP can solve really well. If you are familiar with the types of tasks NLP can be used to help with, it will be easier to spot opportunities in your day-to-day work where AI could help.

In particular, Legal NLP can do the following tasks really well given there is enough relevant data available:

  • Named-Entity Recognition (NER) - In legal circles, this is more commonly known as Extraction. The goal here is for the model to automatically extract an entity name, phrase of text or even particular clauses. Extraction is useful in going from unstructured data in the form of contracts to structured data in the form of a table of fields. This enables higher-level analytics on the data which can drive business insights. Extraction also tends to be used to reduce the need for human capital when carrying out document review.
  • Relation Extraction - Once entities have been extracted, a model can relate relevant entities together. For example, within a contract, a business is usually referred to with its formal name e.g. Google and a reference name e.g. The Company. A relation extraction model can automatically connect these two words as mentions of the same entity so it understands who is being referred to by The Company.
  • Classification - With a large collection of documents, classification models can be used to organise the documents against a pre-existing taxonomy.
  • Clustering - With a large collection of documents, clustering can be used to automatically relate similar documents together. Clustering is an unsupervised method so all that is needed to perform clustering is a dataset of documents.
  • Semantic Similarity - Given two passages of text, a model can determine how similar or dissimilar they are from one another. The two passages of text could be two clauses or two documents. Semantic similarity is useful in detecting deviations in drafting or automatically surfacing risky positions during drafting and review.
  • Semantic Search - With a large collection of clauses and documents, being able to search through them well is a necessicity. Semantic search allows search to be carried out beyond just keywords and exact matching. For example, with semantic search when searching for “criminal offences”, “homicide” should also be retrieved in the results since the two are semantically related.

These are just a few tasks that could be useful in Legal NLP. Stable solutions exist for each of these tasks - the models perform well at these tasks given the data is relevant and sufficient. It’s just a matter of spotting these tasks in your day-to-day work and thinking about how might help.

Related Terms