Regex

What is Regex?

In 1951 the American mathematician Stephen Cole Kleene, a class fellow of Alan Turing, developed regex as a way of using algebraic notation to describe a computational model of the the human nervous system.

Building on Kleene’s work, mathematician and pioneer of Unix, Ken Thompson implemented regular expressions within a text editor in 1968 as a way of allowing users to match patterns in text files. As regex entered the world of computing, it became widely adopted and used. Till today, some 50 years later, regex has been quietly serving many applications. More recently, with the rise in popularity of AI and NLP, regex has been helping bootstrap applications before they have enough data to use machine learning to train models.

A regular expression or regex is a sequence of characters that specifies a search pattern. Regex has its own distinct vocabulary and syntax. In effect, it is a language in its own right that can encode patterns in text. Regex patterns can be used to search and extract information from passages of text.

Regex is a powerful tool for matching patterns in text. For over 50 years it has supported countless applications. With the rise of NLP, regex still has a role to play especially within the legal space. Within legal documents, there are many examples where regex patterns can be used to capture and extract information. Things which are well-structured and follow common patterns like term definitions, legislation citations, statute codes etc. are all examples conducive to regex.