I have introduced Core ML for general usage and Vision framework for image analysis with following articles.

In this one, I will show you natural language processing in iOS 11. The main API is NSLinguisticTagger which is far more than a tagger. It’s another specific area to apply machine learning. In iOS 11, NSLinguisticTagger becomes more powerful. That’s why we involve it in this series although it’s not new.

In codes with NSLinguisticTagger, we use different tag schemes and options to analyze text in different ways. Tag scheme describes the result we want by analyzing the text. Tag options are items we want to omit like punctuation and whitespace. I will introduce tag scheme, options and their combination in the following examples.

To use NSLinguisticTagger, we first initialize an instance with tag schemes and further define the behavior with options for each scheme.

Feed the tagger with text

Enumerate each tag and handle the result

Let’s see what NSLinguisticTagger can do with examples. Please copy the code snippets to playground to get the results.

Language Identification

The scheme here is NSLinguisticTagScheme.language. NSLinguisticTagger analyzes the text to get the dominant language.

Tokenization

With the tag scheme NSLinguisticTagScheme.tokenType, we will get the type of every token. The punctuations and whitespaces are omitted with omitPunctuation and omitWhitespace. Please try different options to see different results.

Lemmatization

With tag scheme NSLinguisticTagScheme.lemma, NSLinguisticTagger gives us stem form of each word token.

NameType

NSLinguisticTagger with the scheme nameType helps us identify whether the token is named entity like personal name and place name.

LexicalClass

To get each token’s lexical class, we use NSLinguisticTagScheme. lexicalClass.

That’s all for NSLinguisticTagger. Thanks for your time.