NRP : 08.04.111.00043
Exercise 1.1
1. Collect the documents to be indexed:

2. Tokenize the text, turning each document into a list of tokens:

3. Do linguistic preprocessing, producing a list of normalized tokens, which
are the indexing terms:

Exercise 1.7
1. tangerine 46653 OR trees 316812 = ( 1234568) AND
marmalade 107913 OR skies 271658 = ( 012356789) AND
kaleidoscope 87009 OR eyes 213312= (0123789)
2. ( 1234568) AND ( 012356789) AND (0123789)
3. (123568) AND (0123789) = ( 1238)