1. Discuss an efficient algorithm to implement the OR operator in Boolean retrieval with two inverted lists that are available in sorted form.
2. Show that a dictionary, which is implemented as a hash table with linear probing, requires constant time for insertions and lookups. Derive the expected number of lookups in terms of the fraction of the table that is full.
3. Write a computer program to implement a hash-based dictionary and an inverted index from a document-term matrix.
4. Suppose that the inverted index also contains positional information. Show that the size of the inverted index is proportional to the number of tokens in the corpus.