Assignment 2: Short Message Service

Introduction

SMS is a popular mobile service which allows asynchronous communication of at most 160 symbols per message. However, text input is complicated since many mobile devices have only a limited set of keys. The first generation of input systems required users to repeatedly press a single key to input the correct symbol. After a while, the dictionary-based T9 system was introduced and it meant a significant reduction in key presses per message (although it introduced some other problems). In this assignment, you will implement Standard ML functions in the context of SMS input processing.

This assignment should be solved individually.

Part I: Keyboard Optimization

Many mobile keyboards use the keys numbered 2-9 for letter input as is shown below. Button 0 is used for space on this particular device. The standard layout is simply a linear enumeration of the letters. A QWERTY-like layout enumerates the letters in the order found on a QWERTY keyboard. There are many other possible layouts and the final question in this part asks you to come up with an optimal layout (optimal in a specific sense).

Standard Layout:
+-------+-------+-------+
|1:     |2: ABC |3: DEF |
+-------+-------+-------+
|4: GHI |5: JKL |6: MNO |
+-------+-------+-------+
|7:PQRS |8: TUV |9:WXYZ |
+-------+-------+-------+
|       |0: sp  |       |
+-------+-------+-------+
QWERTY-like Layout:
+-------+-------+-------+
|1:     |2: QWE |3: RTY |
+-------+-------+-------+
|4: UIO |5: PAS |6: DFG |
+-------+-------+-------+
|7:HJKL |8: ZXC |9:VBNM |
+-------+-------+-------+
|       |0: sp  |       |
+-------+-------+-------+

From the configuration above we can see that to input the letter L we need 3 key presses ("555"), while the letter T requires a single key press ("8"). Below you find a list of the required functionality. We will test your system using these functions so you must respect the specified restrictions on identifier names and calling sequences. Everything internal to your system is up to you to design.

  1. Write a function to parse a keyboard layout specification and produce an internal layout representation. A specification is represented as a string with vertical bars separating the different keys from one another. The first group is for key 2, the last for key 9. The order in each group represents the order in which the letters are cycled through while repeatedly pressing the key. The standard layout is therefore represented by "ABC|DEF|GHI|JKL|MNO|PQRS|TUV|WXYZ". Note that only keys 2-9 are used for letters and key 0 is always space. Usage:
    val layout = parseLayout(specification);
  2. Write a function to convert a layout back to the specification form. Usage:
    val specification = layoutToSpecification(layout);
  3. Write a function to count the number of key presses required to input a given SMS message. Usage:
    val n = countKeyPresses(layout, "DO YOU WANT PIZZA TONIGHT");
    If layout is the standard layout then n = 47.
  4. Write a function to determine which keyboard layout is best suited (requires the least number of key presses) for the task of writing a given set of SMS messages. The set of messages are represented as a list of strings. Usage:
    val (layout, n) = chooseLayout(layouts, messages);
    The function returns the best layout and the number of key presses required to input all the given messages.
  5. Construct an algorithm to find a better layout (preferably an optimal layout) than the standard one for a given set of SMS messages. Usage:
    val layout = optimizeLayout(messages);

Part II: T9

An input system based on a dictionary of common words and their frequencies can be used to reduce the number of key presses by suggesting the most likely word from the set of possible words. For example, if the user pressed the keys in the following order: "43556" a likely word would be "HELLO". You are given a limited dictionary with frequencies and your task is to implement a function that constructs a minimum key sequence that produces the input message. For example, assume the system suggested "HELLO" already after "435" had been entered. Thus, only three key presses were required instead of the five to input all letters explicitly. Below is a list of the required functionality for this part of the assignment.

  1. Design an internal representation for a dictionary. Write a function to construct such a representation. Usage:
    val dictionary = makeDictionary(layout, wordfreq);
    Your representation must be efficient: the time to lookup the subset of words that share the same key sequence prefix should be proportional to the length of the prefix (or the size of the subset), not the size of the dictionary.
  2. Write a function that outputs the SMS message that is produced by the given key sequence. When the user enters OK, space, or completes the message, the system chooses the exact match with the highest frequency or (if no exact match exists) the longest word with the highest frequency. If this criterion does not result in an unambiguous word in the dictionary the system outputs a single question mark. Some examples (assuming layout represents the standard layout): Usage:
    val message = T9(layout, dictionary, sequence);
  3. Write a function that constructs a minimum key sequence that produces the given SMS message. An example is given below, assuming the dictionary is defined as val dictionary = makeDictionary(layout, [("THERE",1), ("HERO",2), ("HELLO",3)]); and that layout is the standard layout. Usage:
    val sequence = T9Sequence(layout, dictionary, message);

Files

  1. Layout Specifications (download)
  2. Messages (download)
  3. Small T9 Dictionary (download)

Peer Review (Code)

Before handing in the final report and code, we will arrange a peer review session where you get to comment on other students code and receive feedback on your own code. Therefore, there is a separate deadline for handing in code before the peer review session (see the schedule for details). You should evaluate the code and comment on readability, design, compliance with the specification, bugs, etc. Write down your comments and hand over one copy to the authors and one to the teachers during the peer review session.

Below is a checklist of things to consider.

Please note that the peer review session is a compulsory part of the course.

Report

Your report should clearly explain all important aspects of your solution. These include, but are not limited to, the design of internal data representations and overviews of complex algorithms and data flows. Imagine that the reader is an employer who wants to evaluate your communication skills.

Some tips: