Eurecom Dictionnary (DRAFT)
General overview
The objective of this project is to complete the C code of a dictionary management system. This system is built upon two components:- A file, storing words (in alphabetical order, or not),
- A C program called
words, performing operations on the dictionary. This program reads arguments from the command line, including commands to be performed on the dictionary (e.g., looking for a word), and returns an answer in the terminal.
System specification
Overview
A dictionary is first stored on disk in a text file. Let's start with this reference file: words.txt. This text file contains a list of English words: one word per line. They are classified in alphabetical order. A program, calledwords, that you ought to write, takes as input two references:
- A reference to a "command file" containing a list of commands, one command per line: these commands are to be executed by
words. Valid commands are given later. - A reference to a dictionary file (e.g., to words.txt)
words is:
$ ./words fileOfCommands words.txt
Actually, in the file repository (gitlab) we provide as input (see below), the dictionary file is in the subdirectory data, words is expected to be generated in bin by the Makefile, and we provide examples of command files in examples. Thus, in our file repository, a valid execution of
words looks like:
$ cd bin&&./words ../examples/commands1 ../data/words.txt
Commands
The commands to implement are given in the following table. We assume that a dictionary named D has been provided as the second argument towords. Do use exactly the specified format for the outputs since we will check your outputs with our own tests. If you want to put extra information, do use a -debug options that activate extra outputs. A command must return either a number, ok, true, false, or error.
$ ./words -debug fileOfCommands words.txt
| Command | Action on D | Output (if no error) |
add word1 word2 word3 ... | adds words to D | ok |
remove word1 word2 ... | removes words from D | ok |
size | Total number of words in D (int) | |
advancedsize ab | Total number of words starting with "ab" in D (int) | |
save filename | saves D in "filename" in alphabetical order | ok |
search word | prints true if the word was found, false otherwise | |
advancedsearchv1 w?r?d | prints true is at least one word was found, false otherwise | |
help | help on commands |
For each command that could not be executed because of an error, "error" is output to the terminal. It may be optionnaly followed by a message, e.g., "error: memory allocation problem". Any other error or information message you may decide to use in your program must be sent to
stderr (and not to stdout.
Implementation
Valid characters for words
All valid characters are the one present in words.txt. This includes:- Uppercase and lowercase letters
- Numbers 0-9
- ! & ' , - . /
$ man ascii
You can easily find this valid list of characters by running this bash command on the
words.txt file:
$ grep -o . data/words.txt | sort -uOr: another way to do:
$ cat data/words.txt | fold -w1 | sort | uniq
Compilation and execution
Your implementation must be compilable and runnable on the Linux PCs in the laboratory rooms: these PCs will be used to evaluate your work. So, even if the project compiles on your own compuyter, be sure that it also compiles on Eurecom PCs. We will evaluate the compilation of your code, and then the execution of your program using the tests we provide with the project. The performance of your implementation is very important as well. Thus, for the provided examples, we will evaluate how much time the execution of your program takes. Using the following command, we can compute how much time your program takes to execute commands given in thecommands1 file:
$ cd bin&&time ./words ../examples/commands1 ../data/words.txt
We have prepared a C environment (Makefile, libraries, include) for the project in the following git: https://gitlab.eurecom.fr/ludovic.apvrille/basicos_project_fall2025_forstudents. Clone it as follows:
$ git clone git@gitlab.eurecom.fr:ludovic.apvrille/basicos_project_fall2025_forstudents.gitOnce cloned, have a look at the
READM.md file at the root of this git repository.
Bonus
-
advancedsearchv2. Extend the advancedsearchv1 command as follows. When no words were found, it still returnsfalse. But now, when it returnstrue, you should print in the same line all identified words, separated by a space. There must also be a space betweentrueand the first word. -
advancedsearchv3command: this new command supports "?" and "*" when looking for words. "*" matches zero or more characters. For instance: "h*l?" should match "hello" and "hell" (and probably many others). - Implementing a command to merge a referenced dictionary file with the opened dictionary.
- Automatically compressing/uncompressing dictionary files.
| Command | Action on D | Output (if no error) |
advancedsearchv2 w?r?d | true + list of words if is at least one word was found, false otherwise |
| Command | Action on D | Output (if no error) |
advancedsearchv3 w*r?d | true i+ list of words if at least one word was found, false otherwise. |
| Command | Action on D | Output (if no error) |
merge refToFile | add all words of refToFile to D | ok |
Deadlines, deliverables and organization
Program the specified system, and provide the sources files in C, Makefile and one report in markdown format, by Dec. 11th, 2025, 18h00 CET. Advices on how to write a report are given here. Everything should be available in the gitlab of your project. No commit after this date will be considered. If your code does not compile, don't expect to have a grave over 5. In your gitlab, commit only source files, never commit what can be obtained by compilation.All members of the group must work on the general definition of the dictionary handling (loading in a data structure), and the implementation of the core functions (command analysis). Then, the implementation of commands must be split between the members of the group, as follows:
- Members #1 (leader): data structure management and size
- Members #2: remove and advancedsize
- Members #3: add and save
- Members #4: search and advancedsearchv1
Grading
10 points are given to the whole group, 10 points are given to each student.- 5 points for the report. The report covers the work of the whole group, but must emphasize on the work done by each group member.
- 5 points for the global approach: global algorithm, common code, regular progression on the project.
- 10 individual points for personal implementation. This grade is individual. If your implementation in C is not ready by the deadline, provide the corresponding algorithm in pseudo code: you can get up five points with a pseudo-code.
How to proceed?
Making groups
- Form 17 groups of 4 students and one group of 3 students. Once your group has been established, I would like the designated group leader to send me an email providing the full names and email addresses of all the group members. In the subject line of your email, please include the phrase "[BasicOS] New group". Upon receiving your email, I will confirm your group's formation, assign you a group number.
- Setup a gitlab projet (use gitlab.eurecom.fr for this) called EXACTLY basicos2025-teamXX (With XX your group number, for group 1: basicos2025-team01) and send me the link to your gitlab: I will follow your progression that way. Do allow me to clone your gitlab project, as well as Sophie Coudert and Yahya Frioui.
How to work?
- First sessions. Understand the repository we provide (.c and .h files, and the Makefile). Clearly define the general structure of your project.
- First and second sessions: program together the main framework: handling of arguments, handling of main commands, handling of errors.
- Third session: individually program the commands you are meant to implement. Make individual tests.
- Fourth session: integrate together all commands, and check that all the tests you provide run: a part of your grade is based on this. Improve program performance. Provide at least 5 more tests with more that 10 commands each.
- Your git must be ready (=STRICT DEADLINE) the 11th of December, 18h00, Paris time.

