Project 1: Introduction and Bootloading

Due: Tuesday, October 2, 11:59:59 p.m.

Table of Contents:

Intermediate Deadline: Before class, Wed, Sept 26, you should be able to execute bochs and display one character in the upper left corner.

Objective

This project will familiarize you with the boot process and also with the tools and the emulator that you will use in subsequent projects. The end product for this project will be (1) a very small kernel that will print out "Hello World" to the screen and hang up (i.e., enter an infinite loop) and (2) a shell script for creating & running a bootable disk image containing that kernel.

Background

When a computer is turned on, it goes through a process known as booting. The computer starts executing a small program, known as the bootstrap program, which is contained in the BIOS (basic input/output system), which comes with the computer and is stored in ROM. The BIOS bootstrap program reads the first sector of the boot disk and loads it into memory. The first sector of the boot disk will contain a small program called the bootloader. After loading the bootloader into memory, the BIOS performs a jump to the address where it placed the bootloader, which starts the bootloader program executing. When executed, the bootloader loads and executes the operating system kernel--a larger program that comprises the bulk of the operating system.

Tools

You will use the following tools to complete this and subsequent projects:

bochs & bochs-x - An x86 processor emulator with X GUI (pronounced box)
bcc (Bruce Evan's C Compiler) - A 16-bit C compiler
as86 and ld86 - A 16-bit 0x86 assembler and linker
gcc/g++ - The standard 32-bit GNU C compiler
nasm - The Netwide Assembler
hexedit/hexdump - Utilities that allows you to edit/view a file in hexadecimal byte-by-byte
dd - A standard low-level copying utility. A simple text-based editor of your choice (e.g., atom, jedit, gedit, vim, xemacs)

All of these tools are available for free for Mac and Linux (but not for Windows). They may already be installed (or part of the project). If not, you are welcome to download and install them on your own Mac/Linux machines. However, I cannot provide any significant support for the download and installation of the tools on your personal machines.

Step 0. Set up

You can develop either on a lab machine or on your own machine using a VM.

We have created a VM image with everything set up and ready to go. VirtualBox is recommended for running the virtual machine. It is already installed if you're using a lab machine. If you are asked to upgrade ubuntu, say no.

What is a Virtual Machine?

A virtual machine has all the same capabilities as a desktop or laptop computer, but runs virtually inside another host operating system. For example, I have a Mac laptop that can run Ubuntu or Windows inside a virtual machine. This is a way to run multiple OSes on the same machine without the hassle of dual-booting. The downside is in performance: now your computer has to run two operating systems at the same time! If you find your VM is too slow, adjust the settings for memory and video memory. If your host machine is becoming too slow, be sure to close the VM when you're not working to free up resources.

Our VM

If you're on your own machine, Download the VM. If you're on the lab machines , you need to use one of the machines in the advanced lab and use /csdept/courses/cs330/handouts/csci330.ova Import the appliance into VirtualBox. If you're having issues with Windows, check out this page.

A screenshot of the VM running is on the right. We're running Ubuntu 18. The username is wluos, the password is OS@wlu. (Having the username and password does not get you any special privileges on the Linux lab machines.)

A number of text editors are available: emacs, vim, jedit, and atom. Figure out your favorite. (atom seems slow to start but then is okay.)

Want to install something not in the VM image? Use the sudo apt install command or Ubuntu's software manager (in the dock).

Using Git

We will use GitHub Classroom to distribute starter code for Project1. Get comfortable with git on the command line, since we won't be using an IDE. Learn Git Branching may help -- you probably just need Main: Introduction Sequence and Remote: Push & Pull -- Git Remotes

Git is a powerful tool, and there is a lot that you can do. However, since you are using this repository just to share code between you and me, you should be able to focus just on a simple typical workflow and be fine.

You will need to create a GitHub account if you don't already have one. (This should be a free account because it's for education use.)

Set up to use git in the VM

Set up git to know who you are by running the commands below. (Update the information in the commands.)
git config --global user.email "you@example.com" git config --global user.name "Your Name"

By default, atom is used as the editor for git commits. If you want to change the editor that is used for git, see these examples.

Cloning the git project

Go to GitHub Classroom to get to the assignment and create your own repository.

Copy your repository name.

In your VM, in a terminal window, run the command:
git clone <your repository URL>

If you like the atom text editor, it has built-in git integration

Step 1. Getting Started with the Project

In the project1_username directory, you will find the following files:

bootload.asm - assembly code for the boot loader.
kernel.asm - assembly language routines you will use in your kernel.
kernel.c - a simple kernel that displays a white A in the upper left-hand corner
opsys.bxrc - bochs configuration file.
test.img - bootable 1.44MB floppy image for testing bochs.

Booting bochs

In a terminal window,

Change into your project1 directory
Copy test.img into a file called floppya.img
Type the command: bochs -f opsys.bxrc -q

This command will run bochs using the opsys.bxrc configuration file. The configuration file tells bochs information like what drives, peripherals, video, and memory the simulated computer has. It also tells bochs where to boot from, in this case from the file floppya.img. The floppya.img file is a 1.4MB floppy disk image, which is a file that contains exactly the number of bytes that can be stored on a 1.4MB floppy disk. This particular image contains the assembled boot loader and a small kernel that simply prints out "Bochs works!".

If all goes according to plan, when bochs boots from floppya.img, the bootloader will load and execute this small kernel, and you will see the message "Bochs works!" appear in the bochs window. If you do not see the expected results repeat the above steps, double-checking your work.

Stop the emulator by pressing the power button.

The Bootloader

We will be booting bochs from a 1.44MB floppy disk image. A 1.4MB floppy disk has 2880 sectors of 512 bytes. Thus, the bootloader is required to be exactly 512 bytes long (one sector) and be loaded into sector 0 of the boot disk image. In addition, the last two bytes of sector 0 must be 0x55 followed by 0xAA, which indicates to the BIOS that the disk is a boot disk. Since not much can be done with a 510 byte program, the purpose of the bootloader is to load the larger operating system from the disk to memory and start it running.

Since a bootloader has to be very small and handle such operations as setting up registers, it does not make sense to write it in any language other than assembly. Consequently, you are not required to write a bootloader in this project; one is supplied to you in the file bootload.asm. You will, however, need to assemble it and install it into sector 0 of your boot disk image.

I encourage you to open the bootload.asm file in a text editor and study its contents. It is a very small program that does three things:

The bootloader sets up the segment registers and the stack to memory address 0x10000. This is where it puts the kernel in memory
It reads 10 sectors (5120 bytes) from the disk starting at sector 3 and puts them at address 0x10000. This would be fairly complicated if it had to talk to the disk driver directly, but fortunately the BIOS already has a disk read function. The disk read function is accessed by putting the parameters into various registers and calling Interrupt 13 (hex). After the interrupt, the program at sectors 3-12 is now in memory at address 0x10000.
It jumps to 0x10000, starting whatever program it just placed there. In this project that program will be a small kernel that prints "Hello World." In future projects, it will be your actual OS kernel.
Notice that, after the jump, it fills out the remaining bytes with 0, and then sets the last two bytes to 0x55 followed by 0xAA, telling the computer that this is a valid bootloader.

NOTE: the assembly code contains the number 0xAA55. The Intel architecture stores data in little endian form, i.e., the least significant bytes are stored at lower addresses, which means the hex digits will be stored from right to left as we go from lower addresses to higher addresses.

Assembling the Bootloader

To install the bootloader, you first have to assemble it. The bootloader is written in x86 assembly language understandable by the nasm assembler. To assemble it, use the command:

nasm bootload.asm

The nasm assembler generates the output file bootload, which contains the actual machine language program that is understandable by the computer.

You can look at the bootload file with the hexdump utility. (Alternatively, you could run hexdump with the -x option, which will display in two-byte hexadecimal.) You will see a few lines of numbers, which are the machine language instructions in hexadecimal. Below that you will see a lot of 00s. Near the end, you will see the magic number 55 AA indicating that it is a boot sector.

Disk Images

We will use the unix dd utility to create disk images. The first thing we'll do is create a disk image filled with all 0's. To do this, use the command:

dd if=/dev/zero of=floppya.img bs=512 count=2880

The above command will copy count=2880 sectors of bs=512 bytes/sector from the input file if=/dev/zero and put it in the output file of=floppya.img. 2880 is the number of sectors on a 1.4MB 3.5" floppy, and /dev/zero is a Unix special file that contains only zeros. You should end up with a 1.4 MB file named floppya.img that is filled with zeros.

We will also use the dd utility to copy the bootload program to sector 0 of the floppya.img disk image. To do this, use the command:

dd if=bootload of=floppya.img bs=512 count=1 conv=notrunc seek=0

The additional parameters to dd indicate that we move seek=0 sectors before writing and that we do not truncate the image after writing (conv=notrunc). If you look at floppya.img now with hexdump, the contents of bootload are contained in the first 512 bytes of floppya.img and the rest of the file is filled with 0's.

If you want, you can try booting bochs using floppya.img. However, nothing meaningful will happen because the bootloader in sector 0 will just load sectors 3-10, which contain all 0's, and then attempt to run them.

In the next part of the assignment, you'll write your "Hello World" program and put it into sector 3 of floppya.img so that it runs when bochs is booted.

Step 2. A Hello World Kernel

For this project, the kernel should contain a main function that simply prints out “Hello” in white letters on a black background at the top left corner of the screen and then enters an infinite while loop. You will write your program in C and save its source code in a file named kernel.c.

When writing C programs for an existing operating system such as Mac OS, Linux or Windows, you can use functions such as printf or putchar that display text on the screen (similar to System.out.println in Java). When these functions are compiled, they use a system call to the operating system, which ultimately handles the display of the characters. But, since we don't have an OS yet (you haven't written it!), we can't have system calls. Thus, there is no way to compile calls to functions such as printf and putchar that rely on system calls for their operation.

Since you cannot use the printf or putchar functions, you will have to write the characters you want to display directly to video memory. Video memory starts at memory address 0xB8000. Every byte of video memory refers to the location of a character on the screen. In text mode, the screen is organized as 25 lines with 80 characters per line. Each character takes up two bytes of video memory:

the first byte is the ASCII code for the character
the second byte tells what color to use to draw the character

The memory is organized line-by-line.

Thus, to draw a white letter 'A' on a black background at the beginning of the third line down, you would do the following:

Compute the address relative to the beginning of video memory:
Since one line is 80 characters long, the beginning of the third line down would be 80*(3-1) = 160
Multiply that relative location by 2 bytes per character: 160*2 = 320
Convert that to hexadecimal (e.g., using a converter): 320 = 0x140
Note: your implementation will not literally have to do this.
Add that to the starting address of video memory (0xB8000) to get the memory address: 0xB8000+0x140 = 0xB8140
Write 0x41, the ASCII code for the letter `A', to address 0xB8140.
ASCII: 65, binary: 0100 0001, hex: 0x41
Write the color white (0x0F) to address 0xB8141

See the A on the third line, on the left, in the image at right?

The 16-bit C compiler that we are using provides no built-in mechanism for writing bytes directly to memory. To allow you to write bytes to memory from your kernel, you are provided with an assembly file kernel.asm that contains the function putInMemory with the following signature:

void putInMemory(int segment, int offset, char b)

segment - the most significant hex digit of the address times 0x1000.
offset - The four least significant hex digits of the address.
b - the ASCII code of the character to be written.

For example, to write the character 'A' to address 0xB8140, you could call:
putInMemory(0xB000, 0x8140, 65);

Fortunately, you will not need to translate characters to ASCII manually. In C, a character is equivalent to its ASCII value. Thus, the above line can be written as:
putInMemory(0xB000, 0x8140, 'A')

Alternatively, you can let C do the hex conversion for you, e.g., putInMemory(0xB000, 0x8000 + 320, 'A')

You should now be able to write a kernel that prints out "Hello" at the top left corner of the screen before entering an infinite while loop.

Compiling the Kernel

The bochs emulator, as well as a physical PC machine, starts up in 16-bit mode. This means that our kernel must be a 16-bit program, as opposed to a 32-bit or 64-bit program. The implication of writing a 16-bit program is that we cannot use the standard GNU gcc C compiler because it generates 32- or 64-bit machine language. Instead we will use the bcc compiler that generates 16-bit machine language code. Unfortunately, bcc is fairly primitive and requires that we use early C syntax.

The most significant aspect of using bcc is that all local variables used in a function must be defined before any statements in the function.

As an aside, modern 32-bit or 64-bit operating systems get around the fact that machines boot in 16-bit mode by using a secondary bootloader. The secondary bootloader is a 16-bit program that loads the 32- or 64-bit kernel into memory and then switches the machine into 32 or 64-bit mode before executing the kernel. Our OS will be a 16-bit OS so we will not have to worry about switching modes.

To compile your kernel use the command:
bcc -ansi -c -o kernel.o kernel.c

The -c flag tells the compiler not to use functions from any pre-existing C libraries that might rely on system calls (e.g. printf or putchar). The -ansi flag tells it to use standard ANSI C syntax and the -o flag tells it to produce an output file called kernel.o.

The kernel.o file is not your final machine code file, however. Recall, that you are using the putInMemory function from the kernel.asm file. For your C program to be able to call putInMemory, you will also need to assemble the kernel.asm file and then link it with your kernel.o file.

To assemble the kernel.asm file use the command:
as86 kernel.asm -o kernel_asm.o

To link the kernel.o and kernel_asm.o files into the executable kernel file use the command:
ld86 -o kernel -d kernel.o kernel_asm.o

The file kernel is your program in machine code. To run it, you will need to copy it into the disk image at sector 3, where the bootloader is expecting to find it (in later projects you will find out why sector 3 and not sector 1). To copy the kernel file to sector 3, use the command:
dd if=kernel of=floppya.img bs=512 conv=notrunc seek=3 Successful step 2 run

Now, if you run bochs, the bootloader will load your kernel from sector 3 and execute it. If your kernel program is correct, you will see "Hello" printed in the top left corner of the screen.

Common Issue

Sometimes floppya.img gets corrupted from mistakes you make. If that's the case, restart: zero it out, then add the bootload program at sector 0, then add the kernel at sector 3. You could put all of these steps in the buid.sh (see next step).

Running the Kernel: Shell Scripts & Makefiles

Producing a final bootable floppy disk image requires you to type quite a few commands. Instead of typing them all each time we change the kernel, create bash scripts and/or create a Makefile for this step.

Bash Script

Create two bash scripts, one called build.sh that builds the OS and the image on the floppy and one called run.sh that runs the emulator. Make the scripts executable. Your build script should not continue if compilation fails.

If you put the correct commands into your shell script, running build.sh will compile your kernel, link it with the putInMemory function, and produce a new bootable disk image.

Makefile

If using a Makefile, create a build.sh script that runs the proper target and create a run.sh script that runs the emulator. Make the scripts executable.

This is a good time to do a commit, if you haven't already, AND push your code to GitHub. Remember that you should only commit source files and scripts, not any of the generated files. The commit message will be written in atom (unless you changed the editor). If in atom, write your comment, save the file, and close the file.

Step 3. Kernel Improvements

Implement a function putChar in kernel.c that displays a character in a specified color at a specified location on the screen. The putChar function should accept as parameters
- the character to be printed
- the color in which to print it
- the row and column at which to print the character as decimal integers (not hexadecimal)
Modify your main function so that, in addition to printing "Hello" in the upper left corner of the screen, it uses the putChar function to display "Hello World" in white on a red background at the center of the screen. The video memory's color codes are shown below. The background color goes in the high-order nibble of the byte for the color, while the foreground color goes in the low-order nibble of the byte. For example, 0000 0001 = 0x01 will produce blue text on a black background.
```
0      0000      black
1      0001      blue
2      0010      green
3      0011      cyan
4      0100      red
5      0101      magenta
6      0110      brown
7      0111      light gray
8      1000      dark gray
9      1001      light blue
A      1010      light green
B      1011      light cyan
C      1100      light red
D      1101      light magenta
E      1110      yellow
F      1111      white
  
```
Implement a function putStr in kernel.c that displays a string in a specified color at a specified location on the screen. The putStr function should accept as parameters
- the terminated string to be printed,
- the color in which to print it,
- the row and column at which to start printing the string as decimal integers (not hexadecimal)
Each successive character in the string should be printed one column to the right of the previous one.
Demonstrate your function works: Modify your main function so that it uses the putStr function to display "Hello World" in red on a white background in the lower right corner of the screen (this didn't show up in the screenshot) and your name in white in the top left.
If the end of a line is reached, the next character should appear on the following line. No characters should be printed past the end of the screen. Modify your putStr function so that it correctly handles that a long string wraps on the screen.
Demonstrate that putStr can handle that case.
What should happen when putStr gets a newline (`\n') character? Does your current implementation do that? If not, update your function.
Demonstrate that putStr can correctly handle that case.

Tips:

When adding functions to your kernel, add prototypes at the top of the file and then put your function definitions after main. You will see really odd behavior otherwise. Example of a prototype:
void putChar(char ch, int color, int row, int column);
You can iterate over every character until you see the end of the string, rather than needing to get the string's length (which can be error prone).
Do put an infinite loop at the end of main.

Cleaning up your code

Great! You got it working! Before you submit, review your code. Are there any good places for constants (#define)? Is your code written in an (relatively) understandable way? Any extra code that you no longer need? Do you have appropriate comments? Do you explain the whys of what you're doing?

Checking your work

It is recommended that you [re?]clone your project in a new directory, and then make sure that you can build and execute your kernel. (This is, afterall, exactly the process that I'll use.)

Submission

GitHub Classroom will make a snapshot of your repository at the deadline. In addition to the originally supplied files, your repository should contain your source code (namely, kernel.c) and Bash scripts and, optionally, your Makefile. (Note that that does not include anything that was built or generated, like bootload or kernel or floppy.img.)

I should be able to pull your code and run build.sh and then run.sh to see your kernel running.

Grading

Your project will be assessed on its correctness as well as its style. Your kernel.c file and bash scripts should be nicely formatted and well-documented.

(50) Basic functionality: writing "Hello World"; build.sh and run.sh scripts
(40) Kernel improvements
(10) Code style: proper comments, format

Acknowledgement

This assignment as well as the accompanying files and source code have been adopted with minor adaptations from those developed by Michael Black at American University. His paper "Build an operating system from scratch: a project for an introductory operating systems course" can be found in the ACM Digital Library.

CSCI 330: Operating Systems