Windowing

Creating a Window on Windows OS

Reading time: 16 mins.

At the end of this chapter, we will have a fully functional C++ app showing you how to display an image and draw on the screen using the mouse. This will take no more than roughly 200 lines of code, which is nothing in coding terms, and will be compilable with a single command.

Again, this is not a lesson on Windows programming, so we won't delve into too many details regarding how things work, just enough to get you started. We will leave readers interested in exploring all the code provided in the sample that's specific to the API to finding resources that will delve into this specifically. The Microsoft website is relatively nicely done with respect to this and provides some good step-by-step examples.

Additionally, and more fundamentally, Scratchapixel's mission is not to teach about how to use APIs, which would typically be left to tutorials, but rather to explain the algorithms used by these APIs and how they work. This includes a general focus on algorithms and techniques used in computer graphics.

Enter the Infinite Loop

Now let's dive into it. Native Windows programs often replace the usual int main() entry point into the program with a method called WinMain. That's not necessary as we prefer to stick with a traditional app structure; we will stick to main:

int main(int argc, char** argv) {
	HINSTANCE hInstance = GetModuleHandle(NULL);
	CreateAndRegisterWindow(hInstance);
	MSG msg;
	while (1) {
		while(PeekMessage(&msg, nullptr, 0, 0, PM_REMOVE) != 0) {
			TranslateMessage(&msg);
			DispatchMessage(&msg);
			if (msg.message == WM_QUIT) {
				break;
			}
		}
		if (msg.message == WM_QUIT)
			break;
		DoSomeWork();
	}
	return 0;
}

The only reason why you might want to use WinMain is that it directly passes an hInstance value to the method, whereas if you don't use WinMain, you have to create it yourself using GetModuleHandle(NULL). Obviously, that's so simple that this is not a good reason in itself for using WinMain. Then we call the method to create the window. We will dive into this next. Once the window is up and running, we enter our infinite loop. We check for messages, process them, and where no new messages are left in the queue, we do some work. As mentioned in the previous chapter, this is where we will be doing our rendering. Finally, note that if one of the messages is about quitting the app (WM_QUIT), then we break from the event loop and then break from the outer loop as well, effectively quitting the app. The reason why we do so is because while generally quitting is quitting, Windows happens to sometimes generate events (notably WM_TIMER event) after a WM_QUIT is received. So one might expect that more messages will be received after a WM_QUIT event.

TranslateMessage and DispatchMessage are Windows native calls. When we call these functions, Windows redispatches these messages to a method known as WinProc, which is essentially where we will add all of our logic for processing things such as mouse or keyboard events. More on this when we get to creating the window.

Now let's see what the CreateAndRegisterWindow method looks like.

Creating the window

Here is the code:

void CreateAndRegisterWindow(HINSTANCE hInstance) {
	WNDCLASSEX wc = {0};
	wc.cbSize = sizeof(WNDCLASSEX);
	wc.lpfnWndProc = WndProc;
	wc.hInstance = hInstance;
	wc.lpszClassName = CLASSNAME;
	wc.hCursor = LoadCursor(nullptr, IDC_ARROW); // Set the default arrow cursor
	wc.hIcon = LoadIcon(hInstance, IDI_APPLICATION); // Load the default application icon
	wc.hbrBackground = (HBRUSH)(COLOR_WINDOW + 1);
    wc.lpszMenuName = nullptr;
	wc.hIconSm = LoadIcon(hInstance, IDI_APPLICATION); // Load the small icon for the application

	if (!RegisterClassEx(&wc)) {
		MessageBox(nullptr, L"Window Registration Failed", L"Error",
			MB_ICONEXCLAMATION | MB_OK);
	}

	hwnd = CreateWindowEx(
		WS_EX_CLIENTEDGE,
		CLASSNAME,
		L"Foo",
		WS_OVERLAPPEDWINDOW & ~WS_THICKFRAME & ~WS_MAXIMIZEBOX, // non-resizable
		CW_USEDEFAULT, CW_USEDEFAULT, win_width, win_height,
		nullptr, nullptr, hInstance, nullptr);

	if (hwnd == nullptr) {
		MessageBox(nullptr, L"Window Creation Failed", L"Error",
			MB_ICONEXCLAMATION | MB_OK);
	}

	InitializeOffScreenDC(hwnd);

	ShowWindow(hwnd, SW_SHOWDEFAULT); // or use WS_VISIBLE but more control with this option
	UpdateWindow(hwnd);
}

Creating a window is straightforward. First, we need to set some fields of the WNDCLASSEX structure, which, as the name suggests, is used to register our window's class. The concept behind window classes is to define a set of behaviors that windows of a certain class will share, and WNDCLASSEX describes the characteristics of a window class. The three most important components of the structure to focus on for now are:

lpfnWndProc: Essentially, this is a pointer to the function that will process messages sent to our windows. As seen in our example, the function's name is WndProc, where we will handle events such as key or mouse inputs. We will delve into this in the next section.
hInstance: This is a handle to the instance containing the window procedure for the class. The hInstance parameter obtained from GetModuleHandle represents a module's instance handle. In Windows programming, a module can be an executable (.exe) file or a dynamic link library (.dll) file. The instance handle is a unique value assigned by the operating system to a module when it is loaded into the address space of a process. This handle is utilized by the system and applications to identify the module's instance.
lpszClassName: This is a string that specifies your window class name. In our example code, we have set it to myapp_window, but it could be anything else you want, as long as it is unique within the application context. This name uniquely identifies the window class. When registering a window class with the RegisterClassEx function, the system uses this name for identification. If another class with the same name is already registered, RegisterClassEx will fail unless the class is being registered by the same module. This means that within a single application, or more specifically, within the same module (DLL or executable), each window class must have a unique name.

After setting the member variables of the WNDCLASSEX instance, we register our class and check for success or failure. If successful, we proceed to the next step: creating the window itself. CreateWindowEx has several parameters, all of which can be found online, so we won't go into much detail here. Just note that we pass the class name registered in the previous step, set the window's title name (Foo), and pass flags that, in our example, prevent the window from being resizable. We use default x and y values for the window's position on the screen, along with the window's width and height.

That's it. Then, we call ShowWindow and UpdateWindow, both Windows API calls. The first is responsible for displaying the window on the screen, and the second forces a redraw of the window's content.

Easy breezy.

The only function we haven't yet explained is InitializeOffScreenDC(hwnd), as it isn't directly related to the window creation process. Recall that our program's goal is to display an image. This function is instrumental in achieving that, but before we delve into it, let's first review the WndProc function.

Handling The Window's Events

Here is the code:

LRESULT CALLBACK WndProc(HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam) {
	switch(msg) {
		case WM_CLOSE:
			if (hBitmap != nullptr) {
				DeleteObject(hBitmap);
				hBitmap = nullptr;
			}
			CleanupOffScreenDC();
			DestroyWindow(hWnd);
			break;
		case WM_DESTROY:
			PostQuitMessage(0);
			break;
		case WM_LBUTTONDOWN:
			is_drawing = true;
			break;
		case WM_LBUTTONUP:
			is_drawing = false;
			break;
		case WM_MOUSEMOVE: {
			int xpos = GET_X_LPARAM(lParam);
			int ypos = GET_Y_LPARAM(lParam);
			if (is_drawing) {
				SetPixelColor(pBits, win_width, xpos, ypos, 255, 0, 0);
				InvalidateRect(hWnd, NULL, TRUE);
			}
			break;
		}
		case WM_ERASEBKGND:
			return 1; // Indicate that background erase is handled
		case WM_PAINT:
			{
				PAINTSTRUCT ps;
				HDC hdc = BeginPaint(hWnd, &ps);
				BitBlt(hdc, 0, 0, win_width, win_height, hdcOffscreen, 0, 0, SRCCOPY);
				EndPaint(hWnd, &ps);
			}
			break;
		default:
			return DefWindowProc(hWnd, msg, wParam, lParam);
	}
	return 0;
}

The type of event sent to our window is stored in the msg variable. The handling approach is straightforward: we use a switch-case construct to check the type of event received and associate specific code with each event type we're interested in. Windows provides dozens, if not hundreds, of event types, which you can find a list of here. The ones we'll start with are prefixed with WM, standing for Window Messages. This includes all keyboard and mouse events, as well as messages like WM_PAINT, which is sent to the window when the system requires it to repaint after being obscured by another window, for instance, or WM_CLOSE and WM_DESTROY, which are triggered when a user closes the window or when it gets terminated (such as being killed via Task Manager).

Note that our code doesn't handle the WM_SIZE event because our window is not resizable.

We won't delve deeply into this straightforward code. However, it's worth discussing how the program is structured to help us achieve our goal, which is, in addition to displaying an image, to enable drawing on the image. We want to do this to demonstrate how mouse and key events can be captured for navigation through a 3D scene, which will be the topic of our next lesson.

There's not much to say about WM_CLOSE or WM_DESTROY other than some cleanup is necessary when exiting the app. More interestingly, let's look at WM_LBUTTONDOWN and WM_LBUTTONUP, triggered when you press and release the left mouse button, respectively. Pressing the button sets a boolean variable is_drawing to true, indicating to the app that any subsequent mouse movement will be used to draw on the image. Releasing the button sets is_drawing to false, mimicking the behavior of a paint program's brush tool. The WM_MOUSEMOVE event is triggered when the mouse moves. We can obtain the x and y mouse positions using the GET_X_LPARAM macro. For more details on this, consider checking a tutorial on Windows programming; it's quite straightforward. If is_drawing is true—indicating the user is holding down the left mouse button while moving the mouse—then we draw into the buffer holding our image data (more on this in the next section). We then call the native Windows function InvalidateRect, which tells Windows to force a redraw of the window's content.

As expected, this triggers a WM_PAINT event, which looks similar and contains drawing code typically enclosed within a BeginPaint-EndPaint pair. hdc represents a handle to the window's device context, which we need to pass to drawing calls like BitBlt, used here to request the bitmap to be drawn on the screen.

If we're not processing an event ourselves, we simply reroute it to a default message handling function, DefWindowProc (provided by Windows, not our own creation).

Of course, it's up to you to now expand on this basic example. For instance, if you need to incorporate the middle and right mouse buttons to extend what your own app can do, then all you need to do is search the documentation to find out which messages these events correspond to (in this example, this would be WM_RBUTTONDOWN for the right mouse button and WM_MBUTTONDOWN for the middle mouse button). There are countless more options here, leading to a great number of possibilities. Let your imagination go wild and be creative!

That's the gist of it. How much simpler could it be? All that's left at this stage is to see how we can load an image and create a Windows-compatible bitmap object, as well as how to draw into this image, and then we will be done.

Creating a Windows Compatible Bitmap

Here is the code we will be using:

auto CreateBitmapFromRGB(char* pData, int width, int height) 
	-> std::pair<HBITMAP, void*> {
	BITMAPINFO bmi = {0};
	bmi.bmiHeader.biSize = sizeof(bmi.bmiHeader);
    bmi.bmiHeader.biWidth = width;
    bmi.bmiHeader.biHeight = -height; // Negative indicates top-down bitmap
    bmi.bmiHeader.biPlanes = 1;
    bmi.bmiHeader.biBitCount = 24; // Assuming 24-bit RGB
    bmi.bmiHeader.biCompression = BI_RGB;

	HDC hdc = GetDC(nullptr);
    void* pBits;
    HBITMAP hbm = CreateDIBSection(hdc, &bmi, DIB_RGB_COLORS, &pBits, nullptr, 0);
    if (hbm != nullptr) {
        std::memcpy(pBits, pData, width * height * 3); // Assuming 3 bytes per pixel (RGB)
    }
    ReleaseDC(nullptr, hdc);
    return {hbm, pBits};
}

void InitializeOffScreenDC(HWND hwnd) {
	std::unique_ptr<char[]> raw_data(new char[win_width * win_height * 3]);
	
	memset(raw_data.get(), 0x0, win_width * win_height * 3);
	std::ifstream ifs("./sample.pbm", std::ios::binary);
	std::string header;
	int width, height, bpp;
	ifs >> header;
	ifs >> width >> height >> bpp;
	ifs.ignore();
	ifs.read(raw_data.get(), win_width * win_height * 3);
	for (uint32_t i = 0; i < win_width * win_height * 3; i += 3) {
		std::swap(raw_data[i], raw_data[i + 2]);
	}
	ifs.close();
    
    auto bitmap_data = CreateBitmapFromRGB(raw_data.get(), win_width, win_height);
    hBitmap = bitmap_data.first;
    pBits = bitmap_data.second;

    HDC hdc = GetDC(hwnd);
    hdcOffscreen = CreateCompatibleDC(hdc);
    SelectObject(hdcOffscreen, hBitmap);
    ReleaseDC(hwnd, hdc);
}

void CleanupOffScreenDC() {
    if (hdcOffscreen) DeleteDC(hdcOffscreen); // Delete the off-screen DC
}

void SetPixelColor(void* pBits, int width, int x, int y, uint8_t red, uint8_t green, uint8_t blue) {
    if (!pBits) return; // Ensure we have a valid pointer.

    int pixel_index = (y * width + x) * 3; // 3 bytes per pixel for RGB.
    
    uint8_t* pPixel = static_cast<uint8_t*>(pBits) + pixel_index;

    pPixel[0] = blue;
    pPixel[1] = green;
    pPixel[2] = red;
}

Let's begin with InitializeOffScreenDC(), which, if you recall, we called right after CreateAndRegisterWindow. This is where we will be loading the image data and generating our bitmap Windows object. When we mention a Windows object, we mean creating a representation of the image data that is compatible with how Windows handles image data for display on the screen, using Windows built-in native calls such as BitBlt, which we discussed in the previous section.

This process is not overly complex. We will use the .ppm, pr .pbm format, which we introduced in the lesson on images. As a quick reminder, ppm or pbm stands for Portable Pixmap or Bitmap Format (not to be confused with the BMP format, which is Windows' native bitmap format. More precisely, it's an image format used to store raster graphics, that is, images made out of pixels. It is the simplest image format in the world, and probably the entire universe. It comes in two flavors: ASCII and binary. In our case, we will use an image whose data are stored in the binary version of the file format. Again, check the lesson on images if you need a refresher or introduction to this file format. The image data are stored as char, so each pixel's values are stored as three chars for the red, green, and blue channels, respectively, with values in the range [0..255].

Then we pass the raw image data to a function called CreateBitmapFromRGB. This is where we create a Windows-compatible bitmap object using the CreateDIBSection Windows native API call. DIB stands for device-independent bitmap. It explains it all. That function returns two things: the bitmap itself, and a void pointer to the bitmap image data. We will use this pointer in our custom SetPixelColor function to write into the image at the mouse pixel positions.

The only thing you need to know about our implementation is the use of the hdcOffscreen object. It's a device-compatible graphics object associated with our device (our window, to put it simply). We tie our bitmap to that object with SelectObject(hdcOffscreen, hBitmap). The reason we do that is so we can "paint" into the bitmap off-screen and only decide to present it to the screen during a WM_PAINT event. This is sometimes referred to as double-buffering, but the use of the term here is misleading because generally, double-buffering is associated with a method used in real-time graphics whereas here, we can work with more than one image at a time, generally two, sometimes three. We draw into image 1, then present image 1 to the screen. While image 1 is on the screen, we draw into image 2, and when drawing is complete, we then present image 2 to the screen, which allows us to return to drawing in image 1, and so on. If we were drawing into an image at the same time it's being presented to the screen, this might create visual artifacts. Double-buffering eliminates this problem. However, what we are doing here is not quite the same, yet you might see it referred to in the literature as double-buffering, which, again, is incorrect. We prefer to refer to it simply as a form of off-screen rendering or drawing.

The SetPixelColor function is pretty straightforward. We call it when we receive a WM_MOUSEMOVE event. We draw a red pixel into the image upon receiving such an event at the current mouse position.

Compile

The complete source code and a sample image can be found on the GitHub repository. Note that creating your own pbm image is straightforward. You can use Gimp or Photoshop for this task.

To compile, open a GitBash terminal and enter the following command:

clang++ -std=c++23 -luser32 -lgdi32 -o window.exe window.cc

That's all there is to it. The user32 and gdi32 are two Windows libraries involved in the windowing system. We encourage you to do your own research on these libraries if you're interested.

Start the app. You should see the image of a corgi. And now you can paint on the image with the mouse left button.

Comments

Windows calls like RegisterClassEx or CreateWindowEx use wchar_t to define strings. While this is not a lesson on C++ programming, it's worth noting that this is why all strings start with the letter L. This prefix signals to the compiler that each character in the string should be stored using a wide character type (wchar_t) instead of the regular char type. Indeed, the L prefix stands for "literal," indicating a "wide character literal" or "wide string literal."
Generally, in Scratchapixel's coding style, we tend to adhere to Google's coding style rules. However, Windows tends to have its own style, and variables predominantly use camel case coding. Considering most examples you will find on the internet follow this rule, we also tried to adhere to it in our example to make it easier for you to navigate between our sample and other examples you might find online. So be it.

Conclusion

As you can see, there's really nothing magical about Windows. It's quite straightforward to both display an image in a window and handle all kinds of events, as long as you have someone to explain it to you. Thanks, Scratchapixel.

This will constitute the foundation of our next lesson, in which we'll learn how to use mouse and keyboard events to control the motion of a 3D camera, allowing us to move through the scene. Exciting stuff! See you in the next lesson.