This project was a technical interview take-home assignment meant to evaluate my abilities in C++, GStreamer, and systems architecture, as well as my overall approach to engineering decisions and design tradeoffs. It was a fun project that was well received by the interview team, although I ultimately decided not to join the company. Since one of the requirements of the assignment was to give this project an open source license, it felt weird leaving it in a private repository to collect dust when there is a lot of information regarding the GStreamer framework that may be helpful to others. I decided to publish it publicly after removing any identifying information from the company. My hope is that this repository can serve as a helpful reference for other developers navigating the steep learning curve of GStreamer.
The assignment required designing a prototype system to simulate the integration of a camera into a larger software ecosystem, built entirely from scratch. The goal was to implement two Linux-based C++ services — a CameraService responsible for emulating a virtual camera, and an EndpointService that consumes its video stream — communicating over a local network. The CameraService exposes an API supporting standard device commands such as INIT, START_STREAM, STOP_STREAM, and STATUS, while GStreamer handles the underlying RTSP video pipeline. Both services are designed to run as systemd-managed processes on Ubuntu, include unit-tests, and follow a modular architecture. Given the short timeframe to complete the assignment, several tradeoffs and design considerations were made, which are discussed throughout this README.
- Operating System: Ubuntu 20.04 or later
- Processor: x86_64 architecture
- Memory: Minimum 4GB RAM
- Storage: If provisioning a VM, allocate at least 15GB of total disk space for the project and dev dependencies.
Update your system packages:
sudo apt update && sudo apt upgrade -yInstall the following dependencies using apt:
sudo apt install -y build-essential cmake make libspdlog-dev libgstreamer1.0-dev \
libgstreamer-plugins-base1.0-dev libgstreamer-plugins-bad1.0-dev gstreamer1.0-plugins-base \
gstreamer1.0-plugins-good gstreamer1.0-plugins-bad \
gstreamer1.0-plugins-ugly gstreamer1.0-libav gstreamer1.0-tools gstreamer1.0-x gstreamer1.0-alsa \
gstreamer1.0-gl gstreamer1.0-gtk3 gstreamer1.0-qt5 gstreamer1.0-pulseaudio libgstrtspserver-1.0-dev \
libboost-all-dev libcurl4-openssl-dev socat lcovCrow is a C++ micro web framework used to implement the CameraService REST API. You can download the Crow headers from the official GitHub releases page.
- Download Deb Package: Crow
sudo dpkg -i <Crow_Deb_File>.deb- Ensure
crow.his in/usr/include/or/usr/local/include/.
Alternative Installation Methods: Linux Setup
After installing the dependencies, you can build the project using the provided scripts. The project uses CMake for building and managing dependencies.
./scripts/build.sh./scripts/build_coverage.shCoverage reports will be generated in the coverage directory.
./scripts/cleanbuild.sh (coverage)Add coverage as an argument to the script to build with coverage.
After building, you can install the project using the provided deployment scripts. Run the following commands from the project root:
- Run Install
- Run Services
- You can skip to the Auto Script section to automatically handle the rest and start streaming immediately. Keep reading for manual setup and usage.
Install
sudo ./deployment/deploy.sh installThe following should be run without sudo
Setup systemd user services:
./deployment/deploy.sh serviceStart the services:
./deployment/deploy.sh startCheck the status of the services:
./deployment/deploy.sh statusStop the services:
./deployment/deploy.sh stopThe following should be run with sudo
Uninstall
sudo ./deployment/deploy.sh uninstallTo run the unit tests, execute the following command:
cd build
ctest --output-on-failureIf you built with build_coverage.sh or cleanbuild.sh coverage argument, a coverage report will be generated in the coverage directory. You can view the HTML report by opening build/coverage/html/index.html in your web browser.
The CLI script offers a command-line interface to interact with the EndpointService. It allows you to control the camera, stream video, and manage the virtual camera device.
The auto_stream.sh script automates the setup and streaming of the virtual camera. It will:
- Check if the systemd services are running and automatically start them for you
- Initialize the camera
- Enable streaming
- Start the media playback
./scripts/auto_stream.shFor manual control, you can use the CLI script located in CLITool/EndpointCLI.sh. This script allows you to interact with the EndpointService directly from the command line.
./CLITool/EndpointCLI.sh <command>init- Initialize the Virtual Cameraplay- Start playing the media streamstop- Stop the media streamenable- Tells the Virtual Camera to start streamingdisable- Tells the Virtual Camera to stop streamingstatus- Get the current status of the Virtual Camera
# Initialize the virtual camera
./CLITool/EndpointCLI.sh init
# Check current status
./CLITool/EndpointCLI.sh status
# Enable streaming on the virtual camera
./CLITool/EndpointCLI.sh enable
# Start playing the media stream locally
./CLITool/EndpointCLI.sh play
# Stop the media stream
./CLITool/EndpointCLI.sh stop
# Disable streaming on the virtual camera
./CLITool/EndpointCLI.sh disableThe CLI will provide feedback on each command, including success or error messages.
# Example output
$ ./EndpointCLI.sh play
Sending command: play
SUCCESS: Video is playingCameraService simulates a remote camera system. It exposes a simple REST API at http://localhost:8080/ for controlling the camera and streaming video. REST was chosen for its simplicity and ease of integration with various clients, allowing CameraService and a client like EndpointService to run on different machines. This implementation doesn't expose the API to the public internet, but it could be extended to do so with proper security measures.
The main endpoint is http://localhost:8080/dev/virtualcamera, which accepts commands in JSON format to control the camera. The API Endpoint Commands section below provides details on the available commands and their expected responses.
When the service starts, it waits for EndpointService to send the INIT command. Upon receiving it, the camera is initialized and the RTSP server is started on port 8554. The stream will not begin until the START_STREAM command is received. There is currently no way to deactivate the RTSP server without stopping the service, but this functionality could be added in the future.
CameraService supports up to 10 concurrent streams and provides basic camera status information. The limit of 10 was chosen to represent a low-powered system, typical of a small drone or IoT device. Because we're simulating a live feed, we use gst_rtsp_media_factory_set_shared() to reuse an existing pipeline for all clients. For a live feed, there's no need to create a new instance of the stream for each connection. Sharing the existing stream significantly reduces resource usage.
The RTSP server defaults to using UDP transport, but will fall back to TCP if UDP is unavailable. This setup provides optimal performance in most cases, while still offering compatibility with clients that only support TCP. While UDP offers lower latency, one drawback is that the stream may start without an I-frame, resulting in a gray or corrupted screen for a few seconds until a new I-frame arrives.
When the STOP_STREAM command is received, CameraService stops the RTSP stream and releases any associated resources. It also force-closes all existing sessions, as the RTSP server remains active until the service itself is stopped.
gst_rtsp_media_factory_set_launch(factory_,
"( videotestsrc is-live=true pattern=ball ! "
"video/x-raw,width=640,height=480,framerate=30/1 ! "
"x264enc tune=zerolatency bitrate=2000 ! "
"rtph264pay name=pay0 pt=96 config-interval=1 )");We use a simple GStreamer pipeline that generates a test video stream to maximize portability for this demo. Manually fine-tuned pipelines can be unstable across machines with varying hardware configurations. The videotestsrc element simulates a live video source, and the ball pattern is used to provide a slightly more visually interesting output than the default test pattern. It also serves as a good way to verify that the shared stream is functioning correctly. Any additional clients that connect using the same client pipeline should appear to be in sync.
- Implement a more flexible video encoding configuration to support various resolutions and bitrates.
- Adding a name to
x264encelement would allow us to set properties likebitrateandtunedynamically.
- Adding a name to
- Add security features such as authentication and authorization for the CameraService API.
- Add endpoints for different streaming resolutions and bitrates to allow clients to request specific configurations.
- Allow for specifying a specific port we want to use for the RTSP server, rather than defaulting it to 8554.
startRTSPServerallows for passing in a port, but we do not expose a way to set this in the API.
Send a POST request to http://localhost:8080/dev/virtualcamera with a JSON body containing a "command" field. Supported commands:
-
INIT
Initializes the camera.
Request:{ "command": "INIT" }Response:
{ "status": "OK", "message": "Camera initialized" } -
START_STREAM
Starts the video stream.
Request:{ "command": "START_STREAM" }Response:
{ "status": "OK", "message": "Streaming started", "stream_url": "rtsp://localhost:8554/stream" }If the camera is not initialized:
{ "status": "ERROR", "message": "Camera not initialized" } -
STOP_STREAM
Stops the video stream.
Request:{ "command": "STOP_STREAM" }Response:
{ "status": "OK", "message": "Streaming stopped" }If the camera is not initialized:
{ "status": "ERROR", "message": "Camera not initialized" } -
STATUS
Gets the current status of the camera.
Request:{ "command": "STATUS" }Response:
{ "status": "OK", "camera_status": "active/inactive", "streaming": "true/false" }
Error Responses
-
Unknown command:
{ "status": "ERROR", "message": "Unknown command: <command>" } -
Missing command field:
{ "status": "ERROR", "message": "Command must contain 'command' field" } -
Internal error:
{ "status": "ERROR", "message": "Error processing request", "error": "<error details>" }
EndpointService is a local service that interacts with the CameraService to stream video. It creates and listens on a Unix socket at /tmp/endpointservice.sock to receive commands related to video playback and camera control. It acknowledges command success or failure upon receiving a request. Unix sockets were chosen for IPC between the CLI tool and EndpointService because communication is only needed between processes on the same machine.
EndpointService is designed to handle a single streaming instance per machine. Only one EndpointService can be active at a time. It uses GStreamer to manage video playback and streaming. When a stream is requested via the play command, it checks whether CameraService is initialized and currently streaming. If it is, EndpointService starts a GStreamer pipeline to play the video stream; otherwise, it returns an error message indicating what went wrong.
The GStreamer class within EndpointService is responsible for managing the GStreamer pipeline and handling video playback. Upon initialization, the client can specify a callback function that will be called when video playback stops. This enables automatic resource cleanup if playback ends unexpectedly—due to a server timeout or internal error. This callback is necessary because GStreamer pipelines cannot be stopped from their own thread; attempting to do so results in a deadlock. Although the main event loop can be stopped, the stream will continue running without proper processing. The callback allows us to handle this scenario gracefully.
The pipeline created by the GStreamer class is monitored by the bus_callback method, which listens for messages from the GStreamer bus. Our implementation handles GST_MESSAGE_EOS (end of stream), GST_MESSAGE_ERROR, and GST_MESSAGE_STATE_CHANGED. Other message types are simply logged by name.
The isStreaming() method checks if the GStreamer pipeline is currently playing video. This method was added because playVideo() only confirms whether the pipeline and streaming thread were successfully started—it does not wait for the video to actually begin playing. isStreaming() provides a way to check the real-time playback state.
std::string pipeline = "rtspsrc location=" + videoUri +
" latency=100 retry=5 buffer-mode=1 drop-on-latency=true "
"! rtph264depay "
"! avdec_h264 "
"! autovideosink";A very simple pipeline was used for this demo. While more specific pipelines can offer better performance, they are often much less portable across different devices. In this example, we reduce the built-in latency to 100ms, as we don’t have significant overhead and the demo is intended to run on relatively powerful hardware.
One tradeoff of using lower latency and enabling drop-on-latency is that the video may initially appear gray or corrupted for a few seconds. This typically occurs when the I-frame is discarded or hasn’t arrived yet—especially on slower machines where startup delays can cause the decoder to miss the first keyframe.
- Allow for CameraService endpoints to be configured via environment variables or a configuration file, rather than hardcoding them in the code. This would allow users to easily change the CameraService URL without modifying the code.
playVideocould be enhanced to wait until the video is actually playing before returning, rather than returning immediately and using a timeout to check playback status. This would notify the CLI at the exact moment playback starts and avoid false negative responses when video startup exceeds the timeout.- The EndpointService is currently very manual in its operation. An
autoSetup()method and command could be added to check the status of the CameraService, automatically initialize the camera, start streaming, and play a default video. This would simplify the setup process for users.
-
init
- Description: Initializes the remote camera.
- Request:
{ "command": "init" } - Response: Success or error message indicating initialization status.
-
play
- Description: Starts video playback from the remote camera.
- Request:
{ "command": "play" } - Response: Success or error message indicating playback status.
-
stop
- Description: Stops the current video playback.
- Request:
{ "command": "stop" } - Response: Success or error message indicating stop status.
-
enable
- Description: Enables the camera stream and returns the stream URL if successful.
- Request:
{ "command": "enable" } - Response: Success message with the stream URL, or an error message if enabling fails.
-
disable
- Description: Disables the camera stream.
- Request:
{ "command": "disable" } - Response: Success or error message indicating disable status.
-
status
- Description: Retrieves the current status of the camera and stream.
- Request:
{ "command": "status" } - Response: Success message with a JSON-encoded status object.
- Example Response:
{ "camera_status": "active", "streaming": true }
- Example Response:
-
If the
"command"field is missing or unrecognized, the service responds with:{ "status": "ERROR", "message": "Unknown command" }
The EndpointCLI tool provides a command-line interface for interacting with the CameraService and EndpointService. It is a shell script that accepts a command-line argument to control EndpointService via a Unix socket. It allows users to control the camera, stream video, and manage the virtual camera device directly from the terminal.
A shell script was chosen as a simple and lightweight way to provide a CLI interface without requiring additional dependencies or setup, allowing focus to remain on the core functionality of CameraService and EndpointService. The script uses socat under the hood to send commands to EndpointService's Unix socket, which in turn communicates with CameraService.
When a command is sent, EndpointService processes it and returns a simple success or error response along with a message. This feedback informs the user whether the command was successful or if an error occurred, and provides details on what went wrong. We increased socat’s default timeout to 2 seconds to account for potential delays in processing commands, particularly during video playback.
Refer to the CLI Script & Usage section above for details on how to use the CLI tool.
- The CLI tool could be enhanced to run persistently and monitor the socket for incoming messages from the EndpointService. This would allow the CLI to provide real-time updates on the status of the video stream and camera service to the client.
Allow logging levels to be set via environment variables. This would allow users to control the verbosity of logs without modifying the code.
- Add an alias for the CLI tool and add it to the user's respective shell configuration file (e.g.,
.bashrc,.zshrc) to make it more convenient to use, allowing it to be calledcamera_controlinstead ofEndpointCLI.sh. This would let users run commands likecamera_control playfrom anywhere on the machine instead of running it from the project directory.
RTSP doesn't natively support adaptive bitrate streaming, but there are potential workarounds. If continuing with RTSP, you could implement manual adaptive bitrate streaming by creating a new command to notify the Virtual Camera to adjust its bitrate—dynamically grabbing the encoder and setting a new bitrate via g_object_set(). Bitrate changes are generally safe to perform on the fly, as decoders can typically adapt.
Changing resolution is trickier. GStreamer pipelines aren’t well-suited to mid-stream resolution changes. While it’s possible with renegotiation of caps, users may still experience a brief pause or flicker in the video stream.
Streaming was restricted to using RTSP for this assignment. RTSP is simpler to implement but is slower and less feature-rich than some modern solutions. Future development should consider switching to WebRTC for its lower latency, native adaptive bitrate streaming, and better support for real-time communication.
For (Company Name Redacted)'s use case, based on our discussion, WebRTC would be the optimal solution. In addition to lower latency streams, WebRTC has bidirectional communication capabilities and a data channel. This means we could directly control the gimbal and other camera features without needing to integrate additional services.