/ Blog / 

Enhanced live classroom experience at scale with the WebRTC-HLS stack

Enhanced live classroom experience at scale with the WebRTC-HLS stack

July 31, 20236 min read



Live classrooms have become the norm for learning ever since the pandemic. A teacher presents the class while students actively listen and ask questions. That is the usual story. However, it is not the case in countries like India where students from tier 2 and tier 3 cities face network issues. In this blog, we will talk about how 100ms solves this problem for EdTech large classrooms by using HLS and WebRTC technologies simultaneously.

Problem with the current stack

When it comes to live classrooms, WebRTC is commonly considered the optimal solution. WebRTC is a real-time communication protocol that enables users to interact with audio-video capabilities while maintaining low latency. This is the same technology that powers one-on-one video calls and video conferences in apps like Google Meet, Skype,…

While WebRTC might seem like a no-brainer answer to EdTech live classrooms, it is not without its own flaws. There are mainly two cases where audio-video issues start to become visible:

  1. Bad network: When the users on the call are experiencing a poor network connection, which is super common in countries like India.

  2. High Scale: When there are too many users on the call.

In the context of live classrooms, scalability aside, the users on bad network recurrently face issues like blurred content on video, skipped or black frames, and choppy audio:

  1. Blurred content on video - The text content and figures being shown during classes look blurred and the overall video quality is subpar.

  2. Skipped or Black frames - Video looks stuck, fast forwards or goes black during the class suddenly.

  3. Choppy Audio - Unclear or breaking audio.

There are apps like Zoom, that use a modified version of WebRTC. However, they still suffer from the same problems that come with WebRTC and, in some cases, a few extra ones.

A tale of latency and quality

Before we move on further, it’s essential to understand the two key factors to consider when selecting a technology for live video use cases: latency and audio/video quality.

  1. Latency: Latency refers to the time it takes for media to travel from the source to the receiver. Low latency is essential for real-time communication, enabling seamless interaction between participants.

  2. Audio/video quality: Video quality is determined by factors such as resolution, frame rate, and bitrate. Higher quality visuals contribute to a more engaging and immersive learning experience.

An ideal solution would offer both low latency and high video quality. However, in practice, it often involves a trade-off. Either prioritize low latency, sacrificing video quality to ensure real-time communication or focus on delivering high video quality but with increased latency.

WebRTC - low latency, subpar quality

WebRTC, designed for two-way communication, aims to minimize latency. However, this focus on low latency can result in compromises in video quality. Some examples of how WebRTC compromises on video quality include:

  1. Packet loss and frame drops: Network limitations, such as limited bandwidth or high latency, can lead to packet loss and dropped frames. These issues can cause a degraded video experience, impacting the clarity and smoothness of the visuals.

  2. Adaptive bitrate streaming challenges: WebRTC employs adaptive bitrate streaming to adjust video quality dynamically based on network conditions. However, the adaptive streaming algorithm may not respond quickly enough to network changes, leading to temporary compromises in video quality.

  3. Reduced resolution and compression artifacts: In low-bandwidth situations, WebRTC may dynamically adjust the bitrate and compression settings of the video stream. This can lead to a reduction in video resolution and the introduction of compression artifacts which can lead to a compromise in the level of detail, affecting the overall quality of the video.

WebRTC as a protocol was not designed for scalability, but rather for peer-to-peer communication. It was later adopted to support multiple peers in a call. Therefore, scaling a WebRTC call is not an easy task and, after a certain point, it’s just a question of making further compromises.

HLS - high latency, high quality

In live streaming use cases where the users are only on the receiving end, the most popular technology is HLS. It is the same technology that powers Super Bowl and IPL match live streams at scale. Unlike WebRTC, this comes with a latency of ~8 seconds but offers the best video quality possible. The latency, however, can be further reduced through optimizations or by moving to LL-HLS at a later stage. HLS offers certain advantages over WebRTC in terms of video quality:

  1. No Frame or Packet Loss: HLS is designed for reliable delivery and ensures minimal frame or packet loss during transmission. It divides the content into small, individual files, allowing viewers to download segments independently. This approach mitigates the risk of losing an entire video frame or packet during the streaming process.
  2. Content caching and CDN support: HLS leverages standard HTTP infrastructure and is compatible with Content Delivery Networks (CDNs). CDNs can cache HLS segments, reducing the load on the origin server and improving video delivery performance. This optimized content delivery contributes to better video quality and reduces the potential impact of network congestion.
  3. Scalability: HLS through use of CDNs, can efficiently deliver content to a large audience (on the scale of millions) without straining server resources.

In the next section, we will explore how 100ms utilizes HLS and WebRTC technologies simultaneously to address the network issues students face in large EdTech classrooms, bringing the best of both worlds.

Enhancing live classroom experiences

Let’s consider an elaborate example of a live classroom scenario with a tutor and a group of students. By default, the students are muted and can only interact through chat messages. If necessary, they can raise their hand, and once accepted by the tutor, they can come “on stage” and engage with audio and video enabled. When they have finished interacting, the tutor can make them go “off stage” like the rest of the students. There are three types of users in the call:

  1. Tutor - The tutor who has admin access over the entire call, with audio and video capabilities.

  2. Students - Students who can listen to the tutor and interact through chat messages, raise their hand, and so on.

  3. On-stage participants - Students who have audio and video capabilities to interact with the tutor in near real-time.

Live classroom with WebRTC and HLS

The tutor joins in a WebRTC connection which is then used to live stream on HLS. By default, the students join as HLS viewers, resembling the experience of watching a class via YouTube Live stream in Low latency mode. The only difference in this setup is that students watching the live stream can become part of the stream and interact with the tutor anytime they want.

Any number of students can raise their hand and transition to become on-stage participants, moving from HLS to the WebRTC connection for direct interaction with the tutor. During this transition, the on-stage participant loses 8 seconds of the live stream. These on-stage participants become part of the HLS stream alongside the tutor, allowing their visibility to the remaining students. Once their interaction concludes, they can go back to being HLS viewers. To prevent them from viewing the 8-second time period of live stream during their presence, a timer can be displayed instead.

Here’s the on-stage participant flow:

  1. At time T, the student is watching the HLS stream, while the tutor is at time T+8 seconds in the WebRTC connection.
  2. At time T+8 seconds, the tutor asks students to raise their hands if they have any doubts. Students have an 8-second window to raise their hand.
  3. At time T, a student raises their hand, and the tutor is notified.
  4. Upon the tutor's approval, the student becomes an on-stage participant, transitioning from HLS to the WebRTC connection.
  5. The student, who was at time T (in the HLS stream), moves to the WebRTC connection at T+8 seconds. This implies that the student misses 8 seconds of the tutor's time and jumps to T+8 seconds.
  6. The student who became the on-stage participant engages in interaction with the tutor, and once finished, returns to being an HLS viewer.
  7. When the on-stage participant reverts to being an HLS viewer, they jump to T seconds, aligned with other students. Now, the student has to wait 8 seconds continuing the live stream to avoid watching their own presence.

Next Steps

Are you interested in building a live classroom using the WebRTC+HLS setup, but don't want to worry about protocols and optimizations? Check out 100ms SDKs, with built-in support for both WebRTC and HLS. The best part is, switching between WebRTC and HLS is as simple as making a function call.

Feel free to book a call with us to understand more.