Computer Architecture : QOS (Quality of Service)

Computer Architecture


Quality of Service just 30 years ago in the early 1980’s used to apply to phone calls made locally and internationally.

Local phone calls in a small town have little traffic to interfere with.  And international calls have time delays which introduce latency in the phone call.

However local phone calls in a metropolitan city with millions of neighbors now cause the quality of service for the phone call to possibly be delayed or dropped or even disconnected due to reduced bandwidth and fluctuating priorities of the telephone switch network.  The telephone switch networks have limited bandwidth and constantly manage phone calls.  Thus, one needs to “adjust” calls or packets according to priorities.  


The same principle applies to computers with multiple devices wanting access to the limited resource of the memory.  The bottleneck in the computer is that everyone wants to communicate with each other and that common area of access is called the main memory. Technically it’s called dynamic random access memory DRAM for short.  It’s a type of technology that maximizes the physical space for memory.  The components of a computer need to have a common area to access data and that area is the main memory.  If there is nothing in common then devices are not synchronized together.  That’s another topic about data coherency and semaphores.  The computer is broken down into two main blocks which manage access to the main memory : switch fabric and arbiter.



How to solve bandwidth limitations and latency issues?

The same principle as a phone call applies to computers : We want to maximize the bandwidth and minimize latency so that we can have more calls and less delay, respectively.  This process involves using the mechanisms of QoS (Quality of Service).

But before we describe these mechanisms we need to understand the devices that want access to the switch fabric and main memory.  The terminology for devices requesting service is “Initiators”.  This makes sense since the device is the one wanting to make the call and initiate the dialing.

 Initiator Types

Real-time : Devices like a digital photo capture requires real-time access to process and store the captured image.  Any limitations of bandwidth or latency could result in a slightly corrupted or distorted image.  Similarly, the display refresh for your smartphone LCD needs immediate access to memory or else you might see portions of the display glitch with a blink of the eye.  See the black dots pointed by the red arrow as a visual artifact that shows up in one display frame which is less than the blink of the eye. This is not acceptable.  (The photo is from for non-commercial purposes).


Latency-sensitive : The CPU works best when it is not delayed but latencies to access the main memory or else “bubbles” will occur in the processor pipeline.  Bubbles are time periods when there is nothing being processed in the processor pipeline.


Best-effort : Devices such as portable hard drives will not have an adverse visual effect if the data it needs to transfer is not at its maximum speed.  It just means the user may be delayed by an extra second or tens of seconds.  That is tolerable to the user.

QoS mechanisms

  • Traffic control
  • Topology
  • Priority mechanisms

Traffic control is further segmented into these concepts

  • Splitting which involves breaking up the bursts from the initiators into smaller packets.  One large burst will cause delays or increased latencies to the other devices.
  • Shaping limits the number of packets that can be pending
  • Pending transactions limits the number of transactions that can be pending (a transaction can have multiple packets)


  • Concurrency involves duplicating logic in a cross-bar switch like structure to allow parallel paths to occur
  • Arbitration levels can be flat so that all paths have equal access or it can be tiered or layered to emphasize certain paths and reduce access to other paths




  • Local buffers can absorb traffic bursts from the initiator and handle reduced bandwidths
  • FIFO are best used when bandwidth reductions such as when a faster clock domain transfers data to a slower clock domain
  • Rate Adapters are used to avoid long wait cycles when a “slower clock domain” transfers data to a faster clock domain

Priority Mechanisms

  • Priority within the packets
  • Sideband priority at the interface channel to indicate an overall priority level of the pending packets to notify the downstream arbiter of increased priority.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: