We have a complicated radio that loses a very small amount of frames in part of the system that's effectively a black box. I was looking for correlations to the loss over time and eventually was able to build a graphical representation of the timing and data flow that makes patterns in the loss really easy to see. The first few causes of the problem have fallen away rather easily and were visible as stripes in time and frequency, but the more complicated ones have taken some extra effort. The novelty came from thinning the data down and presenting the schedules (which change a lot) without hiding any of the main event, which is the frame loss. I also have kind of a neat chart that shows what the probability distribution looks like when various classes of error are considered alone that makes good use of whatever he calls it when you can visually compare complex patterns without needing as much context for successive figures. All of these in the time domain are plotted with a few things in the background that show overall traffic through the radio measured both internally and externally, so it's easy to see correlations with load and the like.
Some of the little things are fun because there are five or six different algorithms interacting - TCP or video flow control, then weighted fair queuing and QoS on the router, then flow control to the router and multiple-path routing choices, then link-level ARQ, then framing and bus access, etc. Sometimes there's coupling in ways that are totally unexpected, and it leads to all sorts of neat problems when you're trying to get the last 10% of performance.
Some of the little things are fun because there are five or six different algorithms interacting - TCP or video flow control, then weighted fair queuing and QoS on the router, then flow control to the router and multiple-path routing choices, then link-level ARQ, then framing and bus access, etc. Sometimes there's coupling in ways that are totally unexpected, and it leads to all sorts of neat problems when you're trying to get the last 10% of performance.
-Peter
Reply
Reply
Leave a comment