Investigation: RobotUnit publishing performance
@paus.fabian and me had a look at the RobotUnitModulePublisher
.
At the moment, the RobotUnitModulePublisher
appears to do the following in publishUnitUpdates()
:
- For each sub unit:
- call
update()
on the sub unit (blocking)
- call
for (const RobotUnitSubUnitPtr& rsu : UnitsAttorneyForPublisher::GetSubUnits(this))
{
if (rsu && rsu->getObjectScheduler() && rsu->getProxy())
{
const auto begInner = TimeUtil::GetTime(true);
rsu->update(controlThreadOutputBuffer, activatedControllers);
const auto endInner = TimeUtil::GetTime(true);
timingMap["UnitUpdate_" + rsu->getName()] = new TimedVariant {TimestampVariant{endInner - begInner}, lastControlThreadTimestamp};
}
}
The sub units perform (multiple) network calls (while sometimes locking a local data mutex, e.g. platform and kinematic sub units). The ice calls done in the update()
functions are (always) synchronous (to the ice storm) and sometimes not batched (the kinematic unit, e.g., is batched). The fact that (only) the kinematic unit is batched may indicate that this problem occurred in the past as well, but only handled in one part.
This could potentially be improved by:
- Always batching the network calls
- Running the network call asynchronously, i.e. using
begin_ice_flushBatchRequests()
. The resulting async result should be kept and checked on the next iteration. At least, we could check whether the previous network call is done and, if not, give a warning.
Then, the updates of all sub units would run in parallel instead of sequentially after another, thus not adding up in a single update (think 3x 50ms x # sub units).
The RobotUnitModulePublisher
seems to log the timing information for each sub unit (using the format "UnitUpdate_" + rsu->getName()
) in the robot unit observer. (The observer queue, filled by robotUnitObserver->offerOrUpdateDataFieldsFlatCopy_async()
, may also fill up, but this is not tracked at the moment.)
FYI @dreher @fabian.reister