Skip to content
Snippets Groups Projects

Issue boards: Bugs Migration Requests | Setup: Installation | Maintainer: Code owners


Foreword

This project aims to enable robots to effectively control their gaze by implementing a biologically inspired gaze control system. The approach is grounded in neuroscience research, particularly in the study of visual attention mechanisms. The foundational concept originates from computational models of visual attention, as introduced in the seminal work:

Itti, L., Koch, C., & Niebur, E. (2002). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on pattern analysis and machine intelligence, 20(11), 1254-1259.

This paper proposed a model where attention is guided by a combination of bottom-up (stimulus-driven) and top-down (task-driven) processes. The idea that gaze selection emerges from the interplay of spontaneous exploration, salient stimuli, and task-specific objectives.

In this project, we implement a similar multi-layered system. Gaze targets are categorized into one of three layers:

  • Random: This layer represents spontaneous gaze movements that are not influenced by external stimuli or specific tasks. It mimics natural exploratory behavior, allowing the robot to scan its environment without a predefined objective.
  • Stimulus-based: This layer focuses on gaze targets that are driven by external stimuli. These targets are selected based on their saliency, such as bright colors, sudden movements, or other visually prominent features in the environment.
  • Task-driven: This layer prioritizes gaze targets that are directly related to the robot's current task or objective. These targets are determined by the specific requirements of the task, ensuring that the robot's gaze aligns with its operational goals.

Additionally, within each layer, gaze targets are assigned both a priority and a duration. The priority determines the importance of a target relative to others within the same layer, while the duration specifies how long the target remains active. A central component, the GazeScheduler, dynamically evaluates all available gaze targets and selects the one with the highest priority for execution. This ensures that the robot's gaze is always directed toward the most relevant target based on the current context.

Publications

This work is described in the following publications:


List of papers that build upon this work.

Terms and Concepts

Gaze targets

A gaze target consists of the following main properties:

  • Name: A string representing the name of the gaze target.
  • Position: A spatial position of the gaze target. It can both be specified in the global frame as well as relative to the robot or an object in the scene.
  • Priority: It defines the layer (random, stimulus-based, or task-driven) and priority within this layer (scalar value, the higher the more important the gaze target) of the gaze target
  • Duration: It specifies how long the gaze target should be active

Gaze scheduler

The GazeScheduler is the central component responsible for managing and prioritizing gaze targets. It receives gaze targets from multiple sources through the memory system, and dynamically schedules them based on a strict priority system.

The selected gaze target will then be sent to a robot-specific gaze controller.

Gaze controller

The robot-specific gaze controller receives a gaze target as either a global position or a position relative to a robot node. It then uses an Inverse-Kinematics-based approach to control the robot's gaze towards the target.

Gaze Control Strategies:

The following strategies are currently implemented for controlling the robot's gaze:

  • Atan2: This strategy calculates the pitch and yaw angles required for the robot's head to focus on a target. It is suitable for robots whose head kinematics align with these two joints, allowing direct usage.

  • GazeIK: A general-purpose solution that leverages inverse kinematics (IK) based on a virtual linkage model. For more details, refer to the implementation in Simox.

  • Hemisphere: An extension of the Atan2 strategy, specifically designed for a hemisphere joint configuration. In this setup, the two degrees of freedom control the pitch and roll of the robot's head. To maintain an upright head posture, roll is constrained to zero, and both degrees of freedom are controlled by the same value.

Structure of this Package

On the code level, this package contains the following libraries and components:

Libraries

Core libraries:

Generating gaze targets:

Client API to interface with GazeScheduler:

Access of functionality through skills:

Components

Core components

Gaze target providers

Skills

Examples

Skills

view_selection_skill_provider

Implemented in the library skills are the following skills, made available through the view_selection_skill_provider

Basic skills

Skill Name Parameters Description
LookAt (ARON, header, ...) Allow setting a gaze target.
SetCustomGazeTarget (None) Allows setting a custom gaze target. TODO: check how it differs LookAt
ResetGazeTargets (None) Removes all gaze targets from the GazeScheduler. Only the idle target (e.g., looking ahead) will remain.

Skills for predefined directions:

Skill Name Parameters Description
LookUp (None) Directs the robot's gaze upward.
LookDown (None) Directs the robot's gaze downward.
LookDownstraight (None) Directs the robot's gaze straight down.
LookLeft (None) Directs the robot's gaze to the left.
LookRight (None) Directs the robot's gaze to the right.
LookAhead (None) Directs the robot's gaze straight ahead.

Skills for scene exploration (looking around):

Skill Name Parameters Description
LookForObjects (None)
LookForObjectsWithKinematicUnit (None) Searches for objects using the robot's kinematic unit. This skill is deprecated.
ScanLocation (None)
ScanLocationsForObject (None)

Focussing specific elements in the scene:

Skill Name Parameters Description
LookAtArticulatedObjectFrame (None) Directs the robot's gaze to a specific frame of an articulated object.
LookAtObject (None) Directs the robot's gaze to a specific object.
LookAtHumanFace (None) Directs the robot's gaze to a human's face.
LookAtHumanHand (None) Directs the robot's gaze to a human's hand.

Usage Example

The following examples show how the client API can be used to send gaze targets to the GazeScheduler.

GazeScheduler client example

Code: component scheduler_example.

It shows how different gaze targets are scheduled and executed by the robot.

Object tracking example

It queries the object memory for specific known objects and directs the robot's gaze toward them. The gaze dynamically adjusts as the objects' positions change, ensuring continuous focus on the targets.

Scenarios

How To

(prefer the doxygen documentation)

=> add link to howto's in documentation