Graduation Year


Document Type




Degree Granting Department

Mechanical Engineering

Major Professor

Rajiv Dubey


Hidden Markov Model, human in the loop, human robot interaction, machine learning, robotics, teleoperation


In this dissertation work, a methodology is proposed to enable a robot to identify an object to be grasped and its intended grasp configuration while a human is teleoperating a robot towards the desired object. Based on the detected object and grasp configuration, the human is assisted in the teleoperation task. The environment is unstructured and consists of a number of objects, each with various possible grasp configurations. The identification of the object and the grasp configuration is carried out in real time, by recognizing the intention of the human motion. Simultaneously, the human user is assisted to preshape over the desired grasp configuration. This is done by scaling the components of the remote arm end-effector motion that lead to the desired grasp configuration and simultaneously attenuating the components that are in perpendicular directions. The complete process occurs while manipulating the master device and without having to interact with another interface.

Intention recognition from motion is carried out by using Hidden Markov Model (HMM) theory. First, the objects are classified based on their shapes. Then, the grasp configurations are preselected for each object class. The selection of grasp configurations is based on the human knowledge of robust grasps for the various shapes. Next, an HMM for each object class is trained by having a skilled teleoperator perform repeated preshape trials over each grasp configuration of the object class in consideration. The grasp configurations are modeled as the states of each HMM whereas the projections of translation and orientation vectors, over each reference vector, are modeled as observations. The reference vectors are the ideal translation and rotation trajectories that lead the remote arm end-effector towards a grasp configuration. During an actual grasping task performed by a novice or a skilled user, the trained model is used to detect their intention. The output probability of the HMM associated with each object in the environment is computed as the user is teleoperating towards the desired object. The object that is associated with the HMM which has the highest output probability, is taken as the desired object. The most likely Viterbi state sequence of the selected HMM gives the desired grasp configuration. Since an HMM is associated with every object, objects can be shuffled around, added or removed from the environment without the need to retrain the models. In other words, the HMM for each object class needs to be trained only once by a skilled teleoperator.

The intention recognition algorithm was validated by having novice users, as well as the skilled teleoperator, grasp objects with different grasp configurations from a dishwasher rack. Each object had various possible grasp configurations. The proposed algorithm was able to successfully detect the operator's intention and identify the object and the grasp configuration of interest. This methodology of grasping was also compared with unassisted mode and maximum-projection mode. In the unassisted mode, the operator teleoperated the arm without any assistance or intention recognition. In the maximum-projection mode, the maximum projection of the motion vectors was used to determine the intended object and the grasp configuration of interest. Six healthy and one wheelchair-bound individuals, each executed twelve pick-and-place trials in intention-based assisted mode and unassisted mode. In these trials, they picked up utensils from the dishwasher and laid them on a table located next to it. The relative positions and orientations of the utensils were changed at the end of every third trial. It was observed that the subjects were able to pick-and-place the objects 51% faster and with less number of movements, using the proposed method compared to the unassisted method. They found it much easier to execute the task using the proposed method and experienced less mental and overall workloads. Two able-bodied subjects also executed three preshape trials over three objects in intention-based assisted and maximum projection mode. For one of the subjects, the objects were shuffled at the end of the six trials and she was asked to carry out three more preshape trials in the two modes. This time, however, the subject was made to change their intention when she was about to preshape to the grasp configurations. It was observed that intention recognition was consistently accurate through the trajectory in the intention-based assisted method except at a few points. However, in the maximum-projection method the intention recognition was consistently inaccurate and fluctuated. This often caused to subject to be assisted in the wring directions and led to extreme frustration. The intention-based assisted method was faster and had less hand movements. The accuracy of the intention based method did not change when the objects were shuffled. It was also shown that the model for intention recognition can be trained by a skilled teleoperator and be used by a novice user to efficiently execute a grasping task in teleoperation.